Isolate Conda on Linux Servers
I’m a huge fan of conda (especially on Windows) for any scientific Python work - it is a cross-platform package manager and virtual environment manager with batteries included. Conda is especially helpful for getting my colleagues set-up and ready to go and share environments. There is one big downside I encountered so far: installing it on a Linux systems has a weird default - it appends itself to the system Path and overwrites your systems Python alias. This is highly problematic in any (server) environment that depends on a stable Python version.
Likelyhood of being hit on the head by a falling rocket on new years eve
This new years eve a friend of mine was hit on the head by the debris of a rocket and had to spend the evening in the hospital. Being hit on the head seemed very unlikely to me, but how unlikely exactly? As a fan of the TheyDidTheMath subreddit I calculated how unlikely her new years eve was. It was also a great excuse to read up on probablity theory which I haven’t used since I graduated school 10 years ago.
Parallel Pandas
Using pandas performance is usually not an issue when you use the well optimized internal functions. However, sometimes you have to a perform a lot of calculations column wise on a large dataframe. I recently ran into this issue while calculating time series features. I increased the speed of the calculation 5x by chunking the dataframe and using parallel processing with Pythons multiprocessing
library.