Tools

I think:

  • ML research quality is often bounded by engineering skill;
  • this is still true in a post-LLM (vibe-coding) world.1

Here I’ve put together some of the tools, tips, and tricks that have helped me throughout my PhD. You might find more on my GitHub.

ML

  • Here is a practical I put together for the AIMS CDT covering:
    • Creating high-quality Python research repositories with minimal effort using cookiecutter.
    • Simplifying the configuration of ML experiments using Hydra.
    • Deep learning at scale, using Ray on Google Cloud TPUs.
  • Hydra is the best thing since sliced bread:
  • I use MLFlow to track experiments and trained models:
    • You can self-host it locally, on premise, or in the cloud.
    • It has better abstractions and a cleaner UI than e.g. WandB.

Misc.

Security

  • Buy a pair of YubiKeys:
    • For storing secret keys (e.g. SSH) to avoid storing private keys on any physical machines.
    • For more secure 2FA.

    See this excellent guide on how to get started.

  • Use a password manager (e.g. 1Password, LastPass).
  1. Frontier research is largely an engineering feat: async RL, efficient parallelism etc. 

  2. e.g. improving security (!33542), and running Ray in Docker (!33541, !40311).