Tools

I think it’s increasingly true that being a good ML researcher involves being a good ML engineer. I think it’s always been true that being a good (ML) engineer involves a good balance between exploration of new tools and exploitation of existing ones.

On this page I’ve put together some of the tools, tips, and tricks that have helped me throughout my PhD. You might find more on my GitHub.

ML

Here is a practical I put together for the AIMS CDT covering:
- Creating high-quality Python research repositories with minimal effort using cookiecutter.
- Simplifying the configuration of ML experiments using Hydra.
- Deep learning at scale, using Ray on Google Cloud TPUs.
Hydra is the best thing since sliced bread:
- Slides on why you should use it.
- Examples of how you can use it.
I use MLFlow to track experiments and trained models:
- You can self-host it locally, on premise, or in the cloud.
- It has better abstractions and a cleaner UI than e.g. WandB.

Misc.

My personal dotfiles: joncarter1/dotfiles
A cookiecutter template for ML research: joncarter1/cookiecutter_research
The Mojo programming language is worth keeping an eye on.
I’m super excited by Ray, a framework for distributed computing, and enjoy making the occasional contribution when I get time.¹

Security

Use a YubiKey:
- For storing secret keys (e.g. SSH) to avoid storing private keys on any physical machines.
- For more secure 2FA.
See this excellent guide on how to get started.
Use a password manager (e.g. 1Password, LastPass).

e.g. improving security (!33542), and running Ray in Docker (!33541, !40311). ↩