Another person blogging about ML… Ho hum.
Yes, you’re right. And no. Hopefully not.
My name is W. Max Lees. As a Senior Software Engineer, my job is to build robust, predictable systems. But when it came to my passion, Reinforcement Learning (RL), my process was anything but: all guesswork and late-night hyperparameter fiddling. This blog is my public commitment to change that, applying a principled, engineering-first approach to the messy art of RL.
My goal with this blog is to go from an amateur RL enthusiast to top-tier RL researcher and to document my process along the way. So why will this be different? I’m not going to present easy, step-by-step guides to simple benchmarks. You won’t see me explain how to create a CNN for NIST or ImageNet classification here. And I’m not going to pretend like I know all the answers.
I’m going to show as honest a look into my process of RL practice and research as I possibly can. The triumphs. The failures. The learnings. When I do something incredibly stupid, you’ll see it. When I figure out something interesting or powerful, you’ll see all the work I did to get there.
Hopefully what that also means is that we’ll all learn together how to be better. We’ll become principled agents of RL research.
I look forward to you joining me in this process!
My Principles
To keep this journey honest and productive, I’m holding myself to a few key principles:
- Community First: I will encourage feedback and discussion wherever possible. We are all smarter together.
- Data Over Intuition: I will not make any changes to my models or agents until I have data to back up why I want to make that change.
- Radical Transparency: I will document and share everything. The good, the bad, and the ugly.
- AI as a Tool, Not a Crutch: I’ll use AI as a collaborator for brainstorming and polishing, but never for generating the core code or primary insights. The goal is to document my learning process.