The Principled Agent

The journey to a better policy.

About

Another person blogging about ML… Ho hum.

Yes, you’re right. And no. Hopefully not.

My name is W. Max Lees. As a Senior Software Engineer, my job is to build robust, predictable systems. But when it came to my passion, Reinforcement Learning (RL), my process was anything but: all guesswork and late-night hyperparameter fiddling. This blog is my public commitment to change that, applying a principled, engineering-first approach to the messy art of RL.

My goal with this blog is to go from an amateur RL enthusiast to top-tier RL researcher and to document my process along the way. So why will this be different? I’m not going to present easy, step-by-step guides to simple benchmarks. You won’t see me explain how to create a CNN for NIST or ImageNet classification here. And I’m not going to pretend like I know all the answers.

I’m going to show as honest a look into my process of RL practice and research as I possibly can. The triumphs. The failures. The learnings. When I do something incredibly stupid, you’ll see it. When I figure out something interesting or powerful, you’ll see all the work I did to get there.

Hopefully what that also means is that we’ll all learn together how to be better. We’ll become principled agents of RL research.

I look forward to you joining me in this process!

My Principles

To keep this journey honest and productive, I’m holding myself to a few key principles:

Community First: I will encourage feedback and discussion wherever possible. We are all smarter together.
Data Over Intuition: I will not make any changes to my models or agents until I have data to back up why I want to make that change.
Radical Transparency: I will document and share everything. The good, the bad, and the ugly.
AI as a Tool, Not a Crutch: I’ll use AI as a collaborator for brainstorming and polishing, but never for generating the core code or primary insights. The goal is to document my learning process.

About Me

My name is W. Max Lees. As a Senior Software Engineer, my job is to build robust, predictable systems. But when it came to my passion, Reinforcement Learning, my process was anything but: all guesswork and late-night hyperparameter fiddling. This blog is my public commitment to change that, applying a principled, engineering-first approach to the messy art of Reinforcement Learning.