Exploration has always been a hot topic in RL research, for a good reason. It can help agents discover primitive skills that can be later reused for more complex tasks, and it can help avoid local ...
Alignment research, especially research on practical alignment methods for current AI systems, contributes significantly to AI progress, and will do so even more in the coming future. For instance,...
I want to share some thoughts on verifiability, inspired by recent works like AlphaEvolve and RLVR. I first clarify that there are several different scenarios in which verifiability manifests, and ...
Noether’s theorem is a fundamental result in physics that relates symmetries to conservation laws. The usual statement of Noether’s theorem requires the action to be invariant under a continuous tr...
When starting this blog, I thought I would be writing a bit more than I am doing right now. What happened? One reason is that it’s not as rewarding as I initially thought. For instance, I used to t...
One of the most famous examples for introducing Lagrangian mechanics is deriving the dynamics of a pendulum. I was never taught d’Alembert’s principle, which is a crucial part of the derivation. T...
Note (Dec 5 2025): I do have some original thoughts in the “Some additional thoughts” section, but as for the rest, you should just watch Marcus Hutter’s 1 hour lecture if you want to understand Il...