Mark Towers

A collection of blog posts that may appear without link on topics that I think are interesting.

The Nuances of Autoreset Modes and Vectorised Rollouts

March 12, 2026 · 5 min read

Next-Step and Same-Step autoreset handle episode boundaries differently. That difference quietly changes what ends up in your rollout buffer and how GAE must...
How does Retrace fix Off-Policyness?

February 12, 2026 · 9 min read

When your RL agent trains on data from an old policy, the value estimates go wrong. Retrace was designed to address this problem.
Generalized Advantage Estimation (GAE) Explained

January 31, 2026 · 5 min read

Generalised Advantage Estimation (GAE) is a critical component of PPO but how does GAE work? What does it achieve?

The Nuances of Autoreset Modes and Vectorised Rollouts