Aspiring AI Interpretability Researcher

Luis Miguel
Montoya

Understand to Align

Exploring the journey from physics to AI interpretability research.

Journey Oct 2025

Highlights, milestones, and lessons learned from the September–October leg of my AI safety journey.

Project Sep 2025

Project log for my MATS 9.0 application work probing how latent content and behavioral inhibition interact inside Taboo-fine-tuned Gemma models.

Literature Sep 2025

Anthropic’s foundational write-up that formalizes how circuits emerge inside transformers and how to reason about them.

Paper