Navigation
View All Projects
Luis Miguel Montoya
Understand to Align
AI Safety & Interpretability Research
Aspiring AI Interpretability Researcher

Luis Miguel Montoya

Understand to Align

Exploring the journey from physics to AI interpretability research.

Journey Oct 2025

September–October 2025 Recap

Highlights, milestones, and lessons learned from the September–October leg of my AI safety journey.

View Journey
Project Sep 2025

Towards Disentangling Latent Content and Behavioral Inhibition in Taboo Language Models

Project log for my MATS 9.0 application work probing how latent content and behavioral inhibition interact inside Taboo-fine-tuned Gemma models.

View Project
Literature Sep 2025

A Mathematical Framework for Transformer Circuits

Anthropic’s foundational write-up that formalizes how circuits emerge inside transformers and how to reason about them.

Paper
View Literature externally