Isabelle Lee

lee.isabelle.g [at] gmail [dot] com

I’m a 3rd year ML/AI PhD student at USC, working with Dani Yogatama and Yan Liu. Currently, I’m visiting Harvard, working with David Alvarez-Melis and Naomi Saphra. My work is supported by the Viterbi School of Engineering Graduate Fellowship and Coefficient Giving’s Technical AI Safety Research Grant.

I’m broadly interested in training, reasoning, and interpretability - how we make sense of models, and how it might uncover the underlying science of large-scale models. In particular, I aim to (1) predict training, and (2) predict failures.

Currently, I’m exploring dynamical systems and physics-inspired approaches to analyze small-scale toy learning problems, as well as predicting/controlling larger-scale training from developmental perspectives.

Publications Blog Tags Archive Search

Updates

Jun '26 Headed to ICML 2026 and FAR.AI’s Alignment Workshop.

May '26 Rigorous Interpretation Is a Form of Evaluation was accepted to EvalEval at ACL 2026 as an oral presentation.

Jan '26 FOL-Traces was accepted to the Findings of EACL 2026.

Jan '26 New preprint: Evaluating Large Language Models for Fair and Reliable Organ Allocation

Dec '25 Awarded Coefficient Giving’s Technical AI Safety Research Grant.

Featured publications See all

Rigorous Interpretation Is a Form of Evaluation

If held to scientific standards of falsifiability, reproducibility, and predictability, interpretation can become evaluation.

FOL-Traces: Verified First-Order Logic Reasoning Traces at Scale

We introduce a large-scale, complexity-annotated, verified CoT-like reasoning trace dataset showing that current LLMs still struggle with structured inference.