About Me
I am currently a Kempner Research Fellow at the Kempner Institute at Harvard University. In Fall 2026, I will start as an Assistant Professor at MIT with a shared appointment between Mathematics and EECS[AI+D].
I received my Ph.D. in Applied and Computational Mathematics at Princeton University under the supervision of Jason D. Lee, and my B.S. in Mathematics at Duke University where I was fortunate to work with Cynthia Rudin and Hau-Tieng Wu.
Research Interests
My research is focused on the mathematical foundations of deep learning. Some fun directions I’ve worked on are:
Deep Learning Optimization Dynamics
My goal is to develop a predictive, and ultimately prescriptive, theory for deep learning optimization. This requires grappling with settings not captured by classical optimization theory. For example, large-batch training typically occurs in a chaotic regime called the Edge of Stability (pictured). I’ve studied how different optimizers navigate the Edge of Stability regime in order to provide simple explanations for their dynamics and behavior. This line of work is summarized in this blogpost and in the papers Self-Stabilization and Central Flows.
You can also click this link for a fun visualization of limit cycles and chaos in Adam.
Representation Learning in Simple Models
The miracle of deep learning is that neural networks automatically extract maningful representations from raw data during the optimization process. To gain insights into this process, I’ve studied the optimization dynamics of simple models trained on synthetic data to ask: What representations are learned? How many samples does the network need to learn them? What signals in the gradient help guide optimization towards them? I’ve worked on these questions in both feed-forward neural networks (MLPs) [1][2][3] and Transformers [4][5].
Computational-to-Statistical Gaps
Many high-dimensional learning problems exhibit a conjectured gap between the minimum number of samples needed information-theoretically to solve the problem, and the number of samples needed by polynomial time algorithms. This implies a fundamental tradeoff between runtime and sample complexity. I’ve studied this tradeoff in Gaussian single-index [3][6] and multi-index [7] models to identify structures that can make learning problems hard or easy.
Recruiting
I am actively looking for students starting in Fall 2026. If you are interested in working with me, please apply to either the Mathematics or EECS departments at MIT and list my name in your application.
Selected Publications
The Generative Leap: Sharp Sample Complexity for Efficiently Learning Gaussian Multi-Index Models
Alex Damian, Jason D. Lee, Joan Bruna
Learning Compositional Functions with Transformers from Easy-to-Hard Data
Zixuan Wang*, Eshaan Nichani*, Alberto Bietti, Alex Damian, Daniel Hsu, Jason D Lee, Denny Wu
Understanding Optimization in Deep Learning with Central Flows
Jeremy M. Cohen*, Alex Damian*, Ameet Talwalkar, J. Zico Kolter, Jason D. Lee
Computational-Statistical Gaps in Gaussian Single-Index Models
Alex Damian, Loucas Pillaud-Vivien, Jason D. Lee, Joan Bruna
How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason D. Lee
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
Alex Damian, Eshaan Nichani, Rong Ge, Jason D. Lee
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani, Alex Damian, Jason D. Lee
Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Alex Damian*, Eshaan Nichani*, Jason D. Lee
Neural Networks can Learn Representations with Gradient Descent
Alex Damian, Jason D. Lee, Mahdi Soltanolkotabi
Label Noise SGD Provably Prefers Flat Global Minimizers
Alex Damian, Tengyu Ma, Jason D. Lee
Awards
- Jane Street Graduate Research Fellowship (2024–2025)
- NSF Graduate Research Fellowship (2021–2024)
- Julia Dale Award, Duke University (2020)
- Putnam Honorable Mention (2019)
- Angier B. Duke Scholarship, Duke University (2016)