Atli Kosson

Machine Learning Researcher · PhD, EPFL

I am a machine learning researcher working on the optimization and training dynamics of large neural networks. I recently completed my PhD in Computer Science at EPFL, advised by Prof. Martin Jaggi, with a thesis on the training dynamics of large language model pre-training — examining how weight decay, learning-rate warmup and schedules, and hyperparameter transfer shape optimization at scale.

Before my doctorate I worked on large-scale neural network training as an Autopilot Software Engineer at Tesla and as a Machine Learning Research Engineer at Cerebras Systems, and most recently as an Applied Scientist intern at Amazon. I hold an MS in Electrical Engineering from Stanford University and a BS from the University of Iceland, where I grew up.

Atli Kosson

Research Interests

I aim to improve our understanding of how and why modern deep learning works, with a focus on optimization. I lean toward a top-down, empirical style — large-scale experimentation and applied analysis rather than purely bottom-up theory. Recurring themes in my work include:

Publications

My work has appeared at ICLR, NeurIPS, ICML, AAAI, MLSys, and TMLR. A full, up-to-date list is available on my Google Scholar profile.