I am a machine learning researcher working on the optimization and training dynamics of large neural networks. I recently completed my PhD in Computer Science at EPFL, advised by Prof. Martin Jaggi, with a thesis on the training dynamics of large language model pre-training — examining how weight decay, learning-rate warmup and schedules, and hyperparameter transfer shape optimization at scale.
Before my doctorate I worked on large-scale neural network training as an Autopilot Software Engineer at Tesla and as a Machine Learning Research Engineer at Cerebras Systems, and most recently as an Applied Scientist intern at Amazon. I hold an MS in Electrical Engineering from Stanford University and a BS from the University of Iceland, where I grew up.