🎉 Our paper SWAN is accepted at ICML 2025!

Apr 1, 2025·

Wenbo Gong

· 1 min read

🥳🥂 Our paper SWAN:SGD with Normalization and Whitening Enables Stateless LLM Training has been accepted in ICML 2025 conference. This optimizer allows stateless LLM training to maximize the memory efficiency, while achieving on-par or better performance than standard AdamW optimizer. 🥳🥂

Last updated on Apr 1, 2025

Academic

Authors

Wenbo Gong

Senior Researcher

Senior Researcher at Microsoft Research Cambridge working on learning dynamics and optimization for foundation models, with prior work on causality and approximate inference.

← 🎉 Our paper Gradient Multi-Normalization is accepted at NeurIPS 2025! Oct 1, 2025