Gradient Multi-Normalization for Stateless and Scalable LLM Training

Jan 1, 2025ยท
M Scetbon
,
C Ma
Wenbo Gong
Wenbo Gong
,
E Meeds
ยท 0 min read
Type
Publication
NeurIPS 2025
Wenbo Gong
Authors
Senior Researcher
Senior Researcher at Microsoft Research Cambridge working on learning dynamics and optimization for foundation models, with prior work on causality and approximate inference.