Towards Efficient Optimizer Design for LLM via Structured Fisher Approximation with a Low-Rank Extension

Jan 1, 2025·

Wenbo Gong

M Scetbon

C Ma

E Meeds

· 0 min read

PDF Cite Source Document

Type

Preprint

Publication

arXiv:2502.07752

Last updated on Jan 1, 2025

Optimization Learning Dynamics LLM

Authors

Wenbo Gong

Senior Researcher

Senior Researcher at Microsoft Research Cambridge working on learning dynamics and optimization for foundation models, with prior work on causality and approximate inference.

← SWAN: SGD with Normalization and Whitening Enables Stateless LLM Training Jan 1, 2025

Deep end-to-end causal inference Jan 1, 2024 →