Scaling Stochastic Momentum from Theory to LLMs

CMSA Room G10 CMSA, 20 Garden Street, Cambridge

New Technologies in Mathematics Seminar Speaker: Courtney Paquette, McGill University Title: Scaling Stochastic Momentum from Theory to LLMs Abstract: Given the massive scale of modern ML models, we now often get only […]