BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CMSA - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:CMSA
X-ORIGINAL-URL:https://cmsa.fas.harvard.edu
X-WR-CALDESC:Events for CMSA
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20260308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20261101T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20251008T140000
DTEND;TZID=America/New_York:20251008T150000
DTSTAMP:20260508T113814
CREATED:20250930T181425Z
LAST-MODIFIED:20251009T195959Z
UID:10003801-1759932000-1759935600@cmsa.fas.harvard.edu
SUMMARY:Understanding Optimization in Deep Learning with Central Flows
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Alex Damian\, Harvard \nTitle: Understanding Optimization in Deep Learning with Central Flows \nAbstract: Traditional theories of optimization cannot describe the dynamics of optimization in deep learning\, even in the simple setting of deterministic training. The challenge is that optimizers typically operate in a complex\, oscillatory regime called the “edge of stability.” In this paper\, we develop theory that can describe the dynamics of optimization in this regime. Our key insight is that while the *exact* trajectory of an oscillatory optimizer may be challenging to analyze\, the *time-averaged* (i.e. smoothed) trajectory is often much more tractable. To analyze an optimizer\, we derive a differential equation called a “central flow” that characterizes this time-averaged trajectory. We empirically show that these central flows can predict long-term optimization trajectories for generic neural networks with a high degree of numerical accuracy. By interpreting these central flows\, we are able to understand how gradient descent makes progress even as the loss sometimes goes up; how adaptive optimizers “adapt” to the local loss landscape; and how adaptive optimizers implicitly navigate towards regions where they can take larger steps. Our results suggest that central flows can be a valuable theoretical tool for reasoning about optimization in deep learning. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_10825/
LOCATION:Hybrid – G10
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-10.8.2025-scaled.png
END:VEVENT
END:VCALENDAR