BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CMSA - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://cmsa.fas.harvard.edu
X-WR-CALDESC:Events for CMSA
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20260308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20261101T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20270314T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20271107T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20260520T140000
DTEND;TZID=America/New_York:20260520T150000
DTSTAMP:20260506T155055
CREATED:20260429T133019Z
LAST-MODIFIED:20260429T143145Z
UID:10003942-1779285600-1779289200@cmsa.fas.harvard.edu
SUMMARY:Separation of timescales controls feature learning and overfitting in large neural networks
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Pierfrancesco Urbani\, Universite Paris-Saclay\, CNRS\, CEA\, Institut de physique theorique \nTitle: Separation of timescales controls feature learning and overfitting in large neural networks \nAbstract: To understand the inductive bias and generalization capabilities of large\, overparameterized machine learning models\, it is essential to analyze the dynamics of their training algorithms. Using dynamical mean field theory we investigate the learning dynamics of large two-layer neural networks. Our findings reveal that\, for networks with a large width\, the training process exhibits a separation of timescales phenomenon. This leads to several key observations:\n1. The emergence of a slow timescale linked to the growth in Gaussian/Rademacher complexity of the network;\n2. An inductive bias favoring low complexity when the initial model complexity is sufficiently small;\n3. A dynamical decoupling between feature learning and overfitting phases;\n4. A non-monotonic trend in test error\, characterized by a “feature unlearning” regime at later stages of training.\nJoint work with Andrea Montanari. \n  \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_52026/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-5.20.2026.docx.png
END:VEVENT
END:VCALENDAR