BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CMSA - ECPv6.15.18//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:CMSA
X-ORIGINAL-URL:https://cmsa.fas.harvard.edu
X-WR-CALDESC:Events for CMSA
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20260308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20261101T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20270314T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20271107T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20260204T140000
DTEND;TZID=America/New_York:20260204T150000
DTSTAMP:20260503T110045
CREATED:20250128T214750Z
LAST-MODIFIED:20260126T163315Z
UID:10003708-1770213600-1770217200@cmsa.fas.harvard.edu
SUMMARY:Automated Theory Formation and Interestingness in Mathematics
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: George Tsoukalas\, UT Austin Dept. of Computer Science and Google DeepMind. \nTitle: Automated Theory Formation and Interestingness in Mathematics \nAbstract: Advances in modern learning systems are beginning to demonstrate utility for select problems in research mathematics. A broader challenge is that of developing new theories automatically. This area has a rich history\, and is tied to some of the earliest work in AI. In particular\, a central question in this study was measuring the “interestingness” of mathematical concepts. \nIn this talk\, I will review this historical context and present our recent work on using large language models to synthesize interestingness measures that guide theory exploration in elementary number theory from scratch. I will conclude by outlining potential future research directions in this domain. \nJoint work done at UT Austin with Rahul Saha\, Amitayush Thakur\, Sabrina Reguyal\, and Swarat Chaudhuri.
URL:https://cmsa.fas.harvard.edu/event/newtech_2426/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-2.4.2026.docx-1-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20260211T140000
DTEND;TZID=America/New_York:20260211T150000
DTSTAMP:20260503T110045
CREATED:20260126T152202Z
LAST-MODIFIED:20260126T212834Z
UID:10003878-1770818400-1770822000@cmsa.fas.harvard.edu
SUMMARY:ReLU and Softplus neural nets as zero-sum\, turn-based\, stopping games
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Yiannis Vlassopoulos\, Athena Research Center \nTitle: ReLU and Softplus neural nets as zero-sum\, turn-based\, stopping games \nAbstract: Neural networks are for the most part treated as black boxes. In an effort to begin elucidating the mathematical structure they encode\, we will explain how ReLU neural nets can be interpreted as zero-sum turn-based\, stopping games. The game runs in the opposite direction to the net. The input to the net is the terminal reward of the game\, the output of the net is the value of the game at its initial states. The bias at each neuron is used to define the reward and the weights are used to define state-transition probabilities. One player –Max– is trying to maximize reward and the other –Min-\, to minimize it. Every neuron gives rise to two game states\, one where Max plays and one where Min plays. In fact running the ReLU net is equivalent to the Shapley-Bellman backward recursion for the value of the game. As a corollary of this construction we get a path integral expression for the output of the net\, given input. Moreover using the fact that the Shapley operator is monotonic (with respect to the coordinate-wise order) we get bounds for the output of the net\, given bounds for the input. Adding an entropic regularization to the ReLU net game allows us to interpret Softplus neural nets as games in an analogous fashion.\nThis is joint work with Stéphane Gaubert. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_21126/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-2.11.2026.docx-1-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20260225T140000
DTEND;TZID=America/New_York:20260225T150000
DTSTAMP:20260503T110045
CREATED:20260210T192336Z
LAST-MODIFIED:20260210T194238Z
UID:10003894-1772028000-1772031600@cmsa.fas.harvard.edu
SUMMARY:Scaling Stochastic Momentum from Theory to LLMs
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Courtney Paquette\, McGill University \nTitle: Scaling Stochastic Momentum from Theory to LLMs \nAbstract: Given the massive scale of modern ML models\, we now often get only a single shot to train them effectively. This limits our ability to sweep architectures and hyperparameters\, making it essential to understand how learning algorithms scale so insights from small models transfer to large ones. \nIn this talk\, I present a framework for analyzing scaling laws of stochastic momentum methods using a power-law random features model\, leveraging tools from high-dimensional probability and random matrix theory. We show that standard SGD with momentum does not improve scaling exponents\, while dimension-adapted Nesterov acceleration (DANA)—which explicitly adapts momentum to model size and data/target complexity—achieves strictly better loss and compute scaling. DANA does this by rescaling its momentum parameters with dimension\, effectively matching the optimizer’s memory to the problem geometry. \nMotivated by these theoretical insights\, I introduce logarithmic-time scheduling for large language models and propose ADANA\, an AdamW-like optimizer with growing memory and explicit damping. Across transformer scales (45M to 2.6B parameters)\, ADANA yields up to 40% compute savings over tuned AdamW\, with gains that improve at scale. \nBased on joint work with Damien Ferbach\, Elliot Paquette\, Katie Everett\, and Gauthier Gidel.
URL:https://cmsa.fas.harvard.edu/event/newtech_22526/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-2.25.2026.docx-scaled.png
END:VEVENT
END:VCALENDAR