BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CMSA - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://cmsa.fas.harvard.edu
X-WR-CALDESC:Events for CMSA
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20230312T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20231105T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20260308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20261101T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20270314T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20271107T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20260506T140000
DTEND;TZID=America/New_York:20260506T150000
DTSTAMP:20260515T085426
CREATED:20260421T144955Z
LAST-MODIFIED:20260421T150144Z
UID:10003935-1778076000-1778079600@cmsa.fas.harvard.edu
SUMMARY:New directions in synthetic data
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Tatsunori Hashimoto\, Stanford \nTitle: New directions in synthetic data \nAbstract: Synthetic data has been an effective\, if boring set of techniques: prompt some language model to restructure your corpus to match some downstream task\, with occasionally some distillation. In this talk\, we will take a more expansive view of synthetic data as a general algorithmic tool for generative modeling\, arguing that the design space and possibilities of synthetic data are much bigger than it might seem. Through a few recent works\, we will show that synthetic data has major benefits beyond transforming the data – improving in-domain perplexities\, and enabling unique algorithmic primitives\, such as neighborhood smoothing and concatenated ‘mega’ documents. With this broader view\, we will point towards a nascent but interesting possibility of treating data itself as an algorithmic object to be engineered and optimized end-to-end. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_5626/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-5.6.2026.docx.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20260318T140000
DTEND;TZID=America/New_York:20260318T150000
DTSTAMP:20260515T085426
CREATED:20260309T145907Z
LAST-MODIFIED:20260311T161332Z
UID:10003916-1773842400-1773846000@cmsa.fas.harvard.edu
SUMMARY:Dynamic reasoning
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Emmanuel Abbé\, EPFL\, Institute of Mathematics and School of Computer and Communication Sciences & Apple \nTitle: Dynamic reasoning \nAbstract: In the current AI landscape\, reasoning is frequently equated with the generation of intermediate “thinking traces”. However\, these traces are merely a mechanism\, not the ultimate objective.\nRelying solely on the presence of a trace can be deceptive\, as models often learn to mimic the format of reasoning while effectively overfitting to specific training distributions.\nTo build more robust and versatile reasoners\, we shift our focus to more specific structural properties of the thinking process\, in particular compositionality (inductive CoT\, AdaBack) and abstraction (AbstRaL).
URL:https://cmsa.fas.harvard.edu/event/newtech_31826-2/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-3.18.2026.docx-1-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20260225T140000
DTEND;TZID=America/New_York:20260225T150000
DTSTAMP:20260515T085426
CREATED:20260210T192336Z
LAST-MODIFIED:20260210T194238Z
UID:10003894-1772028000-1772031600@cmsa.fas.harvard.edu
SUMMARY:Scaling Stochastic Momentum from Theory to LLMs
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Courtney Paquette\, McGill University \nTitle: Scaling Stochastic Momentum from Theory to LLMs \nAbstract: Given the massive scale of modern ML models\, we now often get only a single shot to train them effectively. This limits our ability to sweep architectures and hyperparameters\, making it essential to understand how learning algorithms scale so insights from small models transfer to large ones. \nIn this talk\, I present a framework for analyzing scaling laws of stochastic momentum methods using a power-law random features model\, leveraging tools from high-dimensional probability and random matrix theory. We show that standard SGD with momentum does not improve scaling exponents\, while dimension-adapted Nesterov acceleration (DANA)—which explicitly adapts momentum to model size and data/target complexity—achieves strictly better loss and compute scaling. DANA does this by rescaling its momentum parameters with dimension\, effectively matching the optimizer’s memory to the problem geometry. \nMotivated by these theoretical insights\, I introduce logarithmic-time scheduling for large language models and propose ADANA\, an AdamW-like optimizer with growing memory and explicit damping. Across transformer scales (45M to 2.6B parameters)\, ADANA yields up to 40% compute savings over tuned AdamW\, with gains that improve at scale. \nBased on joint work with Damien Ferbach\, Elliot Paquette\, Katie Everett\, and Gauthier Gidel.
URL:https://cmsa.fas.harvard.edu/event/newtech_22526/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-2.25.2026.docx-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20260211T140000
DTEND;TZID=America/New_York:20260211T150000
DTSTAMP:20260515T085426
CREATED:20260126T152202Z
LAST-MODIFIED:20260126T212834Z
UID:10003878-1770818400-1770822000@cmsa.fas.harvard.edu
SUMMARY:ReLU and Softplus neural nets as zero-sum\, turn-based\, stopping games
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Yiannis Vlassopoulos\, Athena Research Center \nTitle: ReLU and Softplus neural nets as zero-sum\, turn-based\, stopping games \nAbstract: Neural networks are for the most part treated as black boxes. In an effort to begin elucidating the mathematical structure they encode\, we will explain how ReLU neural nets can be interpreted as zero-sum turn-based\, stopping games. The game runs in the opposite direction to the net. The input to the net is the terminal reward of the game\, the output of the net is the value of the game at its initial states. The bias at each neuron is used to define the reward and the weights are used to define state-transition probabilities. One player –Max– is trying to maximize reward and the other –Min-\, to minimize it. Every neuron gives rise to two game states\, one where Max plays and one where Min plays. In fact running the ReLU net is equivalent to the Shapley-Bellman backward recursion for the value of the game. As a corollary of this construction we get a path integral expression for the output of the net\, given input. Moreover using the fact that the Shapley operator is monotonic (with respect to the coordinate-wise order) we get bounds for the output of the net\, given bounds for the input. Adding an entropic regularization to the ReLU net game allows us to interpret Softplus neural nets as games in an analogous fashion.\nThis is joint work with Stéphane Gaubert. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_21126/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-2.11.2026.docx-1-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20260204T140000
DTEND;TZID=America/New_York:20260204T150000
DTSTAMP:20260515T085426
CREATED:20250128T214750Z
LAST-MODIFIED:20260126T163315Z
UID:10003708-1770213600-1770217200@cmsa.fas.harvard.edu
SUMMARY:Automated Theory Formation and Interestingness in Mathematics
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: George Tsoukalas\, UT Austin Dept. of Computer Science and Google DeepMind. \nTitle: Automated Theory Formation and Interestingness in Mathematics \nAbstract: Advances in modern learning systems are beginning to demonstrate utility for select problems in research mathematics. A broader challenge is that of developing new theories automatically. This area has a rich history\, and is tied to some of the earliest work in AI. In particular\, a central question in this study was measuring the “interestingness” of mathematical concepts. \nIn this talk\, I will review this historical context and present our recent work on using large language models to synthesize interestingness measures that guide theory exploration in elementary number theory from scratch. I will conclude by outlining potential future research directions in this domain. \nJoint work done at UT Austin with Rahul Saha\, Amitayush Thakur\, Sabrina Reguyal\, and Swarat Chaudhuri.
URL:https://cmsa.fas.harvard.edu/event/newtech_2426/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-2.4.2026.docx-1-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20251203T140000
DTEND;TZID=America/New_York:20251203T150000
DTSTAMP:20260515T085426
CREATED:20251110T191407Z
LAST-MODIFIED:20251110T225824Z
UID:10003833-1764770400-1764774000@cmsa.fas.harvard.edu
SUMMARY:Machine learning tools for mathematical discovery
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Adam Zsolt Wagner\, Google DeepMind \nTitle: Machine learning tools for mathematical discovery \nAbstract: I will discuss various ML tools we can use today to try to find interesting constructions to various mathematical problems. I will briefly mention simple reinforcement learning setups and PatternBoost\, but the talk will mainly focus on LLM-based tools such as FunSearch and AlphaEvolve. We will discuss the pros and cons of several of these methods\, and try to figure out which one is best for the problems we care about.\nJoint work with François Charton\, Jordan Ellenberg\, Bogdan Georgiev\, Javier Gómez-Serrano\, Terence Tao\, and Geordie Williamson.
URL:https://cmsa.fas.harvard.edu/event/newtech_12325/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-12.3.2025-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20251105T140000
DTEND;TZID=America/New_York:20251105T150000
DTSTAMP:20260515T085426
CREATED:20251027T142022Z
LAST-MODIFIED:20251027T144043Z
UID:10003826-1762351200-1762354800@cmsa.fas.harvard.edu
SUMMARY:Discovery of unstable singularity with machine precision
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Yongji Wang\, NYU Courant Institute of Mathematical Sciences \nTitle: Discovery of unstable singularity with machine precision \nAbstract: Whether singularities can form in fluids remains a foundational unanswered question in mathematics. This phenomenon occurs when solutions to governing equations\, such as the 3D Euler equations\, develop infinite gradients from smooth initial conditions. Historically\, numerical approaches have primarily identified stable singularities. However\, these are not expected to exist for key open problems\, such as the boundary-free Euler and Navier-Stokes cases\, namely the Millennium Prize problem. For these problems\, the true challenge lies in finding unstable singularities\, which are exceptionally elusive\, as any tiny perturbation can divert the system from its blow-up trajectory. \nIn this talk\, I will present a new computational framework which has led to the first systematic discovery of new families of unstable singularities in various fluid equations. Our approach merges curated machine learning architectures with a multi-stage training scheme and a high-precision Gauss-Newton optimizer\, creating a powerful tool for navigating the complex landscape of nonlinear PDEs. Beyond discovering these singularities\, the precision of this method is another key breakthrough\, achieving unprecedented accuracies on the order of $O(10^{-13})$—a level constrained only by the round-off errors of the GPU hardware. This level of precision meets the stringent requirements for rigorous mathematical validation of the discovered solution via computer-assisted proofs\, offering a new pathway to resolving long-standing challenges in mathematical physics. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_11525/
LOCATION:CMSA 20 Garden Street Cambridge\, Massachusetts 02138 United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-11.5.2025-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20251022T140000
DTEND;TZID=America/New_York:20251022T150000
DTSTAMP:20260515T085426
CREATED:20251008T132005Z
LAST-MODIFIED:20251008T133142Z
UID:10003808-1761141600-1761145200@cmsa.fas.harvard.edu
SUMMARY:The Carleson project: A collaborative formalization
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: María Inés de Frutos Fernández\, Mathematical Institute\, University of Bonn \nTitle: The Carleson project: A collaborative formalization \nAbstract: A well-known result in Fourier analysis establishes that the partial Fourier sums of a smooth periodic function $f$ converge uniformly to $f$\, but the situation is a lot more subtle for e.g. continuous functions. However\, in 1966 Carleson proved that they do converge at almost all points for $L^2$ periodic functions on the real line. Carleson’s proof is famously hard to read\, and there are no known easy proofs of this theorem. As a large collaborative project\, we have formalized in Lean a generalization of Carleson’s theorem in the setting of doubling metric measure spaces (proven in 2023)\, and Carleson’s original result as a corollary. In this talk I will give an overview of the project\, with a focus on how the collaboration was organized. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_102225/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-10.22.2025-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20251008T140000
DTEND;TZID=America/New_York:20251008T150000
DTSTAMP:20260515T085426
CREATED:20250930T181425Z
LAST-MODIFIED:20251009T195959Z
UID:10003801-1759932000-1759935600@cmsa.fas.harvard.edu
SUMMARY:Understanding Optimization in Deep Learning with Central Flows
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Alex Damian\, Harvard \nTitle: Understanding Optimization in Deep Learning with Central Flows \nAbstract: Traditional theories of optimization cannot describe the dynamics of optimization in deep learning\, even in the simple setting of deterministic training. The challenge is that optimizers typically operate in a complex\, oscillatory regime called the “edge of stability.” In this paper\, we develop theory that can describe the dynamics of optimization in this regime. Our key insight is that while the *exact* trajectory of an oscillatory optimizer may be challenging to analyze\, the *time-averaged* (i.e. smoothed) trajectory is often much more tractable. To analyze an optimizer\, we derive a differential equation called a “central flow” that characterizes this time-averaged trajectory. We empirically show that these central flows can predict long-term optimization trajectories for generic neural networks with a high degree of numerical accuracy. By interpreting these central flows\, we are able to understand how gradient descent makes progress even as the loss sometimes goes up; how adaptive optimizers “adapt” to the local loss landscape; and how adaptive optimizers implicitly navigate towards regions where they can take larger steps. Our results suggest that central flows can be a valuable theoretical tool for reasoning about optimization in deep learning. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_10825/
LOCATION:Hybrid – G10
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-10.8.2025-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20251001T140000
DTEND;TZID=America/New_York:20251001T150000
DTSTAMP:20260515T085426
CREATED:20250128T214901Z
LAST-MODIFIED:20251002T140605Z
UID:10003710-1759327200-1759330800@cmsa.fas.harvard.edu
SUMMARY:Tropicalized quantum field theory
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Michael Borinsky\, Perimeter Institute  \nTitle: Tropicalized quantum field theory \nAbstract: Quantum field theory (QFT) is one of the most accurate methods for making phenomenological predictions in physics\, but it has a significant drawback: obtaining concrete predictions from it is computationally very demanding. The standard perturbative approach expands an interacting QFT around a free QFT\, using Feynman diagrams. However\, the number of these diagrams grows superexponentially\, making the approach quickly infeasible. \nI will talk about arXiv:2508.14263\, which introduces an intermediate layer between free and interacting field theories: a tropicalized QFT. Often\, this tropicalized QFT can be solved exactly. The exact solution manifests as a non-linear recursion equation fulfilled by the expansion coefficients of the quantum effective action. Geometrically\, this recursion computes volumes of moduli spaces of metric graphs and is thereby analogous to Mirzakhani’s volume recursions on the moduli space of curves. Building on this exact solution\, an algorithm can be constructed that samples points from the moduli space of graphs approximately proportional to their perturbative contribution. Via a standard Monte Carlo approach we can evaluate the original QFT using this algorithm. Remarkably\, this algorithm requires only polynomial time and memory\, suggesting that perturbative quantum field theory computations actually lie in the polynomial-time complexity class\, while all known algorithms for evaluating individual Feynman integrals are at least exponential in time and memory. The (potential) capabilities of this approach are remarkable: For instance\, we can compute perturbative expansions of massive scalar D=3 phi^3 and D=4 phi^4 quantum field theories up to loop orders between 20 and 50 using a basic proof-of-concept implementation. These perturbative orders are completely inaccessible using a naive approach.
URL:https://cmsa.fas.harvard.edu/event/newtech_10125/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-10.1.2025.docx-1-scaled.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250915T090000
DTEND;TZID=America/New_York:20250918T170000
DTSTAMP:20260515T085426
CREATED:20250710T134311Z
LAST-MODIFIED:20250930T154307Z
UID:10003755-1757926800-1758214800@cmsa.fas.harvard.edu
SUMMARY:The Geometry of Machine Learning
DESCRIPTION:The Geometry of Machine Learning \nDates: September 15–18\, 2025 \nLocation: Harvard CMSA\, Room G10\, 20 Garden Street\, Cambridge MA 02138 \nDespite the extraordinary progress in large language models\, mathematicians suspect that other dimensions of intelligence must be defined and simulated to complete the picture. Geometric and symbolic reasoning are among these. In fact\, there seems to be much to learn about existing ML by considering it from a geometric perspective\, e.g. what is happening to the data manifold as it moves through a NN?  How can geometric and symbolic tools be interfaced with LLMs? A more distant goal\, one that seems only approachable through AIs\, would be to gain some insight into the large-scale structure of mathematics as a whole: the geometry of math\, rather than geometry as a subject within math. This conference is intended to begin a discussion on these topics. \nSpeakers \n\nMaissam Barkeshli\, University of Maryland\nEve Bodnia\, Logical Intelligence\nAdam Brown\, Stanford\nBennett Chow\, USCD & IAS\nMichael Freedman\, Harvard CMSA\nElliot Glazer\, Epoch AI\nJames Halverson\, Northeastern\nJesse Han\, Math Inc.\nJunehyuk Jung\, Brown University\nAlex Kontorovich\, Rutgers University\nYann Lecun\, New York University & META*\nJared Duker Lichtman\, Stanford  & Math Inc.\nBrice Ménard\, Johns Hopkins\nMichael Mulligan\, UCR & Logical Intelligence\nPatrick Shafto\, DARPA & Rutgers University\n\nOrganizers: Michael R. Douglas (CMSA) and Mike Freedman (CMSA) \n  \nGeometry of Machine Learning Youtube Playlist \n  \nSchedule \nMonday\, Sep. 15\, 2025 \n\n\n\n8:30–9:00 am\nMorning refreshments\n\n\n9:00–10:00 am\nJames Halverson\, Northeastern \nTitle: Sparsity and Symbols with Kolmogorov-Arnold Networks \nAbstract: In this talk I’ll review Kolmogorov-Arnold nets\, as well as new theory and applications related to sparsity and symbolic regression\, respectively.  I’ll review essential results regarding KANs\, show how sparsity masks relate deep nets and KANs\, and how KANs can be utilized alongside multimodal language models for symbolic regression. Empirical results will necessitate a few slides\, but the bulk will be chalk.\n\n\n10:00–10:30 am\nBreak\n\n\n10:30–11:30 am\nMaissam Barkeshli\, University of Maryland \nTitle: Transformers and random walks: from language to random graphs \nAbstract: The stunning capabilities of large language models give rise to many questions about how they work and how much more capable they can possibly get. One way to gain additional insight is via synthetic models of data with tunable complexity\, which can capture the basic relevant structures of real data. In recent work we have focused on sequences obtained from random walks on graphs\, hypergraphs\, and hierarchical graphical structures. I will present some recent empirical results for work in progress regarding how transformers learn sequences arising from random walks on graphs. The focus will be on neural scaling laws\, unexpected temperature-dependent effects\, and sample complexity.\n\n\n11:30 am–12:00 pm\nBreak\n\n\n12:00–1:00 pm\nAdam Brown\, Stanford \nTitle: LLMs\, Reasoning\, and the Future of Mathematical Sciences \nAbstract: Over the last half decade\, the mathematical capabilities of large language models (LLMs) have leapt from preschooler to undergraduate and now beyond. This talk reviews recent progress\, and speculates as to what it will mean for the future of mathematical sciences if these trends continue.\n\n\n\n  \nTuesday\, Sep. 16\, 2025 \n\n\n\n8:30–9:00 am\nMorning refreshments\n\n\n9:00–10:00 am\nJunehyuk Jung\, Brown University \nTitle: AlphaGeometry: a step toward automated math reasoning \nAbstract: Last summer\, Google DeepMind’s AI systems made headlines by achieving Silver Medal level performance on the notoriously challenging International Mathematical Olympiad (IMO) problems. For instance\, AlphaGeometry 2\, one of these remarkable systems\, solved the geometry problem in a mere 19 seconds! \nIn this talk\, we will delve into the inner workings of AlphaGeometry\, exploring the innovative techniques that enable it to tackle intricate geometric puzzles. We will uncover how this AI system combines the power of neural networks with symbolic reasoning to discover elegant solutions.\n\n\n10:00–10:30 am\nBreak\n\n\n10:30–11:30 am\nBennett Chow\, USCD and IAS \nTitle: Ricci flow as a test for AI\n\n\n11:30 am–12:00 pm\nBreak\n\n\n12:00–1:00 pm\nJared Duker Lichtman\, Stanford & Math Inc. and Jesse Han\, Math Inc. \nTitle: Gauss – towards autoformalization for the working mathematician \nAbstract: In this talk we’ll highlight some recent formalization progress using a new agent – Gauss. We’ll outline a recent Lean proof of the Prime Number Theorem in strong form\, completing a challenge set in January 2024 by Alex Kontorovich and Terry Tao. We hope Gauss will help assist working mathematicians\, especially those who do not write formal code themselves.\n\n\n5:00–6:00 pm\nSpecial Lecture: Yann LeCun\, Science Center Hall C\n\n\n\n  \nWednesday\, Sep. 17\, 2025 \n\n\n\n8:30–9:00 am\nRefreshments\n\n\n9:00–10:00 am\nMichael Mulligan\, UCR and Logical Intelligence \nTitle: Spontaneous Kolmogorov-Arnold Geometry in Vanilla Fully-Connected Neural Networks \nAbstract: The Kolmogorov-Arnold (KA) representation theorem constructs universal\, but highly non-smooth inner functions (the first layer map) in a single (non-linear) hidden layer neural network. Such universal functions have a distinctive local geometry\, a “texture\,” which can be characterized by the inner function’s Jacobian\, $J(\mathbf{x})$\, as $\mathbf{x}$ varies over the data. It is natural to ask if this distinctive KA geometry emerges through conventional neural network optimization. We find that indeed KA geometry often does emerge through the process of training vanilla single hidden layer fully-connected neural networks (MLPs). We quantify KA geometry through the statistical properties of the exterior powers of $J(\mathbf{x})$: number of zero rows and various observables for the minor statistics of $J(\mathbf{x})$\, which measure the scale and axis alignment of $J(\mathbf{x})$. This leads to a rough phase diagram in the space of function complexity and model hyperparameters where KA geometry occurs. The motivation is first to understand how neural networks organically learn to prepare input data for later downstream processing and\, second\, to learn enough about the emergence of KA geometry to accelerate learning through a timely intervention in network hyperparameters. This research is the “flip side” of KA-Networks (KANs). We do not engineer KA into the neural network\, but rather watch KA emerge in shallow MLPs.\n\n\n10:00–10:30 am\nBreak\n\n\n10:30–11:30 am\nEve Bodnia\, Logical Intelligence \nTitle: \nAbstract: We introduce a method of topological analysis on spiking correlation networks in neurological systems. This method explores the neural manifold as in the manifold hypothesis\, which posits that information is often represented by a lower-dimensional manifold embedded in a higher-dimensional space. After collecting neuron activity from human and mouse organoids using a micro-electrode array\, we extract connectivity using pairwise spike-timing time correlations\, which are optimized for time delays introduced by synaptic delays. We then look at network topology to identify emergent structures and compare the results to two randomized models – constrained randomization and bootstrapping across datasets. In histograms of the persistence of topological features\, we see that the features from the original dataset consistently exceed the variability of the null distributions\, suggesting that the observed topological features reflect significant correlation patterns in the data rather than random fluctuations. In a study of network resiliency\, we found that random removal of 10 % of nodes still yielded a network with a lesser but still significant number of topological features in the homology group H1 (counts 2-dimensional voids in the dataset) above the variability of our constrained randomization model; however\, targeted removal of nodes in H1 features resulted in rapid topological collapse\, indicating that the H1 cycles in these brain organoid networks are fragile and highly sensitive to perturbations. By applying topological analysis to neural data\, we offer a new complementary framework to standard methods for understanding information processing across a variety of complex neural systems.\n\n\n11:30 am–12:00 pm\nBreak\n\n\n12:00–1:00 pm\nAlex Kontorovich\, Rutgers University \nTitle: The Shape of Math to Come \nAbstract: We will discuss some ongoing experiments that may have meaningful impact on what working in research mathematics might look like in a decade (if not sooner).\n\n\n5:00–6:00 pm\nMike Freedman Millennium Lecture: The Poincaré Conjecture and Mathematical Discovery (Science Center Hall D)\n\n\n\n  \nThursday\, Sep. 18\, 2025 \n\n\n\n8:30–9:00 am\nMorning refreshments\n\n\n9:00–10:00 am\nElliott Glazer\, Epoch AI \nTitle: FrontierMath to Infinity \nAbstract: I will discuss FrontierMath\, a mathematical problem solving benchmark I developed over the past year\, including its design philosophy and what we’ve learned about AI’s trajectory from it. I will then look much further out\, speculate about what a “perfectly efficient” mathematical intelligence should be capable of\, and discuss how high-ceiling math capability metrics can illuminate the path towards that ideal.\n\n\n10:00–10:30 am\nBreak\n\n\n10:30–11:30 am\nBrice Ménard\, Johns Hopkins \nTitle:Demystifying the over-parametrization of neural networks \nAbstract: I will show how to estimate the dimensionality of neural encodings (learned weight structures) to assess how many parameters are effectively used by a neural network. I will then show how their scaling properties provide us with fundamental exponents on the learning process of a given task. I will comment on connections to thermodynamics.\n\n\n11:30 am–12:00 pm\nBreak\n\n\n12:00–12:30 pm\nPatrick Shafto\, Rutgers \nTitle: Math for AI and AI for Math \nAbstract: I will briefly discuss two DARPA programs aiming to deepen connections between mathematics and AI\, specifically through geometric and symbolic perspectives. The first aims for mathematical foundations for understanding the behavior and performance of modern AI systems such as Large Language Models and Diffusion models. The second aims to develop AI for pure mathematics through an understanding of abstraction\, decomposition\, and formalization. I will close with some thoughts on the coming convergence between AI and math.\n\n\n12:30–12:45 pm\nBreak\n\n\n12:45–2:00 pm\nMike Freedman\, Harvard CMSA \nTitle: How to think about the shape of mathematics \nFollowed by group discussion \n \n\n\n\n  \n  \n  \nSupport provided by Logical Intelligence. \n \n  \n 
URL:https://cmsa.fas.harvard.edu/event/mlgeometry/
LOCATION:CMSA 20 Garden Street Cambridge\, Massachusetts 02138 United States
CATEGORIES:Conference,Event
ATTACH;FMTTYPE=image/jpeg:https://cmsa.fas.harvard.edu/media/GML_2025.7-scaled.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250911T090000
DTEND;TZID=America/New_York:20250912T170000
DTSTAMP:20260515T085426
CREATED:20250502T175902Z
LAST-MODIFIED:20251026T044243Z
UID:10003743-1757581200-1757696400@cmsa.fas.harvard.edu
SUMMARY:Big Data Conference 2025
DESCRIPTION:Big Data Conference 2025 \nDates: Sep. 11–12\, 2025 \nLocation: Harvard University CMSA\, 20 Garden Street\, Cambridge & via Zoom \nThe Big Data Conference features speakers from the Harvard community as well as scholars from across the globe\, with talks focusing on computer science\, statistics\, math and physics\, and economics. \nInvited Speakers \n\nMarkus J. Buehler\, MIT\nYiling Chen\, Harvard\nJordan Ellenberg\, UW Madison\nYue M. Lu\, Harvard\nPankaj Mehta\, BU\nNick Patterson\, Harvard\nGautam Reddy\, Princeton\nTrevor David Rhone\, Rensselaer Polytechnic Institute\nTess Smidt\, MIT\n\nOrganizers: \nMichael M. Desai\, Harvard OEB |  Michael R. Douglas\, Harvard CMSA | Yannai A. Gonczarowski\, Harvard Economics | Efthimios Kaxiras\, Harvard Physics | Melanie Weber\, Harvard SEAS \n  \nBig Data Youtube Playlist \n  \nSchedule \nThursday\, Sep. 11\, 2025 \n  \n\n\n\n9:00 am\nRefreshments\n\n\n9:30 am\nIntroductions\n\n\n9:45–10:45 am\nGautam Reddy\, Princeton \nTitle: Global epistasis in genotype-phenotype maps\n\n\n10:45–11:00 am\nBreak\n\n\n11:00 am –12:00 pm\nNick Patterson\, Harvard \nTitle: The Origin of the Indo-Europeans \nAbstract: Indo-European is the largest family of human languages\, with very wide geographical distribution and more than 3 billion native speakers. How did this family arise and spread? This question has been discussed for nearly 250 years but with the advent of the availability of DNA from ancient fossils is now largely understood\, at least in broad outlines. We will describe what we now know about the origins.\n\n\n12:00–1:30 pm\nLunch break\n\n\n1:30–2:30 pm\nMarkus Buehler\, MIT \nTitle: Superintelligence for scientific discovery \nAbstract: AI is moving beyond prediction to become a partner in invention. While today’s models excel at interpolating within known data\, true discovery requires stepping outside existing truths. This talk introduces superintelligent discovery engines built on multi-agent swarms: diverse AI agents that interact\, compete\, and cooperate to generate structured novelty. Guided by Gödel’s insight that no closed system is complete\, these swarms create gradients of difference – much like temperature gradients in thermodynamics – that sustain flow\, invention\, and surprise. Case studies in protein design and music composition show how swarms escape data biases\, invent novel structures\, and weave long-range coherence\, producing creativity that rivals human processes. By moving from “big data” to “big insight”\, these systems point toward a new era of AI that composes knowledge across science\, engineering\, and the arts.\n\n\n2:30–2:45 pm\nBreak\n\n\n2:45–3:45 pm\nJordan Ellenberg\, UW Madison \nTitle: What does machine learning have to offer mathematics?\n\n\n3:45–4:00 pm\nBreak\n\n\n4:00–5:00 pm\nPankaj Mehta\, Boston University \nTitle: Thinking about high-dimensional biological data in the age of AI \nAbstract: The molecular biology revolution has transformed our view of living systems. Scientific explanations of biological phenomena are now synonymous with the identification of the genes and proteins. The preeminence of the molecular paradigm has only become more pronounced as new technologies allow us to make measurements at scale. Combining this wealth of data with new artificial intelligence (AI) techniques is widely viewed as the future of biology. Here\, I will discuss the promise and perils of this approach. I will focus on our unpublished work with collaborators on two fronts: (i) transformer-based models for understanding genotype-to-phenotype maps\, and (ii) LLM-based ‘foundational models’ for cellular identity\, such as TranscriptFormer\, which is trained on single-cell RNA sequencing (scRNAseq) data. While LLMs excel at capturing complex evolutionary and demographic structure in DNA sequence data\, they are much less adept at elucidating the biology of cellular identity. We show that simple parameter-free models based on linear-algebra outperform TranscriptFormer on downstream tasks related to cellular identity\, even though TranscriptFormer has nearly a billion parameters. If time permits\, I will conclude by showing how we can combine ideas from linear algebra\, bifurcation theory\, and statistical physics to classify cell fate transitions using scRNAseq data.\n\n\n\n  \nFriday\, Sep. 12\, 2025  \n\n\n\n9:00-9:45 am\nRefreshments\n\n\n9:45–10:45 am\nYiling Chen\, Harvard \nTitle: Data Reliability Scoring \nAbstract: Imagine you are trying to make a data-driven decision\, but the data at hand may be noisy\, biased\, or even strategically manipulated. Can you assess whether such a dataset is reliable—without access to ground truth?\nWe initiate the study of reliability scoring for datasets reported by potentially strategic data sources. While the true data remain unobservable\, we assume access to auxiliary observations generated by an unknown statistical process that depends on the truth. We introduce the Gram Determinant Score\, a reliability measure that evaluates how well the reported data align with the unobserved truth\, using only the reported data and the auxiliary observations. The score comes with provable guarantees: it preserves several natural reliability orderings. Experimentally\, it effectively captures data quality in settings with synthetic noise and contrastive learning embeddings.\nThis talk is based on joint work with Shi Feng\, Fang-Yi Yu\, and Paul Kattuman.\n\n\n10:45–11:00 am\nBreak\n\n\n11:00 am –12:00 pm\nYue M. Lu\, Harvard \nTitle: Nonlinear Random Matrices in High-Dimensional Estimation and Learning \nAbstract: In recent years\, new classes of structured random matrices have emerged in statistical estimation and machine learning. Understanding their spectral properties has become increasingly important\, as these matrices are closely linked to key quantities such as the training and generalization performance of large neural networks and the fundamental limits of high-dimensional signal recovery. Unlike classical random matrix ensembles\, these new matrices often involve nonlinear transformations\, introducing additional structural dependencies that pose challenges for traditional analysis techniques. \nIn this talk\, I will present a set of equivalence principles that establish asymptotic connections between various nonlinear random matrix ensembles and simpler linear models that are more tractable for analysis. I will then demonstrate how these principles can be applied to characterize the performance of kernel methods and random feature models across different scaling regimes and to provide insights into the in-context learning capabilities of attention-based Transformer networks.\n\n\n12:00–1:30 pm\nLunch break\n\n\n1:30–2:30 pm\nTrevor David Rhone\, Rensselaer Polytechnic Institute \nTitle: Accelerating the discovery of van der Waals quantum materials using AI \nAbstract: van der Waals (vdW) materials are exciting platforms for studying emergent quantum phenomena\, ranging from long-range magnetic order to topological order. A conservative estimate for the number of candidate vdW materials exceeds ~106 for monolayers and ~1012 for heterostructures. How can we accelerate the exploration of this entire space of materials? Can we design quantum materials with desirable properties\, thereby advancing innovation in science and technology? A recent study showed that artificial intelligence (AI) can be harnessed to discover new vdW Heisenberg ferromagnets based on Cr2Ge2Te6 [1]\, [2] and magnetic vdW topological insulators based on MnBi2Te4 [3]. In this talk\, we will harness AI to efficiently explore the large chemical space of vdW materials and to guide the discovery of vdW materials with desirable spin and charge properties. We will focus on crystal structures based on monolayer Cr2I6 of the form A2X6\, which are studied using density functional theory (DFT) calculations and AI. Magnetic properties\, such as the magnetic moment are determined. The formation energy is also calculated and used as a proxy for the chemical stability. We also investigate monolayers based on MnBi2Te4 of the form AB2X4 to identify novel topological materials. Further to this\, we study heterostructures based on MnBi2Te4/Sb2Te3 stacks. We show that AI\, combined with DFT\, can provide a computationally efficient means to predict the thermodynamic and magnetic properties of vdW materials [4]\,[5]. This study paves the way for the rapid discovery of chemically stable vdW quantum materials with applications in spintronics\, magnetic memory and novel quantum computing architectures.\n[1]        T. D. Rhone et al.\, “Data-driven studies of magnetic two-dimensional materials\,” Sci. Rep.\, vol. 10\, no. 1\, p. 15795\, 2020.\n[2]        Y. Xie\, G. Tritsaris\, O. Granas\, and T. Rhone\, “Data-Driven Studies of the Magnetic Anisotropy of Two-Dimensional Magnetic Materials\,” J. Phys. Chem. Lett.\, vol. 12\, no. 50\, pp. 12048–12054.\n[3]        R. Bhattarai\, P. Minch\, and T. D. Rhone\, “Investigating magnetic van der Waals materials using data-driven approaches\,” J. Mater. Chem. C\, vol. 11\, p. 5601\, 2023.\n[4]        T. D. Rhone et al.\, “Artificial Intelligence Guided Studies of van der Waals Magnets\,” Adv. Theory Simulations\, vol. 6\, no. 6\, p. 2300019\, 2023.\n[5]        P. Minch\, R. Bhattarai\, K. Choudhary\, and T. D. Rhone\, “Predicting magnetic properties of van der Waals magnets using graph neural networks\,” Phys. Rev. Mater.\, vol. 8\, no. 11\, p. 114002\, Nov. 2024.\nThis work used the Extreme Science and Engineering Discovery Environment (XSEDE)\, which is supported by National Science Foundation Grant No. ACI-1548562. This research used resources of the Argonne Leadership Computing Facility\, which is a DOE Office of Science User Facility supported under Contract No. DE-AC02-06CH11357. This material is based on work supported by the National Science Foundation CAREER award under Grant No. 2044842.\n\n\n2:30–2:45 pm\nBreak\n\n\n2:45–3:45 pm\nTess Smidt\, MIT \nTitle: Applications of Euclidean neural networks to understand and design atomistic systems \nAbstract: Atomic systems (molecules\, crystals\, proteins\, etc.) are naturally represented by a set of coordinates in 3D space labeled by atom type. This poses a challenge for machine learning due to the sensitivity of coordinates to 3D rotations\, translations\, and inversions (the symmetries of 3D Euclidean space). Euclidean symmetry-equivariant Neural Networks (E(3)NNs) are specifically designed to address this issue. They faithfully capture the symmetries of physical systems\, handle 3D geometry\, and operate on the scalar\, vector\, and tensor fields that characterize these systems. \nE(3)NNs have achieved state-of-the-art results across atomistic benchmarks\, including small-molecule property prediction\, protein-ligand binding\, force prediciton for crystals\, molecules\, and heterogeneous catalysis. By merging neural network design with group representation theory\, they provide a principled way to embed physical symmetries directly into learning. In this talk\, I will survey recent applications of E(3)NNs to materials design and highlight ongoing debates in the AI for atomistic sciences community: how to balance the incorporation of physical knowledge with the drive for engineering efficiency.\n\n\n\n 
URL:https://cmsa.fas.harvard.edu/event/bigdata_2025/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:Big Data Conference,Conference,Event
ATTACH;FMTTYPE=image/jpeg:https://cmsa.fas.harvard.edu/media/Big-Data-2025_11x17.9-scaled.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250908T090000
DTEND;TZID=America/New_York:20250910T170000
DTSTAMP:20260515T085426
CREATED:20250502T174228Z
LAST-MODIFIED:20260422T141418Z
UID:10003660-1757322000-1757523600@cmsa.fas.harvard.edu
SUMMARY:Math and Machine Learning Reunion Workshop
DESCRIPTION:Math and Machine Learning Reunion Workshop \nDates: September 8–10\, 2025 \nLocation: Harvard CMSA\, Room G10\, 20 Garden Street\, Cambridge MA \nMachine learning and AI are increasingly important tools in all fields of research. In the fall of 2024\, the CMSA Mathematics and Machine Learning Program hosted 70 mathematicians and machine learning experts\, ranging from beginners to established leaders in their field\, to explore ML as a research tool for mathematicians\, and mathematical approaches to understanding ML. More than 20 papers came out of projects started and developed during the program. The MML Reunion workshop will be an opportunity for the participants to share their results\, review subsequent developments\, and develop directions for future research. \nInvited Speakers \n\nAngelica Babei\, Howard University\nGergely Bérczi\, Aarhus University\nJoanna Bieri\, University of Redlands\nGiorgi Butbaia\, University of New Hampshire\nRandy Davila\, RelationalAI\, Rice University\nAlyson Deines\, IDA/CCR La Jolla\nSergei Gukov\, Caltech\nYang-Hui He\, University of Oxford\nMark Hughes\, Brigham Young University\nKyu-Hwan Lee\, University of Connecticut\nEric Mjolsness\, UC Irvine\nMaria Prat Colomer\, Brown University\nSébastien Racanière\, Google DeepMind\nEric Ramos\, Stevens Institute of Technology\nTamara Veenstra\, IDA-CCR La Jolla\n\nOrganizer:Michael Douglas\, CMSA \n\nSchedule \nMonday Sep. 8\, 2025 \n\n\n\n9:00–9:30 am\nMorning refreshments\n\n\n9:30–9:45 am\nIntroductions\n\n\n9:45–10:45 am\nAngelica Babei\, Howard University\nTitle: Predicting Euler factors of elliptic curves\nAbstract: Two non-isogenous elliptic curves will have distinct traces of Frobenius at a large enough prime\, and a finite set of $a_p(E)$ values determines all others. However\, even when enough $a_p(E)$ values are provided to uniquely identify the isogeny class\, no efficient algorithm is known for determining the remaining $a_p(E)$ values from this finite set. Preliminary results show that ML models can learn to predict the next trace of Frobenius with a surprising degree of accuracy from relatively few nearby entries. We investigate some possible reasons for this performance. Based on joint work with François Charton\, Edgar Costa\, Xiaoyu Huang\, Kyu-Hwan Lee\, David Lowry-Duda\, Ashvni Narayanan\, and Alexey Pozdnyakov.\n\n\n10:45–11:00 am\nBreak\n\n\n11:00 am–12:00 pm\nKyu-Hwan Lee\, University of Connecticut\nTitle: Machine learning mutation-acyclicity of quivers\n\n\n12:00–1:30 pm\nLunch\n\n\n1:30–2:30 pm\nGergely Bérczi\, Aarhus University\nTitle: Diffusion Models for Sphere Packings\n\n\n2:30–2:45 pm\nBreak\n\n\n2:45–3:45 pm\nRandy Davila\, RelationalAI\, Rice University\nTitle: Recent Developments in Automated Conjecturing\nAbstract: The dream of a machine capable of generating deep mathematical insight has inspired decades of research—from Fajtlowicz’s Graffiti program in graph theory and chemistry to DeepMind’s neural breakthroughs in knot theory. In this talk\, we briefly trace the evolution of automated conjecturing systems and present recent advances that deepen our understanding of what it means for machines to conjecture—a pursuit long embodied by our system\, TxGraffiti. Building on this legacy\, we introduce a new framework that integrates optimization\, enumeration\, and convex geometric methods with creative heuristics and symbolic translation. This extended system produces not only conjectured inequalities\, but also necessary and sufficient condition statements\, which can then be automatically ranked by IRIS (Inequality Ranking and Inference System) model and translated into Lean 4 for formal verification. The result is a flexible architecture capable of generating precise\, human-readable\, and logically rigorous conjectures with minimal manual intervention.\nWe showcase results across a range of mathematical areas\, including graph theory\, polyhedral theory\, number theory\, and—for the first time—conjectures in string theory\, derived from the dataset of complete intersection Calabi–Yau (CICY) threefolds. Together\, these developments suggest that with the right blend of structure\, strategy\, and aesthetic\, machines can generate conjectures that not only withstand scrutiny but invite it—offering a glimpse into a future where AI contributes meaningfully to the creative process of mathematics.\n\n\n3:45–4:00 pm\nBreak\n\n\n4:00–5:00 pm\nEric Ramos\, Stevens Institute of Technology\nTitle: An AI approach to a conjecture of Erdos\nAbstract: Given a graph G\, its independence sequence is the integral sequence a_1\,a_2\,…\,a_n\, where a_i is the number of independent sets of vertices of size i. In the 90’s Erdos and coauthors showed that this sequence need not be unimodal for general graphs\, but conjectured that it is always unimodal whenever G is a tree. This conjecture was then naturally generalized to claim that the independence sequence of trees should be log concave\, in the sense that a_i^2 is always above a_{i-1}a_{i+1}. This stronger version of the conjecture was shown to hold for all trees of at most 25 vertices. In 2023\, however\, using improved computational power and a considerably more efficient algorithm\, Kadrawi\, Levit\, Yosef\, and Mirzrachi proved that there were exactly two trees on 26 vertices whose independence sequence was not log concave. They also showed how these two examples could be generalized to create two families of trees whose members are all not log concave. Finally\, in early 2025\, Galvin provided a family of trees with the property that for any chosen positive integer k\, there is a member T of the family where log concavity breaks at index alpha(T) – k\, where alph(T) is the independence number of T. Outside of these three families\, not much else was known about what causes log concavity to break.In this presentation\, I will discuss joint work of myself and Shiqi Sun\, where we used the PatternBoost architecture to train a machine to find counter-examples to the log concavity conjecture. We will discuss the successes of this approach – finding tens of thousands of new counter-examples with vertex set sizes varying from 27 to 101 – and some of its fascinating failures.\n\n\n\n  \nTuesday\, Sep. 9\, 2025 \n\n\n\n9:00–9:30 am\nMorning refreshments\n\n\n9:30–10:30 am\nMaria Prat Colomer\, Brown University\nTitle: From PINNs to Computer-Assisted Proofs for Fluid Dynamics\nAbstract: Physics-Informed Neural Networks (PINNs) have emerged as an alternative to traditional numerical methods for solving partial differential equations (PDEs). We apply PINNs to the study of low regularity problems in fluid dynamics\, focusing on the incompressible 2D Euler equations. In particular\, we study V-states\, which are a class of weak\, non-smooth solutions for which the vorticity is the characteristic function of a domain that rotates with constant angular velocity. We have obtained an approximate solution of a limiting V-state using a PINN and we are currently working on a rigourous proof of the existence of a nearby solution through a computer-assisted proof. Our PINN-based numerical approximation significantly improves on traditional methods\, a key factor being the integration of prior mathematical knowledge of the problem to effectively explore the solution space.\n\n\n10:30–11:00 am\nBreak\n\n\n11:00 am–12:00 pm\nSebastian Racaniere\, Google DeepMind\nTitle: Generative models and high dimensional symmetries: the case of Lattice QCD\nAbstract: Applying normalizing flows\, a machine learning technique for mapping distributions\, to Lattice QCD offers a promising route to enhance simulations and overcome limitations of traditional methods like Hybrid Monte Carlo. LQCD aims to compute expectation values of observables from an intractable distribution defined over a lattice of fields. Normalizing flows can learn this complex distribution and generate new configurations\, improving efficiency and addressing challenges such as critical slowing down and topological freezing. Topological freezing\, in particular\, traps simulations in local minima and prevents exploration of the full configuration space\, affecting accuracy. This approach incorporates the symmetries of LQCD through gauge equivariant flows\, leading to successful definitions and good effective sample sizes on smaller lattices. Beyond accelerating configuration generation\, normalizing flows also find application in variance reduction for observable calculation and exploring phenomena at different scales within LQCD. While further research is needed to apply these methods at the scale of state-of-the-art LQCD calculations\, these advancements hold significant potential to improve the accuracy\, efficiency\, and reach of future simulations.\n\n\n12:00–1:30 pm\nLunch break\n\n\n1:30–2:30 pm\nSergei Gukov\, Caltech\nTitle: On sparse reward problems in mathematics\nAbstract: An alternative title for this talk could be “Learning Hardness.” To see why\, we will explore some long-standing open problems in mathematics and examine what makes them hard from a computational perspective. We will argue that\, in many cases\, the difficulty arises from a highly uneven distribution of hardness within families of related problems\, where the truly hard cases lie far out in the tail. We will then discuss how recent advances in AI may provide new tools to tackle these challenges. Based in part on the recent work with A.Shehper\, A.Medina-Mardones\, L.Fagan\, B.Lewandowski\, A.Gruen\, Y.Qiu\, P.Kucharski\, and Z.Wang.\n\n\n2:30–2:45 pm\nBreak\n\n\n2:45–3:45 pm\nAlyson Deines\, IDA-CCR La Jolla; Tamara Veenstra\, IDA-CCR La Jolla; Joanna Bieri\, University of Redlands\nTitle: Machine learning $L$-functions\nAbstract: We study the vanishing order of rational $L$-functions and Maass form $L$-functions from a data scientific perspective. Each $L$-function is represented by finitely many Dirichlet coefficients\, the normalization of which depends on the context. We observe murmurations by averaging over these datasets. For rational $L$-functions\, we find that PCA clusters rational $L$-functions by their vanishing order and record that LDA and neural networks may accurately predict this quantity. For Maass form $L$-functions\, while PCA does not cluster these $L$-functions\, we still find that LDA and neural networks may accurately predict this quantity.\n\n\n3:45–4:00 pm\nBreak\n\n\n4:00–5:00 pm\nMark Hughes\, Brigham Young University\nTitle: Modelling the concordance group via contrastive learning\nAbstract: The concordance group of knots in 3-space is an abelian group formed by the equivalence classes of knots under the relation of concordance\, where two knots are concordant if they are the boundary of a smooth annulus properly embedded in the 4-dimensional product space S^3 x I. Though studied since 1966\, properties of the concordance groups (and even the recognition problem of deciding when a knot is null-concordant\, or slice) are difficult to study. In this talk I will outline ongoing attempts to model the concordance group using contrastive learning. This is joint work with Onkar Singh Gujral.\n\n\n\n  \n  \nWednesday Sep. 10\, 2025 \n\n\n\n9:00–9:30 am\nMorning refreshments\n\n\n9:30–10:30 am\nYang-Hui He\, University of Oxford (Via Zoom)\nTitle: AI for Mathematics: Bottom-up\, Top-Down\, Meta-\nAbstract: We argue how AI can assist mathematics in three ways: theorem-proving\, conjecture formulation\, and language processing. Inspired by initial experiments in geometry and string theory in 2017\, we summarize how this emerging field has grown over the past years\, and show how various machine-learning algorithms can help with pattern detection across disciplines ranging from algebraic geometry to representation theory\, to combinatorics\, and to number theory. At the heart of the programme is the question how does AI help with theoretical discovery\, and the implications for the future of mathematics.\n\n\n10:30–11:00 am\nBreak\n\n\n11:00 am–12:00 pm\nGiorgi Butbaia\, University of New Hampshire\nTitle: Computational String Theory using Machine Learning\nAbstract: Calabi-Yau compactifications of the $E_8\times E_8$ heterotic string provide a promising route to recovering the four-dimensional particle physics described by the Standard Model. While the topology of the Calabi-Yau space determines the overall matter content in the low-energy effective field theory\, further details of the compactification geometry are needed to calculate the normalized physical couplings and masses of elementary particles. In this talk\, we present novel numerical techniques for computing physically normalized Yukawa couplings in a number of heterotic models in the standard embedding using geometric machine learning and equivariant neural networks. We observe that the results produced using these techniques are in excellent agreement with the expected values for certain special cases\, where the answers are known. In the case of the Tian-Yau manifold\, which defines a model with three generations and has $h^{2\,1}>1$\, we provide a first-of-its-kind calculation of the normalized Yukawa couplings. As part of this work\, we have developed a Python library called cymyc\, which streamlines calculation of the Calabi-Yau metric and the Yukawa couplings on arbitrary Calabi-Yau manifolds that are realized as complete intersections and provides a framework for studying the differential geometric properties\, such as the curvature.\n\n\n12:00–1:30 pm\nLunch break\n\n\n1:30–2:30 pm\nEric Mjolsness\, UC Irvine\nTitle: Graph operators for science-applied AI/ML\nAbstract: Scalable\, structured graphs play a central role in mathematical problem definition for scientific applications of artificial intelligence and machine learning. Qualitatively diverse kinds of operators are necessary to bring these graphs to life. Continuous-time processes govern the evolution of spatial graph embeddings and other graph-local differential equation systems\, as well as the flow of probability between locally similar graph structures in a probabilistic Fock space\, according to rules in a dynamical graph grammar (DGG). Both kinds of dynamics have biophysical application eg. to dynamic cytoskeleton\, and both obey graph-centric time-evolution operators in an operator algebra that can be differentiated for learning. On the other hand coarse-scale discrete jumps in graph structure such as global mesh refinement can be modeled with a “graph lineage”: a sequence of sparsely interrelated graphs whose size grows roughly exponentially with level number. Graph lineages permit the definition of substantially more cost-efficient skeletal graph products\, as versions of classic binary graph operators such as the Cartesian product and direct product of graphs\, with analogous but not identical properties. Application to deep neural networks and to multigrid numerical methods are shown.\nThese two graph operator frameworks are interrelated. Further graph lineage operators allow the definition of graph frontier spaces\, accommodating graph grammars and supporting the definition of skeletal graph-graph function spaces. In return\, “confluent” graph grammars e.g. for adaptive mesh generation permit the definition of graph lineages through iteration. I will also sketch the design of compatible AI for Science systems that may exploit DGGs.\nJoint work with Cory Scott and Matthew Hur.\n\n\n2:30–3:00 pm\nBreak\n\n\n3:00–5:00 pm\nPanel and Discussion Group: Jordan Ellenberg\, Tamara Veenstra\, Sébastien Racaniere\, Kyu-Hwan Lee\, Sergei Gukov\n\n\n\n  \n\n  \n  \n 
URL:https://cmsa.fas.harvard.edu/event/mml_2025/
LOCATION:CMSA 20 Garden Street Cambridge\, Massachusetts 02138 United States
CATEGORIES:Event,Workshop
ATTACH;FMTTYPE=image/jpeg:https://cmsa.fas.harvard.edu/media/MML_Reunion_poster.2.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250423T140000
DTEND;TZID=America/New_York:20250423T150000
DTSTAMP:20260515T085426
CREATED:20250128T214818Z
LAST-MODIFIED:20250311T184354Z
UID:10003709-1745416800-1745420400@cmsa.fas.harvard.edu
SUMMARY:Machine learning for analytic calculations in theoretical physics
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Matthias Wilhelm (University of Southern Denmark) \nTitle: Machine learning for analytic calculations in theoretical physics \nAbstract: In this talk\, we will present recent progress on applying machine-learning techniques to improve calculations in theoretical physics\, in which we desire exact and analytic results. One example are so-called integration-by-parts reductions of Feynman integrals\, which pose a frequent bottleneck in state-of-the-art calculations in theoretical particle and gravitational-wave physics. These reductions rely on heuristic approaches for selecting a finite set of linear equations to solve\, and the quality of the heuristics heavily influences the performance. In this talk\, we investigate the use of machine-learning techniques to find improved heuristics. We use funsearch\, a genetic programming variant based on code generation by a Large Language Model\, in order to explore possible approaches\, then use strongly typed genetic programming to zero in on useful solutions. Both approaches manage to re-discover the state-of-the-art heuristics recently incorporated into integration-by-parts solvers\, and in one example find a small advance on this state of the art.
URL:https://cmsa.fas.harvard.edu/event/newtech_42325/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-4.23.2025.docx-1.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250409T140000
DTEND;TZID=America/New_York:20250409T150000
DTSTAMP:20260515T085426
CREATED:20250128T214458Z
LAST-MODIFIED:20250410T150618Z
UID:10003707-1744207200-1744210800@cmsa.fas.harvard.edu
SUMMARY:Can Transformers Do Enumerative Geometry?
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Baran Hashemi\, Technical University of Munich \nTitle: Can Transformers Do Enumerative Geometry? \nAbstract: How can Transformers model and learn enumerative geometry? What is a systematic procedure for using Transformers in abductive knowledge discovery within a mathematician-machine collaboration? In this work\, we introduce a Neural Enumerative Reasoning model for computation of ψ-class intersection numbers on the moduli space of curves. By reformulating the problem as a continuous optimization task\, we compute intersection numbers across a wide value range from 10e-45 to 10e45. To capture the recursive nature inherent in these intersection numbers\, we propose the Dynamic Range Activator (DRA)\, a new activation function that enhances the Transformer’s ability to model recursive patterns and handle severe heteroscedasticity. Given precision requirements for computing the intersections\, we quantify the uncertainty of the predictions using Conformal Prediction with a dynamic sliding window adaptive to the partitions of equivalent number of marked points. Beyond simply computing intersection numbers\, we explore the enumerative “world-model” of Transformers. Our interpretability analysis reveals that the network is implicitly modeling the Virasoro constraints in a purely data-driven manner. Moreover\, through abductive hypothesis testing\, probing\, and causal inference\, we uncover evidence of an emergent internal representation of the large-genus asymptotic of ψ-class intersection numbers. This opens up new possibilities in inferring asymptotic closed-form expressions directly from limited amount of data. \nThis talk is based on https://openreview.net/pdf?id=4X9RpKH4Ls. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_4925/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-4.9.2025.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250402T140000
DTEND;TZID=America/New_York:20250402T150000
DTSTAMP:20260515T085426
CREATED:20250128T214417Z
LAST-MODIFIED:20250403T144343Z
UID:10003706-1743602400-1743606000@cmsa.fas.harvard.edu
SUMMARY:Learning Dynamical Transport without Data
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Michael Albergo (Harvard) \nTitle: Learning Dynamical Transport without Data \nAbstract: Algorithms based on dynamical transport of measure\, such as score-based diffusion models\, have resulted in great progress in the field of generative modeling. However\, these algorithms rely on access to an abundance of data from the target distribution. A complementary problem to this is learning to generate samples from a target distribution when only given query access to the unnormalized log-likelihood or energy function associated to it\, with myriad application in statistical physics\, chemistry\, and Bayesian inference. I will present an algorithm based on dynamical transport to sample from a target distribution in this context\, which can be seen as an augmentation of annealed importance sampling and sequential Monte Carlo. Time permitting\, I will also discuss how to generalize these ideas to dynamics of discrete distributions. This is joint work with Eric Vanden-Eijnden\, Peter Holderrieth\, and Tommi Jaakkola. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_4225/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-4.2.2025.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250327T100000
DTEND;TZID=America/New_York:20250327T110000
DTSTAMP:20260515T085426
CREATED:20250128T214249Z
LAST-MODIFIED:20250327T192309Z
UID:10003666-1743069600-1743073200@cmsa.fas.harvard.edu
SUMMARY:AlphaProof: when reinforcement learning meets formal mathematics
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Thomas Hubert (Google DeepMind) \nTitle: AlphaProof: when reinforcement learning meets formal mathematics \nAbstract: Galileo\, the renowned Italian astronomer\, physicist\, and mathematician\, famously described mathematics as the language of the universe. Progress since only confirmed his intuition as the world we live in can be described with extreme precision with just a few mathematical equations.\nIn the last 70 years\, the rise of computers has also enriched our understanding of and revolutionized the world we live in. Mathematics tremendously benefited from this digital revolution as well: while Gauss had to compute primes by hand\, computers and computation are now routinely used in research mathematics and contribute to grand problems like the Birch and Swinnerton-Dyer conjecture\, one of the Millennium Prize Problems.\nToday\, computers are entering a new age\, one in which computation can be transformed into reasoning. In this talk\, I would like to discuss two such developments that will undoubtedly have an integral role to play in the future of mathematics: the concurrent rise of formal mathematics and of machine intelligence.
URL:https://cmsa.fas.harvard.edu/event/newtech_32625/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-3.27.2025.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250312T140000
DTEND;TZID=America/New_York:20250312T150000
DTSTAMP:20260515T085426
CREATED:20250123T195100Z
LAST-MODIFIED:20250327T194539Z
UID:10003665-1741788000-1741791600@cmsa.fas.harvard.edu
SUMMARY:Discovery in Mathematics with Automated Conjecturing
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Randy Davila\, RelationalAI and Rice University \nTitle: Discovery in Mathematics with Automated Conjecturing \nAbstract: Automated conjecturing is a form of artificial intelligence that applies heuristic-driven methods to mathematical discovery. Since the late 1980s\, systems such as Fajtlowicz’s Graffiti\, DeLaViña’s Graffiti.pc\, and TxGraffiti have collectively contributed to over 130 publications in mathematical journals. In this talk\, we outline the evolution of automated conjecturing\, focusing on TxGraffiti\, a program that employs linear optimization methods and several distinct heuristics to generate mathematically meaningful conjectures. We will then introduce GraphMind\, a dueling framework where the Optimist proposes conjectures while the Pessimist seeks counterexamples\, fostering a feedback loop that strengthens automated reasoning. Finally\, we will present GraffitiAI\, a Python package that extends automated conjecturing across various mathematical domains. \nBio: Randy R. Davila is a Lecturer in the Department of Computational Applied Mathematics & Operations Research at Rice University and a Library Engineer at RelationalAI\, specializing in relational knowledge graph systems for intelligent data management. He earned his PhD in Mathematics from the University of Johannesburg in 2019\, with research focused on graph theory and combinatorial optimization. His work explores artificial intelligence in mathematical conjecture generation\, graph theory\, and neural network applications to combinatorial problems. As the creator of TxGraffiti\, he has developed AI-driven systems that have contributed to numerous mathematical publications. His recent projects include GraphMind\, a dueling agent-based framework that pairs conjecture generation with counterexample discovery\, and GraffitiAI\, a Python package for automated conjecturing across mathematical disciplines. \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_31225/
LOCATION:Hybrid – G10
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-3.12.2025.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250305T140000
DTEND;TZID=America/New_York:20250305T150000
DTSTAMP:20260515T085426
CREATED:20250123T192715Z
LAST-MODIFIED:20250307T154830Z
UID:10003664-1741183200-1741186800@cmsa.fas.harvard.edu
SUMMARY:Machine Learning G2 Geometry
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Elli Heyes\, Imperial College \nTitle: Machine Learning G2 Geometry \nAbstract: Compact Ricci-flat Calabi-Yau and holonomy G2 manifolds appear in string and M-theory respectively as descriptions of the extra spatial dimensions that arise in the theories. Since 2017 machine-learning techniques have been applied extensively to study Calabi-Yau manifolds but until 2024 no similar work had been carried out on holonomy G2 manifolds. In this talk\, I will firstly show how topological properties of these manifolds can be learnt using neural networks. I will then discuss how one could try to numerically learn metrics on compact holonomy G2 manifolds using machine-learning and why these approximations would be useful in M-theory.
URL:https://cmsa.fas.harvard.edu/event/newtech_3525/
LOCATION:Hybrid
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-3.5.2025.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250226T140000
DTEND;TZID=America/New_York:20250226T150000
DTSTAMP:20260515T085426
CREATED:20250124T154400Z
LAST-MODIFIED:20250623T124501Z
UID:10003663-1740578400-1740582000@cmsa.fas.harvard.edu
SUMMARY:Datasets for Math: From AIMO Competitions to Math Copilots for Research
DESCRIPTION:  \nNew Technologies in Mathematics Seminar \nSpeaker: Simon Frieder\, Oxford \nTitle: Datasets for Math: From AIMO Competitions to Math Copilots for Research \nAbstract: This talk begins with a brief exposition of the AI Mathematical Olympiad (AIMO) on Kaggle\, now in its second iteration\, outlining datasets and models available to contestants. Taking a broader perspective\, I then examine 1) the overarching issues the current datasets suffer from—such as binary evaluation or constrained sets of use cases— and 2) the trajectory they set for competition-style mathematical problem-solving\, which is different from mathematical research practice. I argue for a fundamental shift in dataset structure and composition\, both for training and evaluation\, and introduce the idea of mapping mathematical workflows to data\, a key example underscoring the need for this shift. I touch upon new thinking LLMs and their role in redefining LLM math evaluation\, highlighting their implications for dataset design. Finally\, I propose general improvements to the current state of mathematical datasets\, including mathematical adaptations of dataset documentation (e.g.\, datasheets). \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_22625/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/1740494700974-e6086db9-08ab-4681-9ecd-580092fe27b62025-1_1.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250212T140000
DTEND;TZID=America/New_York:20250212T150000
DTSTAMP:20260515T085426
CREATED:20250123T194306Z
LAST-MODIFIED:20250228T212617Z
UID:10003661-1739368800-1739372400@cmsa.fas.harvard.edu
SUMMARY:Discovering Data Structures: Nearest Neighbor Search and Beyond
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Omar Salemohamed\, Mila \nTitle: Discovering Data Structures: Nearest Neighbor Search and Beyond \nAbstract: As neural networks learn increasingly sophisticated tasks—from image recognition to mastering the game of Go—we ask: can deep learning discover data structures entirely from scratch? We introduce a general framework for data structure discovery\, which adapts to the underlying data distribution and provides fine-grained control over query and space complexity. For nearest neighbor (NN) search\, our model (re)discovers classic algorithms like binary search in one dimension and learns structures reminiscent of k-d trees and locality-sensitive hashing in higher dimensions. Additionally\, the model learns useful representations of high-dimensional data such as images and exploits them to design effective data structures. Beyond NN search\, we believe the framework could be a powerful tool for data structure discovery for other problems and adapt our framework to the problem of estimating frequencies over a data stream. To encourage future work in this direction\, we conclude with a discussion on some of the opportunities and remaining challenges of learning data structures end-to-end.
URL:https://cmsa.fas.harvard.edu/event/newtech_21225/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-2.12.2025.docx-1.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241204T140000
DTEND;TZID=America/New_York:20241204T150000
DTSTAMP:20260515T085426
CREATED:20240907T180227Z
LAST-MODIFIED:20241212T205959Z
UID:10003410-1733320800-1733324400@cmsa.fas.harvard.edu
SUMMARY:Can Transformers Reason Logically? A Study in SAT-Solving
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Leyan Pan\, Georgia Tech \nTitle: Can Transformers Reason Logically? A Study in SAT-Solving \nAbstract: Transformer-based LLMs have apparently demonstrated capabilities that resembles human reasoning. In our recent work\, we investigated the Boolean reasoning abilities of decoder-only Transformers equipped with Chain-of-Thought\, establishing that a Transformer model can decide all 3-SAT instances up to a bounded size (i.e.\, number of variables and clauses). In this talk\, I will first review recent studies that formally examine the expressiveness of Transformer models. Next\, I will explain how we establish an equivalence between Chain-of-Thought reasoning and algorithm\, in our case\, the DPLL SAT-solving algorithm. I will then discuss how to encode 3-SAT formulas and partial assignments as vectors so that the high-level operations in DPLL can be represented as vector operations and implemented using attention mechanisms within Transformers. Finally\, I will present experimental results that support our theoretical predictions. I will also address why standard Transformers can only solve reasoning problems of bounded length\, leading to failures in length-generalization\, and discuss potential solutions to overcome this limitation.
URL:https://cmsa.fas.harvard.edu/event/newtech_12424/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-12.4.24.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241120T100000
DTEND;TZID=America/New_York:20241120T230000
DTSTAMP:20260515T085426
CREATED:20241017T153402Z
LAST-MODIFIED:20241115T183929Z
UID:10003614-1732096800-1732143600@cmsa.fas.harvard.edu
SUMMARY:Thinking Like Transformers - A Practical Session
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Gail Weiss\, EPFL \nTitle: Thinking Like Transformers – A Practical Session \nAbstract: With the help of the RASP programming language\, we can better imagine how transformers—the powerful attention based sequence processing architecture—solve certain tasks. Some tasks\, such as simply repeating or reversing an input sequence\, have reasonably straightforward solutions\, but many others are more difficult. To unlock a fuller intuition of what can and cannot be achieved with transformers\, we must understand not just the RASP operations but also how to use them effectively.\nIn this session\, I would like to discuss some useful tricks with you in more detail. How is the powerful selector_width operation yielded from the true RASP operations? How can a fixed-depth RASP program perform arbitrary length long-addition\, despite the equally large number of potential carry operations such a computation entails? How might a transformer perform in-context reasoning? And are any of these solutions reasonable\, i.e.\, realisable in practice? I will begin with a brief introduction of the base RASP operations to ground our discussion\, and then walk us through several interesting task solutions. Following this\, and armed with this deeper intuition of how transformers solve several tasks\, we will conclude with a discussion of what this implies for how knowledge and computations must spread out in transformer layers and embeddings in practice.
URL:https://cmsa.fas.harvard.edu/event/newtech_112024/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-11.20.24.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241113T100000
DTEND;TZID=America/New_York:20241113T230000
DTSTAMP:20260515T085426
CREATED:20241017T141250Z
LAST-MODIFIED:20241115T175125Z
UID:10003613-1731492000-1731538800@cmsa.fas.harvard.edu
SUMMARY:Frontier of Formal Theorem Proving with Large Language Models: Insights from the DeepSeek-Prover Series
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Huajian Xin\, DeepSeek \nTitle: Frontier of Formal Theorem Proving with Large Language Models: Insights from the DeepSeek-Prover Series \nAbstract: Recent advances in large language models have markedly influenced mathematical reasoning and automated theorem proving within artificial intelligence. Yet\, despite their success in natural language tasks\, these models face notable obstacles in formal theorem proving environments such as Lean and Isabelle\, where exacting derivations must adhere to strict formal specifications. Even state-of-the-art models encounter difficulty generating accurate and complex formal proofs\, revealing the unique blend of mathematical rigor required in this domain. In the DeepSeek-Prover series (V1 and V1.5)\, we have explored specialized methodologies aimed at addressing these challenges. This talk will delve into three foundational areas: the synthesis of training data through autoformalization\, reinforcement learning that utilizes feedback from proof assistants\, and test-time optimization using Monte Carlo tree search. I will also provide insights into current model capabilities\, persistent challenges\, and the future potential of large language models in automated theorem proving.
URL:https://cmsa.fas.harvard.edu/event/newtech_111324/
LOCATION:Virtual
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-11.13.24.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241106T140000
DTEND;TZID=America/New_York:20241106T150000
DTSTAMP:20260515T085426
CREATED:20241021T164918Z
LAST-MODIFIED:20241108T192620Z
UID:10003617-1730901600-1730905200@cmsa.fas.harvard.edu
SUMMARY:Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Dylan Foster\, Microsoft Research \nTitle: Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning \nAbstract: Imitation learning (IL) aims to mimic the behavior of an expert in a sequential decision making task by learning from demonstrations\, and has been widely applied to robotics\, autonomous driving\, and autoregressive language generation. The simplest approach to IL\, behavior cloning (BC)\, is thought to incur sample complexity with unfavorable quadratic dependence on the problem horizon\, motivating a variety of different online algorithms that attain improved linear horizon dependence under stronger assumptions on the data and the learner’s access to the expert.In this talk\, we revisit the apparent gap between offline and online IL from a learning-theoretic perspective\, with a focus on general policy classes up to and including deep neural networks. Through a new analysis of behavior cloning with the logarithmic loss\, we will show that it is possible to achieve horizon-independent sample complexity in offline IL whenever (i) the range of the cumulative payoffs is controlled\, and (ii) an appropriate notion of supervised learning complexity for the policy class is controlled. When specialized to stationary policies\, this implies that the gap between offline and online IL is smaller than previously thought. We will then discuss implications of this result and investigate the extent to which it bears out empirically. \nBio: Dylan Foster is a principal researcher at Microsoft Research\, New York. Previously\, he was a postdoctoral fellow at MIT\, and received his PhD in computer science from Cornell University\, advised by Karthik Sridharan. His research focuses on problems at the intersection of machine learning\, AI\, interactive decision making. He has received several awards for his work\, including the best paper award at COLT (2019) and best student paper award at COLT (2018\, 2019). \n 
URL:https://cmsa.fas.harvard.edu/event/newtech_11624/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-11.6.24.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241025T103000
DTEND;TZID=America/New_York:20241025T120000
DTSTAMP:20260515T085426
CREATED:20240912T144420Z
LAST-MODIFIED:20240912T145420Z
UID:10003501-1729852200-1729857600@cmsa.fas.harvard.edu
SUMMARY:Math and Machine Learning Program Discussion
DESCRIPTION:Math and Machine Learning Program Discussion \n 
URL:https://cmsa.fas.harvard.edu/event/mml_meeting_102524/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:MML Meeting
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241023T140000
DTEND;TZID=America/New_York:20241023T150000
DTSTAMP:20260515T085426
CREATED:20241021T140701Z
LAST-MODIFIED:20241108T192710Z
UID:10003616-1729692000-1729695600@cmsa.fas.harvard.edu
SUMMARY:How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad
DESCRIPTION:New Technologies in Mathematics Seminar \nSpeaker: Aryo Lotfi (EPFL) \nTitle: How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad \nAbstract: Can Transformers predict new syllogisms by composing established ones? More generally\, what type of targets can be learned by such models from scratch? Recent works show that Transformers can be Turing-complete in terms of expressivity\, but this does not address the learnability objective. This paper puts forward the notion of ‘globality degree’ of a target distribution to capture when weak learning is efficiently achievable by regular Transformers\, where the latter measures the least number of tokens required in addition to the tokens histogram to correlate nontrivially with the target. As shown experimentally and theoretically under additional assumptions\, distributions with high globality cannot be learned efficiently. In particular\, syllogisms cannot be composed on long chains. Furthermore\, we show that (i) an agnostic scratchpad cannot help to break the globality barrier\, (ii) an educated scratchpad can help if it breaks the globality at each step\, however not all such scratchpads can generalize to out-of-distribution (OOD) samples\, (iii) a notion of ‘inductive scratchpad’\, that composes the prior information more efficiently\, can both break the globality barrier and improve the OOD generalization. In particular\, some inductive scratchpads can achieve length generalizations of up to 6x for some arithmetic tasks depending on the input formatting.
URL:https://cmsa.fas.harvard.edu/event/newtech_102324/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:New Technologies in Mathematics Seminar
ATTACH;FMTTYPE=application/pdf:https://cmsa.fas.harvard.edu/media/CMSA-NTM-Seminar-10.23.24.docx-1-1.pdf
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241023T103000
DTEND;TZID=America/New_York:20241023T120000
DTSTAMP:20260515T085426
CREATED:20240911T205240Z
LAST-MODIFIED:20240911T205240Z
UID:10003495-1729679400-1729684800@cmsa.fas.harvard.edu
SUMMARY:Math and Machine Learning Program Discussion
DESCRIPTION:Math and Machine Learning Program Discussion \n 
URL:https://cmsa.fas.harvard.edu/event/mml_meeting_102324/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:MML Meeting
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241021T103000
DTEND;TZID=America/New_York:20241021T120000
DTSTAMP:20260515T085426
CREATED:20240911T195747Z
LAST-MODIFIED:20240911T195747Z
UID:10003482-1729506600-1729512000@cmsa.fas.harvard.edu
SUMMARY:Math and Machine Learning Program Discussion
DESCRIPTION:Math and Machine Learning Program Discussion \n 
URL:https://cmsa.fas.harvard.edu/event/mml_meeting_102124/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:MML Meeting
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241018T103000
DTEND;TZID=America/New_York:20241018T120000
DTSTAMP:20260515T085426
CREATED:20240912T145729Z
LAST-MODIFIED:20240912T145729Z
UID:10003503-1729247400-1729252800@cmsa.fas.harvard.edu
SUMMARY:Math and Machine Learning Program Discussion
DESCRIPTION:Math and Machine Learning Program Discussion \n 
URL:https://cmsa.fas.harvard.edu/event/mml_meeting_101824/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:MML Meeting
END:VEVENT
END:VCALENDAR