BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CMSA - ECPv6.16.3//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:CMSA
X-ORIGINAL-URL:https://cmsa.fas.harvard.edu
X-WR-CALDESC:Events for CMSA
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20230312T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20231105T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20241028T090000
DTEND;TZID=America/New_York:20241030T170000
DTSTAMP:20260614T162603
CREATED:20240105T032648Z
LAST-MODIFIED:20241106T191859Z
UID:10001111-1730106000-1730307600@cmsa.fas.harvard.edu
SUMMARY:Mathematics and Machine Learning Closing Workshop
DESCRIPTION:Mathematics and Machine Learning Closing Workshop \nDates: October 28 – Oct. 30\, 2024 \nLocation: Room G10\, CMSA\, 20 Garden Street\, Cambridge MA \nThe closing workshop will provide a forum for discussing the most current research in these areas\, including work in progress and recent results from program participants. We will devote one day to frontier topics in interactive theorem proving\, such as mathematical library development and AI for mathematical search and theorem proving. \n  \nYoutube Playlist \n \nOrganizers \n\nFrancois Charton (Meta AI)\nMichael R. Douglas (Harvard CMSA)\nMichael Freedman (Harvard CMSA)\nFabian Ruehle (Northeastern)\nGeordie Williamson (Univ. of Sydney)\n\nSpeakers \n\nAnkit Anand\, Google Deepmind Montreal\nJeremy Avigad\, Carnegie Mellon University\nAngelica Babei\nMatej Balog\, Deepmind\nGergely Bérczi\, Aarhus University\nTristan Buckmaster\, New York University\nGiorgi Butbaia\, University of New Hampshire\nEdgar Costa\, MIT\nAlex Davies\, DeepMind\nBin Dong\, Beijing International Center for Mathematical Research\nKit Fraser-Taliente\, University of Oxford\nJavier Gomez-Serrano\, Brown University\nJim Halverson\, Northeastern University\nThomas Harvey\, MIT\nAmaury Hayat\, Ecole des Ponts Paristech\nYang-Hui He\, City University of London\nJürgen Jost\, Max Planck Institute for Mathematics in the Sciences\nPetros Koumoutsakos\, Harvard University\nKyu-Hwan Lee\, University of Connecticut\nDavid Lowry-Duda\, ICERM\nStephane Mallat\, Flatiron/College de France\nAbbas Mehrabian\, Google Deepmind Montreal\nCengiz Pehlevan\, Harvard University\nFabian Ruehle\, Northeastern University\nEric Vanden-Eijnden\, Courant/NYU\nAdam Wagner\, Worcester Polytechnic Institute\nMelanie Matchett Wood\, Harvard University\n\n  \nSchedule (download PDF) \nMonday Oct. 28\, 2024 \n9:00–9:30 amMorning refreshments \n9:30–9:45 amIntroductions \n9:45–10:45 amJürgen Jost\, Max Planck Institute for Mathematics in the Sciences \nTitle: Data visualization with category theory and geometry \nAbstract: While data often come in a high-dimensional feature space\, they typically exhibit intrinsic constraints and regularities\, and they can therefore often be represented on a lower-dimensional\, but possibly highly curved Riemannian manifold. Still\, for visualization purposes\, that dimension still needs to be lowered to 2 or 3. We present the mathematical foundations for such schemes\, in particular UMAP\, and describe an improved such method. \n10:45–11:00 amBreak \n11:00 am–12:00 pmAnkit Anand\, Google Deepmind Montreal\, Abbas Mehrabian\, Google Deepmind Montreal \nTitle: From Theorem Proving to Disproving: Modern machine learning versus classical heuristic search in automated theorem proving and extremal graph theory \nAbstract: Machine learning is widely believed to outperform classical methods\, but this is not always the case. Firstly\, we describe how we adapted the idea of hindsight experience replay from reinforcement learning to the automated theorem proving domain\, so as to use the intermediate data generated during unsuccessful proofs. We show that provers trained in this way can outperform previous machine learning approaches and compete with the state-of-the-art heuristic-based theorem prover E in its best configuration\, on the popular benchmarks MPTP2078\, M2k and Mizar40. The proofs generated by our algorithm are also almost always significantly shorter than E’s proofs. Based on this paper\, which was presented at ICML 2022: https://proceedings.mlr.press/v162/aygun22a.html. Secondly\, we study a central extremal graph theory problem inspired by a 1975 conjecture of Erdős\, which aims to find graphs with a given size (number of nodes) that maximize the number of edges without having 3- or 4-cycles. We formulate this problem as a sequential decision-making problem and compare AlphaZero\, a neural network-guided tree search\, with tabu search\, a heuristic local search method. Using either method\, by introducing a curriculum—jump-starting the search for larger graphs using good graphs found at smaller sizes—we improve the state-of-the-art lower bounds for several sizes. Joint work with Tudor Berariu\, Joonkyung Lee\, Anurag Murty Naredla\, Adam Zsolt Wagner\, and other colleagues at Google DeepMind. Based on this paper\, which was presented at IJCAI 2024: https://arxiv.org/abs/2311.03583. \n12:00–1:30 pmLunch Break \n1:30–2:30 pmFabian Ruehle\, Northeastern University\, Giorgi Butbaia\, University of New Hampshire \nTitle: Rigorous results  from ML using RL \nAbstract: We explain how to use Reinforcement Learning in Mathematics to obtain provably correct results. After a brief introduction to Reinforcement Learning\, I will illustrate the idea using an example from Number Theory\, where we solve a Diophantine Equation related to String Theory\, and two from Knot Theory. The first knot theory problem is to identify unknots\, while the second is concerned with identifying so-called ribbon knots. The latter play an important role in the search for counter-examples to the smooth Poincare conjecture. \n2:30–2:45 pmBreak \n2:45–3:45 pmCengiz Pehlevan\, Harvard University \nTitle: Solvable Models of Scaling and Emergence in Deep Learning \n3:45–4:00 pmBreak \n4:00–4:30 pmMatej Balog\, Deepmindvia Zoom \nTitle: FunSearch: Mathematical discoveries from program search with large language models \nAbstract: Large language models (LLMs) have demonstrated tremendous capabilities in solving complex tasks\, from quantitative reasoning to understanding natural language. However\, LLMs sometimes suffer from confabulations (or hallucinations)\, which can result in them making plausible but incorrect statements. This hinders the use of current large models in scientific discovery. We introduce FunSearch (short for searching in the function space)\, an evolutionary procedure based on pairing a pretrained LLM with a systematic evaluator. We demonstrate the effectiveness of this approach to surpass the best-known results in important problems\, pushing the boundary of existing LLM-based approaches. Applying FunSearch to a central problem in extremal combinatorics—the cap set problem—we discover new constructions of large cap sets going beyond the best-known ones\, both in finite dimensional and asymptotic cases. This shows that it is possible to make discoveries for established open problems using LLMs. We showcase the generality of FunSearch by applying it to an algorithmic problem\, online bin packing\, finding new heuristics that improve on widely used baselines. In contrast to most computer search approaches\, FunSearch searches for programs that describe how to solve a problem\, rather than what the solution is. Beyond being an effective and scalable strategy\, discovered programs tend to be more interpretable than raw solutions\, enabling feedback loops between domain experts and FunSearch\, and the deployment of such programs in real-world applications. \n4:30–5:00 pmEdgar Costa\, MIT \nTitle: Machine learning L-functions \nAbstract: We report on multiple experiments related to L-functions data. L-functions are complex functions that encode significant information about number theory and algebraic geometry\, playing a crucial part in the Langlands program\, a foundational set of conjectures connecting number theory with other mathematical domains. We focused on two L-function datasets. The first includes about 250k rational L-functions of small arithmetic complexity with diverse origins. Multiple dimensionality reduction techniques were used to analyze invariants and behavioral trends\, focusing on how differing origins impact the results. The second dataset is composed of L-functions associated with Maass forms. Although these L-functions are non-rational\, they also share the low arithmetic complexity of the first dataset. The crux of our investigation here is determining whether this set manifests similar characteristics to the first one. Based on this exploration\, we propose a simple heuristic method to deduce their Fricke sign\, an unknown invariant for 40% of the data. This is joint work with: Joanna Biere\, Giorgi Butbaia\, Alyson Deines\, Kyu-Hwan Lee\, David Lowry-Duda\, Tom Oliver\, Tamara Veenstra\, and Yidi Qi. \n  \n  \nTuesday Oct. 29\, 2024  \n9:15–9:45 amMorning refreshments \n9:45–10:45 amYang-Hui He\, London Institute for Mathematical Sciences Via Zoom \nTitle: AI assisted mathematics \nAbstract: We argue how AI can assist mathematics in three ways: theorem-proving\, conjecture formulation\, and language processing. Inspired by initial experiments in geometry and string theory in 2017\, we summarize how this emerging field has grown over the past years\, and show how various machine-learning algorithms can help with pattern detection across disciplines ranging from algebraic geometry to representation theory\, to combinatorics\, and to number theory.  At the heart of the program is the question how does AI help with theoretical discovery\, and the implications for the future of mathematics. \n10:45–11:00 amBreak \n11:00 –11:30Angelica Babei \nTitle: Learning Euler factors of elliptic curves with transformers \nAbstract: The L-function of an elliptic curve is at the core of the BSD conjecture\, and its Euler factors encode important arithmetic information about the curve. For example\, understanding these Euler factors using machine learning techniques has recently led to discovering the phenomenon of murmurations. In this talk\, we present some results on learning Euler factors based on 1. other nearby factors\, and 2. the Weierstrass equation of the curve.  \n11:30–12:00 pmDavid Lowry-Duda\, ICERM \nTitle: Exploring patterns in number theory with deep learning: a case study with the Möbius and squarefree indicator functionsAbstract: We report on experiments using neural networks and Int2Int\, the integer sequence to integer sequence transformer made by François Charton for this CMSA program. We initially study the Möbius function. This function appears as the coefficients of the reciprocal of the Riemann zeta function and is famously hard to understand. Predicting the Möbius function is closely related to predicting the squarefree indicator function\, leading us to perform similar experiments there. Finally\, we’ll discuss how varying the input representation and model affects the strength of the predictions and allows us to explain most (but not all) of the predictive strength and behavior. \n12:00–1:30 pmLunch \n1:30–2:30 pmAmaury Hayat\, Ecole des Ponts Paristech\, Melanie Matchett Wood\, Harvard University\, Alex Davies\, DeepMind\, Jeremy Avigad\, Carnegie Mellon University \nTitle: Machine learning and theorem proving \nAbstract: Recent successes in machine learning have raised hopes that neural networks could one day assist mathematicians in proving theorems. This raises the question of an appropriate setting to apply machine learning methods to theorem proving. Formal languages\, such as Lean\, provide automatic verification of mathematical proofs and thus offer a natural environment. Nevertheless\, some challenges emerge\, particularly because these languages are often designed to verify correctness rather than find a solution\, while mathematicians often perform reasoning steps to do both at the same time. This talk will present recent applications of machine learning methods to theorem proving in Lean\, highlight current challenges\, and explore what these developments might mean for the future of mathematics. \n  \n2:30–2:45 pmBreak \n2:45–3:45 pmAdam Wagner\, Worcester Polytechnic Institute\, Kit Fraser-Taliente\, University of Oxford\, Gergely Bérczi\, Aarhus University\, Thomas Harvey\, MIT \nTitle: Sparse subgraphs of the d-cube with diameter d \nAbstract: Erdos et al studied spanning subgraphs of the $d$-cube which have the same diameter $d$ as the cube itself. They asked the following natural question: what is the maximum number of edges one can delete from the $d$-dimensional hypercube\, without increasing its diameter? We will discuss how we can use PatternBoost\, a simple machine learning algorithm that alternates local and global optimization steps\, to find good constructions for this problem \n3:45–4:00 pmBreak \n4:00–4:30 pmPetros Koumoutsakos\, Harvard University \n4:30–5:00 pm \nStéphane Mallat\,  Flatiron/College de France \nTitle: Image Generation by Score Diffusion and the Renormalisation Group \nAbstract: Score based diffusions generate impressive models of images\, sounds and complex physical systems. Are they generalising or memorising? How can deep network estimate high-dimensional scores without curse of dimensionality? This talk shows that generalisation does occur for deep network estimation of scores\, with enough training data.  The ability to avoid the curse of dimensionality seems to rely on multiscale properties revealed by a renormalisation group decomposition coming from statistical physics. Applications to models of turbulences will be introduced and discussed. \n  \nWednesday Oct. 30\, 2024 \n9:15–9:45 amMorning refreshments \n9:45–10:45 amBin Dong\, Beijing International Center for Mathematical Research(via Zoom)  \nTitle: AI for Mathematics: From Digitization to Intelligentization \nAbstract: This presentation explores the synergistic relationship between AI and mathematics\, beginning with a brief historical overview of their mutually beneficial interactions. It then examines notable existing work in AI for mathematics\, highlighting their achievements and limitations.  Next\, I will share preliminary findings from the ongoing AI4M research project at Peking University\, including our work on creating high-quality mathematical datasets through formalization (digitization)\, and our future plans for developing intelligent applications using these datasets. The presentation concludes with a forward-looking perspective on the opportunities and challenges within this exciting interdisciplinary field. \n10:45–11:00 am Break \n11:00 am–12:00 pm Eric Vanden-Eijnden\, Courant/NYUvia Zoom \nTitle: Generative modeling with flows and diffusions\, with applications to scientific computing. \nAbstract: Generative models based on dynamical transport have recently led to significant advances in unsupervised learning. At mathematical level\, these models are primarily designed around the construction of a map between two probability distributions that transform samples from the first into samples from the second.  While these methods were first introduced in the context of image generation\, they have found a wide range of applications\, including in scientific computing where they offer interesting ways to reconsider complex problems once thought intractable because of the curse of dimensionality. In this talk\, I will discuss the mathematical underpinning of generative models based on flows and diffusions\, and show how a better understanding of their inner workings can help improve their design. These results indicate how to structure the transport to best reach complex target distributions while maintaining computational efficiency\, both at learning and sampling stages.  I will also discuss applications of generative AI in scientific computing\, in particular in the context of application with models and no data (as opposed to the more standard data andno model)\, such as Monte Carlo sampling\, with applications to the statistical mechanics and Bayesian inference\, as well as the numerical integration and interpretation of random dynamical systems driven out of equilibrium. \n12:00–1:30 pm Lunch \n1:30–2:30 pm Kyu-Hwan Lee\, University of Connecticut \nTitle: Discovering New Mathematical Structures with Machine Learning \nAbstract: Can machine learning help discover new mathematical structures? In this talk\, I will present two case studies: murmurations in number theory and loadings of partitions related to Kronecker coefficients in representation theory and combinatorics. The focus will be on the paradigm of examining mathematical objects collectively\, rather than individually\, to create datasets suitable for machine learning experiments and interpretations. \n2:30–2:45 pm Break \n2:45–3:45 pm James Halverson\, Northeastern University \nTitle: Learning the Topological Invariance of Knots \nAbstract: This talk focuses on using machine learning for the defining problem in knot theory\, the classification of knots up to ambient space isotopy. We will train transformers and convolutional neural networks to distinguish topologically inequivalent knots\, given only representatives of the classes and no a priori knowledge of topological invariants. In this scheme\, we find that equivalent knots are well-clustered in the embedding space of the neural network\, and a trained decoder maps effectively from the embedding space back to knot space. Preliminary results will be presented on a new approach to resolving the Jones unknot conjecture. \n3:45–4:00 pmBreak \n4:00–5:00 pm Tristan Buckmaster\, New York University\, Javier Gomez-Serrano\, Brown Universityvia Zoom \n  \n  \nImage by Sue Side. https://www.sueside.com/\n 
URL:https://cmsa.fas.harvard.edu/event/mmlworkshop_1024/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:Workshop
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/ML_Closing-workshop_v3-1.png
END:VEVENT
END:VCALENDAR