Infinite Limits and Scaling Laws for Deep Neural Networks

CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

https://youtu.be/0998FJhPdj8 New Technologies in Mathematics Seminar Speaker: Blake Bordelon Title: Infinite Limits and Scaling Laws for Deep Neural Networks Abstract: Scaling up the size and training horizon of deep learning models has enabled breakthroughs in computer vision and natural language processing. Empirical evidence suggests that these neural network models are described by regular scaling laws where performance of […]

Hierarchical data structures through the lenses of diffusion models

CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

https://youtu.be/x7LPDDYZn94 New Technologies in Mathematics Seminar Speaker: Antonio Sclocchi, EPFL Title: Hierarchical data structures through the lenses of diffusion models Abstract: The success of deep learning with high-dimensional data relies on the fact that natural data are highly structured. A key aspect of this structure is hierarchical compositionality, yet quantifying it remains a challenge. In […]

From Word Prediction to Complex Skills: Data Flywheels for Mathematical Reasoning

CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

https://youtu.be/OYOuSAAE7QQ New Technologies in Mathematics Seminar Speaker: Anirudh Goyal (University of Montreal) Title: From Word Prediction to Complex Skills: Data Flywheels for Mathematical Reasoning Abstract: This talk examines how large language models (LLMs) evolve from simple word prediction to complex skills, with a focus on mathematical problem solving. A major driver of AI products today is the […]

How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad

CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

https://youtu.be/C6NDdnSaluU New Technologies in Mathematics Seminar Speaker: Aryo Lotfi (EPFL) Title: How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad Abstract: Can Transformers predict new syllogisms by composing established ones? More generally, what type of targets can be learned by such models from scratch? Recent works show that Transformers can be Turing-complete in terms of […]

Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning

CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

https://youtu.be/KOgh-FFDlvg New Technologies in Mathematics Seminar Speaker: Dylan Foster, Microsoft Research Title: Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning Abstract: Imitation learning (IL) aims to mimic the behavior of an expert in a sequential decision making task by learning from demonstrations, and has been widely applied to robotics, autonomous driving, and autoregressive language […]