- This event has passed.
From Word Prediction to Complex Skills: Data Flywheels for Mathematical Reasoning
October 16, 2024 @ 2:00 pm - 3:00 pm
New Technologies in Mathematics Seminar
Speaker: Anirudh Goyal (University of Montreal)
Title: From Word Prediction to Complex Skills: Data Flywheels for Mathematical Reasoning
Abstract: This talk examines how large language models (LLMs) evolve from simple word prediction to complex skills, with a focus on mathematical problem solving. A major driver of AI products today is the fact that new skills emerge in language models when their parameter set and training corpora are scaled up. This phenomenon is poorly understood, and a mechanistic explanation via mathematical analysis of gradient-based training seems difficult. The first part of the talk focuses on analysing emergence using the famous (and empirical) Scaling Laws of LLMs. Then I talk about howc LLMs can verbalize these skills by assigning labels to problems and clustering them into interpretable categories. This metacognitive ability allows us to leverage skill-based prompting, significantly improving performance on mathematical reasoning. I then present a framework that combines LLMs with human oversight to generate challenging, out-of-distribution math questions. This process led to the creation of the MATH^2 dataset, which enhances both model and human performance, driving further advances in mathematical reasoning capabilities.