
- This event has passed.
Datasets for Math: From AIMO Competitions to Math Copilots for Research
February 26, 2025 @ 2:00 pm - 3:00 pm

New Technologies in Mathematics Seminar
Speaker: Simon Frieder, Oxford
Title: Datasets for Math: From AIMO Competitions to Math Copilots for Research
Abstract: This talk begins with a brief exposition of the AI Mathematical Olympiad (AIMO) on Kaggle, now in its second iteration, outlining datasets and models available to contestants. Taking a broader perspective, I then examine 1) the overarching issues the current datasets suffer from—such as binary evaluation or constrained sets of use cases— and 2) the trajectory they set for competition-style mathematical problem-solving, which is different from mathematical research practice. I argue for a fundamental shift in dataset structure and composition, both for training and evaluation, and introduce the idea of mapping mathematical workflows to data, a key example underscoring the need for this shift. I touch upon new thinking LLMs and their role in redefining LLM math evaluation, highlighting their implications for dataset design. Finally, I propose general improvements to the current state of mathematical datasets, including mathematical adaptations of dataset documentation (e.g., datasheets).