• Minerva: Solving Quantitative Reasoning Problems with Language Models

    CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

    https://youtu.be/HUTWime3d6w New Technologies in Mathematics Seminar Speaker: Guy Gur-Ari, Google Research Title: Minerva: Solving Quantitative Reasoning Problems with Language Models Abstract: Quantitative reasoning tasks which can involve mathematics, science, and programming are often challenging for machine learning models in general and for language models in particular. We show that transformer-based language models obtain significantly better performance […]

  • Towards Faithful Reasoning Using Language Models

    CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

    New Technologies in Mathematics Seminar Speaker: Antonia Creswell, DeepMind Title: Towards Faithful Reasoning Using Language Models Abstract: Language models are showing impressive performance on many natural language tasks, including question-answering. However, language models – like most deep learning models – are black boxes. We cannot be sure how they obtain their answers. Do they reason […]

  • From Engine to Auto

    CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

    https://youtu.be/h3jcSg359E8 New Technologies in Mathematics Seminar Speakers: João Araújo, Mathematics Department, Universidade Nova de Lisboa and Michael Kinyon, Department of Mathematics, University of Denver Title: From Engine to Auto Abstract: Bill McCune produced the program EQP that deals with first order logic formulas and in 1996 managed to solve Robbins' Conjecture. This very powerful tool reduces […]

  • How do Transformers reason? First principles via automata, semigroups, and circuits

    CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

    https://youtu.be/g8zdumOAWzw New Technologies in Mathematics Seminar Speaker: Cyril Zhang, Microsoft Research Title: How do Transformers reason? First principles via automata, semigroups, and circuits Abstract: The current "Transformer era" of deep learning is marked by the emergence of combinatorial and algorithmic reasoning capabilities in large sequence models, leading to dramatic advances in natural language understanding, program synthesis, […]

  • Special Lectures on Machine Learning and Protein Folding

    CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

    The CMSA hosted a series of three 90-minute lectures on the subject of machine learning for protein folding. Thursday Feb. 9, Thursday Feb. 16, & Thursday March 9, 2023, 3:30-5:00 pm ET Location: G10, CMSA, 20 Garden Street, Cambridge MA 02138 & via Zoom     Speaker: Nazim Bouatta, Harvard Medical School Abstract: AlphaFold2, a […]

  • How to steer foundation models?

    CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

    https://youtu.be/ztk5TPYTKZA New Technologies in Mathematics Seminar Speaker: Jimmy Ba, University of Toronto Title: How to steer foundation models? Abstract: By conditioning on natural language instructions, foundation models and large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer the model. […]

  • Toolformer: Language Models Can Teach Themselves to Use Tools

    Virtual

    https://youtu.be/UID_oXuN-0Y New Technologies in Mathematics Seminar Speaker: Timo Schick, Meta AI Title: Toolformer: Language Models Can Teach Themselves to Use Tools Abstract: Language models exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where […]

  • Modern Hopfield Networks for Novel Transformer Architectures

    Virtual

    https://youtu.be/5LXiQUsnHrI New Technologies in Mathematics Seminar Speaker: Dmitry Krotov, IBM Research - Cambridge Title: Modern Hopfield Networks for Novel Transformer Architectures Abstract: Modern Hopfield Networks or Dense Associative Memories are recurrent neural networks with fixed point attractor states that are described by an energy function. In contrast to conventional Hopfield Networks, which were popular in […]

  • The TinyStories Dataset: How Small Can Language Models Be And Still Speak Coherent

    Virtual

    https://youtu.be/wTQH6mRDXhw New Technologies in Mathematics Seminar Speaker: Ronen Eldan, Microsoft Research Title: The TinyStories Dataset: How Small Can Language Models Be And Still Speak Coherent Abstract: While generative language models exhibit powerful capabilities at large scale, when either the model or the number of training steps is too small, they struggle to produce coherent and fluent […]

  • Transformers for maths, and maths for transformers

    CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

    https://youtu.be/Sc6k06wVX3s New Technologies in Mathematics Seminar Speaker: François Charton, Meta AI Title:  Transformers for maths, and maths for transformers Abstract: Transformers can be trained to solve problems of mathematics. I present two recent applications, in mathematics and physics: predicting integer sequences, and discovering the properties of scattering amplitudes in a close relative of Quantum ChromoDynamics. Problems of […]

  • LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

    CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

    https://youtu.be/u-pkmdkQoMU New Technologies in Mathematics Seminar Speaker: Alex Gu, MIT Dept. of EE&CS Title: LeanDojo: Theorem Proving with Retrieval-Augmented Language Models Abstract: Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute […]

  • Physics of Language Models: Knowledge Storage, Extraction, and Manipulation

    CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United States

    https://youtu.be/M25cbX5do8Y New Technologies in Mathematics Seminar Speaker: Yuanzhi Li, CMU Dept. of Machine Learning and Microsoft Research Title: Physics of Language Models: Knowledge Storage, Extraction, and Manipulation Abstract: Large language models (LLMs) can memorize a massive amount of knowledge during pre-training, but can they effectively use this knowledge at inference time? In this work, we show several striking […]