New Technologies in Mathematics Seminar
Speaker: Timo Schick, Meta AI
Title: Toolformer: Language Models Can Teach Themselves to Use Tools
Abstract: Language models exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this talk, we show how these limitations can be overcome by letting language models teach themselves to use external tools via simple APIs. We discuss Toolformer, a model trained to independently decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. Through this, it achieves substantially improved zero-shot performance across a variety of downstream tasks without sacrificing its core language modeling abilities.