New Technologies in Mathematics Seminar
Speaker: Cyril Zhang, Microsoft Research
Title: How do Transformers reason? First principles via automata, semigroups, and circuits
Abstract: The current “Transformer era” of deep learning is marked by the emergence of combinatorial and algorithmic reasoning capabilities in large sequence models, leading to dramatic advances in natural language understanding, program synthesis, and theorem proving. What is the nature of these models’ internal representations (i.e. how do they represent the states and computational steps of the algorithms they execute)? How can we understand and mitigate their weaknesses, given that they resist interpretation? In this work, we present some insights (and many further mysteries) through the lens of automata and their algebraic structure.
Specifically, we investigate the apparent mismatch between recurrent models of computation (automata & Turing machines) and Transformers (which are typically shallow and non-recurrent). Using tools from circuit complexity and semigroup theory, we characterize shortcut solutions, whereby a shallow Transformer with only o(T) layers can exactly replicate T computational steps of an automaton. We show that Transformers can efficiently represent these shortcuts in theory; furthermore, in synthetic experiments, standard training successfully finds these shortcuts. We demonstrate that shortcuts can lead to statistical brittleness, and discuss mitigations.
Joint work with Bingbin Liu, Jordan Ash, Surbhi Goel, and Akshay Krishnamurthy.