How do Transformers reason? First principles via automata, semigroups, and circuits
CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United Stateshttps://youtu.be/g8zdumOAWzw New Technologies in Mathematics Seminar Speaker: Cyril Zhang, Microsoft Research Title: How do Transformers reason? First principles via automata, semigroups, and circuits Abstract: The current "Transformer era" of deep learning is marked by the emergence of combinatorial and algorithmic reasoning capabilities in large sequence models, leading to dramatic advances in natural language understanding, program synthesis, […]