How do Transformers reason? First principles via automata, semigroups, and circuits
https://youtu.be/g8zdumOAWzw New Technologies in Mathematics Seminar Speaker: Cyril Zhang, Microsoft Research Title: How do Transformers reason? First principles via automata, semigroups, and circuits Abstract: The current "Transformer era" of deep learning is marked by the emergence of combinatorial and algorithmic reasoning capabilities in large sequence models, leading to dramatic advances in natural language understanding, program synthesis, […]