Asymptotic Theory of Attention: In-Context Learning and Sparse Token Detection
CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United StatesColloquium Speaker: Yue M. Lu, Harvard University Title: Asymptotic Theory of Attention: In-Context Learning and Sparse Token Detection Abstract: Attention-based architectures exhibit striking emergent abilities—from learning tasks directly from context to detecting rare, weak features in long sequences—yet a rigorous theory explaining these behaviors remains limited. In this talk, I will present two recent exactly solvable models that […]