An introduction to mixture of experts in deep learning
CMSA Room G10 CMSA, 20 Garden Street, Cambridge, MA, United StatesMember Seminar Speaker: Samy Jelassi Title: An introduction to mixture of experts in deep learning Abstract: Scale has opened new frontiers in natural language processing – but at a high cost. Mixture-of-Experts (MoE) have been proposed as a path to even larger and more capable language models. They select different parameters for each incoming example. By […]