An introduction to mixture of experts in deep learning
Member Seminar Speaker: Samy Jelassi Title: An introduction to mixture of experts in deep learning Abstract: Scale has opened new frontiers in natural language processing – but at a high cost. Mixture-of-Experts (MoE) have been proposed as a path to even larger and more capable language models. They select different parameters for each incoming example. By […]