Abstract: Scaling laws and associated downstream trends can be used as an organizing principle when thinking about current and future ML progress. I will briefly review scaling laws for generative models in a number of domains, emphasizing language modeling. Then I will discuss scaling results for transfer from natural language to code, and results on python programming performance from “codex” and other models. If there’s time I’ll discuss prospects for the future — limitations from dataset sizes, and prospects for RL and other techniques.
2022-03-02 18:11 - 19:11