The TinyStories Dataset: How Small Can Language Models Be And Still Speak Coherent
Virtualhttps://youtu.be/wTQH6mRDXhw New Technologies in Mathematics Seminar Speaker: Ronen Eldan, Microsoft Research Title: The TinyStories Dataset: How Small Can Language Models Be And Still Speak Coherent Abstract: While generative language models exhibit powerful capabilities at large scale, when either the model or the number of training steps is too small, they struggle to produce coherent and fluent […]