Abstract: We automatically solve, explain, and generate university-level course problems from thirty STEM courses (at MIT, Harvard, and Columbia) for the first time.
We curate a new dataset of course questions and answers across a dozen departments: Aeronautics and Astronautics, Chemical Engineering, Chemistry, Computer Science, Economics, Electrical Engineering, Materials Science, Mathematics, Mechanical Engineering, Nuclear Science, Physics, and Statistics.
We generate new questions and use them in a Columbia University course, and perform A/B tests demonstrating that these machine generated questions are indistinguishable from human-written questions and that machine generated explanations are as useful as human-written explanations, again for the first time.
Our approach consists of five steps:
(i) Given course questions, turn them into programming tasks;
(ii) Automatically generate programs from the programming tasks using a Transformer model, OpenAI Codex, pre-trained on text and fine-tuned on code;
(iii) Execute the programs to obtain and evaluate the answers;
(iv) Automatically explain the correct solutions using Codex;
(v) Automatically generate new questions that are qualitatively indistinguishable from human-written questions.
This work is a significant step forward in applying machine learning for education, automating a considerable part of the work involved in teaching.
Our approach allows personalization of questions based on difficulty level and student backgrounds, and scales up to a broad range of courses across the schools of engineering and science.
This is joint work with students and colleagues at MIT, Harvard University, Columbia University, Worcester Polytechnic Institute, and the University of Waterloo.