Special Lectures on Machine Learning and Protein Folding

2023-02-09 15:30 - 17:00
CMSA Room G10
Address: CMSA, 20 Garden Street, Cambridge, MA 02138 USA

The CMSA will host a series of three 90-minute lectures on the subject of machine learning for protein folding.

Location: G10, CMSA, 20 Garden Street, Cambridge MA 02138

Directions and Recommended Lodging

Speaker: Nazim Bouatta, Harvard Medical School

Abstract: AlphaFold2, a neural network-based model which predicts protein structures from amino acid sequences, is revolutionizing the field of structural biology. This lecture series, given by a leader of the OpenFold project which created an open-source version of AlphaFold2, will explain the protein structure problem and the detailed workings of these models, along with many new results and directions for future research.

Thursday, Feb 9, 2023

3:30–5:00 pm ET Lecture 1: A brief intro to protein biology. AlphaFold2 impacts on experimental structural biology. Co-evolutionary approaches. Space of ‘algorithms’ for protein structure prediction. Proteins as images (CNNs for protein structure prediction). End-to-end differentiable approaches. Attention and long-range dependencies. AlphaFold2 in a nutshell.


Thursday, Feb 16, 2023:  3:30–5:00 pm ET

3:30–5:00 pm ET Lecture 2: AlphaFold2 architecture. Turning the co-evolutionary principle into an algorithm: EvoFormer. Structure module and symmetry principles (equivariance and invariance). OpenFold: retraining AlphaFol2 and insights into its learning mechanisms and capacity for generalization. Applications of variants of AlphaFold2 beyond protein structure prediction: AlphaFold Multimer for protein complexes, RNA structure prediction.


Thursday, March 9, 2023

3:30–5:00 pm ET Lecture 3: Limitations of AlphaFold2 and evolutionary ML pipelines. Current single sequence models. Protein language models (LM): single sequence + LM embeddings. Combining LM models with Frenet-Serret construction for protein structure prediction. Applying AlphaFold2 and OpenFold for language models.



Biography: Nazim Bouatta received his doctoral training in high-energy theoretical physics, and transitioned to systems biology at Harvard Medical School, where he received training in cellular and molecular biology in the group of Prof. Judy Lieberman. He is currently a Senior Research Fellow in the Laboratory of Systems Pharmacology led by Prof. Peter Sorger at Harvard Medical School, and an affiliate of the Department of Systems Biology at Columbia, in the group of Prof. Mohammed AlQuraishi. He is interested in applying machine learning, physics, and mathematics to biology at multiple scales. He recently co-supervised the OpenFold project, an optimized, trainable, and completely open-source version of AlphaFold2. OpenFold has paved the way for many breakthroughs in biology, including the release of the ESM Metagenomic Atlas containing over 600 million predicted protein structures.


Chair: Michael Douglas (Harvard CMSA)

Moderators: Farzan Vafa & Sergiy Verstyuk (Harvard CMSA)