UBI seminar (古澤研究室開催)
 Date: 12th April,2023 14:00-15:30 (JST)
 Place: Room 413 (Faculty of Science Bldg.1, Hongo campus) and Zoom
 Speaker: Dr. John McBride (Ulsan National Institute for Science and Technology)
 Title:Quantifying protein evolution using physical models,experimental
     data, and machine learning.

Abstract: Progress in theory of protein evolution has been held back by gaps between model complexity and accuracy, and a lack of data. Models are typically either: cheap – which facilitates the study of evolution through predictions over many sequences – but inaccurate; or else they are accurate but too expensive to run at scale. With the growth of curated databases and high-throughput datasets, it is becoming easier to create simple physical models and test them on experimental data.
Likewise, the deep learning revolution has produced algorithms that can make fast, accurate predictions. I will first introduce AlphaFold (AF), the new structure prediction algorithm from DeepMind, and show that it is accurate enough to predict effects of single mutations.[1] We show this directly by comparing with structures from the Protein Data Bank, and we show this
indirectly by demonstrating that AF can be used to reliably predict changes in fluorescence in GFP and BFP. As an example of AF’s utility, I will show recent results where AF predictions correlate with changes in Guanylate Kinase activity. To link AF with physical models, I will show that AF can identify evolutionary allosteric coupling, which closely mirrors the propagation of mechanical force through elastic networks. Next I will introduce our new theory of molecular discrimination by proteins.[2] We construct a model of protein-ligand binding that is complex, yet computationally efficient. We show that the affinity and specificity of a protein (w.r.t. a set of ligands) depend non-linearly on flexibility, ligand mismatch, and binding energy, which we summarize in a phase diagram. The key to achieving high specificity is precision – the right about of flexibility,coupled with an appropriate degree of shape and chemical complementarity. We find that mutations to residues far from the binding site lead to increasingly small changes at the binding site,which enables fine-tuning of protein-ligand interactions; this is more evident in larger proteins, and as such, they are more evolvable and robust. These findings lead to the hypothesis that the need for specificity results in a hard constraint on the minimum size of proteins that can discriminate between certain ligands: proteins are large because specificity requires it.

I will finish by discussing credible future directions for combining physical models/theory, experimental data, and machine learning tools, and how they will transform our understanding of protein evolution.

[1] McBride, Polev, Reinharz, Grzybowski, Tlusty, AlphaFold2 can predict structural and phenotypic effects of single mutations, arxiv (2022), https://arxiv.org/abs/2204.06860
[2] McBride, Eckmann, Tlusty, General theory of specific binding:insights from a geneticmechano-chemical protein model, Mol. Biol. Evo (2022), https://doi.org/10.1093/molbev/msac217