Lucy Colwell (Cambridge University)
Date: Thu. March 3rd, 2022, 4:00 pm-5:00 pm
Location: Via Zoom
Link to video
Machine learning for biological sequence discovery and design
Prediction of protein function from sequence is a central challenge. Solving this challenge would enable us to discover new proteins with specific functionality. Experimental breakthroughs allow data on the relationship between sequence and function to be rapidly acquired that can be used to train and validate machine learning models that predict protein function directly from sequence. However, the cost and latency of wet-lab experiments require methods that find good sequences in few experimental rounds, where each round contains large batches of sequence designs. In this setting, I will discuss model-based optimization approaches that allow us to take advantage of sample inefficient methods and find diverse optimal sequence candidates for experimental evaluation. The potential of this approach is illustrated through design and experimental validation of proteins and peptides for therapeutic applications.