Shopping cart


Gapped Sequences Alignment, Statistical Significance, and Biomolecular Interactions – Yi-Kuo Yu

Date: Mon. November 25th, 2002, 12:30 pm-1:30 pm
Location: Rockefeller 221

By comparing the bio-sequences deposited in databases, sequence alignment tools pull out sequences of potential functional similarity to the query. To quantify the significance of the found sequences, one usually ask their associated p-values — how probable it is that a completely irrelevant sequence might be pulled out “by accident”. This important and difficult problem was only solved for a specific type of alignment method — gapless alignment, which is incapable of detecting weak homology. For methods which allow for gaps, the p-values must be obtained through time-consuming shuffling methods. By employing fundamental concepts from statistical physics, we have made important progress in statistics of the gapped alignment. In the main part of my talk, I will present our recent theoretical advances — including a statistical theory, a few alignment methods, an interesting cooling map, and tests on synthetic data as well as real biological data. Since the purpose of sequence comparison is to identify proteins with similar functionality, one may also ask how likely it can help us in understanding issues such as protein-protein interactions or DNA-protein interactions. I will spend a few moments to discuss one of my recent exact result in anti-Wishart random matrices. In addition to direct molecular dynamic simulations, the combination of my new mathematical result and the sequence alignment method will be shown to be a promising procedure towards detailed, unbiased classification of biomolecular interactions.

Scroll To Top