Complex systems and Biological physics seminar [before December 2013]

Improved prediction of direct contact in protein structure with Potts model

by Cecilia Lövkvist (KTH)

Europe/Stockholm
122:026

122:026

Description
Spatially proximate amino acid in a protein tend to coevolve. A protein's 3D-structure hence leaves an echo of correlations in the evolutionary record. Reverse engineering 3D-structures from such correlations is an open problem in structural biology, pursued with increasing vigor as new protein sequences continue to fill the data banks. Within this task lies a statistical inference problem, rooted in the following: correlation between two sites in a protein sequence can arise from first hand interaction, but can also be network-propagated via intermediate sites; observed correlation is not enough to guarantee proximity. To separate direct from indirect interactions is in this context referred to as direct-coupling analysis an instance of the general problem of inverse statistical mechanics, where the task is to learn model parameters (fields, couplings from observables (magnetizations, correlations, samples), in large systems. We here show that the pseudo-likelihood method, applied to a 21-state Potts model describing the probability distributions of amino acids along the backbone chain within one protein family significantly outperforms other approaches to the direct-coupling analysis. The results are verified by comparing to known crystal structures of specific instances of a family.