Stockholm Bioinformatics Center seminars

Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts

by Patrik Björkhom (SBC)

Europe/Stockholm
RB35 (RB35)

RB35

RB35

Seminar room RB35 (Roslagstullsbacken 35, the SBC house)
Description
We propose a novel hidden Markov model based method for predicting residue-residue contacts from protein sequences using as training data homologous sequences, predicted secondary structure and a library of local neighborhoods (local descriptors of protein structures). The library consists of recurring structural entities in-cooperating short-, medium and long-range interactions and is general enough to reassemble the cores of most proteins in PDB. The method is tested on an external test set of 606 domains with no significant sequence similarity to the training set as well as 151 domains with SCOP folds not present in the training set. Considering the top L/5 predictions (L = sequence length), our hidden Markov models obtained an accuracy of 22.8% for long range interactions in new fold targets, and an average accuracy of 28.6% for long-, medium and short-range contacts. This is a significant performance increase over state of the art methods.