1–26 Jul 2019
Nordita, Stockholm
Europe/Stockholm timezone

Towards infering Potts models for evolutionary correlated sequence data

11 Jul 2019, 14:00
30m
FB52 (Nordita, Stockholm)

FB52

Nordita, Stockholm

Speaker

Pierre Barrat-Charlaix

Description

Global coevolutionary models of protein families have become increasingly popular due to their capacity to predict residue-residue contacts from sequence information, but also to predict fitness effects of amino-acid substitutions or to infer protein-protein interactions. The central idea in these models is to construct a probability distribution, a Potts model, that reproduces single and pairwise frequencies of amino acids found in natural sequences of the protein family. This approach treats sequences from the family as independent samples, completely ignoring phylogenetic relations between them. This simplification is known to lead to potentially biased estimates of the parameters of the model, decreasing their biological relevance. Current workarounds for this problem, such as re-weighting sequences, are poorly understood and not principled. Here, we propose an inference scheme that takes the phylogeny of a protein family into account in order to correct biases in estimating the frequencies of amino-acids. Using artificial data, we show that a Potts model inferred using these corrected frequencies performs better in predicting contacts and fitness effect of mutations.

Primary author

Pierre Barrat-Charlaix

Presentation materials

There are no materials yet.