Complex Systems and Biological Physics Seminars

Fitness Inference from Genomic Data by Inverse Ising Models

by Dr Hongli Zeng

Europe/Stockholm
112:028 (Nordita South)

112:028

Nordita South

Description

The Ising/Potts techniques (also known as the Direct Coupling Analysis (DCA) methods in biological community) have been widely studied, and made a big success to predict protein structure. Going beyond protein data, I will show how they can be understood in the framework of population genetics based on genome scale data by this talk.

The genetic composition of a naturally developing population is considered as due to mutation, selection, genetic drift and recombination. Selection is modeled as single-locus terms (additive fitness) and two-loci terms (pairwise epistatic fitness).

We test for the first time that if DCA could be used to infer the biological fitness from population-wide whole-genome data, which is a time series of a developing population. We generate such data in silico, and show that in the Quasi-Linkage Equilibrium (QLE) phase of Kimura, Neher and Shraiman, that pertains at high enough recombination rates and low enough mutation rates, epistatic fitness can be quantitatively correctly inferred by DCA.