Learning about natural selection in populations evolving under recombination, with an application to epistasis between SARS-CoV-2 genes

by Erik Aurell (KTH)


The distribution over genotypes in a population is shaped by evolution, a process driven by natural selection, mutations, genetic drift (finite-N noise) and recombination. It was discovered by Motoo Kimura in the mid-60ies that if recombination is the fastest mechanism, the stationary distribution is of the Ising/Potts type of statistical mechanics. With the advances in genome sequencing it is now in many cases possible to sample the distribution over genotypes. It is hence possible to leverage methodological advances in statistical physics and AI to infer properties of natural selection from data.

I will describe how this works in simple models of evolving populations, and as applied to a large collection of SARS-CoV-2 genomes.

The talk is partially based on

Hong-Li Zeng et al "Global analysis of more than 50,000 SARS-CoV-2 genomes reveals epistasis between eight viral genes" PNAS 117:31519-31526 (December 8, 2020)