Speaker
Jacob Shenker
Description
The difficulty of predicting the course of evolution, even on the single-
protein scale, stems from: (1) the immense size of sequence space on
which evolutionary trajectories lie; (2) the fact that fitness may be a
complex function of many phenotypes; and (3)
phenotypic heterogeneity, which implies that the fitness of a particular
genotype is not solely determined by its mean phenotypes, but by its
distribution of phenotypes. To collect the data requisite to have a hope
of predicting evolution, therefore, one needs a technique which offers
extremely high throughput measurements of multiple phenotypes at
the single-cell level.
We have recently developed the capability to phenotype a million
bacterial strains per day using “mother machine” microfluidic chips, fast
fluorescence microscopes, and a petabyte-scale data analysis pipeline.
We are working towards producing a genotype-to-phenotype map of all
11 million double mutants of GFP, measuring brightness,
maturation kinetics, photobleaching kinetics, propensity to aggregate,
and two-dimensional emission-excitation spectra. We are also
measuring induction curves for millions of double mutants of lacI
repressor. These datasets will, for the first time, comprehensively
reveal how different biophysical properties trade off against each other
in a local fitness neighborhood of a protein. These kinds of data will
allow engineering variants of these proteins with specific desired
properties; more generally, they will inform theoretical models of
protein evolution dynamics by characterizing epistasis and tradeoffs in
real proteins.
Primary author
Jacob Shenker