Speaker
Prof.
Sergey Nechaev
(CNRS)
Description
Finding analytically the statistics of the longest common
subsequence (LCS) of a pair of random sequences
drawn from c alphabets is a challenging
problem in computational evolutionary biology. We
present exact asymptotic results for the distribution of
the LCS in a simpler, yet nontrivial, variant of the
original model called the Bernoulli matching (BM) model.
We show that in the BM model, for all c, the
distribution of the asymptotic length of the LCS, suitably
scaled, is identical to the Tracy-Widom distribution of
the largest eigenvalue of a random matrix whose
entries are drawn from a Gaussian unitary ensemble.