Speaker
Prof.
Max Garzon
(The University of Memphis)
Description
Finding large sets of single DNA strands that do not
crosshybridize to themselves or to their complements is an
important problem in DNA computing, self-assembly, DNA
memories and phylogenetic analyses, because of their error
correction and prevention properties. The problem is in
itself NP-complete, even in very simplified versions using
any single reasonable measure that approximates the Gibbs
energy, thus practically excluding the possibility of
finding any efficient procedure to find maximal sets
efficiently. After a quick survey of advances in this area
in the last few years, we focus on a novel
combinatorial/geometric framework to analyze this problem.
In this framework, codeword design is reduced to finding
large sets of strands maximally separated in DNA spaces and
therefore the size of such sets depends on the geometry of
these DNA spaces. We present a new general technique to
embed DNA spaces in Euclidean spaces and thus, among others,
reduce the word design problem to the well known sphere
packing problem in information theory. The embedding sheds
some insights into the geometry of DNA spaces by enabling a
quantitative analysis via well established approximations of
the Gibbs energy, namely the nearest neighbor model of
duplex formation. The main tool is an efficiently computable
combinatorial approximation which is also a mathematical
metric. As illustration, we show two applications to produce
provably nearly optimal codeword sets (modulo the goodness
of the Gibbs energy approximation) and a new methodology for
phylogenetic analyses in Biology. We conclude with a brief
discussion of some qualitative properties of the Gibbs
energy landscapes for short DNA oligo spaces.
Primary author
Prof.
Max Garzon
(The University of Memphis)