Speaker
Jaakko Hollmen
(Aalto University)
Description
DNA copy number aberrations, i.e. copy number amplifications and
copy number deletions, are hallmarks of nearly all advanced
tumors. We present the data collection of genome-wide DNA copy
number amplification data consisting of data of over 4500 cases
of human neoplasms. The data set has been gathered from
scientific journal articles covering a period of ten years
and is
naturally represented as 0-1 data. We motivate the use of
mixture
models in probabilistic clustering of amplification data and
present a mixture model of multivariate Bernoulli distributions
to yield patterns that are relevant to all cancer types.
Appropriate
complexity for the mixture model for each chromosome is selected
with a model selection procedure. A methodology to create a
naming scheme for the identified patterns is also presented.
Results are interpreted and the diagnostic value of the findings
is further investigated in the light of background risk factors.