Speaker
Michael Biehl
(University of Groningen)
Description
An introduction to distance based classification of multi-
dimensional data is given.The popular Learning Vector
Quantization (LVQ) will serve as the main example in this
talk. Here, typical representatives of the classes
(prototypes) are determined from labelled example data in
a supervised training process. In the working phase,the
prototypes parameterize a classifier which can be applied to
novel, unlabelled data.
A key issue in LVQ and many related methods is the
choice of a suitable similarity or distance measure. So-
called relevance learning schemes employ parameterized
distance measures which are optimized in the data-driven
training process. The recently introduced Matrix Relevance
LVQ, based on generalized Euclidean distances, will be
discussed in greater detail. It is straightforward to extend
the framework beyond Euclidean measures. As an
important example, the use of statistical divergences in LVQ
is introduced.
Divergences can serve as generalized distances when data
correspond to positive or normalized measures, as for
instance in the histogram based classification of image data.
Matrix Relevance Learning and Divergence based LVQ are
illustrated in terms of a number of real world applications
from the biomedical domain. These include adrenal tumor
classification based on steroid excretion values and the
detection and classification of plant diseases using color
histograms.