DisTools examples: Generalized Dissimilarities
Instead of a representation by dissimilarities between objects, distances to models may be used. Some simple examples will be treated. It is assumed that readers are familiar with PRTools and will consult the following pages where needed:
- PRTools User Guide, See at the bottom of the page for a TOC
- Introduction to DisTools
- Dissimilarity Representation Course
- The following packages should be in the Matlab path: PRTools, DisTools, PRDisData
Possibilities for computing models on the training set are cluster analysis and the computation of subspaces. After a cluster analysis objects my be represented by some distance measure defined for clusters, e.g. the minimum, teh maximum or the mean of the distances to all objects in a cluster. Alternatively the cluster may be represented by a central point or a subpace.
Exercise
- Take a dissimilarity dataset, e.g. one of the
chickenpieces
datasets. - Compute as a baseline approach its learning curve for the 1-NN rule in dissimilarity space (use
clevald
and knnc). - Cluster the training set, e.g. by a routine that can use dissimilarities as inputs, e.g.
kcentres
,modeseek
orhclust
. - Compute a cluster based dissimilarity matrix by computing fro every object the distance to the cluster.
- Compute for som classifiers learning curves for the new representation and compure with the baseline approach.
- Repeat for various numbers of clusters.
elements:
datasets
datafiles
cells and doubles
mappings
classifiers
mapping types.
operations:
datasets
datafiles
cells and doubles
mappings
classifiers
stacked
parallel
sequential
dyadic.
user commands:
datasets
representation
classifiers
evaluation
clustering
examples
support routines.
introductory examples:
Introduction
Scatterplots
Datasets
Datafiles
Mappings
Classifiers
Evaluation
Learning curves
Feature curves
Dimension reduction
Combining classifiers
Dissimilarities.
advanced examples.