### DisTools examples: Pseudo-Euclidean Embedding

Instead of constructing a dissimilarity space, the dissimilarity matrix may be used for finding an embedding. Non-Euclidean dissimilarities may be embedded in a pseudo-Euclidean space if they are symmetric. Some examples of such an embedding will be shown. It is assumed that readers are familiar with PRTools and will consult the following pages where needed:

- PRTools User Guide, See at the bottom of the page for a TOC
- Introduction to DisTools
- Dissimilarity Representation Course
- The following packages should be in the Matlab path: PRTools, DisTools, PRDisData

The main DisTools routine for P-embedding is `pe_em`

:

`d = chickenpieces(15,60)*`

`makesym`

;

w = d*`pe_em`

; % compute PE mapping

`getsig`

(w) % get signature of the mapping

x = d*w;`% map dismat in PE space`

`getsig`

(x) % get signature of the data

Note that many dissimilarity datasets are not symmetric. For embedding symmetry is necessary. `pe_em`

symmetrizes by default, but further results are confusing if we map non-symmetric data n a PE space. The routine `makesym`

takes care of this (by default averaging of the matrix and its transpose). The signature (dimensionalities of the positive and negative space) are stored in the mapping as well as in the projected data. The accuracy of a PE mapping and the reconstruction of the dissimilarities can be inspected in the following way:

`[p,q] = getsig(x); % get the signature`

`+d(1:5,1:5).^2 % show part of original data`

`squared`

% recompute squared dissimilarities

r = distm(x(:,1:p)) - distm(x(:,p+1:p+q));

`+r(1:5,1:5) % show and compare`

`mean(mean(abs(+d.^2-(+r)))) % total average difference is small!`

The routine

does exactly the same as the above construct based on `pe_distm`

`distm`

and automatically extracts the signature.

Let us now compute a PE space defined by all data except the first 5 objects and then project these objects in the PE space, followed by a reconstruction of their distances.

`w = d(4:end,4:end)*`

`; % mapping defined by all but 3 objects`

`pe_em`

`x = d(:,4:end)*w; % map all objects`

+d(1:5,1:5).^2 % show part of original data`squared`

r =`pe_distm`

(x);

`% recompute squared dissimilarities`

`+r(1:5,1:5) % show and compare`

Note that the reconstructed distance between the objects 4 and 5 is correct but that all distances with and between the object 1, 2 and 3 are entirely wrong: objects not participating in the construction of the space cannot be mapped correctly. This behavior can be significantly improved by the construction of approximate mappings to the PE space and using just a subset of the eigenvectors. One way to do this is by setting a parameter in

which determines the retained eigen-fraction:`pe_em`

`alf = 0.99;`

`w = d(4:end,4:end)*`

`([],alf);`

`pe_em`

`x = d(:,4:end)*w; % map all objects`

+d(1:5,1:5).^2 % show part of original data, squared

`r =`

`pe_distm`

(x);`% recompute squared dissimilarities`

+r(1:5,1:5) % show and compare

#### Exercise_1

Try to find an acceptable value for `alf`

.

#### Exercise_2

- Split the dataset 50-50 in both directions obtaining four matrices
`D(S,S)`

,`D(S,T)`

,`D(T,S)`

and`D(T,T)`

.`genddat`

may be used to find proper, randomized sets of indices`T`

and`S`

. - Use
`D(T,T)`

to find a PE mapping. - Project
`D(S,T)`

as well as`D(T,T)`

in the PE space - Reconstruct
`D`

, obtaining a distance matrix`R`

. Compare the four submatrices of`D`

with the corresponding four submatrices of`R`

. Define an accuracy measure. Plot the four accuracies as a function of`alf`

.

**elements:**
datasets
datafiles
cells and doubles
mappings
classifiers
mapping types.

**operations:**
datasets
datafiles
cells and doubles
mappings
classifiers
stacked
parallel
sequential
dyadic.

**user commands:**
datasets
representation
classifiers
evaluation
clustering
examples
support routines.

**introductory examples:**
Introduction
Scatterplots
Datasets
Datafiles
Mappings
Classifiers
Evaluation
Learning curves
Feature curves
Dimension reduction
Combining classifiers
Dissimilarities.

**advanced examples**.