PRTools examples: Mappings

Mappings are with datasets a key element of PRTools. It is assumed that the reader is familiar with the introductory sections of the user guide:

The basic idea of a mapping is that it transforms the representation of objects from on space to another. This can be from a higher dimensional vector space to a lower one, but also the reverse. Vector spaces might also be given another scaling or being rotated. A set of objects might also given in a non-vectorial domain, e.g. images of different sizes, a set of graphs or a set of strings in a datafile. Also these might be transformed by a mapping. In the following examples we assume the availability of prdatasets and prdatfiles. Add them to the Matlab path or give the fiollowing commands to create their directories:

prdatasets    % accept or give the path
prdatfiles    % accept or give the path

Fixed mappings

Observe the feature sizes of the datasets as a result of the fixed mapping featsel: feature selection

A = sonar                  % 60 dimensional dataset
B = featsel(A,[2 4 10])    %  3 dimensional dataset
B = A(:,[2,4,10])                    %  3 dimensional dataset

A simple rescaling of the features of a dataset by a sigmoid:

delfigs
A = gendatb;
scatterd(A)
figure; scatterd(A*sigm)
showfigs

The fixed mapping affine is applied in many routines. It just linearly rotates and shifts data

delfigs
A = gendatb;
scatterd(A);
B = A*affine([1 1;1 -1],[10 100]);
figure; scatterd(B)
showfigs

The data is rotated over 45 degrees and the origin is shifted to [10 100];

delfigs
A = kimia
B = selclass(A,[7 11])
show(B,6)
C = B*im_unif([],11)
figure; show(C,6);
C = B*im_resize([],[8 8]) % sampling to 64 pixels
figure; show(C,6);
C = B*im_resize([],[8 8],'cubic') % interpolation to 64 pixels
figure; show(C,6);
showfigs

Exercises

  1. repeat the above experiment for the the kimia_images datafile. Look what happens to the images sizes in the figures.
  2. Consider the entire kimia dataset. Use testk(A,1) to see its nearest neighbor error. What happens to the error if the images are reduced to 32*32, 16*16, 8*8 and 4*4? How does this depend on the interpolation method?

Trainable mappings

Trainable mappings optimize the mapping for a training set. There are always two steps:

  1. Train the untrained version of the mapping by a training set
  2. Apply the trained mapping to the same or new data.

Feature extraction and classification are well known examples. Here is the famous eigenface examples

delfigs
A = faces
show(A,20);
W = A*pcam([],20);  % compute first 20 eigenfaces
figure; show(W,5);
B = A*W;            % apply the mapping to the same data
figure; scatterd(B(:,[1:2]))
showfigs

Are the eigenfaces the best ones? Test it yourself by the following exercise,

Exercise

  1. Use testk(B(:,1:n),1) to find the classification error for n = 1:20
  2. Rank the 20 features by the trainable mapping featself: V = B*featself([],[],20)
  3. Inspect the feature ranking by +V
  4. Reorder the features by C = B*V
  5. Use testk(C(:,1:n),1) to find the classification error for n = 1:20
  6. What is wrong?

Another interesting experiment is

  1. Plot the scatterplot for the first two eigenfaces
  2. Plot the scatterplot for the first two Fisher faces (use fisherm)
  3. Explain the differences

elements: datasets datafiles cells and doubles mappings classifiers mapping types.
operations: datasets datafiles cells and doubles mappings classifiers stacked parallel sequential dyadic.
user commands: datasets representation classifiers evaluation clustering examples support routines.
introductory examples: Introduction Scatterplots Datasets Datafiles Mappings Classifiers Evaluation Learning curves Feature curves Dimension reduction Combining classifiers Dissimilarities.
advanced examples.