PRTools Introductory Examples

Make sure PRTools is in the path. This should not result in an error (copy and paste statements like this to your Matlab command line) :

which ldc

Get rid of old figures

delfigs

Take an arbitrary 2D dataset of two classes and plot it

A = gendatb; % The banana set
scatterd(A); % Show a scatterplot

The dataset A is a PRTools ‘object’. The formal name of this variable type is prdataset, which is the name of its constructor. Typing A, without ‘;‘ gives some info:

A

It can be converted to a structure by

struct(A)

Its contents can be listed by

+A

This shows all information that can possibly be stored in a dataset. PRTools routines make use of it in computations and annotation of plots.

The 2 feature values of the first 5 objects can be inspected by

+A(1:5,:)

Mark them in the scatterplot

hold on; scatterd(A(1:5,:),'o');

Compute a simple classifier: Fisher’s Linear Discriminant

W1 = A*fisherc;

Note that in PRTools A*PROC([],PARS) is an alternative for PROC(A,PARS). The advantage of the first notation is that PROC([],PARS) can be stored in a variable and supplied as a parameter to a function. PRTools operations on datasets are called mappings. The formal name of this variable type is prmapping, according to the name of its constructor. fisherc in the above example is an untrained mapping. Here it is trained by the dataset A, resulting in the trained mapping, the classifier W1.

The error on the training set A can be found by:

A*W1*testc

Here the original dataset A used for training the classifier is used for testing it as well. Formally this is not such a good idea (why not?). The classifier can be plotted it in the scatterplot

plotc(W1)

Compute a 3rd degree polynomial classifier based on fisherc and plot it

W2 = A*polyc([],fisherc,3);
plotc(W2,'r')

The error on the training set

A*W2*testc

Now, let us split the dataset in a separate part for training and one for testing

[AT,AS] = gendat(A,0.5)              % 50-50 split in trainset and testset
W = AT*{fisherc,polyc([],fisherc,3)} % Train classifiers by AT
testc(AS,W)                          % Test  classifiers by AS

Exercises:

(skip these exercises if you continue with the introductory scatterplot examples)

• Repeat the above three statements for a much larger dataset, e.g. of 1000 objects per class, generated by A = gendatb([1000 1000])
• Try other classifiers, e.g. the nearest neighbor classifier knnc or a decision tree, dtc. Note that most classifiers have useful default parameters, e.g. W = A*dtc will do.
• Find in the user guide another 2D dataset, generate data, produce a scatterplot, compute a classifier and plot it.