### PRTools examples: Classifiers

Classifiers are a special type of trainable mappings. It is assumed that the reader is familiar with the introductory sections of the user guide:

PRTools offers a very large set of classifiers:

**Linear and quadratic**: fisherc, ldc, loglc, nmc, nmsc, qdc, udc, statslinc**Support Vector Machines**: libsvc, nulibsvc, rblibsvc, pklibsvc, svc, nusvc, rbsvc**Neural networks**: bpxnc, lmnc, perlc, rbnc, rnnc, vpc, drbmc**Various**: nbayesc, mogc, knnc, statsknnc, parzenc, adaboostc, treec, dtc, statsdtc, randomforestc, naivebc, statsnbc, fdsc**Special**: weakc, stumpc, treec

There is an extensive set of combining rules: there are some examples in preparation.

Like for all trainable mappings, there are two steps in using a classifier:

- Train the untrained version of the classifier, e.g.
`W = A*qdc([],0.1)`

- Apply the trained classifier to some data, either the same training set, or, for evaluation, an independent test set, e.g.
`D = B*W`

.

The classification dataset `D`

has `c`

columns (features). `c`

is the number of classes. So every classifier is a `k*c`

mapping, in which `k`

is the dimension of the feature space. There are three types of classification matrices: distances, densities or confidences (or posterior probabilities). see the faq on this topic. In any case the higher the value, the closer the object to the particular class.

`prdatasets % make sure prdatasets is in the path`

`A = iris % get the Iris dataset`

`[S,T] = gendat(A,[2 2 2]);`

`W = T*{fisherc,`

`qdc`

};

`D = S*W;`

`+D{1}`

`+D{2}`

The first classifier, `fisherc`

, is distance based. The results for `D{1}`

show that it returns confidences (numbers on the interval $$[0,1]$$, summing to 1). By a proper scaling and a sigmoid transform the distances are mapped to confidences. By the inverse sigmoid a back transformation to distances is realized (for a multi-class problem these are the distances to the one-against-res classifier):

`+(D{1}*invsigm)`

The second classifier,

, is density based. The results for `qdc`

`D{2}`

show the densities, weighted by the class prior probabilities. By normalization confidences, or in this case, class posteriors are found:

`+(D{2}*classc)`

The routine

mainly normalizes. If applied to distance based classifiers it has no influence. So, users who don’t know for a particular classifier whether it is distance based or density based, can always apply `classc`

in case confidences are needed. It can already be applied to the untrained versions of the classifiers:`classc`

`U = {`

`fisherc`

*`classc`

,`*`

`qdc`

`classc`

};

`W = T*U;`

`D = S*W;`

`+D{1}`

`+D{2}`

The columns (features) correspond the the classes of the training set, as can be verified by

`classnames(T)`

`getfeatlab(D{1})`

The classification routine `labeld`

finds the assigned classes by determining the maximum over the columns for every object:

`D{1}*`

`labeld`

By comparing them to the true labels of the test objects `S`

, the classification accuracies can be estimated:

Evaluation routines like `testc`

`, testd`

, `confmat`

`, prcrossval`

, `cleval`

and `clevalf`

make use of it.

**elements:**
datasets
datafiles
cells and doubles
mappings
classifiers
mapping types.

**operations:**
datasets
datafiles
cells and doubles
mappings
classifiers
stacked
parallel
sequential
dyadic.

**user commands:**
datasets
representation
classifiers
evaluation
clustering
examples
support routines.

**introductory examples:**
Introduction
Scatterplots
Datasets
Datafiles
Mappings
Classifiers
Evaluation
Learning curves
Feature curves
Dimension reduction
Combining classifiers
Dissimilarities.

**advanced examples**.