PRTools examples: Cross validation
Cross validation is a standard technique for the evaluation of pattern recognition systems. It is assumed that the reader is familiar with the introductory sections of the user guide:
There is a specific set of evaluation routines. Here the PRTools routine prcrossval
is discussed and illustrated.See also the introductory examples on classifiers and evaluation. In cross validation a single labeled dataset is randomly split in $$n$$ subsets of about the same size, called the folds. All-but-one folds are used for training and the remaining one is used for testing. This is rotated over all $$n$$ folds by which all samples in the dataset are used just once for testing and $$n-1$$ times for training. For sufficiently large $$n$$ the training sets are about the same and equivalent to the total set. Consequently, cross validation approximates the use of all samples for training as well as all for testing while simultaneously the test set is independent of the training set.
The entire process is often repeated a number of times and results are averaged in order to rule out the effect of randomly splitting the folds. PRTools uses the so-called stratified splitting strategy to reduce the variability of splitting a far as possible. Herewith a split is not fully random, but the relative class sizes are maintained as good as possible equal to the original ones. Another, systematic (non-random) splitting strategy is ‘density preserving splitting’ (DPS) in which case the number of folds should be a power of 2. As it is non-stochastic, there is no need to iterate it. Consequently it is faster.
In the following experiment we take a single dataset of 200 objects per class and try to predict the performance of the Parzen classifier parzenc
by 8-fold crossvalidation. This is done by 1, 5 and 25 iterations and once using density preserving sampling. The results are compared with the ‘true’ performance of the Parzen classifier trained on the full dataset and tested by a larger test set of 2*1000 objects.
A = gendatb([200 200]);
e1 =
prcrossval
(A,parzenc,8,1);
e5 =
prcrossval
(A,parzenc
,8,5);
e25 =
prcrossval
(A,parzenc
,8,25);
edps =
prcrossval
(A,parzenc
,8,'DPS');
S =
gendatb
([1000,1000]);
etrue = S*(A*
parzenc
)*testc;
disp([etrue,e1,e5,e25,edps])
Repeat this a few times. Results and conclusions are every time different. In order to obtain a more definite conclusion run the following experiment.
Exercise
- Take one of the
mfeat
datasets, e.g.mfeat_kar
. - Split it 50-50 in a training and a test set using
gendat
. - Estimate the performance of the
ldc
classifier on the training set only by 8-fold cross validation. Do this for 1, 5 (and 25, if feasible) iterations as well a by density preserving sampling (NREP = 'DPS'
), usingprcrossval
- Compare the results with those of a single classifier based on the entire training set and tested by the test set (the ‘true’error). Compute for each of the 4 cross validation errors the absolute value of its deviation from the true error.
- Repeat steps 2-4 10 times and report averages and standard deviations of the deviations of the cross validation errors from the true error.
elements:
datasets
datafiles
cells and doubles
mappings
classifiers
mapping types.
operations:
datasets
datafiles
cells and doubles
mappings
classifiers
stacked
parallel
sequential
dyadic.
user commands:
datasets
representation
classifiers
evaluation
clustering
examples
support routines.
introductory examples:
Introduction
Scatterplots
Datasets
Datafiles
Mappings
Classifiers
Evaluation
Learning curves
Feature curves
Dimension reduction
Combining classifiers
Dissimilarities.
advanced examples.