Operations on Classifiers
Classifiers are a special type of mappings. All operations defined for mappings apply to classifiers as well. What is specific for trained classifiers is that their output space, the space they map to, has axes that correspond to the classes the classifier has been trained for. So the basic mapping definition:
input space --> mapping --> output space
looks for a classifier:another:
input space --> classifier --> classifier space
Applied to a collection of objects represented in these spaces by datasets A
and D
, coded by D = A*W
:
input dataset A --> classifier W --> classification matrix D
The dataset D
has as many columns (dimensions) as there are classes defined for W
. The class names stored in the classifier W
(that are derived from the dataset used for training W
and that can be found by classname(W)
) are used for setting the feature labels of D
. They can be retrieved by getfeatlab(D)
.
The standard classifiers used in PRTools are of two types: density based and distance based classifiers. The density based classifiers (e.g. qdc
and parzenc
) output in D
the densities of the objects of A
weighted by the class prior as set by the dataset used for training W
. The distance based classifiers result in class confidences: numbers between 0 and 1 derived from a sigmoid mapping (sigm
) of a linearly scaled distances to the classifier. These distances are between -inf and +inf. The scaling is optimized for the training set by cnormc
using an ML approach that interprets the confidences as class posteriors. The scaled distances can be retrieved by D*invsigm
.
The outputs of density based classifiers can be converted to class posteriors (which can also be understood as class confidences) by D = D*classc
. This has no influence on the outcomes of distnace based classifiers. The following constructs are equivalent (suppose U is an untrained classifier and T a dataset used for training:
W = T*U; D = A*W; D = D*classc; W = T*U; D = A*W*classc; W = T*U*classc; D = A*W; W = T*(U*classc); D = A*W;
Like mappings, classifiers can be combined in several ways:
They are discussed separately.
- Stacked combining combines sets of classifiers defined for the same feature space. This is done by a horizontal concatenation:
W = [W1 W2 W3 ... Wn]
. AsD = A*W = A*[W1 W2 W3 ... Wn]
, the dimensionality ofD
(number of columns) is equal ton*c
, ifc
is the number of classes for each of the classifiers. Combining rules likeD*maxc
operate class wise over the columns, so they combine all columns defined for the same class by the combining operator. There are many of such operators defined in PRTools, fixed ones (likemaxc
,prodc
andminc
) and trainable ones. See the combining classifier page for an overview. - Parallel combining combines sets of classifiers defined for different input spaces. This is done by a vertical concatenation of mappings:
W = [W1; W2; W3; ...; Wn]
. This should be called by a concatenated set of datasets represented in the various feature spaces:D = [A1 A2 A3 ... An]*W
. Liked for the stacked combiner this results in a datasetD
ofn*c
columns. The same combining rules apply. See the combining classifier page for an overview - Sequential combining combines a first mapping with a classifier into a new classifier using the output space of the mapping as input for the classifier. The resulting classifier is defined for the original input space of the mapping. The mapping and the classifier can both be trained or untrained, e.g.
U = pca*fisherc
. Training byW = T*U
first trains the pca and computes a classifier in the resulting pca-space. Afterwards mappings are combined such that it can operate on a datasetA
byD = A*W
, provided thatA
is defined for the same space asT
. - Dyadic combining combines two classifiers or classifiers with doubles (scalars or matrices of the proper size), e.g. W = W1+W2, or
W = s*W1
in whichs
is a scalar orW = W1>W2
.
Links
operations: basic, datasets, datafiles, mappings, classifiers, stacked, parallel, sequential, dyadic
commands: datasets, representation, classifiers, evaluation, clustering and regression, examples, support