selproto

SELPROTO

Select prototypes from dataset, generator mapping

[P,I,J] = SELPROTO(A,N,TYPE,PAR,SEED)
[P,I,J] = A*SELPROTO(N,TYPE,PAR,SEED)

Input
A PRTools dataset or double matrix
N Scalar, number of prototypes to be selected. If N is a row vector with as many elements as A has classes, the selection is done clas wise. If 0 < N < 1, the corresponding fraction of A is selected. Default is N = 1.
TYPE Character string naming the algorithm (lower case supported):
'F' or 'FFT', the Farthest First Traversal.
'K' or 'KMEANS', the k-means algorithm. The nearest objects in A are returned. (default)
'M' or 'MMEANS', the traditional k-means returning the cluster means instead of their nearest objects. In I a NaN is returned.
'C' or 'KCENTRES', the k-centres algorithm.
'R' or 'RANDOM', random selection.
PAR Initialisation: an index for an object in A or a character:
'R', random selection.
'D', deterministic selection (default). The object in A nearest to the mean of A (default).
SEED A desired state of random number generation applied to RANDRESET.

Output
P PRTools dataset, or double matrix in case A is double, containing the selected prototypes. If TYPE is 'M' these are not objects from A and P is a double array.
I The indices of the selected objects in A, P = A(I,:). I = NaN in case TYPE is 'M'.
J Indices of the not-selected objects. J = NaN in case TYPE is 'M'.

Description

This routine selects some possibly interesting objects, e.g. for building a representation set from a feature representation. With an exception for TYPE = 'M', objects from A are returned. In case PAR = 'D', the procedures are deterministic (except for TYPE = 'R'): FFT starts with the most remote object from the dataset mean. The KMEANS algorithms start with the N objects selected by the FFT algorithm. KCENTRES has a greedy, deterministic solution.

If A is a cell array of datasets the command is executed for each dataset separately. Results are stored in cell arrays. For each dataset the random seed is reset, resulting in aligned sets for the generated datasets P if the sets in A were aligned.

Example(s)

% compute a dissimilarity based classifier for a representation set of
% 10 objects using a Minkowski-1 distance.
a = gendatb;
u = selproto(10)*proxm('m',1)*fisherc;
w = a*u;
scatterd(a)
plotc(w)

Select prototypes from dataset, generator mapping

Description

Example(s)

See also