PRTools Contents

PRTools User Guide

selproto

SELPROTO

Select prototypes from dataset, generator mapping

   [P,I,J] = SELPROTO(A,N,TYPE,PAR,SEED)
   [P,I,J] = A*SELPROTO(N,TYPE,PAR,SEED)

Input
 A PRTools dataset or double matrix
 N Scalar, number of prototypes to be selected.  If N is a row vector with as many elements as A has classes,  the selection is done clas wise.  If 0 < N < 1, the corresponding fraction of A is selected.  Default is N = 1.
 TYPE Character string naming the algorithm (lower case supported):
 'F' or 'FFT', the Farthest First Traversal.
 'K' or 'KMEANS', the k-means algorithm. The nearest objects in A are returned. (default)
 'M' or 'MMEANS', the traditional k-means returning the cluster  means instead of their nearest objects. In I a NaN is returned.
 'C' or 'KCENTRES', the k-centres algorithm.
 'R' or 'RANDOM', random selection.
 PAR Initialisation: an index for an object in A or a character:
 'R', random selection.
 'D', deterministic selection (default). The object in A nearest  to the mean of A (default).
 SEED A desired state of random number generation applied to RANDRESET.

Output
 P PRTools dataset, or double matrix in case A is double,  containing the selected prototypes. If TYPE is 'M' these are not  objects from A and P is a double array.
 I The indices of the selected objects in A, P = A(I,:). I = NaN in  case TYPE is 'M'.
 J Indices of the not-selected objects. J = NaN in case TYPE is 'M'.

Description

This routine selects some possibly interesting objects, e.g. for building  a representation set from a feature representation. With an exception for  TYPE = 'M', objects from A are returned. In case PAR = 'D', the  procedures are deterministic (except for TYPE = 'R'): FFT starts with the  most remote object from the dataset mean. The KMEANS algorithms start  with the N objects selected by the FFT algorithm. KCENTRES has a greedy,  deterministic solution.

If A is a cell array of datasets the command is executed for each  dataset separately. Results are stored in cell arrays. For each dataset  the random seed is reset, resulting in aligned sets for the generated  datasets P if the sets in A were aligned.

Example(s)

 % compute a dissimilarity based classifier for a representation set of
 % 10 objects using a Minkowski-1 distance.
 a = gendatb;
 u = selproto(10)*proxm('m',1)*fisherc;
 w = a*u;
 scatterd(a)
 plotc(w)

See also

datasets, mappings, gendat, randreset, prkmeans, kcentres,

PRTools Contents

PRTools User Guide

This file has been automatically generated. If badly readable, use the help-command in Matlab.