modeclust_batch
MODECLUST_BATCH
KNN mode-seeking clustering, batch processing for large datasets
[LAB, K] = MODECLUST_BATCH(A, K, DIST)
Input | A | Dataset | K | Vector with numbers of neighbours to search for local mode. Default: smart sampling. | DIST | Distance function (name or handle) or mapping to be used for clustering (optional; default: @DISTM). If DIST is a function it is expected to take two double arrays as input arguments: if D = DIST(A1, A2) then SIZE(D) is | [SIZE(A1, | 1) SIZE(A2, 1)]. If DIST is a mapping then it should be possible to use it like D = A1*(A2*DIST), i.e. it has to be a trainable mapping and it should be able automatically convert double arrays to datasets. E.g., for PROXM it is possible to call MODECLUST(A, K, PROXM()), but for DISTM we need to use | MODECLUST(A, | K, @DISTM) or MODECLUST(A, K, 'distm'). |
Output | LAB | Indices of mode samples | K | The used input K vector (useful if K was not specified in the call and was autimatically generated) |
Description A K-NN modeseeking method is used to assign objects to their nearest mode. Object densities are defined by one over the distance to the K-th nearest neighbour. Clusters are defined by recursively jumping for every object to the object with the highest density in the local neighborhood.
K can be a vector of neighborhood sizes, which is much faster. Default K: a set of values determined by a geometric series.
This routine is about twice as fast as fast as MODECLUST as by storing distances that are needed on disk.
The multilevel clustering can be made nested by RECLUSTN. Reference(s)R.P.W. Duin, A.L.N. Fred, M. Loog, and E. Pekalska, Mode Seeking Clustering by KNN and Mean Shift Evaluated, Proc. SSPR & SPR 2012, LNCS, vol. 7626, Springer, 2012, 51-59. Example(s)
delfigs
a = gendatm(5000); % generate 5000 objects in 8 classes
[lab,k] = modeclust(a);
for j=1:size(lab,2)
nclust = numel(unique(lab(:,j)));
if nclust < 20 & nclust %gt 1
figure; scattern(prdataset(a,lab(:,j)));
title(['K = ' int2str(k(j)) ' --%gt ' int2str(nclust) ' Clusters']);
end
end
showfigs
lab = modeclust_batch(a, [], @distm); % explicitly using DISTM
lab = modeclust_batch(a, [], proxm([], 'm', 1)); % using PROXM mapping
See also
mappings, datasets, distm, proxm, dclustm, modeclust_batch, modeclustf, reclustn, This file has been automatically generated. If badly readable, use the help-command in Matlab. |
|