ct_clustprocs Review of cluster procedures.
PRTools and ClusterTools should be in the path
Goto ClusterTools examples for a review of all examples.
Download the m-file from here.
It shows the groundtruth of a 2D example with 10 clusters and compares it with the result of 11 cluster procedures. Evaluation is done by assigning all objects in a cluster to the true class of its prototype.
In CLUSTEX1 the same experiment is coded in a more elaborate way.
Contents
Prepare data and set parameters
randreset % initialise random generator k = 10; % Search for k clusters n = 1000; % Speed up hierarchical and exemplar clustering by reducing % the dataset to at most n points by modeseeking m = 3000; % total number of objects in the dataset x = gendatclust1(m); % generate a 10 cluster problem (3000 objects) a = +x; % create unlabeled data prtime(5) % stop iterations after some seconds
Show the original data with target classes
figure;
lab = getnlab(x);
scatn(lab,a,'Ground truth');
define cluster routines
The cluster routines are defined as PRTools mappings for creating k clusters. For every cluster procedure its scatterplot for 10 clusters is shown. In addition the classification error is printed based on the assignment of all objects in a cluster to the true class of its prototype.
proc = cell(1,11); proc{1} = clustk(k,'KMeans'); % define KMeans proc{2} = clustk(k,'KCenters'); % define KCenters proc{3} = clustk(k,'KMedoids'); % define KMedoids proc{4} = clusth(k,'Single Linkage',n); % define Single Linkage proc{5} = clusth(k,'Average Linkage',n); % define Average Linkage proc{6} = clusth(k,'Central Linkage',n); % define Central Linkage proc{7} = clusth(k,'Complete Linkage',n); % define Complete Linkage proc{8} = cluste(k,n); % define Exemplar proc{9} = clustm(k); % define KNN ModeSeek proc{10} = clusts(k); % define Mean Shift proc{11} = clustf(k); % define FFT figure; set(gcf,'Position',[100 100 1200 800]) subplot(3,4,1) lab = getnlab(x); scatn(lab,a,'Ground truth'); for j=1:11 subplot(3,4,j+1); scatn(ones(size(x,1),1),x,getname(proc{j})); text(8,-2,'... in progress ...','FontSize',15); end % compute results for j=1:11 subplot(3,4,j+1); lab = a*proc{j}; scatn(lab,a,getname(proc{j})); % compute active learning error based on prototypes e = clusteval(lab,x,'actl'); text(10,-2,num2str(e,'%5.3f'),'FontSize',15); drawnow; end
PR_Warning: EXEMPLAR: Examplar clustering updating stopped by PRTIME after 193 iterations
comments
The main purpose of this example is to show how the software can be used. Note that using 100000 (m) instead 3000 objects is also feasible due to the preclustering used in the routines and the default settings of the parameters.