ClusterTools Guide
Here the main properties and commands of the ClusterTools toolbox for cluster analysis are described. It is a Matlab toolbox and is built on top of PRTools, which should be in the path. It may be used for active learning as well.
- PRTools introduction page
- PRTools Table of Contents
- ClusterTools Table of Contents
- ClusterTools download
Significant properties:
- ClusterTools contains routines for the traditional algorithms like KMeans, KCentres and various hierarchical clustering schemes, as well as more advanced algorithms like MeanShift, KNN-modeseeking and Exemplar.
- Most algorithms run on feature representations as well as on dissimilarity matrices.
- As a standard multilevel clusterings are returned. These are sets of clusterings between a small set of clusters (2-10) up to thousands of clusters.
- Clusters are represented by a prototype; results are given by pointers to the cluster prototype
- Evaluation is always done by various comparisons with given sets of labels.
- The toolbox contains routines for active learning and semi-supervised learning.
- Some routines, like KNN-modeseeking, may run on millions of objects given by hundreds of features.
- All routines have a preclustering option using the KNN-modeseeking, Thereby they can be applied to any dataset of the size KNN-modeseeking can handle.
Documentation
As ClusterTools is based on PRTools, users should be aware of their relation as well as of the global settings made by PRTools.
A summary of the user commands to find clusters in unlabeled data and to evaluate them by labeled data:
cluster routines, classification, reclustering, evaluation, support