PRTools examples: Datafiles

Datafiles are children of datasets. It is assumed that the reader is familiar with the introductory sections of the user guide:

Here we will present the same example as for the datasets but now for datafiles. The  kimia dataset is available as a datafile. First add the PRTools datafile base to the path if not yet available:


Accept download now load the images

A = kimia_images % accept downloading again, neglect warnings,

Note that the images have different sizes. They are still on disk and loaded whenever needed. It is a 18-class data. We select just two classes:

B = selclass(A,{'elephant','camel'})

It is clearly visible that the images have different sizes. We will show now how features can be defined for datafiles.

feat1 = im_stat([],'sum')
feat2 = filtim('bwperim')*im_stat([],'sum')
C = B*[feat1 feat2]

The first two commands define filters, fixed mappings that will convert every object (in our case images) in a datafile to a feature. feat1 is the sum off all pixels, in black-and-white images the area of the objects. feat2 first finds for the blobs their contour pixels and then counts them. The third command concatenates the two feature mappings and applies the datafile. All three commands just store the processing in the new datafile C. This will be executed by conversion to a dataset:

X = C*datasetm % this is the same as X = prdataset(C);

Now we have a dataset which can be plotted and tested:

scatterd(X,'legend'); axis equal

A better result is obtained by scaling the axes:

Y = X*mapex(scalem,'variance');
figure; scatterd(Y,'legend'); axis equal


Apply the above operations to the entire datafile A.

elements: datasets datafiles cells and doubles mappings classifiers mapping types.
operations: datasets datafiles cells and doubles mappings classifiers stacked parallel sequential dyadic.
user commands: datasets representation classifiers evaluation clustering examples support routines.
introductory examples: Introduction Scatterplots Datasets Datafiles Mappings Classifiers Evaluation Learning curves Feature curves Dimension reduction Combining classifiers Dissimilarities.
advanced examples.