Batch processing

Some mappings handling datasets or datafiles create large internal arrays. By this the speedup offered by Matlab array processing is maximized. Sometimes however these arrays become too large. Datasets applied to fixed mappings or to trained mappings may be split into smaller arrays without affecting the final result. Usually this is not possible for untrained mappings as during training all objects have to be related to each other.

The prtools command setbatch applied to a mapping takes care that datasets are automatically split in smaller batches. It can set or reset batch processing. Default batch sizes are initially 1000 objects. This can be reset by the prglobal command. PRTools commands like plotc that may profit from batch processing use it internally.

Users may set batch processing for fixed and trainable mappings. During training itself it is automatically disabled, but the batch settings are copied from the untrained to the trained classifier. In sequential combining it holds for all mappings involved. Some incorrect and correct examples:

data_out = data_in*my_fixed_mapping*setbatch
% wrong! batch processing is set after mapping is applied


data_out  = data_in*(my_fixed_mapping*setbatch)
% batch processing used during mapping


w = trainset*my_untrained_mapping*setbatch
% batch processing set for the resulting trained mapping
testset*w
% batch processing used during execution of testset


w = a*(pca([],0.90)*my_untrained_mapping*setbatch)
% batch processing set for the resulting sequential mapping
% this is just used when w is used for mapping new data


w = a*(pca([],0.90)*(my_untrained_mapping*setbatch))
% same as above


w = a*((pca([],0.90)*setbatch)*my_untrained_mapping)
% same as above

 

Print Friendly, PDF & Email