37Steps Updates
Here is a list of recent posts and major other updates of this website .
29 Jan 2018 | ClusterTools examples |
ClusterTools | Examples added, including all experiments of the ModeSeeking paper |
24 Jan 2018 | ClusterTools 1.1.3 |
ClusterTools | A new version of ClusterTools |
2017 | Documents |
Documents | Documents most important for understanding and using the software.. |
2017 | DisTools |
DisTools | Pages of a DisTools course have been made public accessible. |
11 Oct 2017 | PRTools environment |
PRTools | Description of the globals in the PRTools environment |
10 Mar 2017 | ClusterTools |
ClusterTools | Toolbox for cluster analysis and active learning added |
9 Mar 2017 | PRTools 5.3.3 |
PRTools | Minor updates |
31 Aug 2016 | PRTools 5.3.2 |
PRTools | Minor updates |
14 Oct 2015 | The role of densities in pattern classification |
Post | Fundamentally pattern recognition is not about statistics. So why are probabilities and densities important? |
12 Oct 2015 | Pattern Recognition: Introduction and Terminology |
eBook | An ebook introducing concepts and terminology, including a large glossary with many internal and external links and examples. |
8 Sep 2015 | Discovered by accident |
Post | An accidental observation by a student pointed to interesting possibilities of sequentially combining multi-class classifiers. |
2 Sep 2015 | PRTools 5.3.1 |
PRTools | Minor updates |
16 Aug 2015 | How to handle memory and time available for computations? |
FAQ | Describes the possible control of memory and computing time in PRTools. |
16 Aug 2015 | How to switch off the prwarning messages? |
FAQ | Introduces the new, streamlined version of the prwarning system. |
16 Aug 2015 | PRTools 5.3.0 |
PRTools | Handling of very small training sets, training time control, streamlined warning system. |
15 July 2015 | Cross-validation |
Post | How to define a proper cross-validation: How many folds? Do we need repeats? How to determine the significance?. Here are some considerations. |
12 July 2015 | How to generate a multi-dimensional banana set? |
FAQ | The standard routine for generating the 2-dimensional banana set may be used to generate a multi-dimensional problem. |
19 June 2015 | How should the PRTools ROC plot be interpreted? |
FAQ | The relation between the PRTools ROC plot and the more standard one is discussed. A conversion statement is supplied. |
17 June 2015 | PRTools advanced examples |
Examples | A set of advanced examples has been added that may be copied and run by the user. |
8 June 2015 | Adaboost and the Random Fisher Combiner |
Post | The significant elements in the Adaboost classifier are the generation of base classifiers and the combining rule. Can they be simplified? A simple experiment. |
5 May 2015 | Using the test set for training |
Post | Never use the test set for training? Sometimes this rule can be neglected. It might be proper and helpful. |
3 April 2015 | My classifier scores 50% error. How bad is that? |
Post | What error rates can we expect for a trained classifier? How good or bad is a 50% error?Some observations and bounds. |
23 February 2015 | Is every pattern recognition problem a small sample size problem? |
Post | Also applications with large training sets have to face the small sample size problem. |
21 December 2014 | PRTools 5.2.3 |
PRTools | Minor bog fixes |
24 November 2014 | Aristotle and the ugly duckling theorem |
Post | If Aristotelian designers of PR systems do not make a step in the Platonic direction, they may suffer from the ugly duckling theorem: all differences are equal. |
13 October 2014 | Why is the nearest neighbor rule so good? |
Post | Because it matches the problems of interest. |
13 September 2014 | PRTools 5.2 |
PRTools | Using categorical data, converting cell arrays into datasets and reverse and more, see updates. |
12 September 2014 | There is no best classifier |
Post | Every problem has its own best classifier. Every classifier has at least one dataset for which it is the best. So every classifier is sometimes the best. |
14 August 2014 | The ten Aristotelian categories, features and dissimilarities |
Post | A relation between the ten categories and the problem of pattern recognition. |
20 July 2014 | Surprisingly good results in flow-cytometry classification |
Post | Surprisingly good results might be a warning of something that is wrong. A report of a mistake. |
3 July 2014 | Are football results random? |
Post | Sometimes the result of a football match seems arbitrary. Watching the FIFA world cup gives clear examples. Is a test on significance possible? |
18 June 2014 | Good recognition is non-metric: true or false? |
Post | A recent paper in Pattern Recognition claims that good recognition is non-metric. Is this statement true or false? |
14 May 2014 | The Eurovision Song Contest analyzed |
Post | The results of the 2014 Eurovision Song Festival are analyzed by a clustering procedure to detect possible cultural similarities between countries. |
29 April 2014 | PRTools 5.1 available |
PRTools | Beautifications and additions |
20 March 2014 | Regularization and invariants |
Post | Regularization is equivalent to the use of invariants. Knowledge about invariants is thereby helpful for choing an effective regularization. |
28 January 2014 | Who invented the nearest neighbor rule? |
Post | Discussion on a paper by Pelillo on Alhazen. |
8 January 2014 | Random representations |
Post | Why and when are random representations good? |
24 November 2013 | Hume’s fork in pattern recognition |
Post | Facts can be true, or just happen to be true. This results in two essentially different lines of research in pattern recognition: on the models or on the observed world. |
4 November 2013 | Choosing or learning a representation? |
Post | Learning a representation needs another, initial representation on which learning can be based. A human choice is inevitable. |
30 October 2013 | How can I control classifier parameter optimization? |
FAQ | Classifier parameters, in particular the ones needed for regularization, can be optimized by the PRTools routine regoptc . Here some details are discussed. |
28 August 2013 | PRTools Examples added to the user guide |
PRTools | A large set of examples introducing the use of PRTools |
28 August 2013 | PRTools 5.0.2 available |
PRTools | Minor upgrade, generator and fixed_cell mappings introduced |
14 August 2013 | PRTools user guide |
PRTools | Major upgrade of the user guide, important for PRTools use on the command line |
10 August 2013 | Mapping types in user guide |
PRTools | The user guide has been extended with sections on the various mapping types: fixed, fixed_cell, untrained, trained, combiner, generator |
3 August 2013 | How should I interpret the outcomes of a classifier? |
FAQ | What do the numbers mean that are shown as classifier outputs? Distances? Densities? Posteriors? Confidences? |
23 July2013 | The error in the error |
Post | How large is the classification error? How large should the test set be to have a small error in the error. The worn out test set. |
16 July 2013 | PRTools5 introduction |
Page | A summary of the whys and whats of the upgrade and transition information |
15 July 2013 | PRTools5 available |
PRTools | This small but significant upgrade makes an integration with Stats, the Matlab Statsitical Toolbox possible. It integrates some of its classifiers as PRTools routines |
9 July 2013 | PRTools 4.2.5 now available |
PRTools | A summary of changes |
8 July 2013 | Pattern recognition and the art of naming |
Post | Finding names for concepts makes them useful as building blocks in our thinking. |
17 June 2013 | Pattern recognition, for better or worse? |
Post | It starts with an innocent curiosity, then it is applied. The applications may be used for targets that the scientist and the engineer did not foresee. |
3 June 2013 | Qualities and Quantities |
Post | Properties can be distinguished in qualities and quantities. Human decision making is based on the first, automatic pattern recognition on the second. Can they do the same thing? |
27 May 2013 | Classifying the exception |
Post | Exceptions do not follow the rules. That is their nature. Humans know how to handle them. Can that be learned? |
20 May 2013 | Fraud and pattern recognition |
Post | More and more reports appear about fraud in science. Does the field of pattern recognition suffer from fraud as well? Or, does it profit from it? |
13 May 2013 | Recognition, belief or knowledge? |
Post | Pattern recognition may be based on machine learning. But what constitutes the training of the machine? Belief or knowledge? Nilsson is writing a book on beliefs. |
6 May 2013 | Pattern recognition and neural networks, Kuhn and Popper |
Post | Is the neural network model good for pattern recognition? Can this be decided by conjectures and refutations? Or is the answer determined by paradigm shifts? |
28 April 2013 | Peaking summarized |
Post | Peaking (overtraining) of the real, expected and mean classification error. |
21 April 2013 | Platonic thinking |
Post | Using ideas and concepts as a basis for research. |
14 April 2013 | Trunk’s example of the peaking phenomenon |
Post | A discussion on the clearest peaking example. |
7 April 2013 | Pattern recognition at eastern |
Post | The search for eastern eggs and a cycling tour with grandma show the pattern recognition abilities of grandchildren. |
1 April 2013 | A crisis in the theory of pattern recognition |
Post | The Russian scientist A. Lerner published in 1972 a paper under the title: “A crisis in the theory of Pattern Recognition”. What was the crisis? The answer is surprising and still of actual interest. |
30 March 2013 | PRTools cheat sheet |
PRTools | PRTools in a glance, an active pdf sheet |
25 March 2013 | The curse of dimensionality |
Post | Imagine a two-class problem represented by 100 training objects in a 100-dimensional feature space. In anyway they are labeled a perfect linear classifier can be found. It is thereby not to be expected that such a classifier does generalize. So, 100 objects in a 100-dimensional space should be avoided. Or not? |
18 March 2013 | Hughes phenomenon |
Post | Hughes explanation of peaking and why it was wrong. |
11 Marcch 2013 | How to prepare my data for PRTools? |
FAQ | A basic question: what are the first steps to take from raw data towards the use of PRTools? |
11 March 2013 | The peaking paradox |
Post | Why is the intuitive truth: “to measure is to know”, limited by statistics? |
4 March 2013 | Non-metric dissimilarities are all around |
Post | Some examples are given showing that non-metric dissimilarities arise easily, both, in daily life as well as in science. |
25 February 2013 | Metric learning, a problem in consciousness |
Post | Defining a proper distance measure is a consciouness problem. Can this be done by automatic means? |
18 February 2013 | Kernel-induced space versus the dissimilarity space |
Post | The dissimilarity representation has a strong resemblance to a kernel. There are, however, essential differences in assumptions and usage. Here they are summarized and illustrated by some examples. |
11 February 2013 | Personal history on the dissimilarity representation |
Post | The previous post briefly explains arguments for the steps taken by us between 1995 and 2005. From the perspective we have now, it has become much more clear what we did in those years. In this post historical remarks are made that may sketch how research proceeds. |
9 February 2013 | Two FAQ’s answerred on scatter plots: usage and gridsize effects |
FAQ | Discussions on the limited usage of scatter plots for 2D datasets only and accuracy effects on classification boundaries caused by changes of the gridsize setting. |
4 February 2013 | The dissimilarity space – a step into the darkness |
Post | Representation by features may neglect relevant aspects, leading to class overlap. Pixels describe everything but tear the objects apart because their structure is not encoded in the representation. Structural descriptions are rich and describe the structure well, yet they do not construct a vector space. As a result, we lack a proper representation for learning from examples. Is there a way out, or are we trapped? |
28 January 2013 | What is new in PRTools |
Post | In the recent PRTools updates of September 2012 (4.2.2) November 2012 (4.2.3) and January 2013 (4.2.4) a number of tools have been added and changed of which not everybody might be aware. Here we will pay more attention to them and give some background information about their use |
26 January 2013 | PRTools 4.2.4 now available |
PRTools | A summary of changes |
23 January 2013 | The user guide in progress |
PRTools | A number of pages has been added to the user guide, in particular on elementary operations on and between datasets and mappings |
21 January 2013 | Non-Euclidean embedding |
Post | Non-Euclidean dissimilarities may be good for including knowledge about the objects in the dissimilarity measure, but how to embed them in a vector space if we want to use the standard linear algebra tools for generalization? Here the so-called pseudo-Euclidean space will be discussed. |
20 January 2013 | Follow us by RSS, Twitter or Recent Updates |
Follow us icons on sidebar added. | |
14 January 2013 | Non-euclidean and non-metric dissimilarities |
Post | Dissimilarities measures may be defined as distances in an euclidean space or such that they can be understood as euclidean distances. Euclidean distances satisfy the triangle inequality: the direct distance between two points is smaller than any detour. They are thereby metric. |
7 January 2013 | Generalization by dissimilarities |
Post | Dissimilarities have the advantage over features that they potentially consider the entire objects and thereby may avoid class overlap. Dissimilarities have the advantage over pixels that they potentially consider the objects as connected totalities, where pixels tear them apart in thousands of pieces. Consequently, the use of dissimilarities may result in better classification performances and may require smaller training sets. But how should this be realized? How to generalize from dissimilarities? |
1 January 2013 | Batch processing |
PRTools | Some mappings handling datasets or datafiles create large internal arrays. By this the speedup offered by Matlab array processing is maximized. Sometimes however these arrays become too large. Datasets applied to fixed mappings or to trained mappings may be split into smaller arrays without affecting the final result. Usually this is not possible for untrained mappings as during training all objects have to be related to each other. |
31 December 2012 | Dissimilarity measures |
Post | It has been argued that dissimilarities are potentially be a good alternative for features. How to build a good representation will be discussed later. Here the question will be faced: what is a good measure? What type of measurement device should be used? What properties do we demand? |
20 December 2012 | How can PRTools be used for studying the dissimilarity representation? |
FAQ | Many studies have been presented on the dissimilarity representation in which experiments are based on PRTools. A large collection of research tools has thereby been developed. As research in this direction is still in progress by various researchers, a stable and consistent toolbox is not yet ready. There are however many possibilities to develop and run dissimilarity based experiments directly on PRTools. Here follows a very short introduction. |
17 December 2012 | Dissimilarities |
Post | In previous posts the usages of features and pixels are discussed for representing objects. Pros and cons are sketched. Here a third alternative will be considered: the direct use of dissimilarities. First the conclusions on features and pixels will be summarized. |