The Russian scientist A. Lerner published in 1972 a paper under the title: “A crisis in the theory of Pattern Recognition”. This is definitely a title that attracts the attention of researchers interested in the history of the field. What was it to have appeared as the crisis? The answer is surprising, in short it…
PR System Archives
A crisis in the theory of pattern recognition
The curse of dimensionality
Imagine a two-class problem represented by 100 training objects in a 100-dimensional feature (vector) space. If the objects are in general position (not by accident in a low-dimensional subspace) then they still fit perfectly in a 99-dimensional subspace. This is a ‘plane’, formally a hyperplane, in the 100-dimensional feature space. We will argue that this…
Hughes phenomenon
The peaking paradox was heavily discussed in pattern recognition after a general mathematical analysis of the phenomenon was published by Hughes in 1968. It has puzzled researchers for at least a decade. This peaking is a real world phenomenon and it has been observed many times. Although the explanation by Hughes seemed general and convincing,…
The peaking paradox
To measure is to know. Thereby, if we measure more, we know more. This is a fundamental understanding in science, phrased by Kelvin. If we want to know more, or want to increase the accuracy of our knowledge, we should observe more. How to realize this in pattern recognition, however, is a permanently returning problem…
Generalization by dissimilarities
Dissimilarities have the advantage over features that they potentially consider the entire objects and thereby may avoid class overlap. Dissimilarities have the advantage over pixels that they potentially consider the objects as connected totalities, where pixels tear them apart in thousands of pieces. Consequently, the use of dissimilarities may result in better classification performances…
Dissimilarity measures
It has been argued that dissimilarities are potentially be a good alternative for features. How to build a good representation will be discussed later. Here the question will be faced: what is a good measure? What type of measurement device should be used? What properties do we demand? If features are given or can be…
Dissimilarities
In previous posts the usage of features and pixels is discussed for representing objects in numerical ways. Pros and cons are sketched. Here, a third alternative will be considered: the direct use of dissimilarities. First, we summarize the conclusions on the use of features and pixels. Features are well suited to represent objects by numbers…
The pixel representation looses information
The pixel representation in its broadest sense samples the objects and uses them to build a vector space. If the sampling is sufficiently dense, it covers everything. How can we loose information? What is wrong? Let us take an image and sample it as on the left above. The pixels can now be ordered as…
PRTools: building blocks for pattern recognition
In science knowledge grows from new observations. Pattern recognition aims to contribute to this process in a systematic way. How is this organized in PRTools? What are the building blocks and how are they glued together? Do they constitute a sprawl or an interesting castle? The most simple place…