Imagine a two-class problem represented by 100 training objects in a 100-dimensional feature (vector) space. If the objects are in general position (not by accident in a low-dimensional subspace) then they still fit perfectly in a 99-dimensional subspace. This is a ‘plane’, formally a hyperplane, in the 100-dimensional feature space. We will argue that this…

## Representation Archives

## The curse of dimensionality

## Hughes phenomenon

The peaking paradox was heavily discussed in pattern recognition after a general mathematical analysis of the phenomenon was published by Hughes in 1968. It has puzzled researchers for at least a decade. This peaking is a real world phenomenon and it has been observed many times. Although the explanation by Hughes seemed general and convincing,…

## The peaking paradox

To measure is to know. Thereby, if we measure more, we know more. This is a fundamental understanding in science, phrased by Kelvin. If we want to know more, or want to increase the accuracy of our knowledge, we should observe more. How to realize this in pattern recognition, however, is a permanently returning problem…

## Non-metric dissimilarities are all around

A big advantage of the representation of objects by a dissimilarity space over the use of kernels is that it has no problems with the usage of non-Euclidean dissimilarity measures. More specifically, it can handle non-metric measures as well. Here, we will show common examples that such dissimilarities arise easily, both, in daily life as…

## Kernel-induced space versus the dissimilarity space

The dissimilarity representation has a strong resemblance to a kernel. There are, however, essential differences in assumptions and usage. Here they will be summarized and illustrated by some examples. Dissimilarities and kernels are both functions describing the pairwise relations between objects. Dissimilarities can be considered as a special type of kernel if kernels are understood…

## Personal history on the dissimilarity representation

Personal and historical notes The previous post briefly explains arguments for the steps taken by us between 1995 and 2005. From the perspective we have now, it has become much more clear what we did in those years. Below a few historical remarks will be made as they sketch how research may proceed. It all started when…

## The dissimilarity space – a step into the darkness

Features are defined such that they focus on isolated aspects of objects. They may neglect other relevant aspects, leading to class overlap. Pixels in images describe everything but pixel-based vector spaces tear the objects apart because their structure is not consciously encoded in the representation. Structural descriptions are rich and describe the structure well, yet they…

## Non-Euclidean and non-metric dissimilarities

Dissimilarities measures may be defined as distances in an Euclidean space or such that they can be interpreted as the Euclidean distances. The Euclidean distances satisfy the triangle inequality: the direct distance between two points is smaller than any detour. They are thereby metric. Euclidean Assume we are given a set of pairwise dissimilarities between…

## Generalization by dissimilarities

Dissimilarities have the advantage over features that they potentially consider the entire objects and thereby may avoid class overlap. Dissimilarities have the advantage over pixels that they potentially consider the objects as connected totalities, where pixels tear them apart in thousands of pieces. Consequently, the use of dissimilarities may result in better classification performances…

## Dissimilarity measures

It has been argued that dissimilarities are potentially be a good alternative for features. How to build a good representation will be discussed later. Here the question will be faced: what is a good measure? What type of measurement device should be used? What properties do we demand? If features are given or can be…