Evaluation Archives

The role of densities in pattern classification

Is pattern recognition about statistics? Well, it depends. If you see as its target to understand how new knowledge can be gained by learning from examples the role of statistics may be disputable. Knowledge lives in the human mind. It is born in the marriage between observations and reasoning. If we follow this process consciously…

Read the rest of this entry

Discovered by accident

Some discoveries are made by accident. The wrong road brought a beautiful view. An arbitrary book from the library gave a great, new insight. A procedure was suddenly understood in a discussion with colleagues during a poster session. In a physical experiment a failure in controlling the circumstances showed a surprising phenomenon. Children playing with…

Read the rest of this entry

Cross-validation

A returning question by students and colleagues is how to define a proper cross-validation: How many folds? Do we need repeats? How to determine the significance?. Here are some considerations. Why cross-validation? Cross-validation is a procedure for obtaining an error estimate of trainable system like a classifier. The resulting estimate is specific for the training…

Read the rest of this entry

Using the test set for training

Never use the test set for training. It is meant for independent validation of the training result. If it has been somewhere included in the training stage it is not independent anymore and the evaluation result will be positively biased. This has been my guideline for a long time. Some students were shocked when I…

Read the rest of this entry

My classifier scores 50% error. How bad is that?

What error rates can we expect for a trained classifier? How good or bad is a 50% error? Well, if classes are separable, a zero-error classifier is possible. But a very bad classifier may assign every object to the wrong class. Generally, all errors between zero and one are possible: . Much more can be…

Read the rest of this entry

Are football results random?

The recent results in the round of 16  of the football world championship in Brazil showed a remarkable statistic. The eight group winners all had to play against a runner-up of another group. All group winners won. Is that significant? Does this show that the result of a match is not random? Watching them strongly…

Read the rest of this entry

The error in the error

How large is the classification error? What is the performance of the recognition  system? At the end this is the main question, in applications, in proposing novelties, in comparative studies. But how trustworthy is the number that is measured, how accurate is the error estimate? The most common way to estimate the error of a…

Read the rest of this entry

Peaking summarized

Pattern recognition learns from examples. Thereby, generalization is needed. This can only be done if the objects, or at least the differences between pattern classes have a finite complexity. That is what peaking teaches us. We will go once more through the steps. (See also our previous discussions on peaking and overtraining). The basic cause…

Read the rest of this entry

Trunk’s example of the peaking phenomenon

In 1979 G.V. Trunk published a very clear and simple example of the peaking phenomenon. It has been cited many times to explain the existence of peaking. Here, we will summarize and discuss it for those who want to have a better idea about the peaking problem. The paper presents an extreme example. Its value…

Read the rest of this entry

The curse of dimensionality

Imagine a two-class problem represented by 100 training objects in a 100-dimensional feature (vector) space. If the objects are in general position (not by accident in a low-dimensional subspace) then they still fit perfectly in a 99-dimensional subspace. This is a ‘plane’, formally a hyperplane, in the 100-dimensional feature space. We will argue that this…

Read the rest of this entry

 Page 1 of 2  1  2 »