Classification Archives

The role of densities in pattern classification

Is pattern recognition about statistics? Well, it depends. If you see as its target to understand how new knowledge can be gained by learning from examples the role of statistics may be disputable. Knowledge lives in the human mind. It is born in the marriage between observations and reasoning. If we follow this process consciously…

Read the rest of this entry

Discovered by accident

Some discoveries are made by accident. The wrong road brought a beautiful view. An arbitrary book from the library gave a great, new insight. A procedure was suddenly understood in a discussion with colleagues during a poster session. In a physical experiment a failure in controlling the circumstances showed a surprising phenomenon. Children playing with…

Read the rest of this entry

Cross-validation

A returning question by students and colleagues is how to define a proper cross-validation: How many folds? Do we need repeats? How to determine the significance?. Here are some considerations. Why cross-validation? Cross-validation is a procedure for obtaining an error estimate of trainable system like a classifier. The resulting estimate is specific for the training…

Read the rest of this entry

Adaboost and the Random Fisher Combiner

Like in most areas, pattern classification and machine learning have their hypes. In the early 90-s the neural networks awoke and enlarged the community significantly. This was followed by the support vector machine reviving the applicability of kernels. Then, from the turn of the century the combining of classifiers became popular, with significant fruits like adaboost…

Read the rest of this entry

Using the test set for training

Never use the test set for training. It is meant for independent validation of the training result. If it has been somewhere included in the training stage it is not independent anymore and the evaluation result will be positively biased. This has been my guideline for a long time. Some students were shocked when I…

Read the rest of this entry

My classifier scores 50% error. How bad is that?

What error rates can we expect for a trained classifier? How good or bad is a 50% error? Well, if classes are separable, a zero-error classifier is possible. But a very bad classifier may assign every object to the wrong class. Generally, all errors between zero and one are possible: . Much more can be…

Read the rest of this entry

Why is the nearest neighbor rule so good?

Just compare the the new observations with the ones stored in memory. Take the most similar one and use its label. What is wrong with that? It is simple, intuitive, implementation is straightforward (everybody will get the same result), there is no training involved and it has asymptotically a very nice guaranteed performance, the Cover…

Read the rest of this entry

There is no best classifier

Every problem has its own best classifier. Every classifier has at least one dataset for which it is the best. So there is no end to pattern recognition research as long as there are problems that are at least slightly different from all other ones that have been studied so far. The reason for this…

Read the rest of this entry

Are the evaluation results of the new procedure you worked on for months, worse or at most marginally better than the baseline procedure? Don’t worry, it happens all the time. Are they surprisingly good? Congratulations! You may write an interesting paper. But can you really understand why they are so good? Check, check, and double-check….

Read the rest of this entry

Are football results random?

The recent results in the round of 16  of the football world championship in Brazil showed a remarkable statistic. The eight group winners all had to play against a runner-up of another group. All group winners won. Is that significant? Does this show that the result of a match is not random? Watching them strongly…

Read the rest of this entry

 Page 1 of 3  1  2  3 »