What do we want to recognize? What are the objects to be classified? What are our examples? What will be our observations? What are the classes we want to distinguish? Are they human defined or are we searching for some truth that might be hidden in the observations but not clearly defined? Many choices to be made, many issues to be studied.
When do classes overlap?
First it should be stressed that for many recognition problems the classes that are distinguished by a human observer are separable. Given the same task again he will recognize the same objects as ‘chair’, ‘table’, ‘bicycle’, ‘car’ and so on. This also holds for handwritten characters, faces or animals, under the condition that he is given exactly the same observation, e.g. the same picture.
Only for complicated and ill defined patterns like lung x-ray images or microscope pictures of tissues an experienced observer like a pathologist may disagree with himself over time. Just in such cases the classes of the real world objects may overlap: one and the same observation may receive different class labels. In many tasks a human observer will classify objects consistently by which classes of real world objects are separable.
It has already been discussed elsewhere that even if classes are separable they may still overlap in the representation. Representations like the use of features reduce the objects such that some differences disappear. Consequently different objects have the same representation and we have to face the overlapping class problem.
What is in fact the recognition problem to be solved?
The problem of class overlap arises when the class assignments of different observers are mixed, or when the classes are not assigned on the basis of what is observed but on their origin (e.g. the character the writer intended to write) or on the future development (e.g. the disease the patient will develop or not).
In the definition of the recognition problem already a first crucial step is made that may complicate the task severely: is it the intention to mimic a single human observer, or is the task defined over a group of people or even considering the development of the object over time. This ambition is much higher: we want to make a system that performs better than is a single human observer can do. A proper setup of such a task needs careful considerations. One of the consequences may be class overlap.
Consequences for the classifier
The difference between a task for which the classes can be considered to be separable and one for which they will overlap has severe consequences. Classifiers that have to distinguish separable classes do not need to estimate probabilities as different classes will never have a non-zero probability for the same objects. So all classification procedures that use in one way or another probabilities are overdone. They make estimates that are not needed. But what are the generalization procedures that do not use statistics? They should be based on distances only, see a previous discussion. We will leave this point for the time being.
Consequences for the training set
The second consequence of being sure that classes do not overlap is very significant and hardly studied until now. We pointed to that recently: the set of examples does not to have to be representative for the future objects to be classified in the statistical sense. We can drop the demand that the training set should drawn from the same distribution as the objects to be classified. This is great news as this demand is severe and often impossible to fulfil.
If a classifier should be found between non-overlapping classes on the basis of examples, it is sufficient to select the objects close to the class boundaries. On the basis of expert knowledge these may be selected. This is a new way of using giving knowledge hardly exploited so far.
Both problems, how to select a set of examples in a smart way and how to build a classifier on such a dataset, need to be studied further. In 2005 we wrote an unpublished paper on this topic.