More and more reports appear about fraud in science. Does the field of pattern recognition suffer from fraud as well? Or, does it profit from it? Googling “fraud” and “pattern recognition” almost entirely results in projects, meetings and studies on the use of pattern recognition for fraud detection.The main issue is not fraud in science but in finance, e.g. the use of stolen credit card information or stolen identities. Reports on fraud in the pattern recognition field itself could not be found. May it be concluded that this is an honest field?
Fraud in science
Fraud in science is usually based on wrong observations. They might be the result of just a mistake or bad experimentation (faster than the speed of light, cold fusion) and overenthusiastic publicity. In that case it might be too strong too call it fraud. But there are also clear examples of removing undesired results and even entirely fabricating data, which for instance results in statements such as “Meat eaters are selfish and less social” (Diederik Stapel). See The great betrayal: fraud in science for some other examples.
Publications like these harm the field. Initially, a large positive publicity is generated. When the error or the fraud is detected it converts into the opposite. Could things like these happen in pattern recognition? What type of observation should be reported that will cause a lot of fuzz and cannot be directly refuted?
Amazing claims are possible and have been done concerning human perception and biological studies on learning by animals: people that can see in the dark or that can read other peoples thoughts, pigeons that learn from a single example or monkeys showing an amazing intelligence and develop a language. But what astonishing claims could be made about automatic recognition?
Also in pattern recognition?
By some technological developments in the field I was flabbergasted, e.g. when real time face recognition was suddenly available, or by discovering that Google already knows what I am searching for before I have finished typing. There is no question here that these results are real, we see the correctness before our eyes. Other inventions, however, are more difficult to verify. Recently, it has been discovered that a frequently used bomb detector was fake. Its results were not better than random. The reason why it took a long time before it was discovered is that clear, independent test procedures are difficult to develop. Think about a government that demands that the error in fingerprint recognition should be less than 10-6. Large databases are needed to verify a claim that this demand is met.
An automatic bomb detector is not so strange. Dogs are able to smell some explosives. A lot of research has been dedicated to the design of an artificial nose. It is thereby to be expected that at some moment explosives can be detected by artificial means. The fraud is in this case just about a not yet properly working sensor. More fundamental research tries to find the good procedures to learn from examples.
On the internet many statements can be found on the performance of specific algorithms. For instance “SVM is the best”, “Adaboost is the best” or “neural networks are the best” yield hundred thousands of hits. Most of them, however, are phrased in a well-defined context and just report the result of a particular experiment. As such there may be nothing wrong with conclusions like these.
More general pattern recognition research
Researchers in pattern recognition are often looking for more general conclusions than just to establish the best procedure in a specific application. They want to establish results that hold for a class of problems. If possible they want to characterize this class. For that reason many papers report results for a number of applications. Reviewers often demand confirmation over several datasets. In principle this is fair, but it is also the cause of a problem.
There is a very large set of problems available on the internet. In addition, many groups are able to deliver new datasets for real world applications. Obviously a selection has to be made. It is easy to make it such that it shows the desired point. It is good practice to test an algorithm within a single problem with a randomly selected chosen test set from that problem. But how do you make a random choice from a set of problems if this set is not yet sharply defined?
Testing the suitability of a procedure for a set of problems may be done in a similar way. I have never seen, however, a study in which the authors first define such a set (e.g. the result of a previous study) and then carefully describe how they test the hypothesis that a certain procedure is good for this set by making a proper sampling of problems out of this set. That is what should be done if we want to arrive at more general conclusions about the suitability of pattern recognition procedures.
It might be too strong to state that it is a fraud if somebody searches a suitable problem for his solution. But we should realize that this will not confirm a general statement about the suitability of his solution.
Filed under: PR System