The structure of the question is as follows: at first, I provide the concept of collective recognition, further I provide explanation of the various methods of group classification that I found, in the end I introduce you the question. Those who are experts in this field and may not need explanations might just look at the headlines go straight to the question.
What is a collective recognition/classification
What is meant by the term collective recognition is the task of using multiple classifiers (committee, ensemble, etc.), each of which will decide on the class of one entity with the subsequent coordination of their decisions with the help of a certain algorithm. Using a set of classifiers, typically lead to higher recognition accuracy and better computational efficiency indicators.
Some approaches of multiple classifiers decisions integration:
- based on the concept of classifiers’ competence areas and procedures that assess the competence of classifiers with respect to each input of the classification system.
- methods for combining classifiers decisions based on the use of neural networks.
Competence areas method
The idea of collective classification based on the competence areas is that each base classifier can work well in some feature space area (area of competence), excelling in this area remaining classifiers in terms of accuracy and reliability of decisions. The area of competence of each base classifier must somehow be estimated. Appropriate program called referee. Classification task is solved so that each algorithm is only used in its own competence area, i.e. where it produces the best results compared to other classifiers. At the same time in each area the decision of only one classifier is taken into account. However, you must have certain algorithm that for any input determines which of the classifiers is the most competent.
So, one approach suggests that with each classifier a special algorithm (the referee, which is designed to assess the competence of the classifier) is used. What is meant by the term “competence of the classifier” in a given region of space of classification objects representation is the accuracy, i.e. probability of correct classification of objects whose description belongs to that area.
The general collective recognition training scheme that is based on competence assessment consists of 2 steps (Fig. 1). At the 1-st step training and testing of each individual base classifier is implemented. This step does not differ from conventional training schemes. In the next step, after each classifier testing, the training sample that was used in the testing phase of a certain classifier, is divided into two subsets, L+ and L-. The 1-st subset includes those instances of the initial test sample that were classified correctly during testing. The 2-nd subset includes other instances of the test sample, i.e. those that were classified incorrectly. Considering these two data sets as the areas of competence and incompetence of the classifier, it can be used as training data for "referee" algorithm training process. During new data classification the task of referee is to determine (for each input instance) whether it belongs to its area of competence or not, and if it belongs, estimate the probability of correct classification of this particular input instance. After that the referee instructs the most competent classifier to solve the classification task.
Neural network approaches.
Neural network approaches of collective classification divided into methods: that use neural network to combine multiple classifiers, ensembles of neural networks and those that use neural networks constructed from modules.
A neural network for combining classifiers
One approach uses a neural network to combine basic classifiers multiple decisions (Fig. 2). The output of each base classifier is a decisions vector (a vector containing 'soft tags' values), the values of which belong to a certain numerical interval [a, b]. These values are delivered to the neural network input (network must be trained to combine base classifiers decisions). The output of the neural network is a final decision in favor of one or another class. The output of the network can also be a vector whose dimension is equal to the number of classes of objects to be recognized that at each position has some confidence degree value in favor of one or another class. In this case, the class with a maximum value of such confidence value can be selected as a decision.
Decisions integration functionates as follows:
- a number of base classifiers is selected and trained;
- metadata for the neural network training is prepared. For that, base classifiers are tested using the interpreted data sample. Then for each test sample a vector of base classifiers decisions is generated. Then, a component which includes the true class value of the test sample is added to the decisions vector;
- meta-data sample is used for training the neural network that performs decisions integration.
Modular neural networks method
For modular neural networks it is proposed to use the so-called gating network (neural network for assessing the classifiers’ competence for a particular input vector of data provided for classifiers). This option represents the neural network paradigm to integrate decisions based on competences. The corresponding theory is called mixture of experts. Each classifier is associated with the so-called "referee" program, which predicts its competence degree with respect to a specific input, that a set of base classifiers is provided with (Fig. 3).
Depending on the input vector X, decisions of different classifiers can be somehow selected and used in order to obtain a combined decision. Number of inputs of predicting network equal to the dimension of the input vector of feature space. Number of network outputs equal to the number of classifiers L. Given a specific input vector, predicting neural network is trained to predict the competence degree of each classifier, i.e. evaluation of the fact that the classifier provides the correct decision. The degree of competence is represented by the number on the interval [0,1].
Ensembles of Neural Networks
Also, decisions integration system architecture, which consists of several “experts” (neural networks) is proposed. Combining the knowledge of neural networks in the ensemble proved to be effective, demonstrating the prospects of application of the collective recognition technology to overcome the problem of "fragility".
The ensemble of neural networks - a set of neural network models, that makes the decision by averaging the results of the individual models. Depending on how the ensemble is constructed, its use allows to solve one of two problems: the tendency of base neural network architecture to underfit (this problem is solved by boosting meta-algorithm), and tendency of the base architecture to overfit (bagging meta-algorithm).
There are various universal voting schemes, for which the winner is the class:
- maximum - with a maximum response of the ensemble members;
- average - with the highest average response of the ensemble members;
- majority - with the largest number of votes of the members of the ensemble.
My question
Which scheme of collective recognition is most suitable for symbol/digit recognition?
Mostly, I will work with digits. Data sources from which I took the information about the various schemes of collective classification dated 2006, and I am afraid that some methods may become obsolete.
I have mentioned the following methods:
- on the basis of competence areas
- on the basis of using a neural network to combine classifiers
- on the basis of modular neural networks
- on the basis of neural network ensembles.
Which method, potentially, can show better classification accuracy and performance in the task of symbols/digits recognition?
Or, are there new and more efficient methods of collective recognition/classification that I did not mention here?