Classifier chains is a machine learning method for problem transformation in multi-label classification. It combines the computational efficiency of the Binary Relevance method while still being able to take the label dependencies into account for classification.
Contents
Problem transformation
Problem transformation methods transform a multi-label classification problem in one or more single-label classification problems. In such a way existing single-label classification algorithms such as SVM and Naive Bayes can be used without modification.
Several problem transformation methods exist. One of them is Binary Relevance method (BR). Given a set of labels
Other approach, which takes into account label correlations is Label Powerset method (LP). Each different combination of labels in a data set is considered to be a single label. After transformation a single-label classifier
Classifier Chains method is based on the BR method and it is efficient even on a big number of labels. Furthermore, it considers dependencies between labels.
Method description
For a given a set of labels
Given a data set where
By classifying new instances the labels are again predicted by building a chain of classifiers. The classification begins with first classifier
In Ensemble of Classifier Chains (ECC) several CC classifiers can be trained with random order of chains (i.e. random order of labels) on a random subset of data set. Labels of a new instance are predicted by each classifier separately. After that, the total number of predictions or "votes" is counted for each label. The label is accepted if it was predicted by a percentage of classifiers that is bigger than some threshold value.
Another extension of CC, related to ECC, is the Monte Carlo CC (MCC), which employs Monte Carlo methods for finding a good chain sequence and performing efficient inference. Other variants of CC, using different random search methods or considering different dependence structure of classifiers, have been proposed in literature.