Stability, also known as algorithmic stability, is a notion in computational learning theory of how a machine learning algorithm is perturbed by small changes to its inputs. A stable learning algorithm is one for which the prediction does not change much when the training data is modified slightly. For instance, consider a machine learning algorithm that is being trained to recognize handwritten letters of the alphabet, using 1000 examples of handwritten letters and their labels ("A" to "Z") as a training set. One way to modify this training set is to leave out an example, so that only 999 examples of handwritten letters and their labels are available. A stable learning algorithm would produce a similar classifier with both the 1000-element and 999-element training sets.
Contents
- History
- Summary of classic results
- Preliminary definitions
- Hypothesis Stability
- Point wise Hypothesis Stability
- Error Stability
- Uniform Stability
- Leave one out cross validation CVloo Stability
- Expected leave one out error E l o o e r r displaystyle Elooerr Stability
- Classic theorems
- Algorithms that are stable
- References
Stability can be studied for many types of learning problems, from language learning to inverse problems in physics and engineering, as it is a property of the learning process rather than the type of information being learned. The study of stability gained importance in computational learning theory in the 2000s when it was shown to have a connection with generalization. It was shown that for large classes of learning algorithms, notably empirical risk minimization algorithms, certain types of stability ensure good generalization.
History
A central goal in designing a machine learning system is to guarantee that the learning algorithm will generalize, or perform accurately on new examples after being trained on a finite number of them. In the 1990s, milestones were made in obtaining generalization bounds for supervised learning algorithms. The technique historically used to prove generalization was to show that an algorithm was consistent, using the uniform convergence properties of empirical quantities to their means. This technique was used to obtain generalization bounds for the large class of empirical risk minimization (ERM) algorithms. An ERM algorithm is one that selects a solution from a hypothesis space
A general result, proved by Vladimir Vapnik for an ERM binary classification algorithms, is that for any target function and input distribution, any hypothesis space
Vapnik's work, using what became known as VC theory, established a relationship between generalization of a learning algorithm and properties of the hypothesis space
Stability analysis was developed in the 2000s for computational learning theory and is an alternative method for obtaining generalization bounds. The stability of an algorithm is a property of the learning process, rather than a direct property of the hypothesis space
Summary of classic results
Preliminary definitions
We define several terms related to learning algorithms training sets, so that we can then define stability in multiple ways and present theorems from the field.
A machine learning algorithm, also known as a learning map
The training set from which an algorithm learns is defined as
and is of size
drawn i.i.d. from an unknown distribution D.
Thus, the learning map
The loss
The empirical error of
The true error of
Given a training set S of size m, we will build, for all i = 1....,m, modified training sets as follows:
Hypothesis Stability
An algorithm
Point-wise Hypothesis Stability
An algorithm
Error Stability
An algorithm
Uniform Stability
An algorithm
A probabilistic version of uniform stability β is:
An algorithm is said to be stable, when the value of
Leave-one-out cross-validation (CVloo) Stability
An algorithm
The definition of (CVloo) Stability is equivalent to Pointwise-hypothesis stability seen earlier.
Expected-leave-one-out error ( E l o o e r r {displaystyle Eloo_{err}} ) Stability
An algorithm
Classic theorems
From Bousquet and Elisseeff (02):
For symmetric learning algorithms with bounded loss, if the algorithm has Uniform Stability with the probabilistic definition above, then the algorithm generalizes.
Uniform Stability is a strong condition which is not met by all algorithms but is, surprisingly, met by the large and important class of Regularization algorithms. The generalization bound is given in the article.
From Mukherjee et al. (06):
This is an important result for the foundations of learning theory, because it shows that two previously unrelated properties of an algorithm, stability and consistency, are equivalent for ERM (and certain loss functions). The generalization bound is given in the article.
Algorithms that are stable
This is a list of algorithms that have been shown to be stable, and the article where the associated generalization bounds are provided.