Neha Patil (Editor)

Cover's theorem

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Cover's Theorem is a statement in computational learning theory and is one of the primary theoretical motivations for the use of non-linear kernel methods in machine learning applications. The theorem states that given a set of training data that is not linearly separable, one can with high probability transform it into a training set that is linearly separable by projecting it into a higher-dimensional space via some non-linear transformation. The theorem is named after the information theorist Thomas M. Cover who stated it in 1965.

The proof is easy. A deterministic mapping may be used. Indeed, suppose there are n samples. Lift them onto the vertices of the simplex in the n 1 dimensional real space. Every partition of the samples into two sets is separable by a linear separator. QED.

A complex pattern-classification problem, cast in a high-dimensional space nonlinearly, is more likely to be linearly separable than in a low-dimensional space, provided that the space is not densely populated.

References

Cover's theorem Wikipedia