![]() | ||
The sample complexity of a machine learning algorithm represents the number of training-samples that it needs in order to successfully learn a target function.
Contents
- Definition
- Unrestricted hypothesis space infinite sample complexity
- Restricted hypothesis space finite sample complexity
- An example of a PAC learnable hypothesis space
- Sample complexity bounds
- Other Settings
- References
More precisely, the sample complexity is the number of training-samples that we need to supply to the algorithm, so that the function returned by the algorithm is within an arbitrarily small error of the best possible function, with probability arbitrarily close to 1.
There are two variants of sample complexity:
The No Free Lunch theorem, discussed below, proves that, in general, the strong sample complexity is infinite. I.e, there is no algorithm that can learn the globally-optimal target function using a finite number of training samples.
However, if we are only interested in a particular class of target functions (e.g, only linear functions) then the sample complexity is finite, and it depends linearly on the VC dimension on the class of target functions.
Definition
Let
Fix a hypothesis space
Fix a loss function
In our setting, we have
In words, the sample complexity
In probabilistically approximately correct (PAC) learning, one is concerned with whether the sample complexity is polynomial, that is, whether
Unrestricted hypothesis space: infinite sample complexity
One can ask whether there exists a learning algorithm so that the sample complexity is finite in the strong sense, that is, there is a bound on the number of samples needed so that the algorithm can learn any distribution over the input-output space with a specified target error. More formally, one asks whether there exists a learning algorithm
Thus, in order to make statements about the rate of convergence of the quantity
Restricted hypothesis space: finite sample-complexity
The latter approach leads to concepts such as VC dimension and Rademacher complexity which control the complexity of the space
It is a theorem from VC theory that the following three statements are equivalent for a hypothesis space
-
H is PAC-learnable. - The VC dimension of
H is finite. -
H is a uniform Glivenko-Cantelli class.
This gives a way to prove that certain hypothesis spaces are PAC learnable, and by extension, learnable.
An example of a PAC-learnable hypothesis space
Let X = Rd, Y = {-1, 1}, and let
Sample-complexity bounds
Suppose
Suppose
Other Settings
In addition to the supervised learning setting, sample complexity is relevant to semi-supervised learning problems including active learning, where the algorithm can ask for labels to specifically chosen inputs in order to reduce the cost of obtaining many labels. The concept of sample complexity also shows up in reinforcement learning, online learning, and unsupervised algorithms, e.g. for dictionary learning.