Berry–Esseen theorem - Alchetron, The Free Social Encyclopedia

In probability theory, the central limit theorem states that, under certain circumstances, the probability distribution of the scaled mean of a random sample converges to a normal distribution as the sample size increases to infinity. Under stronger assumptions, the Berry–Esseen theorem, or Berry–Esseen inequality, gives a more quantitative result, because it also specifies the rate at which this convergence takes place by giving a bound on the maximal error of approximation between the normal distribution and the true distribution of the scaled sample mean. The approximation is measured by the Kolmogorov–Smirnov distance. In the case of independent samples, the convergence rate is n^−1/2, where n is the sample size, and the constant is estimated in terms of the third absolute normalized moments.

Statement of the theorem

Statements of the theorem vary, as it was independently discovered by two mathematicians, Andrew C. Berry (in 1941) and Carl-Gustav Esseen (1942), who then, along with other authors, refined it repeatedly over subsequent decades.

Identically distributed summands

One version, sacrificing generality somewhat for the sake of clarity, is the following:

There exists a positive constant C such that if X₁, X₂, ..., are i.i.d. random variables with E(X₁) = 0, E(X₁²) = σ² > 0, and E(|X₁|³) = ρ < ∞, and if we define Y n = X 1 + X 2 + ⋯ + X n n the sample mean, with F_n the cumulative distribution function of Y n n σ , and Φ the cumulative distribution function of the standard normal distribution, then for all x and n, | F n ( x ) − Φ ( x ) | ≤ C ρ σ 3 n . ( 1 )

That is: given a sequence of independent and identically distributed random variables, each having mean zero and positive variance, if additionally the third absolute moment is finite, then the cumulative distribution functions of the standardized sample mean and the standard normal distribution differ (vertically, on a graph) by no more than the specified amount. Note that the approximation error for all n (and hence the limiting rate of convergence for indefinite n sufficiently large) is bounded by the order of n^−1/2.

Calculated values of the constant C have decreased markedly over the years, from the original value of 7.59 by Esseen (1942), to 0.7882 by van Beek (1972), then 0.7655 by Shiganov (1986), then 0.7056 by Shevtsova (2007), then 0.7005 by Shevtsova (2008), then 0.5894 by Tyurin (2009), then 0.5129 by Korolev & Shevtsova (2009), then 0.4785 by Tyurin (2010). The detailed review can be found in the papers Korolev & Shevtsova (2009), Korolev & Shevtsova (2010). The best estimate as of 2012, C < 0.4748, follows from the inequality

sup x ∈ R | F n ( x ) − Φ ( x ) | ≤ 0.33554 ( ρ + 0.415 σ 3 ) σ 3 n ,

due to Shevtsova (2011), since σ³ ≤ ρ and 0.33554 · 1.415 < 0.4748. However, if ρ ≥ 1.286σ³, then the estimate

sup x ∈ R | F n ( x ) − Φ ( x ) | ≤ 0.3328 ( ρ + 0.429 σ 3 ) σ 3 n ,

which is also proved in Shevtsova (2011), gives an even tighter upper estimate.

Esseen (1956) proved that the constant also satisfies the lower bound

C ≥ 10 + 3 6 2 π ≈ 0.40973 ≈ 1 2 π + 0.01079.

Non-identically distributed summands

Let X₁, X₂, ..., be independent random variables with E(X_i) = 0, E(X_i²) = σ_i² > 0, and E(|X_i|³) = ρ_i < ∞. Also, let S n = X 1 + X 2 + ⋯ + X n σ 1 2 + σ 2 2 + ⋯ + σ n 2 be the normalized n-th partial sum. Denote F_n the cdf of S_n, and Φ the cdf of the standard normal distribution. For the sake of convenience denote σ → = ( σ 1 , … , σ n ) , ρ → = ( ρ 1 , … , ρ n ) . In 1941, Andrew C. Berry proved that for all n there exists an absolute constant C₁ such that sup x ∈ R | F n ( x ) − Φ ( x ) | ≤ C 1 ⋅ ψ 1 , ( 2 ) whereIndependently, in 1942, Carl-Gustav Esseen proved that for all n there exists an absolute constant C₀ such that sup x ∈ R | F n ( x ) − Φ ( x ) | ≤ C 0 ⋅ ψ 0 , ( 3 ) where

It is easy to make sure that ψ₀≤ψ₁. Due to this circumstance inequality (3) is conventionally called the Berry–Esseen inequality, and the quantity ψ₀ is called the Lyapunov fraction of the third order. Moreover, in the case where the summands X₁, ..., X_n have identical distributions

and thus the bounds stated by inequalities (1), (2) and (3) coincide apart from the constant.

Regarding C₀, obviously, the lower bound established by Esseen (1956) remains valid:

C 0 ≥ 10 + 3 6 2 π = 0.4097 … .

The upper bounds for C₀ were subsequently lowered from the original estimate 7.59 due to Esseen (1942) to (considering recent results only) 0.9051 due to Zolotarev (1967), 0.7975 due to van Beek (1972), 0.7915 due to Shiganov (1986), 0.6379 and 0.5606 due to Tyurin (2009) and Tyurin (2010). As of 2011 the best estimate is 0.5600 obtained by Shevtsova (2010).

References

Berry–Esseen theorem Wikipedia

(Text) CC BY-SA

Contents

Statement of the theorem

Identically distributed summands

Non-identically distributed summands

References