Hoeffding's lemma - Alchetron, The Free Social Encyclopedia

In probability theory, Hoeffding's lemma is an inequality that bounds the moment-generating function of any bounded random variable. It is named after the Finnish–American mathematical statistician Wassily Hoeffding.

Statement of the lemma

Let X be any real-valued random variable with expected value E ( X ) = 0 and such that a ≤ X ≤ b almost surely. Then, for all λ ∈ R ,

E [ e λ X ] ≤ exp ⁡ ( λ 2 ( b − a ) 2 8 ) .

Note that by the assumption that the random variable X has zero expectation, the a and b in the lemma must satisfy a ≤ 0 and 0 ≤ b .

A brief proof of the lemma

Since e λ x is a convex function of x , we have

e λ x ≤ b − x b − a e λ a + x − a b − a e λ b ∀ a ≤ x ≤ b

So, E [ e λ X ] ≤ b − E ( X ) b − a e λ a + E ( X ) − a b − a e λ b .

Let h = λ ( b − a ) , p = − a b − a and L ( h ) = − h p + ln ⁡ ( 1 − p + p e h )

Then, b − E ( X ) b − a e λ a + E ( X ) − a b − a e λ b = e L ( h ) since E ( X ) = 0

Taking derivative of L ( h ) ,

L ( 0 ) = L ′ ( 0 ) = 0 and L ″ ( h ) ≤ 1 4 for all h.

By Taylor's expansion,

L ( h ) ≤ 1 8 h 2 = 1 8 λ 2 ( b − a ) 2

Hence, E [ e λ X ] ≤ e 1 8 λ 2 ( b − a ) 2

(The proof below is the same proof with more explanation.)

More detailed proof

First note that if one of a or b is zero, then P ( X = 0 ) = 1 and the inequality follows. If both are nonzero, then a must be negative and b must be positive.

Next, recall that e s x is a convex function on the real line:

∀ x ∈ [ a , b ] : e s x ≤ b − x b − a e s a + x − a b − a e s b .

Applying E to both sides of the above inequality gives us:

E [ e s X ] ≤ b − E ( X ) b − a e s a + E ( X ) − a b − a e s b = b b − a e s a + − a b − a e s b E ( X ) = 0 = ( − a b − a ) e s a ( − b a + e s b − s a ) = ( − a b − a ) e s a ( − b − a + a a + e s ( b − a ) ) = ( − a b − a ) e s a ( − b − a a − 1 + e s ( b − a ) ) = ( 1 − θ + θ e s ( b − a ) ) e − s θ ( b − a ) θ = − a b − a > 0

Let u = s ( b − a ) and define:

{ φ : R → R φ ( u ) = − θ u + log ⁡ ( 1 − θ + θ e u )

φ is well defined on R , to see this we calculate:

1 − θ + θ e u = θ ( 1 θ − 1 + e u ) = θ ( − b a + e u ) > 0 θ > 0 , b a < 0

The definition of φ implies

E [ e s X ] ≤ e φ ( u ) .

By Taylor's theorem, for every real u there exists a v between 0 and u such that

φ ( u ) = φ ( 0 ) + u φ ′ ( 0 ) + 1 2 u 2 φ ″ ( v ) .

Note that:

φ ( 0 ) = 0 φ ′ ( 0 ) = − θ + θ e u 1 − θ + θ e u | u = 0 = 0 φ ″ ( v ) = θ e v ( 1 − θ + θ e v ) − θ 2 e 2 v ( 1 − θ + θ e v ) 2 = θ e v 1 − θ + θ e v ( 1 − θ e v 1 − θ + θ e v ) = t ( 1 − t ) t = θ e v 1 − θ + θ e v ≤ 1 4 t > 0

Therefore,

φ ( u ) ≤ 0 + u ⋅ 0 + 1 2 u 2 ⋅ 1 4 = 1 8 u 2 = 1 8 s 2 ( b − a ) 2 .

This implies

E [ e s X ] ≤ exp ⁡ ( 1 8 s 2 ( b − a ) 2 ) .

References

Hoeffding's lemma Wikipedia

(Text) CC BY-SA

Contents

Statement of the lemma

A brief proof of the lemma

More detailed proof

References