Samiksha Jaiswal (Editor)

Fano's inequality

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In information theory, Fano's inequality (also known as the Fano converse and the Fano lemma) relates the average information lost in a noisy channel to the probability of the categorization error. It was derived by Robert Fano in the early 1950s while teaching a Ph.D. seminar in information theory at MIT, and later recorded in his 1961 textbook.

Contents

It is used to find a lower bound on the error probability of any decoder as well as the lower bounds for minimax risks in density estimation.

Let the random variables X and Y represent input and output messages with a joint probability P ( x , y ) . Let e represent an occurrence of error; i.e., that X X ~ , with X ~ = f ( Y ) being an approximate version of X . Fano's inequality is

H ( X | Y ) H ( e ) + P ( e ) log ( | X | 1 ) ,

where X denotes the support of X,

H ( X | Y ) = i , j P ( x i , y j ) log P ( x i | y j )

is the conditional entropy,

P ( e ) = P ( X X ~ )

is the probability of the communication error, and

H ( e ) = P ( e ) log P ( e ) ( 1 P ( e ) ) log ( 1 P ( e ) )

is the corresponding binary entropy.

Alternative formulation

Let X be a random variable with density equal to one of r + 1 possible densities f 1 , , f r + 1 . Furthermore, the Kullback–Leibler divergence between any pair of densities cannot be too large,

D K L ( f i f j ) β for all i j .

Let ψ ( X ) { 1 , , r + 1 } be an estimate of the index. Then

sup i P i ( ψ ( X ) i ) 1 β + log 2 log r

where P i is the probability induced by f i

Generalization

The following generalization is due to Ibragimov and Khasminskii (1979), Assouad and Birge (1983).

Let F be a class of densities with a subclass of r + 1 densities ƒθ such that for any θ ≠ θ

f θ f θ L 1 α , D K L ( f θ f θ ) β .

Then in the worst case the expected value of error of estimation is bound from below,

sup f F E f n f L 1 α 2 ( 1 n β + log 2 log r )

where ƒn is any density estimator based on a sample of size n.

References

Fano's inequality Wikipedia