In information theory, Fano's inequality (also known as the Fano converse and the Fano lemma) relates the average information lost in a noisy channel to the probability of the categorization error. It was derived by Robert Fano in the early 1950s while teaching a Ph.D. seminar in information theory at MIT, and later recorded in his 1961 textbook.
Contents
It is used to find a lower bound on the error probability of any decoder as well as the lower bounds for minimax risks in density estimation.
Let the random variables X and Y represent input and output messages with a joint probability
where
is the conditional entropy,
is the probability of the communication error, and
is the corresponding binary entropy.
Alternative formulation
Let X be a random variable with density equal to one of
Let
where
Generalization
The following generalization is due to Ibragimov and Khasminskii (1979), Assouad and Birge (1983).
Let F be a class of densities with a subclass of r + 1 densities ƒθ such that for any θ ≠ θ′
Then in the worst case the expected value of error of estimation is bound from below,
where ƒn is any density estimator based on a sample of size n.