Neha Patil (Editor)

Information projection

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In information theory, the information projection or I-projection of a probability distribution q onto a set of distributions P is

p = arg min p P D K L ( p | | q )

where D K L is the Kullback–Leibler divergence from p to q. Viewing the Kullback–Leibler divergence as a measure of distance, the I-projection p is the "closest" distribution to q of all the distributions in P.

The I-projection is useful in setting up information geometry, notably because of the following inequality:

D K L ( p | | q ) D K L ( p | | p ) + D K L ( p | | q )

This inequality can be interpreted as an information-geometric version of Pythagoras' triangle inequality theorem, where KL divergence is viewed as squared distance in a Euclidean space.

It is worthwhile to note that since D K L ( p | | q ) 0 and continuous in p, if P is closed and non-empty, then there exists at least one minimizer to the optimization problem framed above. Furthermore if P is convex, then the optimum distribution is unique.

The reverse I-projection is

p = arg min p P D K L ( q | | p )

References

Information projection Wikipedia