Puneet Varma (Editor)

Law of the unconscious statistician

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In probability theory and statistics, the law of the unconscious statistician (sometimes abbreviated LOTUS) is a theorem used to calculate the expected value of a function g(X) of a random variable X when one knows the probability distribution of X but one does not explicitly know the distribution of g(X).

Contents

The form of the law can depend on the form in which one states the probability distribution of the random variable X. If it is a discrete distribution and one knows its probability mass function ƒX (but not ƒg(X)), then the expected value of g(X) is

E [ g ( X ) ] = x g ( x ) f X ( x ) ,

where the sum is over all possible values x of X. If it is a continuous distribution and one knows its probability density function ƒX (but not ƒg(X)), then the expected value of g(X) is

E [ g ( X ) ] = g ( x ) f X ( x ) d x

(provided the values of X are real numbers as opposed to vectors, complex numbers, etc.).

Regardless of continuity-versus-discreteness and related issues, if one knows the cumulative probability distribution function FX (but not Fg(X)), then the expected value of g(X) is given by a Riemann–Stieltjes integral

E [ g ( X ) ] = g ( x ) d F X ( x )

(again assuming X is real-valued).

The above equation is sometimes known as the law of the unconscious statistician, as statisticians have been accused of using the identity without realizing that it must be treated as the result of a rigorously proved theorem, not merely a definition.

However, the result is so well known that it is usually used without stating a name for it: the name is not extensively used. For justifications of the result for discrete and continuous random variables see.

Joint distributions

A similar property holds for joint distributions. For discrete random variables X and Y, a function of two variables g, and joint distribution f(x, y):

E [ g ( X , Y ) ] = y x g ( x , y ) f ( x , y )

In the continuous case,

E [ g ( X , Y ) ] = g ( x , y ) f ( x , y ) d x d y

From the perspective of measure

A technically complete derivation of the result is available using arguments in measure theory, in which the probability space of a transformed random variable g(X) is related to that of the original random variable X. The steps here involve defining a pushforward measure for the transformed space, and the result is then an example of a change of variables formula.

Ω g X d P = R g d ( X P )

We say X has a density if d ( X P ) is absolutely continuous with respect to the Lebesgue measure μ . In that case

d ( X P ) = f d μ

where f : R R is the density (see Radon-Nikodym derivative). So the above can be rewritten as the more familiar

E [ g ( X ) ] = Ω g X d P = R g ( x ) f ( x ) d x

References

Law of the unconscious statistician Wikipedia