In statistical hypothesis testing, a uniformly most powerful (UMP) test is a hypothesis test which has the greatest power β among all possible tests of a given size α. For example, according to the Neyman–Pearson lemma, the likelihood-ratio test is UMP for testing simple (point) hypotheses.
Let X denote a random vector (corresponding to the measurements), taken from a parametrized family of probability density functions or probability mass functions f θ ( x ) , which depends on the unknown deterministic parameter θ ∈ Θ . The parameter space Θ is partitioned into two disjoint sets Θ 0 and Θ 1 . Let H 0 denote the hypothesis that θ ∈ Θ 0 , and let H 1 denote the hypothesis that θ ∈ Θ 1 . The binary test of hypotheses is performed using a test function ϕ ( x ) .
ϕ ( x ) = { 1 if x ∈ R 0 if x ∈ A meaning that H 1 is in force if the measurement X ∈ R and that H 0 is in force if the measurement X ∈ A . Note that A ∪ R is a disjoint covering of the measurement space.
A test function ϕ ( x ) is UMP of size α if for any other test function ϕ ′ ( x ) satisfying
sup θ ∈ Θ 0 E θ ϕ ′ ( X ) = α ′ ≤ α = sup θ ∈ Θ 0 E θ ϕ ( X ) we have
∀ θ ∈ Θ 1 , E θ ϕ ′ ( X ) = 1 − β ′ ≤ 1 − β = E θ ϕ ( X ) . The Karlin–Rubin theorem can be regarded as an extension of the Neyman–Pearson lemma for composite hypotheses. Consider a scalar measurement having a probability density function parameterized by a scalar parameter θ, and define the likelihood ratio l ( x ) = f θ 1 ( x ) / f θ 0 ( x ) . If l ( x ) is monotone non-decreasing, in x , for any pair θ 1 ≥ θ 0 (meaning that the greater x is, the more likely H 1 is), then the threshold test:
ϕ ( x ) = { 1 if x > x 0 0 if x < x 0 where
x 0 is chosen such that
E θ 0 ϕ ( X ) = α is the UMP test of size α for testing H 0 : θ ≤ θ 0 vs. H 1 : θ > θ 0 .
Note that exactly the same test is also UMP for testing H 0 : θ = θ 0 vs. H 1 : θ > θ 0 .
Although the Karlin-Rubin theorem may seem weak because of its restriction to scalar parameter and scalar measurement, it turns out that there exist a host of problems for which the theorem holds. In particular, the one-dimensional exponential family of probability density functions or probability mass functions with
f θ ( x ) = g ( θ ) h ( x ) exp ( η ( θ ) T ( x ) ) has a monotone non-decreasing likelihood ratio in the sufficient statistic T ( x ) , provided that η ( θ ) is non-decreasing.
Let X = ( X 0 , … , X M − 1 ) denote i.i.d. normally distributed N -dimensional random vectors with mean θ m and covariance matrix R . We then have
f θ ( X ) = ( 2 π ) − M N 2 | R | − M 2 exp { − 1 2 ∑ n = 0 M − 1 ( X n − θ m ) T R − 1 ( X n − θ m ) } = ( 2 π ) − M N 2 | R | − M 2 exp { − 1 2 ∑ n = 0 M − 1 ( θ 2 m T R − 1 m ) } exp { − 1 2 ∑ n = 0 M − 1 X n T R − 1 X n } exp { θ m T R − 1 ∑ n = 0 M − 1 X n } which is exactly in the form of the exponential family shown in the previous section, with the sufficient statistic being
T ( X ) = m T R − 1 ∑ n = 0 M − 1 X n . Thus, we conclude that the test
ϕ ( T ) = { 1 T > t 0 0 T < t 0 E θ 0 ϕ ( T ) = α is the UMP test of size α for testing H 0 : θ ⩽ θ 0 vs. H 1 : θ > θ 0
Finally, we note that in general, UMP tests do not exist for vector parameters or for two-sided tests (a test in which one hypothesis lies on both sides of the alternative). The reason is that in these situations, the most powerful test of a given size for one possible value of the parameter (e.g. for θ 1 where θ 1 > θ 0 ) is different from the most powerful test of the same size for a different value of the parameter (e.g. for θ 2 where θ 2 < θ 0 ). As a result, no test is uniformly most powerful in these situations.