In statistics, the Horvitz–Thompson estimator, named after Daniel G. Horvitz and Donovan J. Thompson, is a method for estimating the total and mean of a superpopulation in a stratified sample. Inverse probability weighting is applied to account for different proportions of observations within strata in a target population. The Horvitz–Thompson estimator is frequently applied in survey analyses and can be used to account for missing data.
Formally, let Y i , i = 1 , 2 , … , n be an independent sample from n of N ≥ n distinct strata with a common mean μ. Suppose further that π i is the inclusion probability that a randomly sampled individual in a superpopulation belongs to the ith stratum. The Horvitz–Thompson estimate of the total is given by:
Y ^ H T = ∑ i = 1 n π i − 1 Y i , and the estimate of the mean is given by:
μ ^ H T = N − 1 Y ^ H T = N − 1 ∑ i = 1 n π i − 1 Y i . In a Bayesian probabilistic framework π i is considered the proportion of individuals in a target population belonging to the ith stratum. Hence, π i − 1 Y i could be thought of as an estimate of the complete sample of persons within the ith stratum. The Horvitz–Thompson estimator can also be expressed as the limit of a weighted bootstrap resampling estimate of the mean. It can also be viewed as a special case of multiple imputation approaches.
For post-stratified study designs, estimation of π and μ are done in distinct steps. In such cases, computating the variance of μ ^ H T is not straightforward. Resampling techniques such as the bootstrap or the jackknife can be applied to gain consistent estimates of the variance of the Horvitz–Thompson estimator. The Survey package for R conducts analyses for post-stratified data using the Horvitz–Thompson estimator.
The Horvitz–Thompson estimator can be shown to be unbiased when evaluating the expectation of the Horvitz–Thompson estimator, E X ¯ n H T , as follows:
E X ¯ n H T = E 1 N ∑ i = 1 n X I i π I i = E 1 N ∑ i = 1 N X i π i 1 i ∈ D n = ∑ b = 1 B P ( D n ( b ) ) [ 1 N ∑ i = 1 N X i π i 1 i ∈ D n ( b ) ] = 1 N ∑ i = 1 N X i π i ∑ b = 1 B 1 i ∈ D n ( b ) P ( D n ( b ) ) = 1 N ∑ i = 1 N ( X i π i ) π i = 1 N ∑ i = 1 N X i w h e r e D n = { x 1 , x 2 , . . . , x n }