Mean squared prediction error - Alchetron, the free social encyclopedia

In statistics the mean squared prediction error of a smoothing or curve fitting procedure is the expected value of the squared difference between the fitted values implied by the predictive function g ^ and the values of the (unobservable) function g. It is an inverse measure of the explanatory power of g ^ , and can be used in the process of cross-validation of an estimated model.

If the smoothing or fitting procedure has operator matrix (i.e., hat matrix) L, which maps the observed values vector y to predicted values vector y ^ via y ^ = L y , then

MSPE ⁡ ( L ) = E ⁡ [ ( g ( x i ) − g ^ ( x i ) ) 2 ] .

The MSPE can be decomposed into two terms (just like mean squared error is decomposed into bias and variance); however for MSPE one term is the sum of squared biases of the fitted values and another the sum of variances of the fitted values:

MSPE ⁡ ( L ) = ∑ i = 1 n ( E ⁡ [ g ^ ( x i ) ] − g ( x i ) ) 2 + ∑ i = 1 n var ⁡ [ g ^ ( x i ) ] .

Note that knowledge of g is required in order to calculate MSPE exactly.

Estimation of MSPE

For the model y i = g ( x i ) + σ ε i where ε i ∼ N ( 0 , 1 ) , one may write

MSPE ⁡ ( L ) = g ′ ( I − L ) ′ ( I − L ) g + σ 2 tr ⁡ [ L ′ L ] .

The first term is equivalent to

∑ i = 1 n ( E ⁡ [ g ^ ( x i ) ] − g ( x i ) ) 2 = E ⁡ [ ∑ i = 1 n ( y i − g ^ ( x i ) ) 2 ] − σ 2 tr ⁡ [ ( I − L ) ′ ( I − L ) ] .

Thus,

MSPE ⁡ ( L ) = E ⁡ [ ∑ i = 1 n ( y i − g ^ ( x i ) ) 2 ] − σ 2 ( n − 2 tr ⁡ [ L ] ) .

If σ 2 is known or well-estimated by σ ^ 2 , it becomes possible to estimate MSPE by

M S P E ^ ⁡ ( L ) = ∑ i = 1 n ( y i − g ^ ( x i ) ) 2 − σ ^ 2 ( n − 2 tr ⁡ [ L ] ) .

Colin Mallows advocated this method in the construction of his model selection statistic C_p, which is a normalized version of the estimated MSPE:

C p = ∑ i = 1 n ( y i − g ^ ( x i ) ) 2 σ ^ 2 − n + 2 tr ⁡ [ L ] .

where p comes from the fact that the number of parameters p estimated for a parametric smoother is given by p = tr ⁡ [ L ] , and C is in honor of Cuthbert Daniel.

References

Mean squared prediction error Wikipedia

(Text) CC BY-SA