In statistics the mean squared prediction error of a smoothing or curve fitting procedure is the expected value of the squared difference between the fitted values implied by the predictive function
If the smoothing or fitting procedure has operator matrix (i.e., hat matrix) L, which maps the observed values vector
The MSPE can be decomposed into two terms (just like mean squared error is decomposed into bias and variance); however for MSPE one term is the sum of squared biases of the fitted values and another the sum of variances of the fitted values:
Note that knowledge of g is required in order to calculate MSPE exactly.
Estimation of MSPE
For the model
The first term is equivalent to
Thus,
If
Colin Mallows advocated this method in the construction of his model selection statistic Cp, which is a normalized version of the estimated MSPE:
where p comes from the fact that the number of parameters p estimated for a parametric smoother is given by