DFFITS is a diagnostic meant to show how influential a point is in a statistical regression. It was proposed in 1980. It is defined as the Studentized DFFIT, where the latter is the change in the predicted value for a point, obtained when that point is left out of the regression; Studentization is achieved by dividing by the estimated standard deviation of the fit at that point:
where
DFFITS is very similar to the externally Studentized residual, and is in fact equal to the latter times
As when the errors are Gaussian the externally Studentized residual is distributed as Student's t (with a number of degrees of freedom equal to the number of residual degrees of freedom minus one), DFFITS for a particular point will be distributed according to this same Student's t distribution multiplied by the leverage factor
For a perfectly balanced experimental design (such as a factorial design or balanced partial factorial design), the leverage for each point is p/n, the number of parameters divided by the number of points. This means that the DFFITS values will be distributed (in the Gaussian case) as
Although the raw values resulting from the equations are different, Cook's distance and DFFITS are conceptually identical and there is a closed-form formula to convert one value to the other.
Development
Previously when assessing a dataset before running a linear regression, the possibility of outliers would be assessed using histograms and scatterplots. Both methods of assessing data points were subjective and there was little way of knowing how much leverage each potential outlier had on the results data. This led to a variety of quantitative measures, including DFFIT, DFBETA.