In statistics and in particular in regression analysis, leverage is a measure of how far away the independent variable values of an observation are from those of the other observations.
Contents
High-leverage points are those observations, if any, made at extreme or outlying values of the independent variables such that the lack of neighboring observations means that the fitted regression model will pass close to that particular observation.
Modern computer packages for statistical analysis include, as part of their facilities for regression analysis, various quantitative measures for identifying influential observations: among these measures is partial leverage, a measure of how a variable contributes to the leverage of a datum.
Definition
In the linear regression model, the leverage score for the i-th data unit is defined as:
the i-th diagonal element of the projection matrix 
  
    
      
        
          
where 
  
    
      
        
Bounds on leverage
Proof
First, note that H is an idempotent matrix: 
  
    
      
        
and
Effect on residual variance
If we are in an ordinary least squares setting with fixed X, regression errors 
  
    
      
        
then 
  
    
      
        
In other words, if the model errors 
  
    
      
        
Proof
First, note that 
  
    
      
        
Thus 
  
    
      
        
Studentized residuals
The corresponding studentized residual—the residual adjusted for its observation–specific residual variance—is then
where 
  
    
      
        
          
            
