Stein's example is an important result in decision theory which can be stated as
The ordinary decision rule for estimating the mean of a multivariate Gaussian distribution is inadmissible under mean squared error risk in dimension at least 3.
The following is an outline of its proof. The reader is referred to the main article for more information.
The risk function of the decision rule 
  
    
      
        d
        (
        
          x
        
        )
        =
        
          x
        
      
    
    
   is
  
    
      
        R
        (
        θ
        ,
        d
        )
        =
        
          
            E
          
          
            θ
          
        
        [
        
          |
        
        
          θ
          −
          X
        
        
          
            |
          
          
            2
          
        
        ]
      
    
    
  
Now consider the decision rule
  
    
      
        
          d
          ′
        
        (
        
          x
        
        )
        =
        
          x
        
        −
        
          
            α
            
              
                |
              
              
                x
              
              
                
                  |
                
                
                  2
                
              
            
          
        
        
          x
        
      
    
    
  
where 
  
    
      
        α
        =
        n
        −
        2
      
    
    
  . We will show that 
  
    
      
        
          d
          ′
        
      
    
    
   is a better decision rule than 
  
    
      
        d
      
    
    
  . The risk function is
  
    
      
        R
        (
        θ
        ,
        
          d
          ′
        
        )
        =
        
          
            E
          
          
            θ
          
        
        
          [
          
            
              |
              
                θ
                −
                X
              
              +
              
                
                  α
                  
                    
                      |
                    
                    
                      X
                    
                    
                      
                        |
                      
                      
                        2
                      
                    
                  
                
              
              
                X
              
              |
            
            
              2
            
          
          ]
        
      
    
    
  
— a quadratic in 
  
    
      
        α
      
    
    
  . We may simplify the middle term by considering a general "well-behaved" function 
  
    
      
        h
        :
        
          x
        
        ↦
        h
        (
        
          x
        
        )
        ∈
        
          R
        
      
    
    
   and using integration by parts. For 
  
    
      
        1
        ≤
        i
        ≤
        n
      
    
    
  , for any continuously differentiable 
  
    
      
        h
      
    
    
   growing sufficiently slowly for large 
  
    
      
        
          x
          
            i
          
        
      
    
    
   we have:
  
    
      
        =
        
          
            [
            h
            (
            
              x
            
            )
            
              
                (
                
                  
                    1
                    
                      2
                      π
                    
                  
                
                )
              
              
                n
                
                  /
                
                2
              
            
            
              e
              
                −
                (
                1
                
                  /
                
                2
                )
                
                  
                    (
                    x
                    −
                    θ
                    )
                  
                  
                    T
                  
                
                
                  (
                  x
                  −
                  θ
                  )
                
              
            
            ]
          
          
            
              x
              
                i
              
            
            =
            −
            ∞
          
          
            ∞
          
        
        −
        ∫
        
          
            
              ∂
              h
            
            
              ∂
              
                x
                
                  i
                
              
            
          
        
        (
        
          x
        
        )
        
          
            (
            
              
                1
                
                  2
                  π
                
              
            
            )
          
          
            n
            
              /
            
            2
          
        
        
          e
          
            −
            (
            1
            
              /
            
            2
            )
            
              
                (
                x
                −
                θ
                )
              
              
                T
              
            
            
              (
              x
              −
              θ
              )
            
          
        
        m
        (
        d
        
          x
          
            i
          
        
        )
      
    
    
  
  
    
      
        =
        −
        
          
            E
          
          
            θ
          
        
        
          [
          
            
              
                ∂
                h
              
              
                ∂
                
                  x
                  
                    i
                  
                
              
            
          
          (
          
            X
          
          )
          
            |
          
          
            X
            
              j
            
          
          =
          
            x
            
              j
            
          
          (
          j
          ≠
          i
          )
          ]
        
        .
      
    
    
  
Therefore,
  
    
      
        
          
            E
          
          
            θ
          
        
        [
        (
        
          θ
          
            i
          
        
        −
        
          X
          
            i
          
        
        )
        h
        (
        
          X
        
        )
        ]
        =
        −
        
          
            E
          
          
            θ
          
        
        
          [
          
            
              
                ∂
                h
              
              
                ∂
                
                  x
                  
                    i
                  
                
              
            
          
          (
          
            X
          
          )
          ]
        
        .
      
    
    
  
(This result is known as Stein's lemma.)
Now, we choose
  
    
      
        h
        (
        
          x
        
        )
        =
        
          
            
              x
              
                i
              
            
            
              
                |
              
              
                x
              
              
                
                  |
                
                
                  2
                
              
            
          
        
        .
      
    
    
  
If 
  
    
      
        h
      
    
    
   met the "well-behaved" condition (it doesn't, but this can be remedied -- see below), we would have
  
    
      
        
          
            
              ∂
              h
            
            
              ∂
              
                x
                
                  i
                
              
            
          
        
        =
        
          
            1
            
              
                |
              
              
                x
              
              
                
                  |
                
                
                  2
                
              
            
          
        
        −
        
          
            
              2
              
                x
                
                  i
                
                
                  2
                
              
            
            
              
                |
              
              
                x
              
              
                
                  |
                
                
                  4
                
              
            
          
        
      
    
    
  
and so
  
    
      
        =
        −
        
          ∑
          
            i
            =
            1
          
          
            n
          
        
        
          
            E
          
          
            θ
          
        
        
          [
          
            
              1
              
                
                  |
                
                
                  X
                
                
                  
                    |
                  
                  
                    2
                  
                
              
            
          
          −
          
            
              
                2
                
                  X
                  
                    i
                  
                  
                    2
                  
                
              
              
                
                  |
                
                
                  X
                
                
                  
                    |
                  
                  
                    4
                  
                
              
            
          
          ]
        
      
    
    
  
  
    
      
        =
        −
        (
        n
        −
        2
        )
        
          
            E
          
          
            θ
          
        
        
          [
          
            
              1
              
                
                  |
                
                
                  X
                
                
                  
                    |
                  
                  
                    2
                  
                
              
            
          
          ]
        
        .
      
    
    
  
Then returning to the risk function of 
  
    
      
        
          d
          ′
        
      
    
    
   :
  
    
      
        R
        (
        θ
        ,
        
          d
          ′
        
        )
        =
        n
        −
        2
        α
        (
        n
        −
        2
        )
        
          
            E
          
          
            θ
          
        
        
          [
          
            
              1
              
                
                  |
                
                
                  X
                
                
                  
                    |
                  
                  
                    2
                  
                
              
            
          
          ]
        
        +
        
          α
          
            2
          
        
        
          
            E
          
          
            θ
          
        
        
          [
          
            
              1
              
                
                  |
                
                
                  X
                
                
                  
                    |
                  
                  
                    2
                  
                
              
            
          
          ]
        
        .
      
    
    
  
This quadratic in 
  
    
      
        α
      
    
    
   is minimized at
  
    
      
        α
        =
        n
        −
        2
        ,
        
      
    
    
  
giving
  
    
      
        R
        (
        θ
        ,
        
          d
          ′
        
        )
        =
        R
        (
        θ
        ,
        d
        )
        −
        (
        n
        −
        2
        
          )
          
            2
          
        
        
          
            E
          
          
            θ
          
        
        
          [
          
            
              1
              
                
                  |
                
                
                  X
                
                
                  
                    |
                  
                  
                    2
                  
                
              
            
          
          ]
        
      
    
    
  
which of course satisfies:
  
    
      
        R
        (
        θ
        ,
        
          d
          ′
        
        )
        <
        R
        (
        θ
        ,
        d
        )
        .
      
    
    
  
making 
  
    
      
        d
      
    
    
   an inadmissible decision rule.
It remains to justify the use of
  
    
      
        h
        (
        
          X
        
        )
        =
        
          
            
              X
            
            
              
                |
              
              
                X
              
              
                
                  |
                
                
                  2
                
              
            
          
        
        .
      
    
    
  
This function is not continuously differentiable since it is singular at 
  
    
      
        
          x
        
        =
        0
      
    
    
  . However the function
  
    
      
        h
        (
        
          X
        
        )
        =
        
          
            
              X
            
            
              ϵ
              +
              
                |
              
              
                X
              
              
                
                  |
                
                
                  2
                
              
            
          
        
      
    
    
  
is continuously differentiable, and after following the algebra through and letting 
  
    
      
        ϵ
        →
        0
      
    
    
   one obtains the same result.