In numerical linear algebra, the conjugate gradient method is an iterative method for numerically solving the linear system
Contents
- Derivation from the conjugate direction method
- The conjugate direction method
- Derivation from the ArnoldiLanczos iteration
- The general Arnoldi method
- The direct Lanczos method
- The conjugate gradient method from imposing orthogonality and conjugacy
- References
where                               
The intent of this article is to document the important steps in these derivations.
Derivation from the conjugate direction method
The conjugate gradient method can be seen as a special case of the conjugate direction method applied to minimization of the quadratic function
The conjugate direction method
In the conjugate direction method for minimizing
one starts with an initial guess                     
where                     
for any                     
The conjugate direction method is imprecise in the sense that no formulae are given for selection of the directions                     
Derivation from the Arnoldi/Lanczos iteration
The conjugate gradient method can also be seen as a variant of the Arnoldi/Lanczos iteration applied to solving linear systems.
The general Arnoldi method
In the Arnoldi iteration, one starts with a vector                     
by defining                     
In other words, for                     
Put in matrix form, the iteration is captured by the equation
where
with
When applying the Arnoldi iteration to solving linear systems, one starts with                     
The direct Lanczos method
For the rest of discussion, we assume that                               
This enables a short three-term recurrence for                     
Since                               
with convenient recurrences for                     
Rewrite                     
with
It is now important to observe that
In fact, there are short recurrences for                     
With this formulation, we arrive at a simple recurrence for                     
The relations above straightforwardly lead to the direct Lanczos method, which turns out to be slightly more complex.
The conjugate gradient method from imposing orthogonality and conjugacy
If we allow                     
As premises for the simplification, we now derive the orthogonality of                     
The residuals are mutually orthogonal because                     
To see the conjugacy of                     
is symmetric and lower triangular simultaneously and thus must be diagonal.
Now we can derive the constant factors                     
Due to the orthogonality of                     
Similarly, due to the conjugacy of                     
This completes the derivation.
