In numerical linear algebra, the conjugate gradient method is an iterative method for numerically solving the linear system
Contents
- Derivation from the conjugate direction method
- The conjugate direction method
- Derivation from the ArnoldiLanczos iteration
- The general Arnoldi method
- The direct Lanczos method
- The conjugate gradient method from imposing orthogonality and conjugacy
- References
where
The intent of this article is to document the important steps in these derivations.
Derivation from the conjugate direction method
The conjugate gradient method can be seen as a special case of the conjugate direction method applied to minimization of the quadratic function
The conjugate direction method
In the conjugate direction method for minimizing
one starts with an initial guess
where
for any
The conjugate direction method is imprecise in the sense that no formulae are given for selection of the directions
Derivation from the Arnoldi/Lanczos iteration
The conjugate gradient method can also be seen as a variant of the Arnoldi/Lanczos iteration applied to solving linear systems.
The general Arnoldi method
In the Arnoldi iteration, one starts with a vector
by defining
In other words, for
Put in matrix form, the iteration is captured by the equation
where
with
When applying the Arnoldi iteration to solving linear systems, one starts with
The direct Lanczos method
For the rest of discussion, we assume that
This enables a short three-term recurrence for
Since
with convenient recurrences for
Rewrite
with
It is now important to observe that
In fact, there are short recurrences for
With this formulation, we arrive at a simple recurrence for
The relations above straightforwardly lead to the direct Lanczos method, which turns out to be slightly more complex.
The conjugate gradient method from imposing orthogonality and conjugacy
If we allow
As premises for the simplification, we now derive the orthogonality of
The residuals are mutually orthogonal because
To see the conjugacy of
is symmetric and lower triangular simultaneously and thus must be diagonal.
Now we can derive the constant factors
Due to the orthogonality of
Similarly, due to the conjugacy of
This completes the derivation.