In mathematics, and in particular linear algebra, a pseudoinverse A+ of a matrix A is a generalization of the inverse matrix. The most widely known type of matrix pseudoinverse is the Moore–Penrose pseudoinverse, which was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951 and Roger Penrose in 1955. Earlier, Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. When referring to a matrix, the term pseudoinverse, without further specification, is often used to indicate the Moore–Penrose pseudoinverse. The term generalized inverse is sometimes used as a synonym for pseudoinverse.
Contents
- Notation
- Definition
- Properties
- Existence and uniqueness
- Basic properties
- Identities
- Reduction to Hermitian case
- Products
- Projectors
- Geometric construction
- Subspaces
- Limit relations
- Continuity
- Derivative
- Scalars
- Vectors
- Linearly independent columns
- Linearly independent rows
- Orthonormal columns or rows
- Circulant matrices
- Rank decomposition
- The QR method
- Singular value decomposition SVD
- Block matrices
- The iterative method of Ben Israel and Cohen
- Updating the pseudoinverse
- Software libraries
- Linear least squares
- Obtaining all solutions of a linear system
- Minimum norm solution to a linear system
- Condition number
- Generalizations
- References
A common use of the pseudoinverse is to compute a 'best fit' (least squares) solution to a system of linear equations that lacks a unique solution (see below under § Applications). Another use is to find the minimum (Euclidean) norm solution to a system of linear equations with multiple solutions. The pseudoinverse facilitates the statement and proof of results in linear algebra.
The pseudoinverse is defined and unique for all matrices whose entries are real or complex numbers. It can be computed using the singular value decomposition.
Notation
In the following discussion, the following conventions are adopted.
Definition
For
-
A A + A = A (AA+ need not be the general identity matrix, but it maps all column vectors of A to themselves); -
A + A A + = A + (A+ is a weak inverse for the multiplicative semigroup); -
( A A + ) ∗ = A A + (AA+ is Hermitian); and -
( A + A ) ∗ = A + A (A+A is also Hermitian).
In particular, when
This particular pseudoinverse constitutes a left inverse, since, in this case,
When
This is a right inverse, as
Properties
Proofs for some of these facts may be found on a separate page Proofs involving the Moore-Penrose pseudoinverse.
Existence and uniqueness
A matrix satisfying the first condition of the definition is known as a generalized inverse. If the matrix also satisfies the second definition, it is called a generalized reflexive inverse. Generalized inverses always exist but are not in general unique. Uniqueness is a consequence of the last two conditions.
Basic properties
Identities
The following identities can be used to cancel certain subexpressions or expand expressions involving pseudoinverses. Proofs for these properties can be found in the proofs subpage.
Reduction to Hermitian case
The computation of the pseudoinverse is reducible to its construction in the Hermitian case. This is possible through the equivalences:
as
Products
If
then
The last property yields the equivalences:
Projectors
Another property is the following: if
This can be proven by defining matrices
Geometric construction
If we view the matrix as a linear map
In other words: To find
This description is closely related to the Minimum norm solution to a linear system.
Subspaces
Limit relations
Continuity
Derivative
The derivative of a real valued pseudoinverse matrix which has constant rank at a point
Scalars
It is also possible to define a pseudoinverse for scalars and vectors. This amounts to treating these as matrices. The pseudoinverse of a scalar x is zero if x is zero and the reciprocal of x otherwise:
Vectors
The pseudoinverse of the null (all zero) vector is the transposed null vector. The pseudoinverse of a non-null vector is the conjugate transposed vector divided by its squared magnitude:
Linearly independent columns
If the columns of
It follows that
Linearly independent rows
If the rows of
It follows that
Orthonormal columns or rows
This is a special case of either full column rank or full row rank (treated above). If
Circulant matrices
For a circulant matrix
Rank decomposition
Let
The QR method
For
Considering the case when
which may be solved by forward substitution followed by back substitution.
The Cholesky decomposition may be computed without forming
so R is the Cholesky factor of
The case of full row rank is treated similarly by using the formula
Singular value decomposition (SVD)
A computationally simple and accurate way to compute the pseudo inverse is by using the singular value decomposition. If
The computational cost of this method is dominated by the cost of computing the SVD, which is several times higher than matrix–matrix multiplication, even if a state-of-the art implementation (such as that of LAPACK) is used.
The above procedure shows why taking the pseudo inverse is not a continuous operation: if the original matrix A has a singular value 0 (a diagonal entry of the matrix
Block matrices
Optimized approaches exist for calculating the pseudoinverse of block structured matrices.
The iterative method of Ben-Israel and Cohen
Another method for computing the pseudoinverse uses the recursion
which is sometimes referred to as hyper-power sequence. This recursion produces a sequence converging quadratically to the pseudoinverse of
Updating the pseudoinverse
For the cases where A has full row or column rank, and the inverse of the correlation matrix (
Similarly, it is possible to update the Cholesky factor when a row or column is added, without creating the inverse of the correlation matrix explicitly. However, updating the pseudoinverse in the general rank-deficient case is much more complicated.
Software libraries
The Python package NumPy provides a pseudoinverse calculation through its functions matrix.I
and linalg.pinv
; its pinv
uses the SVD-based algorithm. SciPy adds a function scipy.linalg.pinv
that uses a least-squares solver. High quality implementations of SVD, QR, and back substitution are available in standard libraries, such as LAPACK. Writing one's own implementation of SVD is a major programming project that requires a significant numerical expertise. In special circumstances, such as parallel computing or embedded computing, however, alternative implementations by QR or even the use of an explicit inverse might be preferable, and custom implementations may be unavoidable.
The MASS package for R provides a calculation of the Moore–Penrose pseudoinverse through the ginv
function. The ginv
function calculates a pseudoinverse using the singular value decomposition provided by the svd
function in the base R package.
The Octave programming language provides a pseudo inverse through the standard package function pinv
as well as the pseudo_inverse()
method.
Linear least-squares
The pseudoinverse provides a least squares solution to a system of linear equations. For
in general, a vector
This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by the Frobenius norm. Let
Obtaining all solutions of a linear system
If the linear system
has any solutions, they are all given by
for arbitrary vector w. Solution(s) exist if and only if
Minimum norm solution to a linear system
For linear systems
This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by the Frobenius norm. Let
Condition number
Using the pseudoinverse and a matrix norm, one can define a condition number for any matrix:
A large condition number implies that the problem of finding least-squares solutions to the corresponding system of linear equations is ill-conditioned in the sense that small errors in the entries of A can lead to huge errors in the entries of the solution.
Generalizations
In order to solve more general least-squares problems, one can define Moore–Penrose pseudoinverses for all continuous linear operators A : H1 → H2 between two Hilbert spaces H1 and H2, using the same four conditions as in our definition above. It turns out that not every continuous linear operator has a continuous linear pseudoinverse in this sense. Those that do are precisely the ones whose range is closed in H2.
In abstract algebra, a Moore–Penrose pseudoinverse may be defined on a *-regular semigroup. This abstract definition coincides with the one in linear algebra.