Let
Contents
 Useful lemmas
 Lemma 1 AA 0 A 0
 Lemma 3 ABB 0 AB 0
 Proof of uniqueness
 Proof of existence
 1 by 1 matrices
 Square diagonal matrices
 Arbitrary matrices
 AA
 A A A A
 Reduction to the Hermitian case
 A A AA
 Products
 A has orthonormal columns
 B has orthonormal rows
 A has full column rank and B has full row rank
 Conjugate transpose
 Projectors and subspaces
 Least squares minimization
 Minimum norm solution to a linear system
 References
 A A^{+}A = A
 A^{+}A A^{+} = A^{+}
 (AA^{+})^{*} = AA^{+}
 (A^{+}A)^{*} = A^{+}A
A^{+} is called the MoorePenrose pseudoinverse of A. Notice that A is also the MoorePenrose pseudoinverse of A^{+} . That is, (A^{+} )^{+} = A.
Useful lemmas
These results are used in the proofs below. In the following lemmas, A is a matrix with complex elements and n columns, B is a matrix with complex elements and n rows.
Lemma 1: A*A = 0 ⇒ A = 0
The assumption says that all elements of A*A are zero. Therefore,
Therefore, all
Lemma 3: ABB* = 0 ⇒ AB = 0
This is proved in a manner similar to the argument of Lemma 2 (or by simply taking the Hermitian conjugate).
Proof of uniqueness
Suppose that B and C are two nbym matrices over
AB = (AB)* = B*A* = B*(ACA)* = B*A*C*A* = (AB)*(AC)* = ABAC = AC.
Analogously we conclude that BA=CA. The proof is completed by observing that then
B = BAB = BAC = CAC = C.
Proof of existence
The proof proceeds in stages.
1by1 matrices
For any
It is easy to see that
Square diagonal matrices
Let
Notice that
We now show that

( D D + D ) i j = D i j D i j + D i j = D i j ⇒ D D + D = D 
( D + D D + ) i j = D i j + D i j D i j + = D i j + ⇒ D + D D + = D + 
( D D + ) i j ∗ = ( D D + ) j i ¯ = D j i D j i + ¯ = ( D j i D j i + ) ∗ = D j i D j i + = D i j D i j + ⇒ ( D D + ) ∗ = D D + 
( D + D ) i j ∗ = ( D + D ) j i ¯ = D j i + D j i ¯ = ( D j i + D j i ) ∗ = D j i + D j i = D i j + D i j ⇒ ( D + D ) ∗ = D + D
Arbitrary matrices
The singular value decomposition theorem states that there exists a factorization of the form
where:
U is an mbym unitary matrix over K. Σ is an mbyn matrix over K with nonnegative real numbers on the diagonal and zeros off the diagonal. V is an nbyn unitary matrix over K.Define
We now show that

A A + A = U Σ V ∗ V Σ + U ∗ U Σ V ∗ = U Σ Σ + Σ V ∗ = U Σ V ∗ = A 
A + A A + = V Σ + U ∗ U Σ V ∗ V Σ + U ∗ = V Σ + Σ Σ + U ∗ = V Σ + U ∗ = A + 
( A A + ) ∗ = ( U Σ V ∗ V Σ + U ∗ ) ∗ = ( U Σ Σ + U ∗ ) ∗ = U ( Σ Σ + ) ∗ U ∗ = U ( Σ Σ + ) U ∗ = U Σ V ∗ V Σ + U ∗ = A A + 
( A + A ) ∗ = ( V Σ + U ∗ U Σ V ∗ ) ∗ = ( V Σ + Σ V ∗ ) ∗ = V ( Σ + Σ ) ∗ V ∗ = V ( Σ + Σ ) V ∗ = V Σ + U ∗ U Σ V ∗ = A + A
A*+=A+*
The proof works by showing that A^{+*} satisfies the four criteria for the pseudoinverse of A*. Since this amounts to just substitution, it is not shown here.
The proof of this relation is given as Exercise 1.18c in.
A+ = A+ A+* A*
A^{+} = A^{+}AA^{+} and AA^{+} = (AA^{+})* imply that A^{+} = A^{+}(A A^{+})* = A^{+}A^{+*}A*.
A+ = A* A+* A+
A^{+} = A^{+}AA^{+} and A^{+}A = (A^{+}A)* imply that A^{+} = (A^{+}A)*A^{+} = A*A^{+*}A^{+}.
A = A+* A* A
A = A A^{+} A and A A^{+} = (A A^{+})* imply that A = (A A^{+})* A = A^{+}* A* A.
A = A A* A+*
A = A A^{+} A and A^{+} A = (A^{+} A)* imply that A = A (A^{+} A)* = A A* A^{+}*.
A* = A* A A+
This is the conjugate transpose of A = A^{+}* A* A above.
A* = A+ A A*
This is the conjugate transpose of A = A A* A^{+}* above.
Reduction to the Hermitian case
The results of this section show that the computation of the pseudoinverse is reducible to its construction in the Hermitian case. It suffices to show that the putative constructions satisfy the defining criteria.
A+ = A* (A A*)+
This relation is given as exercise 18(d) in, for the reader to prove, "for every matrix A". Write D = A* (A A*)^{+}. Observe that
Similarly, (AA*)^{+}AA*(AA*)^{+} = (AA*)^{+} implies that A*(AA*)^{+}AA*(AA*)^{+} = A*(AA*)^{+} i.e. DAD = D.
Additionally, AD = AA*(AA*)^{+} so AD = (AD)*.
Finally, DA = A*(AA*)^{+}A implies that (DA)* = A* ((AA*)^{+})*A = A* ((AA*)^{+})A = DA.
Therefore D = A^{+}.
A+ = (A* A)+A*
This is proved in an analogous manner to the case above, using Lemma 2 instead of Lemma 3.
Products
For the first three proofs, we consider products C = AB.
A has orthonormal columns
If A has orthonormal columns i.e. A*A = I then A^{+}=A*. Write D=B^{+}A^{+} = B^{+}A*. We show that D satisfies the MoorePenrose criteria.
CDC = ABB^{+}A*AB = ABB^{+}B = AB = C .
DCD = B^{+}A*ABB^{+}A* = B^{+}BB^{+}A* = B^{+}A* = D
(CD)* = D*B*A* = A(B^{+})*B*A* = A(BB^{+})*A* = ABB^{+}A* = CD
(DC)* = B*A*D* = B*A*A(B^{+})* = (B^{+}B)* = B^{+}B = B^{+}A*AB = DC
Therefore D = C^{+}
B has orthonormal rows
If B has orthonormal rows i.e. BB* = I then B^{+}=B*. Write D=B^{+}A^{+} = B*A^{+}. We show that D satisfies the MoorePenrose criteria.
CDC = ABB*A^{+}AB = AA^{+}AB = AB = C .
DCD = B*A^{+}ABB*A^{+} = B*A^{+}AA^{+} = B*A^{+} = D
(CD)* = D*B*A* = (A^{+})*BB*A* = (A^{+})*A* = (AA^{+})* = AA^{+} = ABB*A^{+} = CD
(DC)* = B*A*D* = B*A*(A^{+})*B = B*(A^{+}A)*B = B*A^{+}AB = DC
Therefore D = C^{+}
A has full column rank and B has full row rank
Since A has full column rank, A*A is invertible so (A*A)^{+} = (A*A)^{−1}. Similarly, since B has full row rank, BB* is invertible so (BB*)^{+} = (BB*)^{−1}.
Write D = B^{+}A^{+} = B*(BB*)^{−1}(A*A)^{−1}A*. We show that D satisfies the MoorePenrose criteria.
CDC = ABB*(BB*)^{−1}(A*A)^{−1}A*AB = AB = C .
DCD = B*(BB*)^{−1}(A*A)^{−1}A*ABB*(BB*)^{−1}(A*A)^{−1}A*= B*(BB*)^{−1}(A*A)^{−1}A* = D
CD = ABB*(BB*)^{−1}(A*A)^{−1}A* = A(A*A)^{−1}A* = (A(A*A)^{−1}A*)* ⇒ (CD)* = CD.
DC = B*(BB*)^{−1}(A*A)^{−1}A*AB = B*(BB*)^{−1}B = (B*(BB*)^{−1}B)* ⇒ (DC)* = DC.
Therefore D = C^{+}
Conjugate transpose
Here,
Therefore
and, since
Projectors and subspaces
Define P = AA^{+} and Q = A^{+}A. Observe that P^{2} = AA^{+}AA^{+} = AA^{+} = P. Similarly Q^{2} = Q, and finally, P = P* and Q = Q*. Thus P and Q are orthogonal projection operators. Orthogonality follows from the relations P = P* and Q = Q*. Indeed, consider the operator P: any vector decomposes as
x = Px + (IP)x
and for all vectors x and y satisfying Px = x and (IP)y = y, we have
x*y = (Px)*(IP)y = x*P*(IP)y = x*P(IP)y = 0.
It follows that PA = AA^{+}A = A and A^{+}P = A^{+}AA^{+} = A^{+}. Similarly, QA^{+} = A^{+} and AQ = A. The orthogonal components are now readily identified.
If y belongs to the range of A then for some x, y = Ax and Py = PAx = Ax = y. Conversely, if Py = y then y = AA^{+}y so that y belongs to the range of A. It follows that P is the orthogonal projector onto the range of A. I  P is then the orthogonal projector onto the orthogonal complement of the range of A, which equals the kernel of A*.
A similar argument using the relation Q A* = A* establishes that Q is the orthogonal projector onto the range of A* and (IQ) is the orthogonal projector onto the kernel of A.
Using the relations P(A^{+})* = P*(A^{+})* = (A^{+}P)* = (A^{+})* and P = P* = (A^{+})*A* it follows that the range of P equals the range of (A^{+})*, which in turn implies that the range of IP equals the kernel of A^{+}. Similarly QA^{+} = A^{+} implies that the range of Q equals the range of A^{+}. Therefore, we find,
Leastsquares minimization
In the general case, it is shown here for any
To prove this, we first note that (stating the complex case), using the fact that
so that
as claimed.
If
Minimumnorm solution to a linear system
The proof above also shows that if the system
To see this, note first, with
Thus
with equality if and only if