Eigenvalue perturbation - Alchetron, the free social encyclopedia

In mathematics, an eigenvalue perturbation problem is that of finding the eigenvectors and eigenvalues of a system that is perturbed from one with known eigenvectors and eigenvalues. This is useful for studying how sensitive the original system's eigenvectors and eigenvalues are to changes in the system. This type of analysis was popularized by Lord Rayleigh, in his investigation of harmonic vibrations of a string perturbed by small inhomogeneities.

Example

Suppose we have solutions to the generalized eigenvalue problem,

K 0 x 0 i = λ 0 i M 0 x 0 i . ( 0 )

where K 0 and M 0 are matrices. That is, we know the eigenvalues λ_0i and eigenvectors x_0i for i = 1, ..., N. Now suppose we want to change the matrices by a small amount. That is, we want to find the eigenvalues and eigenvectors of

K x i = λ i M x i ( 1 )

where

K = K 0 + δ K M = M 0 + δ M

with the perturbations δ K and δ M much smaller than K and M respectively. Then we expect the new eigenvalues and eigenvectors to be similar to the original, plus small perturbations:

λ i = λ 0 i + δ λ i x i = x 0 i + δ x i

Steps

We assume that the matrices are symmetric and positive definite, and assume we have scaled the eigenvectors such that

x 0 j ⊤ M 0 x 0 i = δ i j ( 2 )

where δ_ij is the Kronecker delta. Now we want to solve the equation

K x i = λ i M x i .

Substituting, we get

( K 0 + δ K ) ( x 0 i + δ x i ) = ( λ 0 i + δ λ i ) ( M 0 + δ M ) ( x 0 i + δ x i ) ,

which expands to

K 0 x 0 i + δ K x 0 i + K 0 δ x i + δ K δ x i = = λ 0 i M 0 x 0 i + λ 0 i M 0 δ x i + λ 0 i δ M x 0 i + δ λ i M 0 x 0 i + λ 0 i δ M δ x i + δ λ i δ M x 0 i + δ λ i M 0 δ x i + δ λ i δ M δ x i .

Canceling from (1) leaves

δ K x 0 i + K 0 δ x i + δ K δ x i = λ 0 i M 0 δ x i + λ 0 i δ M x 0 i + δ λ i M 0 x 0 i + λ 0 i δ M δ x i + δ λ i δ M x 0 i + δ λ i M 0 δ x i + δ λ i δ M δ x i .

Removing the higher-order terms, this simplifies to

K 0 δ x i + δ K x 0 i = λ 0 i M 0 δ x i + λ 0 i δ M x 0 i + δ λ i M 0 x 0 i . ( 3 )

When the matrix is symmetric, the unperturbed eigenvectors are orthogonal and so we use them as a basis for the perturbed eigenvectors. That is, we want to construct

δ x i = ∑ j = 1 N ε i j x 0 j ( 4 )

where the ε_ij are small constants that are to be determined. Substituting (4) into (3) and rearranging gives

K 0 ∑ j = 1 N ε i j x 0 j + δ K x 0 i = λ 0 i M 0 ∑ j = 1 N ε i j x 0 j + λ 0 i δ M x 0 i + δ λ i M 0 x 0 i ( 5 ) ∑ j = 1 N ε i j K 0 x 0 j + δ K x 0 i = λ 0 i M 0 ∑ j = 1 N ε i j x 0 j + λ 0 i δ M x 0 i + δ λ i M 0 x 0 i Applying K 0 to the sum ∑ j = 1 N ε i j λ 0 j M 0 x 0 j + δ K x 0 i = λ 0 i M 0 ∑ j = 1 N ε i j x 0 j + λ 0 i δ M x 0 i + δ λ i M 0 x 0 i Using Eq. ( 1 )

Because the eigenvectors are M₀-orthogonal when M₀ is positive definite, we can remove the summations by left multiplying by x 0 i ⊤ :

x 0 i ⊤ ε i i λ 0 i M 0 x 0 i + x 0 i ⊤ δ K x 0 i = λ 0 i x 0 i ⊤ M 0 ε i i x 0 i + λ 0 i x 0 i ⊤ δ M x 0 i + δ λ i x 0 i ⊤ M 0 x 0 i .

By use of equation (1) again:

x 0 i ⊤ K 0 ε i i x 0 i + x 0 i ⊤ δ K x 0 i = λ 0 i x 0 i ⊤ M 0 ε i i x 0 i + λ 0 i x 0 i ⊤ δ M x 0 i + δ λ i x 0 i ⊤ M 0 x 0 i . ( 6 )

The two terms containing ε_ii are equal because left-multiplying (1) by x 0 i ⊤ gives

x 0 i ⊤ K 0 x 0 i = λ 0 i x 0 i ⊤ M 0 x 0 i .

Canceling those terms in (6) leaves

x 0 i ⊤ δ K x 0 i = λ 0 i x 0 i ⊤ δ M x 0 i + δ λ i x 0 i ⊤ M 0 x 0 i .

Rearranging gives

δ λ i = x 0 i ⊤ ( δ K − λ 0 i δ M ) x 0 i x 0 i ⊤ M 0 x 0 i

But by (2), this denominator is equal to 1. Thus

δ λ i = x 0 i ⊤ ( δ K − λ 0 i δ M ) x 0 i .

Then, by left-multiplying equation (5) by x_0k:

ε i k = x 0 k ⊤ ( δ K − λ 0 i δ M ) x 0 i λ 0 i − λ 0 k , i ≠ k .

Or by changing the name of the indices:

ε i j = x 0 j ⊤ ( δ K − λ 0 i δ M ) x 0 i λ 0 i − λ 0 j , i ≠ j .

To find ε_ii, use the fact that:

x i ⊤ M x i = 1

implies:

ε i i = − 1 2 x 0 i ⊤ δ M x 0 i .

Summary

λ i = λ 0 i + x 0 i ⊤ ( δ K − λ 0 i δ M ) x 0 i x i = x 0 i ( 1 − 1 2 x 0 i ⊤ δ M x 0 i ) + ∑ j = 1 j ≠ i N x 0 j ⊤ ( δ K − λ 0 i δ M ) x 0 i λ 0 i − λ 0 j x 0 j

for infinitesimal δK and δM (the high order terms in (3) being negligible)

Results

This means it is possible to efficiently do a sensitivity analysis on λ_i as a function of changes in the entries of the matrices. (Recall that the matrices are symmetric and so changing K_kℓ will also change K_ℓk, hence the (2 − δ_kℓ) term.)

∂ λ i ∂ K ( k ℓ ) = ∂ ∂ K ( k ℓ ) ( λ 0 i + x 0 i ⊤ ( δ K − λ 0 i δ M ) x 0 i ) = x 0 i ( k ) x 0 i ( ℓ ) ( 2 − δ k ℓ ) ∂ λ i ∂ M ( k ℓ ) = ∂ ∂ M ( k ℓ ) ( λ 0 i + x 0 i ⊤ ( δ K − λ 0 i δ M ) x 0 i ) = λ i x 0 i ( k ) x 0 i ( ℓ ) ( 2 − δ k ℓ ) .

Similarly

∂ x i ∂ K ( k ℓ ) = ∑ j = 1 j ≠ i N x 0 j ( k ) x 0 i ( ℓ ) ( 2 − δ k ℓ ) λ 0 i − λ 0 j x 0 j ∂ x i ∂ M ( k ℓ ) = − x 0 i x 0 i ( k ) x 0 i ( ℓ ) 2 ( 2 − δ k ℓ ) − ∑ j = 1 j ≠ i N λ 0 i x 0 j ( k ) x 0 i ( ℓ ) λ 0 i − λ 0 j x 0 j ( 2 − δ k ℓ ) .

Existence of eigenvectors

Note that in the above example we assumed that both the unperturbed and the perturbed systems involved symmetric matrices, which guaranteed the existence of N linearly independent eigenvectors. An eigenvalue problem involving non-symmetric matrices is not guaranteed to have N linearly independent eigenvectors, though a sufficient condition is that K and M be simultaneously diagonalisable.

References

Eigenvalue perturbation Wikipedia

(Text) CC BY-SA