Derivative of the exponential map - Alchetron, the free social encyclopedia

In the theory of Lie groups, the exponential map is a map from the Lie algebra g of a Lie group G into G. In case G is a matrix Lie group, the exponential map reduces to the matrix exponential. The exponential map, denoted exp:g → G, is analytic and has as such a derivative d/dtexp(X(t)):Tg → TG, where X(t) is a C¹ path in the Lie algebra, and a closely related differential dexp:Tg → TG.

The formula is important both in pure and applied mathematics. It enters into proofs of theorems such as the Baker–Campbell–Hausdorff formula, and it is used frequently in physics for example in quantum field theory, as in the Magnus expansion in perturbation theory, and in lattice gauge theory.

Throughout, the notations exp(X) and e^X will be used interchangeably to denote the exponential given an argument, except when, where as noted, the notations have dedicated distinct meanings. The calculus-style notation is preferred here for better readability in equations. On the other hand, the exp-style is sometimes more convenient for inline equations, and is necessary on the rare occasions when there is a real distinction to be made.

Statement

The derivative of the exponential map is given by

Explanation

X = X(t) is a C¹ (continuously differentiable) path in the Lie algebra with derivative X ´(t) = dX(t)/dt. The argument t is omitted where not needed.

ad_X is the linear transformation of the Lie algebra given by ad_X(Y) = [X, Y]. It is the adjoint action of a Lie algebra on itself.

The fraction 1 − exp(−ad_X)/ad_X is given by the power series

derived from the power series of the exponential map of a linear endomorphism, as in matrix exponentiation

When G is a matrix Lie group, all occurrences of the exponential are given by their power series expansion.

When G is not a matrix Lie group, 1 − exp(−ad_X)/ad_X is still given by its power series (2), while the other two occurrences of exp in the formula, which now are the exponential map in Lie theory, refer to the time-one flow of the left invariant vector field X, i.e. element of the Lie algebra as defined in the general case, on the Lie group G viewed as an analytic manifold. This still amounts to exactly the same formula as in the matrix case.

The formula applies to the case where exp is considered as a map on matrix space over ℝ or ℂ, see matrix exponential. When G = GL(n, ℂ) or GL(n, ℝ), the notions coincide precisely.

To compute the differential dexp of exp at X, dexp_X:Tg_X → TG_exp(X), the standard recipe

d exp X ⁡ Y = d d t e Z ( t ) | t = 0 , Z ( 0 ) = X , Z ′ ( 0 ) = Y

is employed. With Z(t) = X + tY the result

follows immediately from (1). In particular, dexp₀:Tg₀ → TG_exp(0) = TG_e is the identity because Tg_X ≃ g (since g is a vector space) and TG_e ≃ g.

Proof

The proof given below assumes a matrix Lie group. This means that the exponential mapping from the Lie algebra to the matrix Lie group is given by the usual power series, i.e. matrix exponentiation. The conclusion of the proof still holds in the general case, provided each occurrence of exp is correctly interpreted. See comments on the general case below.

The outline of proof makes use of the technique of differentiation with respect to s of the parametrized expression

Γ ( s , t ) = e − s X ( t ) ∂ ∂ t e s X ( t )

to obtain a first order differential equation for Γ which can then be solved by direct integration in s. The solution is then e^X Γ(1, t).

Lemma
Let Ad denote the adjoint action of the group on its Lie algebra. The action is given by Ad_AX = AXA⁻¹ for A ∈ G, X ∈ g. A frequently useful relationship between Ad and ad is given by

Proof
Using the product rule twice one finds,

∂ Γ ∂ s = e − s X ( − X ) ∂ ∂ t e s X ( t ) + e − s X ∂ ∂ t [ X ( t ) e s X ( t ) ] = e − s X d X d t e s X .

Then one observes that

∂ Γ ∂ s = A d e − s X X ′ = e − a d s X X ′ ,

by (4) above. Integration yields

Γ ( 1 , t ) = e − X ( t ) ∂ ∂ t e X ( t ) = ∫ 0 1 ∂ Γ ∂ s d s = ∫ 0 1 e − a d s X X ′ d s .

Using the formal power series to expand the exponential, integrating term by term, and finally recognizing (2),

Γ ( 1 , t ) = ∫ 0 1 ∑ k = 0 ∞ ( − 1 ) k s k k ! ( a d X ) k d X d t d s = ∑ k = 0 ∞ ( − 1 ) k ( k + 1 ) ! ( a d X ) k d X d t = 1 − e − a d X a d X d X d t ,

and the result follows. The proof, as presented here, is essentially the one given in Rossmann (2002). A proof with a more algebraic touch can be found in Hall (2015).

Comments on the general case

The formula in the general case is given by

d d t e x p ( C ( t ) ) = e x p ( C ) ϕ ( − a d ( C ) ) C ′ ,

where

ϕ ( z ) = e z − 1 z = 1 + 1 2 ! z + 1 3 ! z 2 + ⋯ ,

which formally reduces to

d d t e x p ( C ( t ) ) = e x p ( C ) 1 − e − a d C a d C d C ( t ) d t .

Here the exp-notation is used for the exponential mapping of the Lie algebra and the calculus-style notation in the fraction indicates the usual formal series expansion. For more information and two full proofs in the general case, see the freely available Sternberg (2004) reference.

Local behavior of the exponential map

The inverse function theorem together with the derivative of the exponential map provides information about the local behavior of exp. Any C^k, 0 ≤ k ≤ ∞, ω map f between vector spaces (here first considering matrix Lie groups) has a C^k inverse such that f is a C^k bijection in an open set around a point x in the domain provided df_x is invertible. From (3) it follows that this will happen precisely when

1 − e a d X a d X

is invertible. This, in turn, happens when the eigenvalues of this operator are all nonzero. The eigenvalues of 1 − exp(−ad_X)/ad_X are related to those of ad_X as follows. If g is an analytic function of a complex variable expressed in a power series such that g(U) for a matrix U converges, then the eigenvalues of g(U) will be g(λ_ij), where λ_ij are the eigenvalues of U, the double subscript is made clear below. In the present case with g(U) = 1 − exp(−U)/U and U = ad_X, the eigenvalues of 1 − exp(−ad_X)/ad_X are

1 − e − λ i j λ i j ,

where the λ_ij are the eigenvalues of ad_X. Putting 1 − exp(−λ_ij)/λ_ij = 0 one sees that dexp is invertible precisely when

λ i j ≠ k 2 π i , k = ± 1 , ± 2 , … .

The eigenvalues of ad_X are, in turn, related to those of X. Let the eigenvalues of X be λ_i. Fix an ordered basis e_i of the underlying vector space V such that X is lower triangular. Then

X e i = λ i e i + ⋯ ,

with the remaining terms multiples of e_n with n > i. Let E_ij be the corresponding basis for matrix space, i.e. (E_ij)_kl = δ_ikδ_jl. Order this basis such that E_ij < E_nm if i − j < n − m. One checks that the action of ad_X is given by

a d X E i j = ( λ i − λ j ) E i j + ⋯ ≡ λ i j E i j + ⋯ ,

with the remaining terms multiples of E_mn > E_mn. This means that ad_X is lower triangular with its eigenvalues λ_ij = λ_i − λ_j on the diagonal. The conclusion is that dexp_X is invertible, hence exp is a local bianalytical bijection around X, when the eigenvalues of X satisfy

λ i − λ j ≠ k 2 π i , k = ± 1 , ± 2 , … , 1 ≤ i , j ≤ n = d i m V .

In particular, in the case of matrix Lie groups, it follows, since dexp₀ is invertible, by the inverse function theorem that exp is a bi-analytic bijection in a neighborhood of 0 ∈ g in matrix space. Furthermore, exp, is a bi-analytic bijection from a neighborhood of 0 ∈ g in g to a neighborhood of e ∈ G. The same conclusion holds for general Lie groups using the manifold version of the inverse function theorem.

It also follows from the implicit function theorem that dexp_ξ itself is invertible for ξ sufficiently small.

Derivation of a Baker–Campbell–Hausdorff formula

If Z(t) is defined such that

e Z ( t ) = e X e t Y ,

an expression for Z(1) = log( expX expY ), the BCH formula, can be derived from the above formula,

exp ⁡ ( − Z ( t ) ) d d t e x p ( Z ( t ) ) = 1 − e − a d Z a d Z Z ′ ( t ) .

Its left-hand side is easy to see to equal Y. Thus,

Y = 1 − e − a d Z a d Z Z ′ ( t ) ,

and hence, formally,

Z ′ ( t ) = a d Z 1 − e − a d Z Y ≡ ψ ( e a d Z ) Y , ψ ( w ) = w log ⁡ w w − 1 = 1 + ∑ m = 1 ∞ ( − 1 ) m + 1 m ( m + 1 ) ( w − 1 ) m , | | w | | < 1.

However, using the relationship between Ad and ad given by (4), it is straightforward to further see that

e a d Z = e a d X e t a d Y

and hence

Z ′ ( t ) = ψ ( e a d X e t a d Y ) Y .

Putting this into the form of an integral in t from 0 to 1 yields,

Z ( 1 ) = log ⁡ ( exp ⁡ X exp ⁡ Y ) = X + ( ∫ 0 1 ψ ( e ad X e t ad Y ) d t ) Y ,

an integral formula for Z(1) that is more tractable in practice than the explicit Dynkin's series formula due to the simplicity of the series expansion of ψ. Note this expression consists of X+Y and nested commutators thereof with X or Y. A textbook proof along these lines can be found in Hall (2015) and Miller (1972).

Derivation of Dynkin's series formula

Dynkin's formula mentioned may also be derived analogously, starting from the parametric extension

e Z ( t ) = e t X e t Y ,

whence

e − Z ( t ) d e Z ( t ) d t = e − t a d Y X + Y ,

so that, using the above general formula,

Z ′ = a d Z 1 − e − a d Z ( e − t a d Y X + Y ) = a d Z e a d Z − 1 ( X + e t a d X Y ) .

Since, however,

a d Z = l o g ( e x p ( a d Z ) ) = l o g ( 1 + ( e x p ( a d Z ) − 1 ) ) = ∑ n = 1 ∞ ( − ) n + 1 n ( exp ⁡ ( a d Z ) − 1 ) n , | | a d Z | | < log ⁡ 2 ,

the last step by virtue of the Mercator series expansion, it follows that

and, thus, integrating,

Z ( 1 ) = ∫ 0 1 d t d Z ( t ) d t = ∑ n = 1 ∞ ( − ) n − 1 n ∫ 0 1 d t ( e t a d X e t a d Y − 1 ) n − 1 ( X + e t a d X Y ) .

It is at this point evident that the qualitative statement of the BCH formula holds, namely Z lies in the Lie algebra generated by X, Y and is expressible as a series in repeated brackets (A). For each k, terms for each partition thereof are organized inside the integral ∫dt t^k−1. The resulting Dynkin's formula is then

For a similar proof with detailed series expansions, see Rossmann (2002). For complete details, click on "show" below.

References

Derivative of the exponential map Wikipedia

(Text) CC BY-SA

Contents