Functional derivative

In the calculus of variations, a field of mathematical analysis, the functional derivative (or variational derivative) relates a change in a functional to a change in a function on which the functional depends.

In the calculus of variations, functionals are usually expressed in terms of an integral of functions, their arguments, and their derivatives. In an integrand L of a functional, if a function f is varied by adding to it another function δf that is arbitrarily small, and the resulting integrand is expanded in powers of δf, the coefficient of δf in the first order term is called the functional derivative.

For example, consider the functional

J [ f ] = ∫ a b L [ x , f ( x ) , f ′ ( x ) ] d x ,

where f ′(x) ≡ df/dx. If f is varied by adding to it a function δf, and the resulting integrand L(x, f +δf, f '+δf ′) is expanded in powers of δf, then the change in the value of J to first order in δf can be expressed as follows:

δ J = ∫ a b δ J δ f ( x ) δ f ( x ) d x .

The coefficient of δf(x), denoted as δJ/δf(x), is called the functional derivative of J with respect to f at the point x. For this example functional, the functional derivative is the left hand side of the Euler-Lagrange equation,

δ J δ f ( x ) = ∂ L ∂ f − d d x ∂ L ∂ f ′ .

Definition

In this section, the functional derivative is defined. Then the functional differential is defined in terms of the functional derivative.

Given a manifold M representing (continuous/smooth) functions ρ (with certain boundary conditions etc.), and a functional F defined as

F : M → R or F : M → C ,

the functional derivative of F[ρ], denoted δF/δρ, is defined by

∫ δ F δ ρ ( x ) ϕ ( x ) d x = lim ε → 0 F [ ρ + ε ϕ ] − F [ ρ ] ε = [ d d ϵ F [ ρ + ϵ ϕ ] ] ϵ = 0 ,

where ϕ is an arbitrary function. The quantity εϕ is called the variation of ρ. In other words,

ϕ ↦ [ d d ϵ F [ ρ + ϵ ϕ ] ] ϵ = 0

is a linear functional, so by the Riesz–Markov–Kakutani representation theorem, this functional is given by integration against some measure. Then δF/δρ is defined to be the Radon–Nikodym derivative of this measure.

We think of the function δF/δρ as the gradient of F at the point ρ and

∫ δ F δ ρ ( x ) ϕ ( x ) d x

as the directional derivative at point ρ in the direction of ϕ. Then analogous to vector calculus, the inner product with the gradient gives the directional derivative.

Functional differential

The differential (or variation or first variation) of the functional F[ρ] is

δ F ( ρ , ϕ ) = ∫ δ F δ ρ ( x ) ϕ ( x ) d x .

Heuristically, ϕ is the change in ρ, so we 'formally' have ϕ = δρ, and then this is similar in form to the total differential of a function F(ρ₁, ρ₂, ..., ρ_n),

d F = ∑ i = 1 n ∂ F ∂ ρ i d ρ i ,

where ρ₁, ρ₂, ... , ρ_n are independent variables. Comparing the last two equations, the functional derivative δF/δρ(x) has a role similar to that of the partial derivative ∂F/∂ρ_i , where the variable of integration x is like a continuous version of the summation index i.

Formal description

The definition of a functional derivative may be made more mathematically precise and formal by defining the space of functions more carefully. For example, when the space of functions is a Banach space, the functional derivative becomes known as the Fréchet derivative, while one uses the Gâteaux derivative on more general locally convex spaces. Note that Hilbert spaces are special cases of Banach spaces. The more formal treatment allows many theorems from ordinary calculus and analysis to be generalized to corresponding theorems in functional analysis, as well as numerous new theorems to be stated.

Properties

Like the derivative of a function, the functional derivative satisfies the following properties, where F[ρ] and G[ρ] are functionals:

Linearity:

δ ( λ F + μ G ) [ ρ ] δ ρ ( x ) = λ δ F [ ρ ] δ ρ ( x ) + μ δ G [ ρ ] δ ρ ( x ) ,

where λ, μ are constants.

Product rule:

δ ( F G ) [ ρ ] δ ρ ( x ) = δ F [ ρ ] δ ρ ( x ) G [ ρ ] + F [ ρ ] δ G [ ρ ] δ ρ ( x ) ,

Chain rules:

If F is a functional and G an operator, then δ F [ G [ ρ ] ] δ ρ ( y ) = ∫ d x δ F [ G ] δ G ( x ) G = G [ ρ ] ⋅ δ G [ ρ ] ( x ) δ ρ ( y ) . If G is an ordinary differentiable function g, then this reduces to δ F [ g ( ρ ) ] δ ρ ( y ) = δ F [ g ( ρ ) ] δ g [ ρ ( y ) ] d g ( ρ ) d ρ ( y ) .

Determining functional derivatives

We give a formula to determine functional derivatives for a common class of functionals that can be written as the integral of a function and its derivatives. This is a generalization of the Euler–Lagrange equation: indeed, the functional derivative was introduced in physics within the derivation of the Lagrange equation of the second kind from the principle of least action in Lagrangian mechanics (18th century). The first three examples below are taken from density functional theory (20th century), the fourth from statistical mechanics (19th century).

Formula

Given a functional

F [ ρ ] = ∫ f ( r , ρ ( r ) , ∇ ρ ( r ) ) d r ,

and a function ϕ(r) that vanishes on the boundary of the region of integration, from a previous section Definition,

∫ δ F δ ρ ( r ) ϕ ( r ) d r = [ d d ε ∫ f ( r , ρ + ε ϕ , ∇ ρ + ε ∇ ϕ ) d r ] ε = 0 = ∫ ( ∂ f ∂ ρ ϕ + ∂ f ∂ ∇ ρ ⋅ ∇ ϕ ) d r = ∫ [ ∂ f ∂ ρ ϕ + ∇ ⋅ ( ∂ f ∂ ∇ ρ ϕ ) − ( ∇ ⋅ ∂ f ∂ ∇ ρ ) ϕ ] d r = ∫ [ ∂ f ∂ ρ ϕ − ( ∇ ⋅ ∂ f ∂ ∇ ρ ) ϕ ] d r = ∫ ( ∂ f ∂ ρ − ∇ ⋅ ∂ f ∂ ∇ ρ ) ϕ ( r ) d r .

The second line is obtained using the total derivative, where ∂f /∂∇ρ is a derivative of a scalar with respect to a vector. The third line was obtained by use of a product rule for divergence. The fourth line was obtained using the divergence theorem and the condition that ϕ=0 on the boundary of the region of integration. Since ϕ is also an arbitrary function, applying the fundamental lemma of calculus of variations to the last line, the functional derivative is

where ρ = ρ(r) and f = f (r, ρ, ∇ρ). This formula is for the case of the functional form given by F[ρ] at the beginning of this section. For other functional forms, the definition of the functional derivative can be used as the starting point for its determination. (See the example Coulomb potential energy functional.)

The above equation for the functional derivative can be generalized to the case that includes higher dimensions and higher order derivatives. The functional would be,

F [ ρ ( r ) ] = ∫ f ( r , ρ ( r ) , ∇ ρ ( r ) , ∇ ( 2 ) ρ ( r ) , … , ∇ ( N ) ρ ( r ) ) d r ,

where the vector r ∈ ℝⁿ, and ∇⁽ⁱ⁾ is a tensor whose nⁱ components are partial derivative operators of order i,

[ ∇ ( i ) ] α 1 α 2 ⋯ α i = ∂ i ∂ r α 1 ∂ r α 2 ⋯ ∂ r α i where α 1 , α 2 , ⋯ , α i = 1 , 2 , ⋯ , n .

An analogous application of the definition of the functional derivative yields

δ F [ ρ ] δ ρ = ∂ f ∂ ρ − ∇ ⋅ ∂ f ∂ ( ∇ ρ ) + ∇ ( 2 ) ⋅ ∂ f ∂ ( ∇ ( 2 ) ρ ) + ⋯ + ( − 1 ) N ∇ ( N ) ⋅ ∂ f ∂ ( ∇ ( N ) ρ ) = ∂ f ∂ ρ + ∑ i = 1 N ( − 1 ) i ∇ ( i ) ⋅ ∂ f ∂ ( ∇ ( i ) ρ ) .

In the last two equations, the nⁱ components of the tensor ∂ f ∂ ( ∇ ( i ) ρ ) are partial derivatives of f with respect to partial derivatives of ρ,

[ ∂ f ∂ ( ∇ ( i ) ρ ) ] α 1 α 2 ⋯ α i = ∂ f ∂ ρ α 1 α 2 ⋯ α i where ρ α 1 α 2 ⋯ α i ≡ ∂ i ρ ∂ r α 1 ∂ r α 2 ⋯ ∂ r α i ,

and the tensor scalar product is,

∇ ( i ) ⋅ ∂ f ∂ ( ∇ ( i ) ρ ) = ∑ α 1 , α 2 , ⋯ , α i = 1 n ∂ i ∂ r α 1 ∂ r α 2 ⋯ ∂ r α i ∂ f ∂ ρ α 1 α 2 ⋯ α i .

Thomas–Fermi kinetic energy functional

The Thomas–Fermi model of 1927 used a kinetic energy functional for a noninteracting uniform electron gas in a first attempt of density-functional theory of electronic structure:

T T F [ ρ ] = C F ∫ ρ 5 / 3 ( r ) d r .

Since the integrand of T_TF[ρ] does not involve derivatives of ρ(r), the functional derivative of T_TF[ρ] is,

δ T T F δ ρ ( r ) = C F ∂ ρ 5 / 3 ( r ) ∂ ρ ( r ) = 5 3 C F ρ 2 / 3 ( r ) .

Coulomb potential energy functional

For the electron-nucleus potential, Thomas and Fermi employed the Coulomb potential energy functional

V [ ρ ] = ∫ ρ ( r ) | r | d r .

Applying the definition of functional derivative,

∫ δ V δ ρ ( r ) ϕ ( r ) d r = [ d d ε ∫ ρ ( r ) + ε ϕ ( r ) | r | d r ] ε = 0 = ∫ 1 | r | ϕ ( r ) d r .

So,

δ V δ ρ ( r ) = 1 | r | .

For the classical part of the electron-electron interaction, Thomas and Fermi employed the Coulomb potential energy functional

J [ ρ ] = 1 2 ∬ ρ ( r ) ρ ( r ′ ) | r − r ′ | d r d r ′ .

From the definition of the functional derivative,

∫ δ J δ ρ ( r ) ϕ ( r ) d r = [ d d ϵ J [ ρ + ϵ ϕ ] ] ϵ = 0 = [ d d ϵ ( 1 2 ∬ [ ρ ( r ) + ϵ ϕ ( r ) ] [ ρ ( r ′ ) + ϵ ϕ ( r ′ ) ] | r − r ′ | d r d r ′ ) ] ϵ = 0 = 1 2 ∬ ρ ( r ′ ) ϕ ( r ) | r − r ′ | d r d r ′ + 1 2 ∬ ρ ( r ) ϕ ( r ′ ) | r − r ′ | d r d r ′

The first and second terms on the right hand side of the last equation are equal, since r and r′ in the second term can be interchanged without changing the value of the integral. Therefore,

∫ δ J δ ρ ( r ) ϕ ( r ) d r = ∫ ( ∫ ρ ( r ′ ) | r − r ′ | d r ′ ) ϕ ( r ) d r

and the functional derivative of the electron-electron coulomb potential energy functional J[ρ] is,

δ J δ ρ ( r ) = ∫ ρ ( r ′ ) | r − r ′ | d r ′ .

The second functional derivative is

δ 2 J [ ρ ] δ ρ ( r ′ ) δ ρ ( r ) = ∂ ∂ ρ ( r ′ ) ( ρ ( r ′ ) | r − r ′ | ) = 1 | r − r ′ | .

Weizsäcker kinetic energy functional

In 1935 von Weizsäcker proposed to add a gradient correction to the Thomas-Fermi kinetic energy functional to make it suit better a molecular electron cloud:

T W [ ρ ] = 1 8 ∫ ∇ ρ ( r ) ⋅ ∇ ρ ( r ) ρ ( r ) d r = ∫ t W d r ,

where

t W ≡ 1 8 ∇ ρ ⋅ ∇ ρ ρ and ρ = ρ ( r ) .

Using a previously derived formula for the functional derivative,

δ T W δ ρ ( r ) = ∂ t W ∂ ρ − ∇ ⋅ ∂ t W ∂ ∇ ρ = − 1 8 ∇ ρ ⋅ ∇ ρ ρ 2 − ( 1 4 ∇ 2 ρ ρ − 1 4 ∇ ρ ⋅ ∇ ρ ρ 2 ) where ∇ 2 = ∇ ⋅ ∇ ,

and the result is,

δ T W δ ρ ( r ) = 1 8 ∇ ρ ⋅ ∇ ρ ρ 2 − 1 4 ∇ 2 ρ ρ .

Entropy

The entropy of a discrete random variable is a functional of the probability mass function.

H [ p ( x ) ] = − ∑ x p ( x ) log ⁡ p ( x )

Thus,

∑ x δ H δ p ( x ) ϕ ( x ) = [ d d ϵ H [ p ( x ) + ϵ ϕ ( x ) ] ] ϵ = 0 = [ − d d ε ∑ x [ p ( x ) + ε ϕ ( x ) ] log ⁡ [ p ( x ) + ε ϕ ( x ) ] ] ε = 0 = − ∑ x [ 1 + log ⁡ p ( x ) ] ϕ ( x ) .

Thus,

δ H δ p ( x ) = − 1 − log ⁡ p ( x ) .

Exponential

Let

F [ φ ( x ) ] = e ∫ φ ( x ) g ( x ) d x .

Using the delta function as a test function,

δ F [ φ ( x ) ] δ φ ( y ) = lim ε → 0 F [ φ ( x ) + ε δ ( x − y ) ] − F [ φ ( x ) ] ε = lim ε → 0 e ∫ ( φ ( x ) + ε δ ( x − y ) ) g ( x ) d x − e ∫ φ ( x ) g ( x ) d x ε = e ∫ φ ( x ) g ( x ) d x lim ε → 0 e ε ∫ δ ( x − y ) g ( x ) d x − 1 ε = e ∫ φ ( x ) g ( x ) d x lim ε → 0 e ε g ( y ) − 1 ε = e ∫ φ ( x ) g ( x ) d x g ( y ) .

Thus,

δ F [ φ ( x ) ] δ φ ( y ) = g ( y ) F [ φ ( x ) ] .

This is particularly useful in calculating the correlation functions from the partition function in quantum field theory.

Functional derivative of a function

A function can be written in the form of an integral like a functional. For example,

ρ ( r ) = F [ ρ ] = ∫ ρ ( r ′ ) δ ( r − r ′ ) d r ′ .

Since the integrand does not depend on derivatives of ρ, the functional derivative of ρ(r) is,

δ ρ ( r ) δ ρ ( r ′ ) ≡ δ F δ ρ ( r ′ ) = ∂ ∂ ρ ( r ′ ) [ ρ ( r ′ ) δ ( r − r ′ ) ] = δ ( r − r ′ ) .

Functional derivative of iterated function

The functional derivative of the iterated function f ( f ( x ) ) is given by:

δ f ( f ( x ) ) δ f ( y ) = f ′ ( f ( x ) ) δ ( x − y ) + δ ( f ( x ) − y )

and

δ f ( f ( f ( x ) ) ) δ f ( y ) = f ′ ( f ( f ( x ) ) ( f ′ ( f ( x ) ) δ ( x − y ) + δ ( f ( x ) − y ) ) + δ ( f ( f ( x ) ) − y )

In general:

δ f N ( x ) δ f ( y ) = f ′ ( f N − 1 ( x ) ) δ f N − 1 ( x ) δ f ( y ) + δ ( f N − 1 ( x ) − y )

Putting in N=0 gives:

δ f − 1 ( x ) δ f ( y ) = − δ ( f − 1 ( x ) − y ) f ′ ( f − 1 ( x ) )

Using the delta function as a test function

In physics, it's common to use the Dirac delta function δ ( x − y ) in place of a generic test function ϕ ( x ) , for yielding the functional derivative at the point y (this is a point of the whole functional derivative as a partial derivative is a component of the gradient):

δ F [ ρ ( x ) ] δ ρ ( y ) = lim ε → 0 F [ ρ ( x ) + ε δ ( x − y ) ] − F [ ρ ( x ) ] ε .

This works in cases when F [ ρ ( x ) + ε f ( x ) ] formally can be expanded as a series (or at least up to first order) in ε . The formula is however not mathematically rigorous, since F [ ρ ( x ) + ε δ ( x − y ) ] is usually not even defined.

The definition given in a previous section is based on a relationship that holds for all test functions ϕ, so one might think that it should hold also when ϕ is chosen to be a specific function such as the delta function. However, the latter is not a valid test function (it is not even a proper function).

In the definition, the functional derivative describes how the functional F [ φ ( x ) ] changes as a result of a small change in the entire function φ ( x ) . The particular form of the change in φ ( x ) is not specified, but it should stretch over the whole interval on which x is defined. Employing the particular form of the perturbation given by the delta function has the meaning that φ ( x ) is varied only in the point y . Except for this point, there is no variation in φ ( x ) .

References

Functional derivative Wikipedia

(Text) CC BY-SA

Contents

Definition

Functional derivative

Functional differential

Formal description

Properties

Determining functional derivatives

Formula

Thomas–Fermi kinetic energy functional

Coulomb potential energy functional

Weizsäcker kinetic energy functional

Entropy

Exponential

Functional derivative of a function

Functional derivative of iterated function

Using the delta function as a test function

References