Smooth maximum - Alchetron, The Free Social Encyclopedia

In mathematics, a smooth maximum of an indexed family x₁, ..., x_n of numbers is a differentiable approximation to the maximum function

{ x 1 , … , x n } ↦ max { x 1 , … , x n } ,

and the concept of smooth minimum is similarly defined.

For large positive values of the parameter α > 0 , the following formulation is one smooth, differentiable approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum.

S α ( { x i } i = 1 n ) = ∑ i = 1 n x i e α x i ∑ i = 1 n e α x i

S α has the following properties:

S α → max as α → ∞
S 0 is the average of its inputs
S α → min as α → − ∞

The gradient of S α is closely related to softmax and is given by

∇ x i S α ( { x i } i = 1 n ) = e α x i ∑ j = 1 n e α x j [ 1 + α ( x i − S α ( { x i } i = 1 n ) ) ] .

This makes the softmax function useful for optimization techniques that use gradient descent.

Another formulation is:

g ( x 1 , … , x n ) = log ⁡ ( exp ⁡ ( x 1 ) + … + exp ⁡ ( x n ) − ( n − 1 ) )

The ( n − 1 ) term corrects for the fact that exp ⁡ ( 0 ) = 1 by canceling out all but one zero expoential

References

Smooth maximum Wikipedia

(Text) CC BY-SA