In mathematics, a smooth maximum of an indexed family x1, ..., xn of numbers is a differentiable approximation to the maximum function
and the concept of smooth minimum is similarly defined.
For large positive values of the parameter
-
S α → max asα → ∞ -
S 0 -
S α → min asα → − ∞
The gradient of
This makes the softmax function useful for optimization techniques that use gradient descent.
Another formulation is:
The