In mathematics, specifically differential calculus, the inverse function theorem gives sufficient conditions for a function to be invertible in a neighborhood of a point in its domain. The theorem also gives a formula for the derivative of the inverse function. In multivariable calculus, this theorem can be generalized to any continuously differentiable, vector-valued function whose Jacobian determinant is nonzero at a point in its domain. In this case, the theorem gives a formula for the Jacobian matrix of the inverse. There are also versions of the inverse function theorem for complex holomorphic functions, for differentiable maps between manifolds, for differentiable functions between Banach spaces, and so forth.
For functions of a single variable, the theorem states that if                     f                 is a continuously differentiable function with nonzero derivative at the point                     a                , then                     f                 is invertible in a neighborhood of                     a                , the inverse is continuously differentiable, and
                                          (                                    f                      −            1                                                              )                                ′                (        f        (        a        )        )        =                              1                                          f                ′                            (              a              )                                      ,                where notationally the left side refers to the derivative of the inverse function evaluated at f(a).
For functions of more than one variable, the theorem states that if the total derivative of a continuously differentiable function                     F                 defined from an open set of                                           R                                n                                   into                                           R                                n                                   is invertible at a point                     p                 (i.e., the Jacobian determinant of                     F                 at                     p                 is non-zero), then                     F                 is an invertible function near                     p                . That is, an inverse function to                     F                 exists in some neighborhood of                     F        (        p        )                . Moreover, the inverse function                               F                      −            1                                   is also continuously differentiable. In the infinite dimensional case it is required that the Fréchet derivative have a bounded inverse at                     p                . Finally, the theorem says that
                              J                                    F                              −                1                                                    (        F        (        p        )        )        =        [                  J                      F                          (        p        )                  ]                      −            1                          ,                where                     [        ⋅                  ]                      −            1                                   denotes matrix inverse and                               J                      F                          (        p        )                 is the Jacobian matrix of the function                     F                 at the point                     p                . This formula can also be derived from the chain rule. The chain rule states that for functions                     G                 and                     H                 which have total derivatives at                     H        (        p        )                 and                     p                 respectively,
                              J                      G            ∘            H                          (        p        )        =                  J                      G                          (        H        (        p        )        )        ⋅                  J                      H                          (        p        )        .                Letting                     G                 be                               F                      −            1                                   and                     H                 be                     F                ,                     G        ∘        H                 is the identity function, whose Jacobian matrix is also the identity. In this special case, the formula above can be solved for                               J                                    F                              −                1                                                    (        F        (        p        )        )                . Note that the chain rule assumes the existence of total derivative of the inside function                     H                , while the inverse function theorem proves that                               F                      −            1                                   has a total derivative at                     p                . The existence of an inverse function to                     F                 is equivalent to saying that the system of                     n                 equations                               y                      i                          =                  F                      i                          (                  x                      1                          ,        …        ,                  x                      n                          )                 can be solved for                               x                      1                          ,        …        ,                  x                      n                                   in terms of                               y                      1                          ,        …        ,                  y                      n                                   if we restrict                     x                 and                     y                 to small enough neighborhoods of                     p                 and                     F        (        p        )                , respectively.
Consider the vector-valued function                     F                 from                                           R                                2                                   to                                           R                                2                                   defined by
                    F        (        x        ,        y        )        =                              [                                                                                                      e                                              x                                                              cos                                        y                                                                                                                                          e                                              x                                                              sin                                        y                                                                        ]                          .                Then the Jacobian matrix is
                              J                      F                          (        x        ,        y        )        =                              [                                                                                                      e                                              x                                                              cos                                        y                                                                                        −                                          e                                              x                                                              sin                                        y                                                                                                                                          e                                              x                                                              sin                                        y                                                                                                              e                                              x                                                              cos                                        y                                                                        ]                                  and the determinant is
                    det                  J                      F                          (        x        ,        y        )        =                  e                      2            x                                    cos                      2                                  y        +                  e                      2            x                                    sin                      2                                  y        =                  e                      2            x                          .                                The determinant                               e                      2            x                                   is nonzero everywhere. By the theorem, for every point                     p                 in                                           R                                2                                  , there exists a neighborhood about                     p                 over which                     F                 is invertible. Note that this is different than saying                     F                 is invertible over its entire image. In this example,                     F                 is not invertible because it is not injective (because                     F        (        x        ,        y        )        =        F        (        x        ,        y        +        2        π        )                ).
The inverse function theorem can be generalized to differentiable maps between differentiable manifolds. In this context the theorem states that for a differentiable map                     F        :        M        →        N                , if the differential of                     F                ,
                    d                  F                      p                          :                  T                      p                          M        →                  T                      F            (            p            )                          N                is a linear isomorphism at a point                     p                 in                     M                 then there exists an open neighborhood                     U                 of                     p                 such that
                    F                              |                                U                          :        U        →        F        (        U        )                is a diffeomorphism. Note that this implies that                     M                 and                     N                 must have the same dimension at                     p                . If the derivative of                     F                 is an isomorphism at all points                     p                 in                     M                 then the map                     F                 is a local diffeomorphism.
The inverse function theorem can also be generalized to differentiable maps between Banach spaces. Let                     X                 and                     Y                 be Banach spaces and                     U                 an open neighbourhood of the origin in                     X                . Let                     F        :        U        →        Y                 be continuously differentiable and assume that the derivative                     d                  F                      0                          :        X        →        Y                 of                     F                 at 0 is a bounded linear isomorphism of                     X                 onto                     Y                . Then there exists an open neighbourhood                     V                 of                     F        (        0        )                 in                     Y                 and a continuously differentiable map                     G        :        V        →        X                 such that                     F        (        G        (        y        )        )        =        y                 for all                     y                 in                     V                . Moreover,                     G        (        y        )                 is the only sufficiently small solution                     x                 of the equation                     F        (        x        )        =        y                .
These two directions of generalization can be combined in the inverse function theorem for Banach manifolds.
The inverse function theorem (and the implicit function theorem) can be seen as a special case of the constant rank theorem, which states that a smooth map with constant rank near a point can be put in a particular normal form near that point. Specifically, if                     F        :        M        →        N                 has constant rank near a point                     p        ∈        M                , then there are open neighborhoods                     U                 of                     p                 and                     V                 of                     F        (        p        )                 and there are diffeomorphisms                     u        :                  T                      p                          M        →        U                 and                     v        :                  T                      F            (            p            )                          N        →        V                 such that                     F        (        U        )        ⊆        V                 and such that the derivative                     d                  F                      p                          :                  T                      p                          M        →                  T                      F            (            p            )                          N                 is equal to                               v                      −            1                          ∘        F        ∘        u                . That is,                     F                 "looks like" its derivative near                     p                . Semicontinuity of the rank function implies that the set of points near which the derivative has constant rank is an open dense subset of the domain of the map. So the constant rank theorem applies "generically" across the domain.
When the derivative of                     F                 is injective (resp. surjective) at a point                     p                , it is also injective (resp. surjective) in a neighborhood of                     p                , and hence the rank of                     F                 is constant on that neighborhood, so the constant rank theorem applies.
If the Jacobian (in this context the matrix formed by the complex derivatives) of a holomorphic function                     F                , defined from an open set                     U                 of                                           C                                n                                   into                                           C                                n                                   , is invertible at a point                     p                , then                     F                 is an invertible function near                     p                . This follows immediately from the theorem above. One can also show, that this inverse is again a holomorphic function.