In mathematical optimization, Zermelo's navigation problem, proposed in 1931 by Ernst Zermelo, is a classic optimal control problem that deals with a boat navigating on a body of water, originating from a point                     A                 to a destination point                     B                . The boat is capable of a certain maximum speed, and the goal is to derive the best possible control to reach                     B                 in the least possible time.
Without considering external forces such as current and wind, the optimal control is for the boat to always head towards                     B                . Its path then is a line segment from                     A                 to                     B                , which is trivially optimal. With consideration of current and wind, if the combined force applied to the boat is non-zero the control for no current and wind does not yield the optimal path.
In his 1931 article, Ernst Zermelo formulates the following problem:
In an unbounded plane where the wind distribution is given by a vector field as a function of position and time, a ship moves with constant velocity relative to the surrounding air mass. How must the ship be steered in order to come from a starting point to a given goal in the shortest time?
This is an extension of the classical optimisation problem for geodesics – minimising the length of a curve                     I        [        c        ]        =                  ∫                      a                                b                                                1            +                          y                              ′                                  2                                                                            d        x                 connecting points                     A                 and                     B                 , with the added complexity of considering some wind velocity. Although it is usually impossible to find an exact solution in most cases, the general case was solved by Zermelo himself in the form of a partial differential equation, known as Zermelo's equation, which can be numerically solved.
The case of constant wind is easy to solve exactly. Let                               d                =                                                            A                B                            →                                              , and suppose that to minimise the travel time the ship travels at a constant maximum speed                     V                . Thus the position of the ship at time                     t                 is                               x                =        t        (                  v                +                  w                )                . Let                     T                 be the time of arrival at                     B                , so that                               d                =        T        (                  v                +                  w                )                . Taking the dot product of this with                               w                         and                               d                         respectively results in                                                         d              →                                      ⋅                                            w              →                                      =        T        (                  v                ⋅                                            w              →                                      +                              w                                2                          )                 and                               d                      2                          =                  T                      2                          (                  v                      2                          +        2                                            v              →                                      ⋅                  w                +                              w                                2                          )                . Eliminating                                                         v              →                                      ⋅                                            w              →                                               and writing this system as a quadratic in                     T                 results in                     (                                                            v                →                                                          2                          −                                                            w                →                                                          2                          )                  T                      2                          +        2        (                  d                ⋅                  w                )        T        −                              d                                2                          =        0                . Upon solving this, taking the positive square-root since                     T                 is positive, we obtain
                                                                        T                [                                  d                                ]                                                            =                                                                            −                      2                      (                                              d                                            ⋅                                              w                                            )                      ±                                                                        4                          (                                                      d                                                    ⋅                                                      w                                                                                )                                                          2                                                                                +                          4                                                                                    d                                                                                      2                                                                                (                                                                                    v                                                                                      2                                                                                −                                                                                    w                                                                                      2                                                                                )                                                                                                            2                      (                                                                        v                                                                          2                                                                    −                                                                        w                                                                          2                                                                    )                                                                                                                                                        =                                                                                                                                                          d                                                                                2                                                                                                                                                              v                                                                                      2                                                                                −                                                                                                                                                      w                                  →                                                                                                                                                    2                                                                                                                                            +                                                                                            (                                                      d                                                    ⋅                                                      w                                                                                )                                                          2                                                                                                                                (                                                                                                                                                      v                                  →                                                                                                                                                    2                                                                                −                                                                                                                                                      w                                  →                                                                                                                                                    2                                                                                                            )                                                          2                                                                                                                                                                          −                                                                                                    d                                            ⋅                                              w                                                                                                                                      v                                                                          2                                                                    −                                                                        w                                                                          2                                                                                                                                                                Claim: This defines a metric on                                           R                                2                                  , provided                               |                          v                          |                >                  |                          w                          |                        .
By our assumption, clearly                     T        [                  d                ]        ≥        0                 with equality if and only if                               d                =        0                . Trivially if                                                                         d                            ~                                      =                                                            B                A                            →                                              , we have                     T        [                  d                ]        =        T        [                                                            d                            ~                                      ]                . It remains to show                     T                 satisfies a triangle inequality                     T        [                              d                                1                          +                              d                                2                          ]        ≤        T        [                              d                                1                          ]        +        T        [                              d                                2                          ]        .                
Indeed, letting                               c                      2                          :=                              v                                2                          −                              w                                2                                  , we note that this is true if and only if
                                                                                                                                                                                                    (                                                                                    d                                                                                      1                                                                                +                                                                                    d                                                                                      2                                                                                                            )                                                          2                                                                                                                                c                                                      2                                                                                                                +                                                                                            (                          (                                                                                                                                                      d                                  →                                                                                                                                                    1                                                                                +                                                                                                                                                      d                                  →                                                                                                                                                    2                                                                                )                          ⋅                                                                                                                    w                                →                                                                                                                                          )                                                          2                                                                                                                                c                                                      4                                                                                                                                              −                                                                            (                                                                        d                                                                          1                                                                    +                                                                        d                                                                          2                                                                    )                      ⋅                                              w                                                                                    c                                              2                                                                                                                                                ≤                                                                                                                                                                                                                      d                                                                                1                                                                                2                                                                                                    c                                                      2                                                                                                                +                                                                                            (                                                                                    d                                                                                      1                                                                                ⋅                                                      w                                                                                )                                                          2                                                                                                                                c                                                      4                                                                                                                −                                                                                                                                                      d                                                                                      2                                                                                ⋅                                                      w                                                                                                    c                                                      2                                                                                                                                              +                                                                                                                                                          d                                                                                2                                                                                2                                                                                                    c                                                      2                                                                                                                +                                                                                            (                                                                                    d                                                                                      2                                                                                ⋅                                                      w                                                                                )                                                          2                                                                                                                                c                                                      4                                                                                                                                              −                                                                                                                              d                                                                          2                                                                    ⋅                                              w                                                                                    c                                              2                                                                                                                                        if and only if
                                                                                          d                                                  1                                            ⋅                                                d                                                  2                                                                    c                              2                                                    +                                            (                                                d                                                  1                                            ⋅                              w                            )              (                                                d                                                  2                                            ⋅                              w                            )                                      c                              4                                                    ≤                              [                                                                                                                              d                        →                                                                                                  1                                                        2                                                                    c                                      2                                                                        +                                                            (                                                            d                                                              1                                                        ⋅                                      w                                                        )                                          2                                                                                        c                                      4                                                                        ]                                1                          /                        2                                                [                                                                                                                              d                        →                                                                                                  2                                                        2                                                                    c                                      2                                                                        +                                                            (                                                            d                                                              2                                                        ⋅                                      w                                                        )                                          2                                                                                        c                                      4                                                                        ]                                1                          /                        2                          ,                which is true if and only if
                                                        (                                                d                                                  1                                            ⋅                                                d                                                  2                                                            )                                  2                                                                    c                              4                                                    +                                            2              (                                                d                                                  1                                            ⋅                                                d                                                  2                                            )              (                                                d                                                  1                                            ⋅                              w                            )              (                                                d                                                  2                                            ⋅                              w                            )                                      c                              6                                                    ≤                                                                              d                                                  1                                                  2                                            ⋅                                                d                                                  2                                                  2                                                                    c                              4                                                    +                                                                              d                                                  1                                                  2                                            (                                                d                                                  2                                            ⋅                              w                                            )                                  2                                            +                                                d                                                  2                                                  2                                            (                                                d                                                  1                                            ⋅                              w                                            )                                  2                                                                    c                              6                                                            Using the Cauchy–Schwartz inequality, we obtain                     (                              d                                1                          ⋅                              d                                2                                    )                      2                          ≤                              d                                1                                2                          ⋅                              d                                2                                2                                   with equality if and only if                                           d                                1                                   and                                           d                                2                                   are linearly dependent, and so the inequality is indeed true.                     ◼                
Note: Since this is a strict inequality if                                           d                                1                                   and                                           d                                2                                   are not linearly dependent, it immediately follows that a straight line from                     A                 to                     B                 is always a faster path than any other path made up of straight line segments. We use a limiting argument to prove this is true for any curve.
Consider the general example of a ship moving against a variable wind                                                         w              →                                      (        x        ,        y        )                . Writing this component-wise, we have the drift in the                     x                -axis as                     u        (        x        ,        y        )                 and the drift in the                     y                -axis as                     v        (        x        ,        y        )                . Then for a ship moving at maximum velocity                     V                 at variable heading                     θ                , we have
                                                                                                                                    x                      ˙                                                                                                                  =                V                cos                                θ                +                u                (                x                ,                y                )                                                                                                                                y                      ˙                                                                                                                  =                V                sin                                θ                +                v                (                x                ,                y                )                                                            The Hamiltonian of the system is thus
                    H        =                  λ                      x                          (        V        cos                θ        +        u        )        +                  λ                      y                          (        V        sin                θ        +        v        )        +        1                Using the Euler–Lagrange equation, we obtain
                                                                                                                                                            λ                        ˙                                                                                                  x                                                                                              =                −                                                                            ∂                      H                                                              ∂                      x                                                                      =                                  λ                                      x                                                                                                              ∂                      u                                                              ∂                      x                                                                      −                                  λ                                      y                                                                                                              ∂                      v                                                              ∂                      x                                                                                                                                                                                                              λ                        ˙                                                                                                  y                                                                                              =                −                                                                            ∂                      H                                                              ∂                      y                                                                      =                                  λ                                      x                                                                                                              ∂                      u                                                              ∂                      y                                                                      −                                  λ                                      y                                                                                                              ∂                      v                                                              ∂                      y                                                                                                                          0                                                            =                                                                            ∂                      H                                                              ∂                      θ                                                                      =                V                (                −                                  λ                                      x                                                  sin                                θ                +                                  λ                                      y                                                  cos                                θ                )                                                            The last equation implies that                     tan                θ        =                  λ                      y                                    /                          λ                      x                                  . We note that the system is autonomous; the Hamiltonian does not depend on time                     t                , thus                     H                 = constant, but since we are minimising time, the constant is equal to 0. Thus we can solve the simultaneous equations above to get
                                                                                          λ                                      x                                                                                              =                                                                            −                      cos                                            θ                                                              V                      +                      u                      cos                                            θ                      +                      v                      sin                                            θ                                                                                                                                            λ                                      y                                                                                              =                                                                            −                      sin                                            θ                                                              V                      +                      u                      cos                                            θ                      +                      v                      sin                                            θ                                                                                                                  Substituting these values into our EL-equations results in the differential equation
                                                        d              θ                                      d              t                                      =                  sin                      2                                  θ                                            ∂              v                                      ∂              x                                      +        sin                θ        cos                θ                  (                                                    ∂                u                                            ∂                x                                              −                                                    ∂                v                                            ∂                y                                              )                −                  cos                      2                                  θ                                            ∂              u                                      ∂              y                                              This result is known as Zermelo's equation. Solving this with our system allows us to find the general optimum path.
If we go back to the constant wind problem                               w                         for all time, we have
                                                        ∂              v                                      ∂              y                                      =                                            ∂              v                                      ∂              x                                      =                                            ∂              u                                      ∂              x                                      =                                            ∂              u                                      ∂              y                                      =        0                so our general solution implies                                                         d              θ                                      d              t                                      =        0                , thus                     θ                 is constant, i.e. the optimum path is a straight line, as we had obtained before with an algebraic argument.