Finding the Shortest Path, using integration by parts

The Shortest Path -- A “Standard” Derivation

This is a “classic” derivation of the minimization condition for a path, using integration by parts. It's similar to proofs which appear in any number of mechanics texts. (See, also, the visual derivation of the minimization condition.)

Integration by parts is pretty simple in principle, but in practice I typically am left feeling like the magician just pulled a rabbit from a hat after its application. My preference would be for a purely "visual" derivation. But let us proceed.

The definition of a path and statement of the problem are the same as those given with the visual derivation. If you've already seen them there, you might as well just skip down to the derivation.

Definition of a Path

We define a path in R^N from point X_ato point X_b as a smooth mapping from the unit interval on the real line [0,1] into R^N:

(1)

Figure 1 -- Some possible paths in R²:
Three paths

Statement of the Problem

We are given a function f which is defined along any path. Along any particular path, f is a function of the components xⁱ of the path X, and of their derivatives, which we show with an overdot. f maps each point on the path into R. So,we could also say that f maps a particular path, and a point in the unit interval, into R; thus, we have:

(2)

We wish to find a particular path which will minimize (or extremize) the integral of f over the path. That integral we will call F; it maps each path into a point in R:

(3)

More specifically, we wish to find a path such that the integral of f along any nearby path is at least as large as the integral along our chosen path.

Figure 2 -- Some "nearby" paths in R²:
Some nearby paths

Note, however, that something important isn't shown on the illustrations: The function, f, is a function of the location along the path, and is also a function of how fast we are moving along the path. The derivatives of x and y with respect to t do not appear in figures 1 and 2 but they are important none the less.

The condition a path must satisfy, to be a minimum (or maximum) of F, is the familiar one: The derivative of F (with respect to the path) must vanish. That means that any “infinitesimal” deviation from the path will result in no change -- or, in other words, the path must be a stationary point for F in the space of all possible paths. We wish to find purely local conditions on the functon X(t) which will allow us to determine if a path is minimal.

At this point we have our definitions in place, and it would make sense to bail out, look at the visual derivation of the minimization conditions, and call it a day. But what follows is more rigorous.

The Derivation of the Minimality Conditions

We need to define a variation from a path. That's just another path, but it's a path leading from 0 to 0. We'll call it η:

(4)

Now, we want to take the derivative in the direction of a particular variation, η. We do that by multiplying η by a scalar, and then differentiating with respect to the scalar:

(5)

Rewriting that in terms of f, the condition we need to satisfy is:

(6)

Since we're only interested in the derivative at s=0, we can replace the integrand with the first order terms of its Taylor series:

(7)

The derivative now becomes:

(8)

We'd like to eliminate references to dη /dt from this, so we integrate the second term by parts:

(9)

Because η(0) = η(1) = 0, the bracketed term in (9) disappears, and we're left with:

(10)

Since this must be zero for any arbitrary η, we must, therefore, have:

(11)

For the path to be extremal, that is the condition which must be satisfied.

Go to the visual derivation of the condition.

Page created 10/31/06; updated with prettier equations on 11/14/06