Like the
integral of xn,
this is one of the most basic facts in calculus. Consequently, it
deserves to be thoroughly understood -- yet a thorough understanding of
this is not provided by most introductory calculus books (not the ones
I've seen, anyway).
It's easy enough to prove it
algebraically. Before we get
into our geometric argument, here's a brief and sloppy proof using the
binomial theorem (aka Pascal's Triangle, which may be familiar from high
school algebra classes).
We need to find the increase in
xn when we increase
x
by a small amount, which we'll call δx:
(1)
It's not obvious from (1) what the difference is. We need to
separate
x and
δx, which appear mushed together in the "(
)
n" expression on the right of (1). Expanding using the
binomial theorem will do the job:
(2)
Discarding everything higher than first order in δx, and plugging (2) into
(1), we obtain:
(3)
where we've used the dotted equal sign to indicate equality only to first
order.
Dividing through (3) by δx and reducing the change to an infinitesimal
value (and replacing the 'δ' characters with 'd' characters) leads to the
derivative.
That was fast, painless, and ... kind of opaque. One
almost
has a feeling that the coefficient 'n' wasn't
entirely pulled from
a hat. It represents the number of ways we can form the second term
in the binomial expansion. However, I, at least, still can't see any
physical significance to the (n choose 1) coefficient on the second
term when I look at (2).
On the rest of this page, we'll produce a visual justification for both
the form of the result and the coefficient, and we'll also extend the
result to arbitrary real exponents.
1. Working from the (Flat) Graphs: No Joy Here
We can see pretty easily by inspection of the graphs that the derivative
of
x0 is 0 (since it's a constant function), and the
derivative of
x1 is 1 (since it's a diagonal line, with
slope 1). But when we get to
x2, it's a lot less
clear.
If we peek at the integral of
x we get a strong clue as to what's
going on with
x2; it's the area of a triangle, and we
can almost see where that
2 comes from in
dx2/dx
= 2
x . For higher order functions, though, just
looking at their graphs does not make it clear what's going on (at least
for me).
But
xn isn't just a function we can graph; it has a
physical interpretation, as well.
2. What Is xn ?
We encounter this in high school algebra and physics; we just need to
recall it in the context of calculus:
- x is a length.
- x2 is an area.
- x3 is a volume.
In general,
xn is the
n-volume of an
n
dimensional
hypercube with side length
x.
And that was the "Aha!" moment; the rest is details.
3. The Change in
xn Due to a Change in x, for integer n>0
Figure 1:
Change
in x2
|
If
x is a length, then a tiny change in
x is also a tiny
length. And so we have the remarkably undistinguished result
(3.1)
where we have used the symbol "δ" (rather than "
d") to indicate a
tiny, but finite, difference.
But now, consider
x2, which is the area of a square
with side length
x. When we increase
x by a small
increment,
δx, we can see from
figure 1
that the increase in
x2 must be:
(3.2)
When we divide through (3.2) by δx and reduce it to an infinitesimal
value, we obtain the derivative:
(3.2b)
The picture tells the story pretty well all on its own. None the
less, there are three things worth emphasizing here.
- The term x⋅δx is the product of the small change in x,
with the length (or 1-dimensional "area") of one face
of the square.
- The factor of 2 is there because we're growing the
square along two axes.
If the square has its lower left corner at the origin, then the faces
which lie along the vertical line passing through (x,0) and the
horizontal line which passes through (0,x) have each shifted
outward by δx.
In other words, for each pair of faces (of which the square
has two), we've moved one face out by δx. The square has two
pairs of faces; hence, we need to add up 2 copies of the term
xδx.
- The piece shown as "second order" in figure 1,
which is the fragment in the upper right corner, is second order in
δx. As we shrink δx down to an infinitesimal size, the second
order term essentially vanishes, since it's a factor of δx smaller
than the two rectangles of size xδx. Consequently, we can ignore
it. To put it another way, when we're finding a derivative we
only care about the first order (linear) behavior of the function.
Figure 2:
Change in x3
|
And now let's consider the change in
x3 when
x
changes by a tiny amount. We've shown it in figure 2, and it's
exactly analogous to the change in
x2 shown in figure
1. The picture tells the story; we'll now describe in words what's
going on.
The cube itself has volume
x3. When we increase
x
by a small amount, we "grow" the cube along each of the three axes.
The result is that we add on
three panels, one on each face of the
cube that's being "grown".
The volume of one of the "panels" is the area of the face,
times
the amount by which we're increasing
x, which is the thickness of
the "panel". The area of one face is
x2, and the
amount we're increasing
x by is δx, so the
volume of each
of the "panels" is
x2⋅δx.
We're adding
three "panels" because we're growing along
three
axes -- one axis for each pair of faces. So, the
total
amount by which we're increasing the volume of the cube is
(3.3)
In summary,
The reason the factor of 3 appears is that we're growing along
three axes.
The reason for the factor x2
is that it is the area of one face of the cube, which has
dimension one smaller than the cube as a whole. One
"face" of a square has dimension 1, one "face" of a cube has dimension 2,
one "face" of a 4-dimensional hypercube has dimension 3, and so forth.
Figure 3:
Cube
with higher order "caulking" shown
|
Before I go on, I should mention that I left out the "higher order"
changes in the volume entirely in
figure 2, to
save clutter and keep the visualization simple. There's actually a
bit of second order "caulking" along each of the three visible edges of
the cube, and there's a tiny third order "cubelet" at the corner.
Since they all are higher than first order in δx they don't matter to the
derivative, as their contribution to the change in volume is vanishingly
small when we make δx very small.
For completeness, I've shown the cube with the "caulking" in place in
figure 3. The three green boxes have total volume
3x⋅(δx)2,
and the gray "cubelet" in the corner has volume
(δx)3.
The
ratio of the total volume of the caulking to the volume of the
three "panels" approaches zero as we make δx very small, which is why we
can ignore it when finding the derivative.
3.1 Higher Powers of x
For higher powers of
x, the situation is exactly analogous
to the two and three dimensional cases discussed above, though I won't
attempt to draw any higher dimensional figures.
We can visualize the expression
xn as representing the
volume of an
n dimensional hypercube with side length
x.
As
we saw on the
hypercubes page, an
n
dimensional hypercube has
n pairs of faces, with one pair
of faces lying along each of the
n axes of
Rn.
The
area of one face of a hypercube with side length
x is
xn-1.
When
we grow the hypercube, by increasing
x by δx, we increase the
volume of the cube by adding a (thin) "panel" to each face of the cube.
The volume of a "panel" is the area of the face,
times its
thickness, which is δx. We're growing the hypercube, and adding one
"panel", along each
axis, and there are
n axes. So,
the amount of volume we add, to first order, is
(3.4)
This completes our comments on dx
n/dx for positive integer
values of
n. On the rest of this page we'll be generalizing
that result to other values for
n, and to negative values of
x.
4. The derivative of x-n, for integer n>0
Figure 4:
Change in (1/x)2
|
To find the derivative of a negative integer power of
x, we just
invert
x and use the same visualization we already used in
section
3 for a positive integer power. The only tricky bit is that
the hypercube, as shown in figures
1 and
2
above,
shrinks instead of growing when we increase
x.
Consequently, the "panel thickness" is going to be
negative,
rather than positive (see
figure 4 for the case
n=2).
To complete this, we need to figure out, to first order, how much
1/x
changes when we increase
x by δx:
(4.1)
That's not obvious by inspection (to me, at least) so let's expand the
fraction 1/(
x+δx) in terms of 1/x. We only need the value to
first order in δx so it's easy and there's no need to use any fancy
tools we haven't developed yet (like Taylor series). We can just
divide it out, using polynomial long division, and stop when the remainder
is second order in δx (
the division's easy, but typesetting it in Latex
was a major pain, BTW):
(4.2)
And now, using (4.2), we can find the small change in 1/x due to a small
change in x:
(4.3)
(Note that we have again used the dotted equal sign to mean "equal to
first
order
only". Note also that we just found the derivative of 1/x,
though that wasn't what we set out to do.)
Now, using the change in 1/x provided by (4.3) in place of the simple
change in
x, and using the fact that the area of one face of a
hypercube of side length
L is
L(n-1), we can
find the volume of
one of the panels we're
subtracting
from the cube (in exact analogy to the arguments given for positive
n,
in the previous section). We've shown this graphically for
n=2
in
figure 4.
(4.4)
And finally, using the fact that an
n dimensional hypercube is
going to be growing (or, rather, shrinking) along
n axes, so its
total change in volume will be
n times the volume of one panel, we
can find the change in volume of a hypercube of side length 1/
x
when
x changes by a small amount:
(4.5)
And we're done with the hard part.
Replacing 1/
x with
x-1 and then dividing through
by δx, reducing the change to an infinitesimal value, and replacing the δ
characters with 'd' characters, gives us the derivative. To make it
look more like the usual result, we've also substituted
h=-n in
the following (note that
h<0):
(4.6)
5. The derivative of xn, for integer n, and x<0
This is most easily dealt with through inspection of the graphs of even
and odd functions.
From the graphs, it seems clear that, for an
even function,
f,
(5.1)
Conversely, for an
odd function, g, we have
(5.2)
(5.1 and 5.2 can be proved with a little symbol pushing but it doesn't add
a lot to the exposition so we won't.)
Consequently, for any
even power of
x, and for an
arbitrary negative value
u, we have
(5.3)
where we've used the fact that even powers of
x are even
functions, and we already know the derivative for
x>0. We
also used the fact that -1 raised to an odd power is -1, as a result of
which the two "-" signs in the second line cancel.
For any odd power of
x, and for an arbitrary negative value
u,
we have:
(5.4)
where we've used the fact that odd powers of
x are odd functions,
and we already know the derivative for
x>0. We also used
the fact that -1 raised to an even power is 1, so the "-" sign in the
second line drops out.
6. The derivative of x(1/n), for integer n≠0
For a function of a single variable, it should be clear that
(6.1)
wherever its derivative is defined and nonzero.
Furthermore, if we have
y=x(1/n), then, if we raise
each side to the
nth power, we see that
x=yn.
(This
would benefit from a figure and I may add one at some point.)
So, if
y=x(1/n), we have
(6.2)
as was to be shown.
7. The derivative of xm/n for integers m≠0, n≠0
We just use the chain rule. We rewrite
xm/n as:
(7.1)
Then we have:
(7.2)
which was to be shown.
8. The derivative of xr, for real r≠0
Fourth down, and I'm tired. Time to punt.
dx
n/dx is a continuous function of
x and
n, so
its value at any irrational point is the limit of its value at rational
points which approach that point. Consequently the formula for the
derivative we've found, which is valid at all rational points, must be
valid at all real points.
Taking the limit, and showing that it converges to the same expression we
found for the rational points, is straightforward, and is very similar to
what we did on the Integrals of Powers of X page, §5, "
The
Integral of xs, for any positive real s".
I may expand this section later.
Page created on 9/18/2013