Like the integral of xn
this is one of the most basic facts in calculus. Consequently, it
deserves to be thoroughly understood -- yet a thorough understanding of
this is not provided by most introductory calculus books (not the ones
I've seen, anyway).
It's easy enough to prove it algebraically
. Before we
get into our geometric argument, here's a brief and sloppy proof using
the binomial theorem (aka Pascal's Triangle, which may be familiar from
high school algebra classes).
We need to find the increase in xn
when we increase x
by a small amount, which we'll call δx:
It's not obvious from (1) what the difference is. We need to
, which appear mushed together in the
" expression on the right of (1). Expanding using
the binomial theorem will do the job:
Discarding everything higher than first order in δx, and plugging (2)
into (1), we obtain:
where we've used the dotted equal sign to indicate equality only to
Dividing through (3) by δx and reducing the change to an infinitesimal
value (and replacing the 'δ' characters with 'd' characters) leads to
That was fast, painless, and ... kind of opaque. One almost
has a feeling that the coefficient 'n' wasn't entirely
pulled from a hat. It represents the number of ways we can form
second term in the binomial expansion. However, I, at least,
can't see any physical significance
to the (n choose 1)
coefficient on the second term when I look at (2).
On the rest of this page, we'll produce a visual justification for both
the form of the result and the coefficient, and we'll also extend the
result to arbitrary real exponents.
1. Working from the (Flat) Graphs: No Joy Here
We can see pretty easily by inspection of the graphs that the
derivative of x0
is 0 (since it's a constant
function), and the derivative of x1
is 1 (since it's
a diagonal line, with slope 1). But when we get to x2
it's a lot less clear.
If we peek at the integral of x
we get a strong clue as to
what's going on with x2
; it's the area of a
triangle, and we can almost see where that 2
comes from in
. For higher order
functions, though, just looking at their graphs does not make it clear
what's going on (at least for me).
isn't just a function we can graph; it has a
physical interpretation, as well.
2. What Is xn ?
We encounter this in high school algebra and physics; we just need to
recall it in the context of calculus:
- x is a length.
- x2 is an area.
- x3 is a volume.
In general, xn
is the n
-volume of an n
with side length
And that was the "Aha!" moment; the rest is details.
3. The Change
in xn Due to a Change in x, for integer n>0
is a length, then a tiny change in x
is also a
tiny length. And so we have the remarkably undistinguished result
where we have used the symbol "δ" (rather than "d
") to indicate
a tiny, but finite, difference.
But now, consider x2
, which is the area of a square
with side length x
. When we increase x
by a small
, we can see from figure 1
the increase in x2
When we divide through (3.2) by δx and reduce it to an infinitesimal
value, we obtain the derivative:
The picture tells the story pretty well all on its own. None the
less, there are three things worth emphasizing here.
- The term x⋅δx is the product of the small change in x,
with the length (or 1-dimensional "area") of one face
of the square.
- The factor of 2 is there because we're growing the
square along two axes.
If the square has its lower left corner at the origin, then the faces
which lie along the vertical line passing through (x,0) and the
horizontal line which passes through (0,x) have each shifted
In other words, for each pair of faces (of which the square
two), we've moved one face out by δx. The square has two pairs
faces; hence, we need to add up 2 copies of the term xδx.
- The piece shown as "second order" in figure 1,
which is the fragment in the upper right corner, is second order in
δx. As we shrink δx down to an infinitesimal size, the second
order term essentially vanishes, since it's a factor of δx smaller
than the two rectangles of size xδx. Consequently, we can ignore
it. To put it another way, when we're finding a derivative we
only care about the first order (linear) behavior of the function.
Change in x3
And now let's consider the change in x3
changes by a tiny amount. We've shown it in figure 2, and it's
exactly analogous to the change in x2
figure 1. The picture tells the story; we'll now describe in
words what's going on.
The cube itself has volume x3
. When we
by a small amount, we "grow" the cube along each of
the three axes. The result is that we add on three panels
one on each face of the cube that's being "grown".
The volume of one of the "panels" is the area of the face, times
the amount by which we're increasing x
, which is the thickness
of the "panel". The area of one face is x2
and the amount we're increasing x
by is δx, so the volume
of each of the "panels" is x2⋅δx
We're adding three
"panels" because we're growing along three
-- one axis for each pair of faces. So, the total
amount by which we're increasing the volume of the cube is
The reason the factor of 3 appears is that we're growing
along three axes.
The reason for the factor x2
is that it is the area of one face of the cube, which has
dimension one smaller than the cube as a whole.
"face" of a square has dimension 1, one "face" of a cube has dimension
2, one "face" of a 4-dimensional hypercube has dimension 3, and so
with higher order "caulking" shown
Before I go on, I should mention that I left out the "higher order"
changes in the volume entirely in figure 2
save clutter and keep the visualization simple. There's actually
a bit of second order "caulking" along each of the three visible edges
of the cube, and there's a tiny third order "cubelet" at the
corner. Since they all are higher than first order in δx they
don't matter to the derivative, as their contribution to the change in
volume is vanishingly small when we make δx very small.
For completeness, I've shown the cube with the "caulking" in place in figure 3
. The three green boxes have total volume
, and the gray "cubelet" in the corner has
. The ratio
of the total
volume of the caulking to the volume of the three "panels" approaches
zero as we make δx very small, which is why we can ignore it when
finding the derivative.
3.1 Higher Powers of x
For higher powers of x
, the situation is exactly analogous
to the two and three dimensional cases discussed above,
though I won't attempt to draw any higher dimensional figures.
We can visualize the expression xn
the volume of an n
dimensional hypercube with side length x
we saw on the hypercubes
page, an n
dimensional hypercube has n pairs
of faces, with one
pair of faces lying along each of the n
axes of Rn
area of one face of a hypercube with side length x
we grow the hypercube, by increasing x
by δx, we increase
the volume of the cube by adding a (thin) "panel" to each face of the
cube. The volume of a "panel" is the area of the face, times
its thickness, which is δx. We're growing the hypercube, and
adding one "panel", along each axis
, and there are n
axes. So, the amount of volume we add, to first order, is
This completes our comments on dxn
/dx for positive integer
values of n
. On the rest of this page we'll be
generalizing that result to other values for n
, and to negative
values of x
4. The derivative of x-n, for integer n>0
Change in (1/x)2
To find the derivative of a negative integer power of x
just invert x
and use the same visualization we already used in
positive integer power. The only tricky bit is that the
hypercube, as shown in figures 1
instead of growing when
we increase x
. Consequently, the "panel thickness" is
going to be negative
, rather than positive (see figure
for the case n
To complete this, we need to figure out, to first order, how much 1/x
changes when we increase x
That's not obvious by inspection (to me, at least) so let's expand the
) in terms of 1/x. We only need the value
to first order
in δx so it's easy and there's no need to use
any fancy tools we haven't developed yet (like Taylor series). We
can just divide it out, using polynomial long division, and stop when
the remainder is second order in δx (the division's easy, but
typesetting it in Latex was a major pain, BTW
And now, using (4.2), we can find the small change in 1/x due to a
small change in x:
(Note that we have again used the dotted equal sign to mean "equal to first
". Note also that we just found the derivative of
1/x, though that wasn't what we set out to do.)
Now, using the change in 1/x provided by (4.3) in place of the simple
change in x
, and using the fact that the area of one face of a
hypercube of side length L
, we can
find the volume of one
of the panels we're subtracting
from the cube (in exact analogy to the arguments given for positive n
in the previous section). We've shown this graphically for n
in figure 4
And finally, using the fact that an n
dimensional hypercube is
going to be growing (or, rather, shrinking) along n
its total change in volume will be n
times the volume of one
panel, we can find the change in volume of a hypercube of side length 1/x
changes by a small amount:
And we're done with the hard part.
and then dividing
through by δx, reducing the change to an infinitesimal value, and
replacing the δ characters with 'd' characters, gives us the
derivative. To make it look more like the usual result, we've
also substituted h=-n
in the following (note that h
5. The derivative of xn, for integer n, and x<0
This is most easily dealt with through inspection of the graphs of even
and odd functions.
From the graphs, it seems clear that, for an even
Conversely, for an odd
function, g, we have
(5.1 and 5.2 can be proved with a little symbol pushing but it doesn't add
lot to the exposition so we won't.)
Consequently, for any even
power of x
, and for an
arbitrary negative value u
, we have
where we've used the fact that even powers of x
functions, and we already know the derivative for x
We also used the fact that -1 raised to an odd power is -1, as a result
of which the two "-" signs in the second line cancel.
For any odd power of x
, and for an arbitrary negative value u
where we've used the fact that odd powers of x
functions, and we already know the derivative for x
We also used the fact that -1 raised to an even power is 1, so the "-"
sign in the second line drops out.
6. The derivative of x(1/n), for integer n≠0
For a function of a single variable, it should be clear that
wherever its derivative is defined and nonzero.
Furthermore, if we have y=x(1/n)
, then, if we raise
each side to the nth
power, we see that x=yn
would benefit from a figure and I may add one at some point.)
So, if y=x(1/n)
, we have
as was to be shown.
7. The derivative of xm/n for integers m≠0, n≠0
We just use the chain rule. We rewrite xm/n
Then we have:
which was to be shown.
8. The derivative of xr, for real r≠0
Fourth down, and I'm tired. Time to punt.
/dx is a continuous function of x
its value at any irrational point is the limit of its value at rational
points which approach that point. Consequently the formula for
the derivative we've found, which is valid at all rational points, must
be valid at all real points.
Taking the limit, and showing that it converges to the same expression
we found for the rational points, is straightforward, and is very
similar to what we did on the Integrals of Powers of X page, §5, "The
Integral of xs, for any positive real s
I may expand this section later.
Page created on 9/18/2013