Like the integral of xn
this is one of the most basic facts in calculus. Consequently, it
deserves to be thoroughly understood -- yet a thorough understanding of
this is not provided by most introductory calculus books (not the ones
I've seen, anyway).
It's easy enough to prove it algebraically
. Before we get
into our geometric argument, here's a brief and sloppy proof using the
binomial theorem (aka Pascal's Triangle, which may be familiar from high
school algebra classes).
We need to find the increase in xn
when we increase x
by a small amount, which we'll call δx:
It's not obvious from (1) what the difference is. We need to
, which appear mushed together in the "(
" expression on the right of (1). Expanding using the
binomial theorem will do the job:
Discarding everything higher than first order in δx, and plugging (2) into
(1), we obtain:
where we've used the dotted equal sign to indicate equality only to first
Dividing through (3) by δx and reducing the change to an infinitesimal
value (and replacing the 'δ' characters with 'd' characters) leads to the
That was fast, painless, and ... kind of opaque. One almost
has a feeling that the coefficient 'n' wasn't entirely
a hat. It represents the number of ways we can form the second term
in the binomial expansion. However, I, at least, still can't see any
to the (n choose 1) coefficient on the second
term when I look at (2).
On the rest of this page, we'll produce a visual justification for both
the form of the result and the coefficient, and we'll also extend the
result to arbitrary real exponents.
1. Working from the (Flat) Graphs: No Joy Here
We can see pretty easily by inspection of the graphs that the derivative
is 0 (since it's a constant function), and the
derivative of x1
is 1 (since it's a diagonal line, with
slope 1). But when we get to x2
, it's a lot less
If we peek at the integral of x
we get a strong clue as to what's
going on with x2
; it's the area of a triangle, and we
can almost see where that 2
comes from in dx2/dx
. For higher order functions, though, just
looking at their graphs does not make it clear what's going on (at least
isn't just a function we can graph; it has a
physical interpretation, as well.
2. What Is xn ?
We encounter this in high school algebra and physics; we just need to
recall it in the context of calculus:
- x is a length.
- x2 is an area.
- x3 is a volume.
In general, xn
is the n
-volume of an n
with side length x
And that was the "Aha!" moment; the rest is details.
3. The Change in
xn Due to a Change in x, for integer n>0
is a length, then a tiny change in x
is also a tiny
length. And so we have the remarkably undistinguished result
where we have used the symbol "δ" (rather than "d
") to indicate a
tiny, but finite, difference.
But now, consider x2
, which is the area of a square
with side length x
. When we increase x
by a small
, we can see from figure 1
that the increase in x2
When we divide through (3.2) by δx and reduce it to an infinitesimal
value, we obtain the derivative:
The picture tells the story pretty well all on its own. None the
less, there are three things worth emphasizing here.
- The term x⋅δx is the product of the small change in x,
with the length (or 1-dimensional "area") of one face
of the square.
- The factor of 2 is there because we're growing the
square along two axes.
If the square has its lower left corner at the origin, then the faces
which lie along the vertical line passing through (x,0) and the
horizontal line which passes through (0,x) have each shifted
outward by δx.
In other words, for each pair of faces (of which the square
has two), we've moved one face out by δx. The square has two
pairs of faces; hence, we need to add up 2 copies of the term
- The piece shown as "second order" in figure 1,
which is the fragment in the upper right corner, is second order in
δx. As we shrink δx down to an infinitesimal size, the second
order term essentially vanishes, since it's a factor of δx smaller
than the two rectangles of size xδx. Consequently, we can ignore
it. To put it another way, when we're finding a derivative we
only care about the first order (linear) behavior of the function.
Change in x3
And now let's consider the change in x3
changes by a tiny amount. We've shown it in figure 2, and it's
exactly analogous to the change in x2
shown in figure
1. The picture tells the story; we'll now describe in words what's
The cube itself has volume x3
. When we increase x
by a small amount, we "grow" the cube along each of the three axes.
The result is that we add on three panels
, one on each face of the
cube that's being "grown".
The volume of one of the "panels" is the area of the face, times
the amount by which we're increasing x
, which is the thickness of
the "panel". The area of one face is x2
, and the
amount we're increasing x
by is δx, so the volume
of the "panels" is x2⋅δx
We're adding three
"panels" because we're growing along three
-- one axis for each pair of faces. So, the total
amount by which we're increasing the volume of the cube is
The reason the factor of 3 appears is that we're growing along
The reason for the factor x2
is that it is the area of one face of the cube, which has
dimension one smaller than the cube as a whole.
"face" of a square has dimension 1, one "face" of a cube has dimension 2,
one "face" of a 4-dimensional hypercube has dimension 3, and so forth.
with higher order "caulking" shown
Before I go on, I should mention that I left out the "higher order"
changes in the volume entirely in figure 2
save clutter and keep the visualization simple. There's actually a
bit of second order "caulking" along each of the three visible edges of
the cube, and there's a tiny third order "cubelet" at the corner.
Since they all are higher than first order in δx they don't matter to the
derivative, as their contribution to the change in volume is vanishingly
small when we make δx very small.
For completeness, I've shown the cube with the "caulking" in place in figure 3
. The three green boxes have total volume 3x⋅(δx)2
and the gray "cubelet" in the corner has volume (δx)3
of the total volume of the caulking to the volume of the
three "panels" approaches zero as we make δx very small, which is why we
can ignore it when finding the derivative.
3.1 Higher Powers of x
For higher powers of x
, the situation is exactly analogous
to the two and three dimensional cases discussed above, though I won't
attempt to draw any higher dimensional figures.
We can visualize the expression xn
as representing the
volume of an n
dimensional hypercube with side length x
we saw on the hypercubes
page, an n
dimensional hypercube has n pairs
of faces, with one pair
of faces lying along each of the n
axes of Rn
area of one face of a hypercube with side length x
we grow the hypercube, by increasing x
by δx, we increase the
volume of the cube by adding a (thin) "panel" to each face of the cube.
The volume of a "panel" is the area of the face, times
thickness, which is δx. We're growing the hypercube, and adding one
"panel", along each axis
, and there are n
the amount of volume we add, to first order, is
This completes our comments on dxn
/dx for positive integer
values of n
. On the rest of this page we'll be generalizing
that result to other values for n
, and to negative values of x
4. The derivative of x-n, for integer n>0
Change in (1/x)2
To find the derivative of a negative integer power of x
, we just
and use the same visualization we already used in section
for a positive integer power. The only tricky bit is that
the hypercube, as shown in figures 1
instead of growing when we increase x
Consequently, the "panel thickness" is going to be negative
rather than positive (see figure 4
for the case n
To complete this, we need to figure out, to first order, how much 1/x
changes when we increase x
That's not obvious by inspection (to me, at least) so let's expand the
) in terms of 1/x. We only need the value to
in δx so it's easy and there's no need to use any fancy
tools we haven't developed yet (like Taylor series). We can just
divide it out, using polynomial long division, and stop when the remainder
is second order in δx (the division's easy, but typesetting it in Latex
was a major pain, BTW
And now, using (4.2), we can find the small change in 1/x due to a small
change in x:
(Note that we have again used the dotted equal sign to mean "equal to first
". Note also that we just found the derivative of 1/x,
though that wasn't what we set out to do.)
Now, using the change in 1/x provided by (4.3) in place of the simple
change in x
, and using the fact that the area of one face of a
hypercube of side length L
, we can
find the volume of one
of the panels we're subtracting
from the cube (in exact analogy to the arguments given for positive n
in the previous section). We've shown this graphically for n
in figure 4
And finally, using the fact that an n
dimensional hypercube is
going to be growing (or, rather, shrinking) along n
axes, so its
total change in volume will be n
times the volume of one panel, we
can find the change in volume of a hypercube of side length 1/x
changes by a small amount:
And we're done with the hard part.
and then dividing through
by δx, reducing the change to an infinitesimal value, and replacing the δ
characters with 'd' characters, gives us the derivative. To make it
look more like the usual result, we've also substituted h=-n
the following (note that h
5. The derivative of xn, for integer n, and x<0
This is most easily dealt with through inspection of the graphs of even
and odd functions.
From the graphs, it seems clear that, for an even
Conversely, for an odd
function, g, we have
(5.1 and 5.2 can be proved with a little symbol pushing but it doesn't add
a lot to the exposition so we won't.)
Consequently, for any even
power of x
, and for an
arbitrary negative value u
, we have
where we've used the fact that even powers of x
functions, and we already know the derivative for x
also used the fact that -1 raised to an odd power is -1, as a result of
which the two "-" signs in the second line cancel.
For any odd power of x
, and for an arbitrary negative value u
where we've used the fact that odd powers of x
are odd functions,
and we already know the derivative for x
>0. We also used
the fact that -1 raised to an even power is 1, so the "-" sign in the
second line drops out.
6. The derivative of x(1/n), for integer n≠0
For a function of a single variable, it should be clear that
wherever its derivative is defined and nonzero.
Furthermore, if we have y=x(1/n)
, then, if we raise
each side to the nth
power, we see that x=yn
would benefit from a figure and I may add one at some point.)
So, if y=x(1/n)
, we have
as was to be shown.
7. The derivative of xm/n for integers m≠0, n≠0
We just use the chain rule. We rewrite xm/n
Then we have:
which was to be shown.
8. The derivative of xr, for real r≠0
Fourth down, and I'm tired. Time to punt.
/dx is a continuous function of x
its value at any irrational point is the limit of its value at rational
points which approach that point. Consequently the formula for the
derivative we've found, which is valid at all rational points, must be
valid at all real points.
Taking the limit, and showing that it converges to the same expression we
found for the rational points, is straightforward, and is very similar to
what we did on the Integrals of Powers of X page, §5, "The
Integral of xs, for any positive real s
I may expand this section later.
Page created on 9/18/2013