Some Pysics Insights
Path:  physics insights > basics > Calculus >

The Derivative of xn

Like the integral of xn, this is one of the most basic facts in calculus.  Consequently, it deserves to be thoroughly understood -- yet a thorough understanding of this is not provided by most introductory calculus books (not the ones I've seen, anyway).

It's easy enough to prove it algebraically.  Before we get into our geometric argument, here's a brief and sloppy proof using the binomial theorem (aka Pascal's Triangle, which may be familiar from high school algebra classes).

We need to find the increase in xn when we increase x by a small amount, which we'll call δx:


It's not obvious from (1) what the difference is.  We need to separate x and δx, which appear mushed together in the "( )n" expression on the right of (1).  Expanding using the binomial theorem will do the job:


Discarding everything higher than first order in δx, and plugging (2) into (1), we obtain:


where we've used the dotted equal sign to indicate equality only to first order.

Dividing through (3) by δx and reducing the change to an infinitesimal value (and replacing the 'δ' characters with 'd' characters) leads to the derivative.

That was fast, painless, and ... kind of opaque.  One almost has a feeling that the coefficient 'n' wasn't entirely pulled from a hat.  It represents the number of ways we can form the second term in the binomial expansion.  However, I, at least, still can't see any physical significance to the (n choose 1) coefficient on the second term when I look at (2).

On the rest of this page, we'll produce a visual justification for both the form of the result and the coefficient, and we'll also extend the result to arbitrary real exponents.

1. Working from the (Flat) Graphs:  No Joy Here

We can see pretty easily by inspection of the graphs that the derivative of x0 is 0 (since it's a constant function), and the derivative of x1 is 1 (since it's a diagonal line, with slope 1).  But when we get to x2, it's a lot less clear.
If we peek at the integral of x we get a strong clue as to what's going on with x2; it's the area of a triangle, and we can almost see where that 2 comes from in dx2/dx = 2x .  For higher order functions, though, just looking at their graphs does not make it clear what's going on (at least for me).
But xn isn't just a function we can graph; it has a physical interpretation, as well.

2. What Is xn ?

We encounter this in high school algebra and physics; we just need to recall it in the context of calculus:
In general, xn is the n-volume of an n dimensional hypercube with side length x.

And that was the "Aha!" moment; the rest is details.

3. The Change in xn Due to a Change in x, for integer n>0

Figure 1:  Change in x2

If x is a length, then a tiny change in x is also a tiny length.  And so we have the remarkably undistinguished result


where we have used the symbol "δ" (rather than "d") to indicate a tiny, but finite, difference.

But now, consider x2, which is the area of a square with side length x.  When we increase x by a small increment, δx, we can see from figure 1 that the increase in x2 must be:


When we divide through (3.2) by δx and reduce it to an infinitesimal value, we obtain the derivative:


The picture tells the story pretty well all on its own.  None the less, there are three things worth emphasizing here.
  1. The term x⋅δx is the product of the small change in x, with the length (or 1-dimensional "area") of one face of the square.

  2. The factor of 2 is there because we're growing the square along two axes.
    If the square has its lower left corner at the origin, then the faces which lie along the vertical line passing through (x,0) and the horizontal line which passes through (0,x) have each shifted outward by δx.
    In other words, for each pair of faces (of which the square has two), we've moved one face out by δx.  The square has two pairs of faces; hence, we need to add up 2 copies of the term xδx.

  3. The piece shown as "second order" in figure 1, which is the fragment in the upper right corner, is second order in δx.  As we shrink δx down to an infinitesimal size, the second order term essentially vanishes, since it's a factor of δx smaller than the two rectangles of size xδx.  Consequently, we can ignore it.  To put it another way, when we're finding a derivative we only care about the first order (linear) behavior of the function.

Figure 2: Change in x3

And now let's consider the change in x3 when x changes by a tiny amount.  We've shown it in figure 2, and it's exactly analogous to the change in x2 shown in figure 1.  The picture tells the story; we'll now describe in words what's going on.

The cube itself has volume x3.  When we increase x by a small amount, we "grow" the cube along each of the three axes.  The result is that we add on three panels, one on each face of the cube that's being "grown".
The volume of one of the "panels" is the area of the face, times the amount by which we're increasing x, which is the thickness of the "panel".  The area of one face is x2, and the amount we're increasing x by is δx, so the volume of each of the "panels" is x2⋅δx.
We're adding three "panels" because we're growing along three axes -- one axis for each pair of faces.  So, the total amount by which we're increasing the volume of the cube is


In summary,

The reason the factor of 3 appears is that we're growing along three axes.
The reason for the factor x2 is that it is the area of one face of the cube, which has dimension one smaller than the cube as a whole.  One "face" of a square has dimension 1, one "face" of a cube has dimension 2, one "face" of a 4-dimensional hypercube has dimension 3, and so forth.

Figure 3:  Cube with higher order "caulking" shown

Before I go on, I should mention that I left out the "higher order" changes in the volume entirely in figure 2, to save clutter and keep the visualization simple.  There's actually a bit of second order "caulking" along each of the three visible edges of the cube, and there's a tiny third order "cubelet" at the corner.  Since they all are higher than first order in δx they don't matter to the derivative, as their contribution to the change in volume is vanishingly small when we make δx very small.

For completeness, I've shown the cube with the "caulking" in place in figure 3. The three green boxes have total volume 3x⋅(δx)2, and the gray "cubelet" in the corner has volume (δx)3.  The ratio of the total volume of the caulking to the volume of the three "panels" approaches zero as we make δx very small, which is why we can ignore it when finding the derivative.

3.1 Higher Powers of x

For higher powers of x,  the situation is exactly analogous to the two and three dimensional cases discussed above, though I won't attempt to draw any higher dimensional figures.
We can visualize the expression xn as representing the volume of an n dimensional hypercube with side length x.  As we saw on the hypercubes page, an n dimensional hypercube has n pairs of faces, with one pair of faces lying along each of the n axes of Rn.  The area of one face of a hypercube with side length x is xn-1.  When we grow the hypercube, by increasing x by δx, we increase the volume of the cube by adding a (thin) "panel" to each face of the cube. The volume of a "panel" is the area of the face, times its thickness, which is δx.  We're growing the hypercube, and adding one "panel", along each axis, and there are n axes.  So, the amount of volume we add, to first order, is


This completes our comments on dxn/dx for positive integer values of n.  On the rest of this page we'll be generalizing that result to other values for n, and to negative values of x.

4. The derivative of x-n, for integer n>0

Figure 4:  Change in (1/x)2

To find the derivative of a negative integer power of x, we just invert x and use the same visualization we already used in section 3 for a positive integer power.  The only tricky bit is that the hypercube, as shown in figures 1 and 2 above, shrinks instead of growing when we increase x.  Consequently, the "panel thickness" is going to be negative, rather than positive (see figure 4 for the case n=2).

To complete this, we need to figure out, to first order, how much 1/x changes when we increase x by δx:


That's not obvious by inspection (to me, at least) so let's expand the fraction 1/(x+δx) in terms of 1/x.  We only need the value to first order in δx so it's easy and there's no need to use any fancy tools we haven't developed yet (like Taylor series).  We can just divide it out, using polynomial long division, and stop when the remainder is second order in δx (the division's easy, but typesetting it in Latex was a major pain, BTW):


And now, using (4.2), we can find the small change in 1/x due to a small change in x:


(Note that we have again used the dotted equal sign to mean "equal to first order only".  Note also that we just found the derivative of 1/x, though that wasn't what we set out to do.)

Now, using the change in 1/x provided by (4.3) in place of the simple change in x, and using the fact that the area of one face of a hypercube of side length L is L(n-1), we can find the volume of one of the panels we're subtracting from the cube (in exact analogy to the arguments given for positive n, in the previous section).  We've shown this graphically for n=2 in figure 4.


And finally, using the fact that an n dimensional hypercube is going to be growing (or, rather, shrinking) along n axes, so its total change in volume will be n times the volume of one panel, we can find the change in volume of a hypercube of side length 1/x when x changes by a small amount:


And we're done with the hard part.
Replacing 1/x with x-1 and then dividing through by δx, reducing the change to an infinitesimal value, and replacing the δ characters with 'd' characters, gives us the derivative.  To make it look more like the usual result, we've also substituted h=-n in the following (note that h<0):


5. The derivative of xn, for integer n, and x<0

This is most easily dealt with through inspection of the graphs of even and odd functions.
From the graphs, it seems clear that, for an even function, f,


Conversely, for an odd function, g, we have


(5.1 and 5.2 can be proved with a little symbol pushing but it doesn't add a lot to the exposition so we won't.)

Consequently, for any even power of x, and for an arbitrary negative value u, we have


where we've used the fact that even powers of x are even functions, and we already know the derivative for x>0.  We also used the fact that -1 raised to an odd power is -1, as a result of which the two "-" signs in the second line cancel.

For any odd power of x, and for an arbitrary negative value u, we have:


where we've used the fact that odd powers of x are odd functions, and we already know the derivative for x>0.  We also used the fact that -1 raised to an even power is 1, so the "-" sign in the second line drops out.

6. The derivative of x(1/n), for integer n≠0

For a function of a single variable, it should be clear that


wherever its derivative is defined and nonzero.

Furthermore, if we have y=x(1/n), then, if we raise each side to the nth power, we see that x=yn.  (This would benefit from a figure and I may add one at some point.)

So, if y=x(1/n), we have


as was to be shown.

7. The derivative of xm/n for integers m≠0, n≠0

We just use the chain rule.  We rewrite xm/n as:


Then we have:


which was to be shown.

8. The derivative of xr, for real r≠0

Fourth down, and I'm tired.  Time to punt.

dxn/dx is a continuous function of x and n, so its value at any irrational point is the limit of its value at rational points which approach that point.  Consequently the formula for the derivative we've found, which is valid at all rational points, must be valid at all real points.

Taking the limit, and showing that it converges to the same expression we found for the rational points, is straightforward, and is very similar to what we did on the Integrals of Powers of X page, §5, "The Integral of xs, for any positive real s".

I may expand this section later.

Page created on 9/18/2013