Path:  physics insights > basics > Calculus >

## The Fundamental Theorem of Calculus

As I've stated elsewhere, the primary purpose of these calculus pages is to motivate some results which are all too often proven but not explained.  On this page, we provide (what we hope is) clear motivation for the fundamental theorem of calculus -- but, at least initially, we will not be providing a rigorous proof of it.  (I may add one at a later date.)

After we present the two parts of the fundamental theorem, we'll say a bit more about how the dx notation relates to this, and discuss the visualization of dx as "a small change in x" and its use in understanding these formulas.   And finally, we'll say a bit about the issue of "lack of rigor" when using "sloppy infinitesimals", and as an illustration of the contrast between the "sloppy infinitesimal" motivation of a theorem and a "rigorous proof", we'll present a proof of the chain rule.

### The Derivative of an Integral:  The Fundamental Theorem, Part I

Figure 1: A definite integral
The integral of a function is the area under its curve (figure 1).

The derivative of the integral, with respect to its upper bound, is the rate at which the area increases as we move the upper bound to the right.

That is -- rather obviously! -- just the value of f(x) at the upper bound.

If we add one more little "piece" to the total area under the curve, and the width of of that "piece" is dx units, then the area of that piece must be f(x) · dx units.

If the area we added was f(x) · dx units, when we moved dx units to the right, then the rate at which we're adding area must be -- once again -- f(x).   In other words:

(1)

 Figure 2: Change in an Integral
I just looked up the fundamental theorem in Thomas's ninth edition, a respectable calculus text, and the authors emphasize how surprising this result is.  I find that statement inexplicable; this is among the most obvious facts in calculus.  It's also very important, however.  So we will dwell on it a bit longer.

In figure 2, we have increased the upper bound by Δx2, and the value of the integral has increased by the area shown in pink, which is f(x) · Δx2.   The derivative -- the rate at which the integral increases, as the bound increases -- is that additional area, divided by the distance we moved the bound:

(2)

I should hasten to add that I've left out a limit operation here; for a finite change in the bound, f will typically vary over the width of the added "panel".  I wrote the equation and drew the picture as though f were constant, which is only really legitimate in the limit of infinitesimal Δx2.  Taking the limit explicitly, and paying attention to boundary cases, adds a page or so of algebra to the operation and doesn't much clarity; I may add a formal proof later but for the time being I'm going to stop here.

Again, what we've just shown is the rate of change in the integral as we move the upper bound; the sum up:

 The derivative of the integral of a function is the function itself.

### The Integral of a Derivative:  The Fundamental Theorem, Part II

The derivative of a function, f, with respect to x, is the rate of change of f as x changes.

If we multiply the (average) rate at which f changes as x changes, by the total change in x, we will, of course, find the total amount by which f changed.  And that is all this theorem says.  We'll illustrate this with a simple example.  We'll use speed in the example; speed is the change of position over time; in other words, speed is the derivative of location with respect to time.

As our first, totally trivial example, if the average speed of a car is 30 MPH, and the car travels for an hour, it will cover 1 hour * 30 MPH = 30 miles.

If, however, the speed of the car varies, we need to be a little more clever.  We might proceed as follows:
If it starts out going 10 MPH, we can assume it maintains that speed (with little change) for a brief period -- say, 1 minute.  So, multiply 10 MPH by 1/60 of an hour, and we find out how far it traveled in the first minute.

At the end of a minute, (we suppose) the car is traveling 12 MPH.  So, to get the distance traveled during the second minute, we multiple 12 MPH by 1/60 of an hour.

And we proceed like this for the entire duration of the trip.

If the car's speed varies very rapidly, our result may not be very accurate; in that case we need to "divide up the time" in to smaller intervals.  We might check the car's speed every 10 seconds rather than every minute -- or every second, or twice a second.  As we divide up the time into ever smaller intervals and sum up the product of speed and time in each of those (tiny) intervals, our total sum will certainly approach the correct answer, which is the total distance traveled.
The point in all this is that speed is the derivative of position with respect to time.  By summing the speed times distance over many tiny intervals, we are integrating the speed of the car, over time -- and the result of that integral is the total distance traveled.  In this particular example, what we're finding is this:

(3)

In other words, we integrated the derivative of the position, and obtained the change in position.

This should not be surprising; indeed it should seem completely natural.  It is an illustration of the second part of the fundamental theorem of calculus, which can be written as:

(4)

Stated succinctly,

 The integral of the derivative of a function is the function itself.

As with the first part of the fundamental theorem, the conclusion seems clear; a detailed proof won't add a lot to the clarity.  I may add one at some future date but for now I'm going to stop here.

### A Little More about "∫", and about "d", the "differential operator"

There are a number of interpretations of d, the "differential operator".  For calculus of a single real variable, the simplest interpretation is the best:  dx is a "tiny" (but imprecisely specified) change in x.  The "d" stands for difference; it is the difference between the new and old values of x.  This "visualization" of what is going on is not only simple, but powerful.

The symbol also has a very simple interpretation:  It's a stretched-out "S", and it means Sum.  It indicates we should take the sum of its arguments.  Let's see where these notions lead.

We have:

(5)

That last one might come as a bit of a surprise; it just seems too simple.  But it makes sense: If x changes, then f will surely change by that same amount, times the ratio of the amount f changes by to the amount x changes by!  And, if we continue to think of d as meaning "a small change in", then it's also obvious:  The dx term simply "cancels":

Now, let's consider the symbol in this context.  It means, "Sum all the tiny changes over a range".  The simplest possible integral is this (where we haven't specified the bounds):

(6)

Specifying the bounds, that's:

(7)

If we add up all the changes in x in getting from x=x0 to x=x1, of course we'll just get the total change!

There is nothing special about the variable x in the integral; regardless of what we call the tiny differences we're summing, the result will be the total change in the integrand.  Let's look at a few examples of how this can be applied.

Suppose we have the derivative of some function:

(8)

Let's integrate it:

(9)

But now, let's plug in equation (8):

(10)

This is, of course, just Part II of the fundamental theorem, which we discussed earlier on this page.  The point is that, if we treat dx simply as a "small change in x", the fundamental theorem simply "falls out".

Given an arbitrary integral, what is a "small change" in that integral?  What does it mean to apply the d operator to the integral?  In general, what we mean in that case is that we are asking for a small change in the value of the integral when we change the upper bound of integration by a small amount.

But in that case, the "change" will just the the area of the last "panel" in the sum.  So, we'll have

(11)

This is a strange-looking expression.  What can we do with it?

If we divide through by dx, we just get back part I of the fundamental theorem:

(12)

In general, it's often convenient to use d where one is actually interested in obtaining some derivative, and then divide by one of the d terms and rearrange the equation to get the result one wants.  For example:

(13)

In (13), we started with two functions which are equal.  We differentiated both sides -- but we did it by taking "differentials".  A small change in f is the derivative of f with respect to x, times a small change in x; similarly, g is written as a function of y, a small change in g is its derivative with respect to y, times dy.  And finally, we decided we wanted the derivative of x with respect to y -- so we just divided through by f' · dy, and voila, we have the derivative of x with respect to y.

### A Caveat: The Return of Clutter, and a Return to Rigor

No doubt some "purists" may disapprove of the preceding section.  And, in fact, it is possible to get in trouble using differentials that way.  There are two things to keep in mind.
• The differences are not independent.  dy and dx cannot actually be pulled out of the equations and treated as independent numbers (though we can think of them that way if we're a little careful).
• There is a limit operation which we're not writing down.  Without the limit process, the equations are not actually correct.  (But we can nearly always leave out the explicit limit ... as long as we know it's there, in the background.)
It's also worth keeping in mind that most simple visualizations only work for well behaved -- i.e., smooth -- functions.  It has been said that all functions used in physics are so smooth you could ski on their graphs.  This is not completely accurate but there's some truth in it -- the really weird counter-example functions which crop up in mathematics don't typically have any physical significance.

How can we put this on a more rigorous footing?  First, we need to recognize that, the way we're using dx, it isn't really infinitesimal.   We're thinking of it as being very small.  So, it might be better to write it that way.  We can, for example, say:

(14)

with perfect assurance that it's correct.  However, a lot of the equations we're interested in are not longer true when written this way.  For example, in general we'll have:

(15)

This is where the limit operation comes in; it's present, implicitly, when we use the dx notation.  What we really intend, rather than equation (15), is:

(16)

But we still have one more imprecision to deal with:  We've written Δf and Δx as though they're independent.  They're not.  We actually need to expand Δf, so that the equation becomes:

(17)

At this point, we're back to the ordinary definition of a derivative.  The point is the "missing" pieces actually are present; we're just not writing them down.

Let's look at one more example, which is the chain rule.  In dx notation it's trivial:

(18)

The meaning is crystal clear, and the "reason" it is true is obvious:  The dv's cancel.  But now let's expand it into "rigorous form".  First we rewrite it in terms of Δx and put in the limit operations:

(19)

This is still more or less recognizable, and the "reason" it's true is still apparent.  Unfortunately, though, there's something else wrong here:  The thing on the left isn't exactly the product two derivatives.  The problem is that the two dv terms in the original expression, one in the denominator and one in the numerator, are actually independent!  They aren't necessarily the same value -- and that's the source of the claim that you can't just "cancel" them out.  Written properly, (19) turns into this:

(20)

and now we have completely lost the intuitive clarity of equation (18).  In fact, at first glance it's not even obvious how to prove (20) is true -- or, indeed, if it is true.   Nearly all of the squirrelly manipulations which follow are directed at showing that we can treat the two dx's in (18) as having the same value, and so we really can just "cancel them out" after all.

Throughout the following, we're going to assume that v is not constant at the point where we're trying to evaluate the derivative.  (If it's constant there, then Δv is zero and both (19) and (20) are in deep trouble.)

To proceed with the proof, let's start by defining

(21)

Next let's rewrite the term in brackets on left side of equation (19) using definitions (21):

(22)

To reduce the clutter a bit, let's substitute:

(22b)

After multiplying out and substituting (22b), the right hand side of (22) becomes:

(23)

Now, as we vary Δx, the terms u'(v(x)) and v'(x) are fixed -- they don't change.  But as Δx goes to zero, both g(Δv) and h(Δx) go to zero.  So, all but the first term in (23) will vanish as Δx  goes to zero.  But (23) was the term inside brackets in equation (19).  So, applying these results to the left side of (19), we can conclude that,

(24)

We're finally back to something we can work with!  The two terms in v inside the brackets in (24) certainly do cancel, giving us:

(25)

We recognize the right hand side of (25) as the derivative of u(v(x)):

(26)

Equating the right hand sides of (24) and (26) we finally obtain the result we want:

(27)

And so we see, first, how easily we can come to the right conclusion using the dy/dx "sloppy infinitesimal" notation, and second, how hard it can be to actually prove that the conclusion we came to is "really correct".

Page created on 11/03/2007