As I've stated elsewhere, the primary purpose of these calculus pages is to motivate
some results which are all too often proven but not explained. On
this page, we provide (what we hope is) clear motivation for the
fundamental theorem of calculus -- but, at least initially, we will not
be providing a rigorous proof of it. (I may add one at a later
After we present the two parts of the fundamental theorem, we'll say a bit more about how the dx
notation relates to this, and discuss the visualization of dx
as "a small change in x" and its use in understanding these formulas.
And finally, we'll say a bit about the issue of "lack of rigor"
when using "sloppy infinitesimals", and as an illustration of the
contrast between the "sloppy infinitesimal" motivation
of a theorem and a "rigorous proof", we'll present a proof of the chain rule.
The Derivative of an Integral: The Fundamental Theorem, Part I
Figure 1: A definite integral
The integral of a function is the area under its curve (figure 1).
of the integral, with respect to its upper bound, is the rate at which the area increases as we move the upper bound to the right.
That is -- rather obviously! -- just the value of f(x)
at the upper bound.
If we add one more little "piece" to the total area under the curve, and the width of of that "piece" is dx
units, then the area
of that piece must be f(x)
If the area we added was f(x)
units, when we moved dx
units to the right, then the rate
at which we're adding area must be -- once again -- f(x)
. In other words:
|Figure 2: Change in an Integral|
I just looked up the fundamental theorem in Thomas's ninth edition, a respectable calculus text, and the authors emphasize how surprising
this result is. I find that statement inexplicable; this is among the most
obvious facts in calculus. It's also very important, however. So we
will dwell on it a bit longer.
In figure 2, we have increased the upper bound by Δx2
, and the value of the integral has increased by the area shown in pink, which is f(x) · Δx2
. The derivative -- the rate
at which the integral increases, as the bound increases -- is that additional area, divided by the distance we moved the bound:
I should hasten to add that I've left out a limit operation here; for a finite change in the bound, f
will typically vary
over the width of the added "panel". I wrote the equation and drew the picture as though f
were constant, which is only really legitimate in the limit of infinitesimal Δx2
Taking the limit explicitly, and paying attention to boundary cases, adds a page or so of algebra to the
operation and doesn't much clarity; I may add a formal proof later but for the time being I'm going to stop here.
Again, what we've just shown is the rate of change
in the integral as we move the upper bound; the sum up:
of the integral of a function is the function itself.|
The Integral of a Derivative: The Fundamental Theorem, Part II
The derivative of a function, f
, with respect to x
, is the rate of change
If we multiply the (average) rate
at which f
changes as x
changes, by the total change in x
, we will, of course, find the total amount by which f
changed. And that is all this theorem says. We'll illustrate this with a simple example. We'll use speed
in the example; speed is the change of position
; in other words, speed is the derivative of location with respect to time.
our first, totally trivial example, if the average speed of a car is 30
MPH, and the car travels for an hour, it will cover 1 hour * 30 MPH =
If, however, the speed of the car varies, we need to be a little more clever. We might proceed as follows:
it starts out going 10 MPH, we can assume it maintains that speed (with
little change) for a brief period -- say, 1 minute. So, multiply
10 MPH by 1/60 of an hour, and we find out how far it traveled in the
At the end of a minute, (we suppose) the car is traveling 12 MPH. So, to get the distance traveled during the second minute, we multiple 12 MPH by 1/60 of an hour.
And we proceed like this for the entire duration of the trip.
the car's speed varies very rapidly, our result may not be very
accurate; in that case we need to "divide up the time" in to smaller
intervals. We might check the car's speed every 10 seconds rather
than every minute -- or every second, or twice a second. As we
divide up the time into ever smaller intervals and sum up the product
of speed and time in each of those (tiny) intervals, our total sum will
certainly approach the correct answer, which is the total distance traveled.
The point in all this is that speed is the derivative
of position with respect to time. By summing the speed times distance over many tiny intervals, we are integrating
the speed of the car, over time -- and the result of that integral is the total distance traveled
. In this particular example, what we're finding is this:
In other words, we integrated the derivative
of the position, and obtained the change
should not be surprising; indeed it should seem completely natural.
It is an illustration of the second part of the fundamental
theorem of calculus, which can be written as:
|The integral of the derivative of a function is the function itself.|
with the first part of the fundamental theorem, the conclusion seems
clear; a detailed proof won't add a lot to the clarity. I may add
one at some future date but for now I'm going to stop here.
A Little More about "∫", and about "d", the "differential operator"
There are a number of interpretations of d
, the "differential operator". For calculus of a single real variable, the simplest interpretation is the best: dx
is a "tiny" (but imprecisely specified) change in x
. The "d" stands for difference
; it is the difference
between the new and old values of x
. This "visualization" of what is going on is not only simple, but powerful.
symbol also has a very simple interpretation: It's a stretched-out "S", and it means Sum
. It indicates we should take the sum
of its arguments. Let's see where these notions lead.
That last one might come as a bit of a surprise; it just seems too simple. But it makes sense: If x
changes, then f
will surely change by that same amount, times
the ratio of the amount f
changes by to the amount x
changes by! And, if we continue to think of d
as meaning "a small change in", then it's also obvious: The dx
term simply "cancels
Now, let's consider the
symbol in this context. It means, "Sum all the tiny changes over
a range". The simplest possible integral is this (where we
haven't specified the bounds):
Specifying the bounds, that's:
If we add up all the changes in x
in getting from x
, of course we'll just get the total
There is nothing special about the variable x
in the integral; regardless of what we call the tiny differences we're summing, the result will be the total
change in the integrand. Let's look at a few examples of how this can be applied.
Suppose we have the derivative of some function:
Let's integrate it:
But now, let's plug in equation (8):
is, of course, just Part II of the fundamental theorem, which we
discussed earlier on this page. The point is that, if we treat dx
simply as a "small change in x
", the fundamental theorem simply "falls out".
Given an arbitrary integral, what is a "small change
" in that integral? What does it mean to apply the d
operator to the integral? In general, what we mean in that case
is that we are asking for a small change in the value of the integral
when we change the upper bound of integration
by a small amount.
But in that case, the "change" will just the the area of the last "panel" in the sum. So, we'll have
This is a strange-looking expression. What can we do with it?
If we divide through by dx
, we just get back part I of the fundamental theorem:
In general, it's often convenient to use d
where one is actually interested in obtaining some derivative, and then divide by one of the d
terms and rearrange the equation to get the result one wants. For example:
(13), we started with two functions which are equal. We
differentiated both sides -- but we did it by taking "differentials".
A small change in f
is the derivative of f
with respect to x
, times a small change in x
; similarly, g
is written as a function of y
, a small change in g
is its derivative with respect to y
, times dy
. And finally, we decided we wanted the derivative of x
with respect to y
-- so we just divided through by f'
, and voila, we have the derivative of x
with respect to y.
A Caveat: The Return of Clutter, and a Return to Rigor
doubt some "purists" may disapprove of the preceding section.
And, in fact, it is possible to get in trouble using
differentials that way. There are two things to keep in mind.
- The differences are not independent. dy and dx cannot actually be pulled out of the equations and treated as independent numbers (though we can think of them that way if we're a little careful).
- There is a limit operation
which we're not writing down. Without the limit process, the
equations are not actually correct. (But we can nearly always
leave out the explicit limit ... as long as we know it's there, in the
It's also worth keeping in mind that most simple visualizations only work for well behaved
-- i.e., smooth
-- functions. It has been said that all functions used in physics
are so smooth you could ski on their graphs. This is not
completely accurate but there's some truth in it -- the really weird
counter-example functions which crop up in mathematics don't typically
have any physical significance.
How can we put this on a more rigorous footing? First, we need to recognize that, the way we're using dx
, it isn't really
infinitesimal. We're thinking of it as being very small
. So, it might be better to write it that way. We can, for example, say:
perfect assurance that it's correct. However, a lot of the
equations we're interested in are not longer true when written this
way. For example, in general we'll have:
This is where the limit operation comes in; it's present, implicitly, when we use the dx
notation. What we really intend, rather than equation (15), is:
But we still have one more imprecision to deal with: We've written Δf
as though they're independent. They're not. We actually need to expand Δf
, so that the equation becomes:
this point, we're back to the ordinary definition of a derivative.
The point is the "missing" pieces actually are present; we're
just not writing them down.
Let's look at one more example, which is the chain rule. In dx
notation it's trivial:
The meaning is crystal clear, and the "reason" it is true is obvious: The dv
cancel. But now let's expand it into "rigorous form". First
we rewrite it in terms of Δx and put in the limit operations:
is still more or less recognizable, and the "reason" it's true is still
apparent. Unfortunately, though, there's something else wrong
here: The thing on the left isn't exactly
the product two derivatives. The problem is that the two dv
terms in the original expression, one in the denominator and one in the numerator, are actually independent!
They aren't necessarily the same value -- and that's the source
of the claim that you can't just "cancel" them out. Written
properly, (19) turns into this:
now we have completely lost the intuitive clarity of equation (18).
In fact, at first glance it's not even obvious how to prove (20)
is true -- or, indeed, if
it is true. Nearly all of the squirrelly manipulations which follow are directed at showing that we can
treat the two dx
's in (18) as having the same value, and so we really can
just "cancel them out" after all.
Throughout the following, we're going to assume that v
is not constant
at the point where we're trying to evaluate the derivative. (If it's constant there, then Δv
is zero and both (19) and (20) are in deep trouble.)
To proceed with the proof, let's start by defining
Next let's rewrite the term in brackets on left side of equation (19
) using definitions (21):
To reduce the clutter a bit, let's substitute:
After multiplying out and substituting (22b), the right hand side of (22) becomes:
Now, as we vary Δx, the terms u'(v(x))
are fixed -- they don't change. But as Δx goes to zero, both g(Δv)
go to zero. So, all but the first term in (23) will vanish
as Δx goes to zero. But (23) was the term inside brackets in equation (19
). So, applying these results to the left side of (19
), we can conclude that,
We're finally back to something we can work with! The two terms in v
inside the brackets in (24) certainly do
cancel, giving us:
We recognize the right hand side of (25) as the derivative of u(v(x))
Equating the right hand sides of (24) and (26) we finally obtain the result we want:
And so we see, first, how easily we can come to the right conclusion using the dy
"sloppy infinitesimal" notation, and second, how hard it can be to actually prove
that the conclusion we came to is "really correct".
created on 11/03/2007