Some Pysics Insights
Path:  physics insights > basics > Calculus >

Introducing Our Approach to Calculus

Elementary calculus, as I said elsewhere,  is geometry in a tuxedo.  It's got a fair amount of algebra thrown in for decoration, but for the most part it can be explained in pictures.

Too often, the explanations given for the fundamental facts of calculus are presented in terms of epsilonics (which is valuable for proofs but rarely of value for explaining things), and are presented as though they should be difficult to grasp or obscure.  My introduction to calculus was a case in point:  We used a text book that wasn't very good, and my instructor was, to put it charitably, uninspired.  The understanding of the subject I got from that course was, to say the least, weak.

One of the more remarkable moments in the class came when we had a guest lecture by the head of the math department, who also was less than stellar at explaining the Zen of calculus.  At one point in the lecture he talked about the chain rule.  He wrote it out on the chalk board, then observed,
"A lot of students, when they see this, want to just do this..."
and he crossed out the common terms, like this:

      Chain rule, common terms crossed out

Then he gave a kind of conspiratorial laugh, and said, "Of course, you can't do that!".  And I sat there, wondering why you can't do that...  Surely my ignorance must have been deep, for didn't see the problem with doing that....

The aggravating thing here is that you certainly CAN do that; in fact, that's all there is to the chain rule!  It's a product of two ratios, and the common terms cancel.  I must hasten to add that "canceling terms" that way does not produce a rigorous proof of the rule all by itself; for that you need to be careful of exactly how you take the limits for small differences.  But at the level of understanding what the rule means, it's legitimate.

After all, we can restate the rule in English:  "If Joe runs twice as fast as Sally, and Sally runs three times as fast as Tim, then Joe must run six times as fast as Tim".  There really isn't anything more to it than that -- yet it's sometimes presented as though it's obscure, deep, and hard to fathom.

My wretched calculus class worked out well in the end, though.  Through a combination of great good luck and lack of preparation, I bombed the calculus AP test, and consequently had to take the subject over again in college.  That second time around, we used Thomas (which is actually a pretty good book for studying calculus restricted to 3 dimensions, in my opinion).  Better still, my instructor in the course, Gene Kleinberg, was one of the clearest lecturers I've ever encountered.  I can still recall sitting in class one day (about 30 years ago...) in absolute amazement as he explained why Taylor series is the way it is -- I had learned it "by rote" in my high school class, with no idea that it actually could be presented in a way that made sense.  But we'll go into that more later on.

Some Definitions:  What's That "dx" Thing?

We need to start somewhere and this is as good a place as any.  I lost sleep over what "dx" meant when I was first learning this stuff.  The instructor couldn't really say, beyond saying it was something more than just "a small change", and the text we used went off into hyperspace by attempting to introduce the concept of a 1-form into an otherwise rather shallow elementary calculus course.

You don't need to do any of that.  There are actually multiple legitimate ways to define "dx", but for understanding elementary calculus, the simplest is the best:  It's a tiny change in "x".  We can call it an "infinitesimal" change if we like.

You can use "dx" in an equation, standing alone; you can divide "dy" by "dx" to get the rate at which y is changing as x changes, and in most cases it will work out with no problem.  The technical term for what we're doing when we think of it this way is using "physicist's sloppy infinitesimals".  It's not rigorous -- not without some care and more work -- but it conveys the meaning very well, and in general it behaves just fine.  I dare say it's the way Leibnitz thought of it, too.

Now, I keep saying "It's not rigorous".  What's that mean?  It means we haven't proved it works in all cases, and in fact, absent such a proof, one should usually suspect that something doesn't work in all cases.  But the cases where using "sloppy infinitesimals" doesn't work are likely to be pretty pathological, and we don't need to worry about them if all we're trying to do is understand the subject.

Just how far are we from "rigor" when we say "dx is an infinitesimal change in X"?  Not very far, it turns out -- an equation with dx and dy can typically be replaced with an almost identical equation which uses "Δx" in place of "dx", and which is true in the limit as Δx is made very small.  But for picturing it all, we can treat dx as an infinitesimal, and forget, for a while, that there's a limit operation lurking in the background.


The integral is, of course, the area under a curve ... or so it's usually represented.

The key here is that it's an area, or more generally a volume of some sort.

The common notation is actually a recipe for taking an integral:


That symbol on the left is actually a large, stylized "S", and it stands for "Sum".  The "dx" is, as we already said, a tiny change in X.  So, the whole thing says you slice up the region you're interested into tiny chunks, each of which is dx units wide, you find the area (or volume) of each one, and you add them up.  And that is the definition we will take as "an integral".  To restate it:

Definition 1:  The integral of a function, from point a to point b, is the area under the function's curve between a and b, if you plot it ... and it's found by dividing up the region over which we're integrating into tiny (infinitesimal) slices, finding the area (or volume) of each slice, and adding them up.

The area (or volume) of an infinitesimal slice will just be the value of the function we're integrating, times the width of the slice.

This isn't the most general definition of an integral but it's a sensible way to think of it.

To make it rigorous, one either needs to rigorously define infinitesimals, or define it in terms of limits.  Just to be complete we'll mention a definition of the Riemann integral here, which is what the notation corresponds to most closely:

Definition 1b:






Again, this isn't the most general definition of an integral, but for our purposes here it will be just fine.


The classic example of a derivative is speed:  how much distance is covered per unit time.

Leibnitz's notation is nice because it says exactly what the derivative is:

Definition 2:  The derivative of f with respect to x is the ratio of the change in f to the change in x when x changes a little (infinitesimal) bit:


Now, formally, we haven't defined "infinitesimal" so it's sloppy to say "dx is an infinitesimal".   To be rigorous we either need to precisely define infinitesimals (which takes more work than I'm willing to put in on this, and takes us far afield from what I want to do with this subject) or we need to present the definition in terms of limits.  Just to be complete we'll mention it in passing:

Definition 2b:




and that's about all we'll have to say about limits here.

Fundamental Properties

There are a number of basic properties of integrals and derivatives which should be obvious, simply from the definitions given above.  I won't be giving proofs for any of these.


And that about does it for the introduction.  From here on it will be pictures (almost) all the way.

One More Thing:  What About Robinson, and Non-Standard Analysis?

Back in the 1960's, the use of infinitesimals was placed on a secure footing by Abraham Robinson.  It turns out it's possible to prove that infinitesimals exist (well, they exist in the world of mathematics, anyway, if not exactly in the physical universe).  Use of these "rigorous infinitesimals" provides a powerful tool for understanding analysis.  However, the proofs which result tend to be highly algebraic, and that's not a direction I want to go in this section of the website.  My goal is to explain many of the basic principles using simple geometry and pictures, and the use of nonstandard analysis doesn't help with that.

To show what I mean, I'll give an example.  (Since this is chosen to show that the hyperreal approach is not a gateway to instant clarity, don't expect the meaning to be glaringly obvious!)

Here's  a proof of the product rule for derivatives, using infinitesimals.  (We'll be going over this later, using pictures -- it's another example of something which is obvious when pictured properly, and the derivation I'm about to give is certainly not what I mean by a "proper explanation"):


A little explanation is needed here, of course, since we've just introduced an entirely new discipline, with new notation and new semantics!

(I should also mention that the notation used here is what I learned in college, and it may not match anything you find in current use in this field.  Hence, there is a double need for an explanation of what I just did...)

Anything with a tilde over it is an infinitesimal (an actual, legitimate one this time).  The "*" operator takes the "real" (non-infinitesimal) part of a value: it returns the real part, leaving off the infinitesimal fuzz.  Two values are equal within the first-order real numbers if their real parts are equal -- in other words, if the "*" operator returns the same value for both.

In hyperreal calculus, we take a derivative by finding the change in a function when we move an infinitesimal amount.  We divide that change by the (infinitesimal) distance moved, and then take the real part of the result.  The proof above shows that the (real part of the) derivative of the product of two functions, given on the first line, is equal to the product of one with the other's derivative plus the product of the second with the first's derivative, which is the point of the exercise.

A couple of the lines in the middle may not be obvious, however.

On the second line, I introduced s and t, which are both (unknown) infinitesimals.  I replaced file:///media/disk/home/slawrence/website/physics_insights/physics/formulas/eqe_temp_image_J7aEx0.png with file:///media/disk/home/slawrence/website/physics_insights/physics/formulas/eqe_temp_image_CrMjVH.png -- that is, I replaced the value of f(x) at a point infinitesimally far from x with its value at x, plus the derivative of f, plus an infinitesimal error term, times the (infinitesimal) distance moved.  That's legitimate as long as f is well-behaved at x -- and if it's not, then it's not differentiable there either and the proof is a bust no matter how you do it.  And I did the same for g(x), using t for the second unknown infinitesimal.

On the fourth line, "u" suddenly appears, pulled out of a hat, as it were.  But it's nothing very obscure:  I just collected all the terms which I can see are infinitesimal, and lumped them together in a new variable, which I named "u".  I did that because I know that the "*" operator is going to throw away everything that's infinitesimal.  And I lumped every term which was multiplied by any infinitesimal into "u" because I know that any infinitesimal times any (finite) real number is also infinitesimal.

In many ways it's a very cute proof.  It didn't require taking any explicit limits; in fact all it used was simple fractions, straight out of high school algebra -- yet it really is a rigorous proof (unless I botched the algebra).  I haven't done it justice here; it's possible to write it out far more readably than my rather hasty scribble.  But -- and for me, this is a big "but" -- it is wholly non-pictorial.  It's purely symbolic.

So, since the whole point of this section is to get away from the (too often opaque) use of symbols and do as much as possible with pictures, we won't be pursuing the hyperreal path any farther -- and a little farther along, we will see the product rule again, "done right".

Page first posted on 11/04/2007.  Minor typos corrected on 2/27/2008.