 Path:  physics insights > basics > numbers >

## Cardinality of the Real Numbers

On this page, we'll show that the rational numbers are countable, then show that the set of all numbers that can ever be named is countable; we'll then go on to show that the real numbers are uncountable.  We'll also show that the set of all numbers which can ever be named has measure zero, and from that conclude that, if you threw a dart at the number line, the probability that you'd hit a number which could be named would be zero.

### Countability

A set is called countable if it can be placed in 1:1 correspondence with the nonnegative integers.  In other words, if we can assign a different integer to every member of the set, and not have any members left over,  then the set is countable.

Another way to say the same thing is a countable set can be sorted into a sequence which can be indexed by the (nonnegative) integers.

To start with we'll count the rational numbers between 0 and 1, which will show what we mean by this.

### Countability of the Rational Numbers between 0 and 1

Write each rational number as a reduced fraction.  That is, express it as a ratio of integers, "a/b", where a and b have no factors besides 1 in common.  Then sort the set, using the comparison:

(1) If we then write down the list of all rationals between 0 and 1 in the order given by , we can see that each of them must (eventually) appear on the list.  Furthermore, if we number the items on the list using the integers, then for any particular rational, we can easily find an upper bound on the integer it will be assigned.  Call its index on the list n.  Then it's clear that:

(2) This shows that the set of rational values between 0 and 1 is countable.  (Actually determining the precise value assigned to each rational is feasible but tricky, and not necessary for our purposes; all we care about is showing that each rational gets assigned some integer.)

We could take this particular case farther, showing that the set of all rational numbers is countable, and then showing that all Cartesian products of the rationals are countable.  We could then go on and extend this to other sets besides the rational numbers.  However, we're going to take a different approach.

We're going to show that everything we can ever talk about, even in principle, forms a countable set.

### The Set of All Symbols

The Unicode character encoding scheme assigns a number to all symbols commonly in use.  But we can go farther than that.

Consider the set of all symbols ever written down by anyone anywhere on Earth since time began.  That set is clearly finite; in fact it probably has no more than a few million members.

Let's merge all those symbols into a single giant alphabet, which we'll call Ecode (for Extended Unicode).   Using Ecode, we can assign to each symbol ever used anywhere on Earth its own unique integer.

To avoid problems later, we will reserve the value 0 and assign it to no symbol.  (The ASCII null symbol will be given some other code point.)

### The Set of All Finite Strings

With the addition of a few special characters to indicate what direction the text is running in, we are in a position to write any statement -- any string, any document ever written, in fact any document which ever could be written -- in Ecode.

Suppose there are N characters in Ecode, where N is some number on the order of a billion or less.  Then any Ecode character string may be treated as a string of "base N" digits, and may be interpreted as a number written in base N.  (Note that we did not assign the digit value 0 to any symbol, so we don't have problems with ambiguity due to extra leading zeros.)

Conversely, given any integer, if we convert it to base N, we can read off its digits as an Ecode string.

Thus, we have established a 1:1 mapping between a subset of the integers and all possible strings in Ecode.  In particular, we have shown that the set of all possible character strings of finite length is countable.  Consequently, we realize that the set of all things which can be described by use of character strings is also countable.  That includes all possible images (since they can be expressed as Ecode strings -- to see this, consider that they may be character-encoded and attached to email messages), all possible books (a book is just a long character string), and all possible descriptions of anything (a description is, after all, of finite length, and consists of statements and images, all of which can be encoded as finite Ecode strings).

We can summarize this as follows:
An object which can be described by a statement of finite length is nameable.
The set of all nameable objects is countable.

### Numbers with (Finite) Names

Consider any number which we might care to discuss.  To do anything with the number we must be able to specify its value.  That specification is its name, and any number which has such a specification is nameable.

Of course all the integers and all the rationals have names.  Furthermore, all the real numbers we can think of -- in fact, all the real numbers we can ever specify -- must also have names.  As examples, consider three irrational numbers:  π, e, and √2.  Each of these has a name -- that is, it has a finite specification of its value; in fact each of them has many names.  Here are examples which may make it clearer what we mean by names for them:

(3) Each of these names consists of a finite number of symbols, and can certainly be expressed as an Ecode string as we've described them.  Consequently, each of these names corresponds to a unique integer in the mapping we gave.   In a similar way, any other number of which we can speak must have (at least one) name.

And so we see that the set of all numbers which can be named is countable.

### Countability of the Real Numbers

So the set of all real numbers which we could ever name is countable.  But does that set include all real numbers?  No; it's easy to show the set of all real numbers is uncountable.

The usual approach is a diagonalization proof.  I'll give an overview, then a more detailed proof.

Assume we have a "counting" of the real numbers -- a list of all of them, indexed by the integers.  Write out each one as a series of terms, written horizontally, with the list running vertically.  Now run down the list, taking the first term from the first number, the second term from the second number, and so on, sliding out along the diagonal as we go down the list.  For each term, substitute a different value.  Put all the new terms together in a string, and the result must be a number which was not on the list.

Now, for the detailed proof, let's consider all the real numbers between 0 and 1.  Express each one as a decimal expansion, which is  a decimal point followed by a string of decimal digits:

(4) The whole list would look like this:

(5) Now construct a new number, by adding 5 mod 10 to each digit on the diagonal, and fudging to avoid accidentally producing an infinite string of 9's (which would be an alias for a single 1).  We'll map the digits as follows:

(6) We now apply this mapping to the digits d0,0, d1,1, d2,2, ... and so forth, and obtain a new value,

(7) By construction, r differs from rk at the kth decimal place, and so is not equal to any of the values on our list.  From this we conclude that any list of real numbers which is indexed by the integers must be incomplete.

Since the set of nameable numbers is countable, there must, therefore, be real numbers which cannot be named -- they cannot be represented by any finite string of characters.

### Measure of the Nameable Numbers

We have shown that there is at least one real number which cannot be named.  But is there more than one?  Obviously; otherwise we could add the unnamed one to our list and we'd be done.  So just how many unnameable reals are there?

A moment's reflection should be enough to realize that there are an uncountable number of unnameable real numbers.  For, if the set of nameless numbers were countable, we could just index it with the integers.  By indexing it with the even integers, and indexing the nameable numbers by the odd integers, we would obtain a counting for the whole set.

This has strange consequences.  Consider the measure of the interval (0,1).  It's 1 -- the total length of the interval is 1.  (Well, obviously, eh?)  But what is the measure of the nameable numbers?

To find it, let's start with a counting of the nameable numbers -- a list of all of them, indexed by the integers.  Let's center an interval of length a on the first entry in the list, then center an interval of length a/2 on the second entry on the list, a/4 on the third entry, and so forth, dividing the length of the new interval by 2 at each step.  The union of all of those intervals certainly contains all the nameable numbers.  So, the total length -- or the "measure" -- of the nameable numbers must be no larger than the sum of the lengths of those intervals.

We can compute the sum of the lengths of the intervals.  If the total measure of the nameable numbers is L then we must have:

(8) But a was arbitrary -- this is true for any positive value of a.  That can only be true if

(9) So the nameable numbers have total measure zero.  But the length of the interval (0,1) is 1 -- what's taking up all the space?

The space is taken up by the unnameable numbers.  Almost every point on the number line is unnameable.

### A Dart and the Number Line

Suppose we had a dart with an infinitely sharp point, and we threw it at the number line.  It would strike one single value on the line.  Let's pull out the dart and look at the number it hit. What's the probability that the number is one we can name?

The probability that it's nameable is equal to the ratio of the measure of the nameable numbers to the total measure of the number line.

That's zero.  The probability of hitting a number with a name is nonexistent.

The set of numbers that can be named is utterly insignificant in comparison to the set of numbers which cannot be named.

Many people have found this result disturbing.

The rationals are dense.  That means that, in any finite interval, there is at least one rational number.

In particular, between any two different real numbers, there is a rational number.

Suppose we have two unnameable numbers.  Then sandwiched between them there is a rational number.  This is easy to see:  Take the decimal expansions of the two unnameable numbers.  Find the first digit at which they differ.  Copy all the identical digits, copy the larger value for the two "differing" digits, and stop right there.  The new string represents a rational number which lies between the two expansions.

Intuitively, this would seem to indicate that there are exactly as many rational numbers as there are unnameable numbers!  But intuition fails terribly when dealing with uncountable sets.  In this case, intuition has nothing to say about how many unnameable numbers might lie between our "middle-value" rational number and either of the two original unnameable numbers -- and intuition is also mute in the face of this:  You can't select two unnameable numbers to start with! We can never find any unnameable number, because they are, simply, unnameable.  But our argument assumed from the start that we could somehow select two of these values; from there on it's all down hill.

One can obtain a slightly better idea of how large the set of unnameable numbers is by taking the decimal expansions of the values between 0 and 1 and "turning them around", or "reversing" them.  Use the same digits, but put them to the left of the decimal point.  To make this work out nicely we need an extra convention -- we need to add one more digit value for use as a "flag value", so we can write exact values for repeating decimals using a finite number of digits. With that small (and obviously practical) change, we can map every rational fraction to a finite integer by reversing its digits and putting the decimal point at the right hand end.

But what of the unnameable values?  When these are reversed, they extend off to the left, without ever terminating; they represent infinitely large values.  The rationals are represented as finite integers, and the difference between any two "reversed" rationals is finite.  However, the difference between any two unnameable numbers is infinite.  One can count from any reversed rational to any other reversed rational in a finite number of steps, but to count from any unnameable number to any other requires an infinite number of steps.  Intuitively, this suggests that the the ratio of the number of unnameables to rationals must be infinite -- and indeed it is.

But I should hasten to say that you shouldn't try to take this visualization too literally.  It doesn't address the difference between nameable and unnameable irrational numbers, and it makes very sloppy use of the concept of "infinitely large" integers.

Page created on 11/23/2007