Why you can’t square the circle, and why you can’t work out √2 and π exactly

Squaring_the_circle.svg

Squaring the circle

“Squaring the circle” is a common phrase which means “solving an impossible problem”. For mathematicians in ancient Greece, it meant constructing, with compass and straight-edge only, a square of exactly the same area as a circle.

They couldn’t do it. We now know that’s because it is impossible, and some other constructions they couldn’t do – trisecting an angle (dividing it into three equal parts) and doubling the cube (constructing a cube with exactly twice the volume of another) are also impossible.

But some constructions they couldn’t do were possible. In 1796, when he was 19, Carl Friedrich Gauss solved a problem which had baffled everyone else for two thousand years, and found a construction for a regular 17-sided polygon. He also made a good guess, which turned out to be correct, about which other regular polygons could be constructed with compass and straight-edge only. (The key is to find which prime-number-sided polygons can be constructed. The ancient Greeks did 3 and 5. Gauss guessed that 7, 11, and 13 were impossible; the next possible ones are 17, 257, and 65,537).

msri
The Mathematical Sciences Research Institute in Berkeley, California, has 17 Gauss Way as its address, and a display showing the construction for the 17-sided regular polygon

That “squaring the circle” is impossible was proved by the German mathematician Ferdinand von Lindemann in 1882. Lindemann’s other claim to fame is that he taught David Hilbert, generally reckoned the foremost mathematician of the late 19th and early 20th century.

Lindemann in fact proved that π is not algebraic. That is, it cannot be a root of any algebraic equation a0+a1x+a2x2+….+anxn=0 for any whole number n or any whole numbers a0, a1, a2…. an. We’ll see below why this implies that “squaring the circle” is impossible.

Lindemann’s proof built on a proof by Charles Hermite in 1873 that Euler’s number e, which is about 2.71828, is not algebraic. Hermite was the foremost French mathematician of the late 19th century, a time when France was second only to Germany as a mathematical centre (as today it is second only to the USA).

Unlike Evariste Galois, Hermite did not actually fail his entrance exam to the Ecole Polytechnique, the top university in Paris for maths, but that was only because the examiners had been tipped off in advance that Hermite had already started publishing research papers while at school. The Ecole Polytechnique then threw him off the course after a year because Hermite was lame in his right foot (yes, you read that right), but after some more brilliant research Hermite became a teacher at the same Ecole Polytechnique.

Complex numbers, and translating the geometrical problem into an algebraic one

The theory of complex numbers tells us that the points which can be constructed by compass and straight-edge are the numbers which can be calculated by any combination of addition, subtraction, multiplication, division, complex conjugate, and square root.

Take a starting set of points in the plane. Identify one of them with zero, and another with 1, make a choice about which way is up and which way is down, and we’ve identified all the points with complex numbers.

But then all the points we can construct with compass and straight-edge are all the points which can be calculated from the starting points using any combination of addition, subtraction, multiplication, division, complex conjugate, and square root. All those arithmetical operations correspond to constructions with compass and straight-edge.

In other words, we’d be able to construct a square of area equal to a given circle if we could find a formula for π by applying any combination or repetition of addition, subtraction, multiplication, division, and square root to a whole number.

If we prove that π is not algebraic, we have proved that can’t find a formula like that, and we’ve proved a lot more too.

https://en.wikipedia.org/wiki/Compass-and-straightedge_construction#Compass_and_straightedge_constructions_as_complex_arithmetic

Whole, rational, irrational, transcendental

We learn to count before we can read. All human communities which have no written language (so, no reading and writing) still have counting. Those communities whose languages include very few words for numbers (some, just for 1, 2, 3, 4; at least one, I believe, just for 1 and 2) still count, even if they have to describe 5, say, as two 2s plus 1. So the counting numbers 1, 2, 3, 4… are basic.

We then stretch our imagination about numbers step by step.

First stretch

The first leap of imagination is to think of the counting numbers going on for ever. We usually start counting the first few numbers. My daughter Molly, when she was about one and a half, would toddle around chanting “one, two, three, four, five, six”, with great glee, even before she could really talk in sentences. Rose remembers how at pre-school she first came across the number eleven, and realised with a shock that numbers went on after ten, and maybe much further.

The mathematician Ron Graham, well-known for compiling the biggest specific number ever used in a mathematical proof (“Graham’s number”, so big that the entire universe isn’t big enough to spell it out, however small we write), has commented:

“The trouble with the whole numbers is that we have examined only the small ones. Maybe all the exciting stuff happens at really big numbers.”

Most of us have difficulty getting a grip on numbers which come up in economics (like 1.76 trillion, the wealth in dollars of the richest 62 individuals in the world, also equal to the total wealth of half the world’s population). “Graham’s number” is much bigger. But if whole numbers go on for ever, then there are infinitely more of them bigger than “Graham’s number” than smaller than it. There are infinitely more of them hugely bigger than “Graham’s number” than smaller than it!

Second stretch

The second stretch is to think of zero as a number – the ancient Romans and Greeks couldn’t do that, and their number systems included no symbol for zero – and to imagine counting backwards before zero, that is, negative whole numbers.

This calls for a bit more imagination. It is harder to draw a picture representing zero or −2 than one representing 2.

“Integers” is the short name for the positive and negative counting numbers, plus zero.

Third stretch

The third stretch is to see fractions in between the integers. Numbers which can be written as fractions are called rational numbers (because they can be equated with ratios).

Fourth stretch

The fourth stretch is to include surds, like √2. You might wonder why in GCSE you are taught special rules for adding and multiplying and dividing surds, when you thought you already knew all about how to add and multiply and divide numbers. Though school maths takes some care about explaining the “stretch” from counting numbers to negative numbers and fractions, surds are sort of slipped in without much comment.

But the fact, the shocking fact, is that √2 is a different sort of number from rational numbers. √2 cannot be written exactly as a fraction or as a decimal (terminating or recurring). It can never be worked out exactly, no matter how powerful your calculator or computer.

The proof of this fact is simple but involves high-powered thinking: https://mathsmartinthomas.wordpress.com/2016/06/22/look-and-see-proofs-that-the-square-root-of-2-is-irrational/.

There is a story that Hippasus, the ancient Greek mathematician said to have first discovered this proof, was drowned by other mathematicians in order to keep the shocking fact secret. The story is probably not true, but the very fact that the story got around shows us how shocking the fact was.

drowning

Fifth stretch

If we draw a number line and mark in all the rational numbers, then there are no gaps. However, we haven’t included the surds. Where do they fit in, if there are no gaps?

One way to answer that question is to stretch our imagination further and think of numbers as ways of chopping the number line into two bits, one to the left and one to the right. Make the left-hand bit include all negative numbers, and all non-negative numbers with squares less than 2, and the right-hand bit all positive numbers with squares greater than 2. Then that way of chopping the number line represents √2.

And the ordinary number 3, for example, is represented by the way of chopping the number line where the left-hand bit includes all numbers less than 3, and the right-hand bit all numbers greater than 3. There is no gap between the left-hand bit and the right-hand bit, but “3” squeezes in between them.

Chop the number line so the left-hand bit includes all the numbers less than the circumference of a circle with diameter=1, and the right-hand bit includes all the numbers greater than that circumference.

Is that chop equivalent to an algebraic number? Lindemann’s proof shows that it is not. Numbers which are not algebraic are called transcendental. π is transcendental.

A twist to the tale

We’ll see the proofs that e and π are transcendental. The proofs are complicated and clever. That also shows that π2 and e3 and whatever are transcendental. When we come across particular transcendental numbers in maths, they are usually e and π and others built from them.

The third most used distinct “transcendental” number is γ (gamma), which is the limit as n→∞ of 1+(1/2)+(1/3)+(1/4)+…+(1/n)−ln(n), and about 0.577. But I’ve written “transcendental” in scare-quotes, because no-one has yet proved that γ is transcendental, or even that it is irrational. Mathematicians will be surprised if it’s not transcendental, but…

So, not many distinct numbers proved transcendental. Not many distinctively different transcendental numbers appearing in mathematical equations. And yet in 1874, only a year after Charles Hermite proved e was transcendental, and eight years before π was proved transcendental, Georg Cantor proved that there are infinitely more transcendental numbers than algebraic numbers. Almost all numbers are transcendental.

Cantor found a second proof of this theorem about transcendental numbers some years later. The second proof is more famous, because it is simpler, and because it introduced a new method of proof, Cantor’s diagonal argument, later used by other mathematicians for other important theorems.

Diagonal

List all the algebraic numbers between 0 and 1 as s1, s2, etc. (in binary). Construct the number s (shown in blue) to have its k’th digit 1 if the k’th digit of sk is 0, and 0 if the k’th digit of sk is 1. Then s is different from sk for every k. Therefore the set of all numbers between 0 and 1 is an infinity too big to be countable.

Another argument for the result, not rigorous enough to be a proof, and in essence more complicated than Cantor’s diagonal, but maybe more helpful first time round, comes from integration.

011 dx = 1; it’s the area of a 1×1 square.

But that square is made up of lots of lines, each line drawn vertically up from each number on the x-axis between 0 and 1 up to the line y=1.

The algebraic numbers are a “countable” infinity, that is, you can pair them off one-by-one with the counting numbers 1, 2, 3…. But then you can add up the areas of the lines drawn vertically up from the algebraic numbers in the same way as you add an infinite series like 1+(1/2)+(1/4)+(1/8)+(1/16)+….

0+0+0+…. = 0, since each line has area 0. Since the area is in fact 1, there must be a much bigger total infinity of numbers between 0 and 1, including the transcendental numbers, than just the “countable” infinity of algebraic numbers.

If we marked in all the algebraic numbers, or even just all the rational numbers, on the number line, then there would be no gaps. But we would have missed out infinitely more numbers than we had included.

While it takes an educated imagination to be able to deal properly even with whole numbers as big as 1.76 trillion, it takes more imagination to deal with all the real numbers.

A digression about transfinite numbers

The “infinity” of real numbers, which is 20, is bigger than the “infinity” of integers, ℵ0.

Arithmetic with transfinite numbers is different from with finite numbers. So ℵ0+1=ℵ0, ℵ0+ℵ0=ℵ0, ℵ0×ℵ0=ℵ0, etc. Raising 2 to the power ℵ0 is the first arithmetical operation with ℵ0 which produces a result bigger than ℵ0 itself. (There’s nothing special about 2 here. 30=20.)

So maybe 20=ℵ1? The “infinity” of real numbers is the next biggest “infinity” after the “infinity” of integers?

After Georg Cantor introduced the idea of these “transfinite” numbers, at first most mathematicians thought yes, this hypothesis, that the “infinity” of real numbers is the next biggest “infinity” after the “infinity” of integers, called the continuum hypothesis, was true. In old age David Hilbert even published what he thought was a proof of that claim: it was the only one of his papers which had to be rejected from his Collected Works because it was seriously wrong.

Then, in 1964, Paul Cohen proved a stunning result. From the ordinarily-used axioms of mathematics, there is no way of deciding whether the continuum hypothesis is true or not. We can say it’s true, or untrue, only by deciding that those axioms do not fully reflect what we know to be true about ordinary numbers, and adding another axiom.

Even more stunningly, Cohen developed a new method of mathematical proof – “forcing”, now widely used – to get his result.

Cohen’s result was the most-talked-about recent discovery in mathematics when I went to uni to study maths, in 1966.

Cohen himself said that he thought it might “eventually come to be accepted that the CH is obviously false”. Here is a video of Cohen, in old age, talking about his work.

Proof that e is irrational

That e is irrational can be proved simply, and Hermite’s proof that e is transcendental is based on the same idea. The idea is, paradoxically, that e is irrational precisely because it is approximated so very well by rational numbers with denominators 3!, 4!, 5!…. n!….

We go by way of proving that e−1 is irrational, which comes to the same thing as proving e irrational.

e−1 = 1 − 1/1! + 1/2! − 1/3! + 1/4! ….. ∓ 1/n! ± 1/(n+1)! ∓ ….

so n!.e−1 = (a whole number) ±n!/(n+1)! ∓ n!/(n+2)! ± n!/(n+3)! ∓ …

 = (a whole number) ±1/(n+1) ∓ 1/(n+1)(n+2) ± 1/(n+1)(n+2)(n+3) ∓ …

The “remainder” bit on the right
= ±[1/(n+1) − {1/(n+1)(n+2) − 1/(n+1)(n+2)(n+3)} − {…} − …]

and also
= ±[{1/(n+1) − 1/(n+1)(n+2)} + {1/(n+1)(n+2)(n+3) − 1/(n+1)(n+2)(n+3)} + {…} + …]

All the terms in curly brackets are positive. So that “remainder” bit is between 0 and 1/(n+1), i.e., if n≥1, it’s between 0 and ±½. It’s not zero, and it’s not a whole number. So, whatever n is, e−1 is not a fraction with denominator n. ▇

Hermite
Charles Hermite

Proof that e is transcendental

Click here for a simplified version of Hermite’s proof, worked out by David Hilbert, as explained in Michael Spivak’s textbook Calculus

It is a proof by contradiction. It starts by assuming that e is algebraic, in other words that for some n and some integers a0, a1, a2…. an

a0+a1e+a2e2+….+anen=0

and then it shows that is impossible.

It does this by choosing a (big) prime p, bigger than n, and, with p, constructing a (big) integer M such that e is approximated by M1/M, e2 is approximated by M2/M, … ek is approximated by Mk/M … , with all the Mk integers, and all so well that the sum of the remainders εk, each multiplied by the corresponding ak and also by M, is more than −1 and less than 1.

The proof also shows that a0M+a1M1+a2M2+….+anMn is an integer. And it is not zero, because we prove it is not divisible by p.

Then a0M+a1M1+a2M2+….+anMn plus the remainder term M × (sum of the akεk) can’t be zero, and thus a0+a1e+a2e2+….+anen can’t be zero.

We never have to say in the proof exactly what p is; it just has to be a prime number much bigger than n.

The proof uses two facts about the integral ∫0 xm.e−x dx.

The first is that this integral = m! That’s something you find by iterated integration by parts. You learn integration by parts in C4, and iterated integration by parts in FP3. It also means that ∫0 xm.e−x dx is much, much bigger for big values of m than for smaller ones.

The second is that most of the value of this integral is gained in the neighbourhood of x=m. You can tell by differentiation that the maximum value of xm.e−x is at x=m. For x much smaller than x=m, xm is much smaller than xm, and e−x isn’t that big; as x gets bigger than x=m, e−x becomes a tinier and tinier fraction, more and more quickly, and reduces xm.e−x to a tiny value however big xm is.

M is a total of integrals of type ∫0 xm.e−x dx, for m=(p−1) to m=(np+p−1), all divided by (p−1)!

Mk is ek × a similar total of similar integrals, all divided by (p−1)!, but from k to ∞ rather than from 0 to ∞

and so εk is ek × a similar total of similar integrals, all divided by (p−1)!, but from 0 to k

εk are dominated by the integrals for high powers of x, and so long as p is big enough those integrals from 0 to k will be small, because k is much smaller than those high powers.

Another trick is calculating the Mk integrals using integration by substitution, which you learn in C4. That shows that they all come out as integers, and, more than that, all integers divisible by p.

Meanwhile you show that the total of integrals M includes one integral which comes out as an integer not divisible by p (it’s ±(n!)p, which is not divisible by p if p>n), and all the other integrals which are part of M come out as integers which are divisible by p.

So M itself is not divisible by p, and a1M1+a2M2+….+anMn is divisible by p. Therefore, a0M+a1M1+a2M2+….+anMn is not divisible by p; therefore it ≠ 0.

Proof that π is transcendental

Click here to read a step-by-step explanation of Lindemann’s proof

Lindemann used the equation

e+1=0

which you will learn in FP2

and some facts about formulas involving roots of polynomials (which build on what you learned about sums and products of roots of polynomials in FP1)

to show that if you have an algebraic equation for π, then from it you can build an algebraic equation for e. Therefore, the fact that there is no algebraic equation for e shows that there is no algebraic equation for π. Therefore, you can’t square the circle! ▇