Sunday, November 21, 2021

From Counting to Complex by Inverse and Closure

Walking the path from counting numbers to complex numbers.

Contents

Preface

Many years ago I read that Richard Feynman gave a talk to a room full of scientists in which he rederived basic abstract algebra on real numbers in under an hour. Since then I found that Feynman gave this derivation in a discussion on Algebra in his Lectures on Physics, for which I give a link a few paragraphs below.

I'm not going to compete with Feynman, but doing this derivation seemed like a fun challenge to undertake. Below I present my explanation of how one gets to complex numbers based on a few simple concepts: repetition, inverse and closure. Along the way I try to throw in a few comments about abstract algebra. By the end, we will look at Euler's Identity, eiπ+1=0, and maybe make it a little less mystical than it might appear.

It is not necessary for you to understand all of the references to math terms, so you don't need to follow those links unless you want to learn about that concept. Similarly, it is not necessary for you to follow and understand in detail every proof. Hopefully you can simply ignore any parts you don't immediately understand and yet still get something out of the overall presentation.

I walked this path mostly for my own entertainment, but I thought perhaps others might get something out of it. It is quite long and likely contains some errors, so caveat lector.

Here are a couple of other documents that discuss Algebra that you might find interesting:

Introduction

Imagine that none of this stuff exists, so we are making it all up as we go. We are going to define our numbering system from the ground up, gradually building up a structure of definitions and operations that all manage to work together nicely. It's not just by random chance that things work nicely: we are defining our numbers and operations precisely to make them work together nicely.

In the code blocks below, I label each assumption (or definition) with a name such as A1 enclosed in square brackets, like this: [A1]. Lemmas (things which can be proved from the assumptions and are used in later proofs) are labeled similarly but with L rather than A. Other intermediate steps in a proof which are not referenced outside of that proof are labeled similarly but with I. These names may be referenced later to build up additional lemmas. The references look the same, but appear in the text or in comments after an equation rather than before.

Concepts

There are three basic ways we will be extending our system:
  • Repetition: performing the same operation many times. For example, multiplication is repeated addition.
  • Inverse: an operation that has the opposite effect of some other operation. For example, subtraction is the inverse of addition.
  • Closure: the results of an operation are in the same set as the operands. For example, the natural numbers (or positive integers) are closed under addition, because you can add any two natural numbers and get another natural number; but they are not closed under subtraction, because there are some expressions on natural numbers using subtraction whose results are not natural numbers, such as (3 - 5).

Preview

Here is the quick preview of how we will move from counting to complex:
  • start with zero and the successor function
  • repeated successors yields counting and the natural numbers
  • repeated counting yields addition
  • inverse of addition yields subtraction
  • closure on subtraction yields negative numbers
  • repeated addition yields multiplication
  • inverse of multiplication yields division
  • closure on division yields rational numbers
  • repeated multiplication yields exponentiation
  • inverse of exponentiation yields logarithms
  • closure on exponentiation with positive rational numbers yields real numbers
  • closure on exponentiation with negative rational numbers yields complex numbers
  • all of our operations on complex numbers are already closed, so we are done
If you enjoy playing with math you might want to try doing all of these derivations yourself before reading my derivations.

Counting

At the most basic level, we start with some simple assumptions, which happen to be a subset of the Peano axioms.

We define a starting point for counting. Historically, people typically started with one, but for later simplicity in this exercise we start with zero. We define a successor function s(x) that takes a number x and produces the next number, which by definition is distinct from x.
[A1] zero exists [A2] given x, s(x) generates another number, where s(x) is not the same as x

Equals

We define an equals operator (=) so that the statement a=a is true, and the statement a=b means that, for any true statement containing a, we can replace any or all instance of a by b and the resulting statement will also be true. We further assume that if a=b is false, then the same replacements as described above will generally (but not always) yield a false statement.
[A3] a=a is true for all a [A4] a=b is a replacement rule (described above)
The equals operator is:
  • Reflexive: a=a (by definition)
  • Symmetric: if a=b then b=a. Starting with the true statement a=a and the predicate a=b, by our definition of equals we can replace any instance of a by b in a=a and still have a true statement; we chose to replace the first a by b, yielding b=a.
  • Transitive: if a=b and b=c, then a=c. Taking the assumed true statement a=b, and applying our equals rule using the second statement b=c, we replace b by c in the first statement, yielding a=c.
[L5.1] if a=b then b=a (demonstrated above) [L5.2] if a=b and b=c then a=c (demonstrated above)
For convenience, we define the not-equals operator != to be false whenever equals on the same values is true, and vice=versa.

The above definition also leads almost directly to one of the common ways of solving algebraic equations: performing the same operation to both sides of an equation, such as adding the same number to both sides of an equation, or multiplying both sides by the same number. Here's an example of adding the same amount to both sides of an equation.
a = b Assume this is our starting equation we are working with a + c = a + c True by definition [A3] a + c = b + c From [A4]
Note that this works for any function:
[I6.1] a = b Assume this is our starting equation we are working with [I6.2] f(a) = f(a) True by definition [A3] [I6.3] f(a) = f(b) From [A4] using [I6.2] as a starting equation and [I6.1] as our replacement rule [L6.4] if a = b then f(a) = f(b) for any f defined for a
f(x) might be 2*x, x+3, sin(x), or anything else we desire. Thus we can start with any true equation, perform the same valid operation on both sides, and still have a true equation.

Natural Numbers

Given our previously defined starting point of zero, we now define the natural numbers:
[A7.0] 0=zero [A7.1] 1=s(0) [A7.2] 2=s(1) [A7.3] 3=s(2) etc. to infinity.
By definition, s(x)!=x, so 1!=0, 2!=1, etc. Note that we did not assume that repeated application of s(x) would not eventually give us the same number. Without that assumption it is possible that, for example, s(s(s(x)))=x, or in other words, 3=0. This yields a "modulo" system, which can be useful. But for this particular exposition, I want to use the "normal" numbers, so we will add the assumption that s(x) is never equal to any previous value in the sequence. More precisely, we assume:
[A8] For any x, repeated application of the successor function any number of times will never generate x.
We have now defined an unending stream of distinct numbers, each of which is a successor to one other number.

Greater Than

We next define the relational operators less than (<) and greater than (>) with the following statements:
[A9] s(a) > a [A10] if (a > b) and (b > c) then (a > c) [A11] (b < a) always has the same truth value as (a > b)
We are now at the point where we can count and know (by definition) that each time we count we get a number that is greater than all of the previous numbers. We can start with any number and count up from there by repeated application of the successor function. For example, if we start with 4 (which is s(s(s(s(zero))))) we can count up from there by three by repeated application of the successor function three times to get s(s(s(4))), which we can calculate is 7. This gets unwieldy pretty fast. To make this simpler, let's define an "addition" operator + that gives us the same results as repeated counting.

Addition

We define the addition operator (+) as follows:
[A21] a + 0 = a [A22] a + s(b) = s(a + b)
Some quick examples:
[L23.1] a + 1 = a + s(0) = s(a + 0) = s(a) [L23.2] a + 2 = a + s(1) = s(a + 1) = s(s(a))
Since s(a) = a+1, we also have
[L23.3] a + s(b) = a + (b+1) [L23.4] s(a + b) = (a + b) + 1
For some of what we want to do below, we are going to need to use the rule of induction:
[A24] If an equation is true for a known value of n, and it can be demonstrated to be true for n+1 for any n when true for n, then it is true for all natural numbers x where x > n.

Associative

We now show that our addition operator is associative. We want to prove that (a+b)+n = a+(b+n) for all n. We start by showing this is true for n=1, then use induction:
[L25.1] a + (b + 1) = (a + b) + 1 From [A22], [L23.3] and [L23.4] [I25.2] a + (b + n) = (a + b) + n Inductive assumption, true for n=1 a + (b + (n + 1)) = a + ((b + n) + 1) From [L25.1] on (b+(n+1)) = (a + (b + n)) + 1 From [L25.1] with (b+n) for b = ((a + b) + n)+ 1 From [I25.2] applied to (a+(b+n)) = (a + b) + (n + 1) From [L25.1] in reverse with (a+b) for a and n for b [L26] a + (b + c) = (a + b) + c Above lines summarized, with c for n+1
Thus by induction we have our proof of associativity.

Commutative

We use a similar approach to show that addition is commutative, such that a+b=b+a. We start by showing that 0 commutes with a for any a.
[I27.1] 0 + 0 = 0 From [A21] with 0 for a 0 + 1 = 0 + s(0) From [L23.1] = s(0 + 0) From [A22] with 0 for a and b = s(0) From [L27] = 1 [L27.2] 0 + 1 = 1 Summary of the above few lines [I27.3] 0 + n = n Inductive assumption, true for n=1 from [L27.2] [I27.4] 0 + (n + 1) = (0 + n) + 1 From [L26] [I27.5] 0 + (n + 1) = n + 1 By induction from [I27.3] and [I27.4] [L27.6] 0 + a = a From [I27.5] with a for n+1 [I27.7] 0 + a = a = a + 0 From [L27.6] and [A21] [L27.8] 0 + a = a + 0 From [L5.2]
Now we show that 1 commutes with any number by induction.
1 + (n + 1) = 1 + s(n) From [L23.1] on (n+1) with n for a = s(1 + n) From [A22] with 1 for a and n for b = s(n + 1) From inductive assumption that 1 commutes with n, known true for n=0 = n + s(1) From [A21] with n for a and 1 for b = n + (1 + 1) From [L23.1] on s(1) with 1 for a = (n + 1) + 1 From [L25.1] [L28] 1 + a = a + 1 Summary of the above with a for n+1
Finally, we use induction again to show that any two numbers commute.
a + (n + 1) = (a + n) + 1 From [L25.1] = (n + a) + 1 From inductive assumption that a commutes with n, known true for n=1 [L28] = n + (a + 1) From [L25.1] = n + (1 + a) From [L28] = (n + 1) + a From [L25.1] [L29] a + b = b + a Summary of the above with b for n+1
As a final note for addition, since we have demonstrated that (a+b)+c=a+(b+c), we can omit the parentheses when adding multiple terms without creating any ambiguity.
[A30] a + b + c = (a + b) + c = a + (b + c)
Repeated application of this rule can be used for addition with four or more terms without parentheses. By combining this rule with [L29] commutative law, we can see that we can take an expression with multiple terms added together, such as a + b + c + d + e and rearrange and group the terms any way we want.

The associative rule also makes it easy to calculate our addition facts. We already know that 1=0+1, 2=1+1, 3=2+1 etc from our definitions [A7] with [L23.1]. That lets us fill in the first row of our addition fact table. We can then calculate all of the n+2 values based on the n+1 values, and repeat ad infinitum for the rest of the numbers.
n + 2 = n + (1 + 1) = (n + 1) + 1 n + 3 = n + (2 + 1) = (n + 2) + 1 n + 4 = n + (3 + 1) = (n + 3) + 1
Wikipedia has proofs of associativity and commutativity of addition, which are similar to mine but actually a little more concise, and here is a proof of commutativity that does not rely on associativity - but I wanted to think through these derivations myself and present them here in-line with the rest of my exposition.

Identity

At this point we know that a+0=a [A21] and 0+a=a [L27.6], or in other words adding zero to any number (on either side, since we showed addition is commutative) yields that number. This is an interesting enough fact that we will give this number a special name: the Identity for addition.

It's easy to show that there is only one identity for addition.
Assume two identity values e and f. Consider the expression e+f. Because e is an identity, e+f=f. Because f is an identity, e+f=e. Therefore e=f. [L31] Since this is true for any two identities, all are in fact the same one identity.

Algebra

We have built up our concepts in layers, like building a house: we set a foundation with zero and the successor function, put in some rim joists with the natural numbers, and laid on some flooring with the addition operator and its identity element. We have created a little structure from our concepts. Whereas a house is a physical structure, this is an algebraic structure.

It turns out that this algebraic structure is useful enough that mathematicians have given this kind of structure a name: a monoid. A monoid has these characteristics (with our case in parentheses):
  • It has a set of elements (the natural numbers).
  • It has a binary operation on those elements (the + operator).
  • The operation is associative (+ is associative).
  • The operation is closed (adding two natural numbers always produces another natural number).
  • It has an identity element (zero).
There are a few rules from the above section that we will use often enough that we want to reference them by name rather than lemma number. We use the first letter of the name of the characteristic, followed by the operator character.
[a+] a + (b + c) = (a + b) + c [L26] Associativity of addition [c+] a + b = b + a [L29] Commutativity of addition [i+] a + 0 = 0 + a = a [A21], [L27.6] Identity for addition

Subtraction

At this point we have the ability to perform addition, which allows us to calculate a value for x in such equations as x = a + b. But we don't yet have the ability to solve for x in the equation a + x = b. We want to add an operation that is the opposite of addition. In other words, if we start with a and add b to it, we want to be able to take the result and perform another operation using b in order to get back to a. An operator that has this characteristic is called an inverse. We are going to define an operation that is the inverse of addition. We will call that operation subtraction, and we will use the dash character (-) as the operator.

Before we defined addition, we already had the successor function [A2] and we defined the numbers [A7] in terms of the successor function. We defined addition with two axioms [A21] and [A22], then showed that adding 1 to any number is the same [L23] as applying the successor function. Including the successor function and the definitions of the numbers in terms of the successor function, we really had four pieces going into the definition of addition.

We could follow the same path and define a predecessor function that is the inverse of the successor function, but instead we will skip that step and work in terms of adding and subtracting 1 instead of successor and predecessor functions.

We define our subtraction operator (-) recursively, similarly to how we defined the addition operator, using an additional axiom [A41.1] in place of defining a predecessor function p(x):
[A41] a - 0 = a [A41.1] (a + 1) - 1 = a [A42] a - (b + 1) = (a - b) - 1
So let's see how this works:
3 - 0 = 3 From [A41] 3 - 1 = (2 + 1) - 1 = 2 From [A41.1], and since 3 is the successor to 2 (i.e. 3=2+1) 3 - 2 = 3 - (1 + 1) = (3 - 1) - 1 = 2 - 1 = (1 + 1) - 1 = 1

Associative

We want to prove the associative laws for subtraction so we know how we can transform various combinations of parentheses and operators. We already know about a + (b + c), so there are three other possible combinations of + and - with the parentheses in the same position:
  • a - (b + c)
  • a + (b - c)
  • a - (b - c)
We start with a - (b + c).
[L43.1] a - (b + n) = (a - b) - n Inductive assumption, true for n=1 from [A42] a - (b + (n + 1)) = a - ((b + n) + 1) From [a+] = (a - (b + n)) - 1 From [A42] = ((a - b) - n) - 1 From [L43.1] on (a-(b+n)) = (a - b) - (n + 1) From [A42] with (a-b) for a and n for b [L43.2] a - (b + c) = (a - b) - c Above lines summarized, with c for n+1
Next we do a + (b - c), which we do by induction after first doing a + (b - 1).
(a + (n + 1)) - 1 = ((a + n) + 1) - 1 From [a+] = a + n From [A41.1] with a+n for a = a + ((n + 1) - 1) From [A41.1] with n for a [L44] (a + b) - 1 = a + (b - 1) Above lines summarized, with b for n+1
[L45.1] a + b = a + (b - 0) From [A41] with b for a [L45.2] a + b = (a + b) - 0 From [A41] with (a+b) for a [L45.3] a + (b - 0) = (a + b) - 0 From [L45.1] and [L45.2] by [A4] [L45.4] a + (b - n) = (a + b) - n Inductive assumption, true for n=0 by [L45.3] a + (b - (n + 1)) = a + (b - (1 + n)) From [c+] with n for a and 1 for b = a + ((b - 1) - n) From [L43.2] on b-(1+n) = (a + (b - 1)) - n From [L45.4] with b-1 for b = ((a + b) - 1) - n From [L44] = (a + b) - (1 + n) From [L43.2] with a+b for a, 1 for b, n for c = (a + b) - (n + 1) From [c+] with n for a and 1 for b [L45.5] a + (b - c) = (a + b) - c Above lines summarized, with c for n+1
Finally we tackle a - (b - c), which we build up to through quite a few lemmas.
[L46.1] 0 - 0 = 0 [A41] with 0 for a [L46.2] (0 + 1) - 1 = 0 [A41.1] with 0 for a [L46.3] 1 - 1 = 0 From [L27.2] on 0+1 [L46.4] n - n = 0 Inductive assumption, true for n=1 from [L46.3] (n + 1) - (n + 1) = (n + 1) - (1 + n) From [c+] = ((n + 1) - 1) - n) From [L43.2] with n+1 for a, 1 for b, n for c = n - n From [A41.1] on (n+1)-1 with n+1 for a = 0 From [L46.4] [L46.5] a - a = 0 Above lines summarized, with a for n+1
a - b = a - (b + 0) From a+0=0 with b for a = a - (b + (n - n)) From a-a=0 with n for a = a - ((b + n) - n) From [L45.5] with b for a, n for b and c = a - ((n + b) - n) From commutative+ = a - (n + (b - n)) From [L45.5] = (a - n) - (b - n) From [L43.2] [L47] a - b = (a - n) - (b - n)
Substituting a = (c + n), b = (d + n) in [L47] yields [L48.1] (c + n) - (d + n) = ((c + n) - n) - ((d + n) - n) = c - d [L48.2] c - d = (c + n) - (d + n) [L48.1] last and first parts
(a - n) + n = n + (a - n) From [c+] = (n + a) - n From [L45.5] = (a + n) - n From [c+] = a + (n - n) From [L45.5] = a + 0 From [L46.5] = a From [i+] [L49] (a - n) + n = a Above lines summarized
a - (b - c) = (a + c) - ((b - c) + c) From [L48.2] with a for c, b-c for d, c for n = (a + c) - b From [L49] with c for n = (c + a) - b From [c+] on a+c = c + (a - b) From [L45.5] = (a - b) + c From [c+] [L50] a - (b - c) = (a - b) + c Above lines summarized
We now have all of our rules of association for addition and subtraction. The following four equations, repeated from above, show all eight possible combinations of + and - operators and grouping of three variables.
[L26] a + (b + c) = (a + b) + c [L43.2] a - (b + c) = (a - b) - c [L45.5] a + (b - c) = (a + b) - c [L50] a - (b - c) = (a - b) + c
Earlier we saw that, because of [L26], we can write a + b + c and know that it is unambiguous. But that is not true if we write a - b - c, because the statement (a - b) - c = a - (b - c) is not in general true. In order to be able to write fewer parentheses, we arbitrarily choose to have a - b - c mean the same thing as (a - b) - c.
[A51] a - b - c = (a - b) - c
We have specified that the middle variable (b in our equation), following the - operator, should be grouped with the variable on its left, so we call the - operator left-associative; but we generally say it is not associative, meaning it does not associate both ways as does addition.

Unlike addition, subtraction is not commutative, and it has no identity. More precisely, we could say that zero is a right identity for subtraction, but since it is not also a left identity, it is not a simple identity and we usually don't mention it.

Negative Numbers

You may already have noticed that adding the subtraction operator to our structure has created a bit of a problem: we are now able to write expressions which we can not evaluate within our structure. For example, the expression 2 - 4 can not be reduced to a single natural number. When we reduce this equation according to our rules, we eventually get to the point where we need to solve for 0 - 1, and we have no rule to reduce that any further. In other words, our system is no longer a closed system: to state the problem more precisely, the natural numbers are not closed under subtraction.
A pet peeve of mine: elementary school math teachers who tell their students "You cannot subtract 5 from 3." This statement is misleading in its imprecision, since it can be solved with the use of negative numbers. Math is a precise field. The correct statement should include that qualification: "You cannot subtract 5 from 3 using the counting numbers we are studying."

Likewise for other incorrect statements such as "You can not divide 3 by 2" and "You can not take the square root of -4."
We would like to be able to solve any equation we can write with our subtraction operator, so we will define new numbers that we can use for that purpose. We call these numbers negative numbers. We choose to write them using the same digits as we write our natural numbers, with a leading - character, such as -1 and -2.

In our house-building analogy, so far we have built a little house from the foundation upwards, and now we realize we need some more support in order to finish subtraction. Adding negative numbers is like adding another room to that house: in order to have a solid structure, we need to extend our foundation. To save on design work, we are going to reuse the same basic plan as we used when we built up the natural numbers. This is like using the same blueprint for the second room of our house as for the first, except in mirror image because we find symmetry pleasing. Here is a little diagram:
+-----+ +-----+ / 3 \ / 3 \ +----+ +----+----+ +----+----+ | 2 | | 2 | | 5 | 2 | +------+ +----+-+ +----+-+ +-+----+----+-+ | 1 | | 1 | | 1 | | 4 | 1 | +------+ +------+ +------+ +------+------+ 1. Natural 2. Addition 3. Subtraction 4. Negative Numbers Numbers on Naturals Oops! 5. Addition on Negatives 3. Completion of Subtraction
Thus we go back to the beginning of our derivation of natural numbers. To distinguish our original numbers from our newly defined negative numbers, we will call all of the numbers generated by our successor function (that would be all numbers 1 and above) the positive numbers. We will call the collection of all of these numbers (positive, negative and zero) the integers. We will call the characteristic of being "positive" and "negative" the sign of the number.

Since we want our rules to apply to all integers, we start by stating that in any of our previous assumptions and derivations, a variable name can refer to any integer unless the specific proof or assumption states otherwise (such as for induction proofs).

We started by defining a successor operator s(x) [A2], and we now define a corresponding predecessor operator p(x) that generates our negative numbers in a way which is symmetric to s(x):
[A61] given x, p(x) generates another number, where p(x) is not the same as x
In all of our original assumptions and following proofs, we now state that variable names in those assumption refer to any integer. We define the predecessor function as the inverse of the successor function and vice-versa. In other words:
[A62.1] p(s(a)) = a [A62.2] s(p(a)) = a
We define our negative numbers in the same way as we defined our natural (positive) numbers [A7]:
[A63.1] -1 = p(0) [A63.2] -2 = p(-1) [A63.3] -3 = p(-2) etc. to negative infinity.
We take our no-duplicates assumption [A8] on the successor function and state it for the predecessor function:
[A64] For any x, repeated application of the predecessor function any number of times will never generate x.
For the relational operators, we can derive their meaning relative to the predecessor operator:
s(a) > a [A9] p(s(a)) > p(a) Apply p(x) to both sides [L6.6] a > p(a) From [A62.1] [L65] p(a) < a From [A9]

Addition

We add to our definition of Addition ([A21] and [A22]) to handle negative numbers, and we extend our induction assumption [A24] to negative numbers:
[A71] a + p(b) = p(a + b) [A72] If an equation is true for a known value of n, and it can be demonstrated to be true for n+(-1) for any n when true for n, then it is true for all natural numbers x where x < n.
For each of our original assumptions through addition, we have now added similar assumptions to handle our negative numbers. All of our assumptions are completely symmetrical: take any of the original assumptions, replace successor by predecessor, replace 1 by -1, and exchange < with >, and you will get the equivalent assumption for our negative numbers. Because all of our other proofs in those sections are based on those assumptions, the symmetric proofs for negative numbers follow from the symmetric assumptions in exactly the same way as for the natural numbers. Thus all of the results and conclusions in those sections are valid for addition of negative numbers: commutative, associative, identity, algebra.

We list the results of one lemma here, leaving the details of the derivation as an exercise to the reader:
[L73] a + -1 = p(a)
We derive a couple of other useful results:
p(s(a)) = a [A62.1] p(a + 1) = a [L23.1] (a + 1) + -1 = a [L73] a + (1 + -1) = a [a+] (1 + -1) = 0 [L74] -1 + 1 = 0 [c+]
(1 + -1) = 0 [L74] n + -n = 0 Inductive assumption, true for n=1 [L74] (n + -n) + (1 + -1) = 0 From [i+] because (1 + -1) = 0 (n + 1) + (-n + -1) = 0 (n + 1) + (-(n+1)) = 0 From p(x) defn [L75] a + -a = 0 Above lines summarized, with a for n+1
The above statement says that, for any element a in our set of natural numbers, there is an element -a (a negative number, negative a) which can be added to that natural number to produce zero (our identity element). We call negative a the inverse element of a, and likewise a is the inverse element of -a.
-a + a = 0 [L75] (-a + a) - a = 0 - a Subtract a from each side -a + (a - a) = 0 - a [L45.5] [L76] -a = 0 - a [L46.5] and [i+]
a + -a = 0 [L75] (a + -a) - -a = 0 - -a Subtract -a from each side a + (-a - -a) = 0 - -a [L45.5] a = 0 - -a [L46.5] [L76.1] a = -(-a) [L76]
a + -b = a + (0 - b) [L76] = (a + 0) - b [L45.5] = a - b [i+] [L77] a + -b = a - b

Subtraction

As with addition, we note that we can create a set of symmetric assumptions using negative numbers in place of positive numbers, so that all of our results and conclusions of subtraction on positive numbers also work on negative numbers.

For improved symmetry with the definition of addition, we restate our assumptions defining subtraction to use the successor and predecessor functions, and we add a symmetric assumption that covers negative numbers. We no longer need (a+1)-1=0 [A41.1] as an assumption for subtraction, because it is equivalent to p(s(a))=a) [A62.1]. Since these assumptions are just a rewriting of our original assumptions for subtraction, all of our derivations remain the same.
[A41] a - 0 = a Repeat of original [A41] [A81] a - s(b) = p(a - b) [A42] restated in terms of s and p [A82] a - p(b) = s(a - b) Symmetric assumption to [A81]

Algebra

With the addition of negative numbers to our structure, our set is closed with respect to subtraction. We now have a set (the integers) with an associative binary operator (+) with an identity (0) and inverse elements (the negative numbers). This algebraic structure is called a group. Because our operator (addition) is commutative, our algebraic structure is an abelian group. The group, however, ignores the subtraction operator.

Multiplication

Once we start using addition for real tasks, we find that we are often adding the same number many times, such as 3+3+3+3. Because this is so common, we would like to define a shortcut - a new operator - that means the same thing. We call this operation multiplication.

There are various conventions for how the multiplication operator is written: x, * and dot are common, and in some cases a convention is adopted that two variables written next to each other with no operator between them are to be multiplied. Most computer programming languages use the asterisk character (*), and I will use that here.

In order to have as much symmetry as we can, and to minimize our design work, we will define multiplication using a similar approach as we did when we defined addition:
[A101] a * 0 = 0 [A102] a * (b + 1) = (a * b) + a [A103] a * (b - 1) = (a * b) - a
We could equivalently have used a slightly different formulation for [A103] in which we add -1 rather than subtracting 1, as supported by [L77]:
a * (-1) = a * (0 - 1) [L76] = (a * 0) - a [A103] = 0 - a [A101] = -a [L76] [L104.1] a * -1 = -a Above lines summarized
a * (b + -1) = a * (b - 1) [L77] = (a * b) - a [A103] = (a * b) + -a [L77] = (a * b) + (a * -1) [L104.1] [L104.2] a * (b + -1) = (a * b) + (a * -1) Above lines summarized
If the second operand is negative, we can factor that out and we see that it changes the sign of the result.
a * -n = -(a * n) Inductive assumption, true for n=1 a * -(n + 1) = a * (-n - 1) = (a * -n) - a = -(a * n) - a = 0 - (a * n) - a = 0 - ((a * n) + a) = 0 - (a * (n + 1)) = -(a * (n + 1)) [L104.3] a * -b = -(a * b) Above summarized, with b for n+1 [L104.4] -a * b = -(a * b) Swap a with b and use [c*]
-a * -b = -(-a * b) [L104.3] = -(-(a * b)) [L104.3] again = a * b [L76.1] [L104.5] -a * -b = a * b Above lines summarized

Identity and Zero

By setting b=0 in [A102], we see that 1 is a right-identity for multiplication:
a * (0 + 1) = (a * 0) + a From [A102] with 0 for b a * 1 = 0 + a From [i+] on LHS, [A101] on RHS [L105] a * 1 = a
We show by induction that zero multiplied on either side gives zero:
[L106.1] 0 * 0 = 0 From [A101] with 0 for a [L106.2] 0 * n = 0 Inductive assumption, true for n=0 [L106.3] 0 * (n + 1) = (0 * n) + 0 From [A102] with 0 for a, n for b [L106.4] 0 * (n + 1) = 0 + 0 From [L106.2] [L106.5] 0 * (n + 1) = 0 [L106.6] 0 * a = 0 Above summarized with a for n+1
By doing the same proof using [A103] we can conclude that [L106.6] holds for all integers.

We show that 1 is a left identity:
1 * 1 = 1 From [L105] with a=1 1 * n = n Inductive assumption, true for n=1 1 * (n + 1) = (1 * n) + 1 From [A102] with a=1 and b=n = n + 1 From Inductive assumption [L106.8] 1 * a = a Above summarized, with a for n+1
Since 1 is both a left identity and a right identity, we can drop the handedness and just refer to it as an identity.

With addition we had one special number, 0, which when added to any number yielded that number. With multiplication we see that we have two special numbers: the number 1 is an identity for multiplication, but 0 is also special, since anything multiplied by 0 yields 0. We choose to use the word "zero", when associated with a specific operation such as multiplication, to mean a value that, when given as an operand to that operator, always yields zero. Our multiplication operator has only one zero, but other systems and operators may have more than one zero.

By the same argument [L31] as for the additive identity, we can see that there is only one multiplicative identity and only one multiplicative zero.

Distributive

We show that multiplication is distributive over addition by induction:
[L107.1] a * (b + 0) = a * b = (a * b) + 0 = (a * b) + (a * 0) a * (b + 1) = (a * b) + a [A102] a * (b + 1) = (a * b) + (a * 1) From [L105] on rightmost a [L107.2] a * (b + n) = (a * b) + (a * n) Inductive assumption, true for n=1 a * (b + (n + 1)) = a * ((b + n) + 1) From [a+] = (a * (b + n)) + a From [A102] = ((a * b) + (a * n)) + a From [L107.2] = (a * b) + ((a * n) + a) From [a+] = (a * b) + (a * (n + 1)) From [A102] [L107.3] a * (b + c) = (a * b) + (a * c) Above summarized, with c for n+1
The above proof can be repeated using -1 instead of 1 (by [L104.2]), so [L107.3] covers all integers.

Using the same proof steps using [A103] rather than [A102] demonstrates that multiplication distributes over subtraction as well. Since by [L77] subtraction is the equivalent of adding the negative of a number, this is consistent.
[L107.4] a * (b - c) = (a * b) - (a * c)
2 * 1 = 2 = 1 + 1 2 * n = n + n Inductive assumption, true for n=1 2 * (n + 1) = 2 * n + 2 = (n + n) + (1 + 1) = (n + 1) + (n + 1) 2 * a = a + a 1 * b = b (0 * 1) * b = (0 * b) + b (n + 1) * b = (n * b) + b Inductive assumption, true for n=0 (n + 2) * b = (n * b) + b + b Inductive assumption, true for n=0 ((n + 1) + 1) * b = (n + 2) * b = (n * b) + b + b = ((n + 1) * b) + b (a + 1) * b = (a * b) + b

Associative

We show multiplication is associative by induction:
[L108.1] (a * b) * 0 = 0 = a * 0 = a * (b * 0) [L108.2] (a * b) * 1 = a * b = a * (b * 1) From [L105] on each side [L108.3] (a * b) * n = a * b = a * (b * n) Inductive assumption, true for n=1 (a * b) * (n + 1) = ((a * b) * n) + (a * b) = (a * (b * n)) + (a * b) From [L108.3] = a * ((b * n) + b) From [L107.3] with b*n for b, b for c = a * (b * (n + 1)) From [A102] with b for a, n for b [L108.4] (a * b) * c = a * (b * c) Above lines summarized, with c for n+1
As with the distributive law, we can replace 1 by -1 to show that our conclusion covers negative numbers as wel.

Commutative

m * n = n * m Inductive assumption, true for m=0 or 1 and n=0 or 1 (m + 1) * (n + 1) = (m + 1) * n + (n + 1) From [A102] = (m * n) + m + (n + 1) From [(a+1)*b = a*b+b] = (n * m) + n + (m + 1) From Inductive assumption and [a+] = (n + 1) * m + (m + 1) From [same as two lines up] = (n + 1) * (m + 1) From [A102] [L109] a * b= b * a
As with addition, the fact that multiplication is associative [L108.4] means that, if we have an expression that is a string of values multiplied together, we can drop the parentheses from the expression without creating any ambiguity; and the fact that it is commutative means that we can rearrange all of those multiplied values to any order we want.

Algebra

We have added a second operator to our repertoire that, like addition, is an associative binary operator with an identity. With two such operators, where one distributes over the other, we have a ring (for a more precise definition, follow the link). In the same way that group ignores subtraction, the ring ignores the division operator. As with addition, there are a few rules from the above section that we will use often enough that we want to reference them by name rather than lemma number.
[a*] a * (b * c) = (a * b) * c [L108.4] Associativity of multiplication [c*] a * b = b * a [L109] Commutativity of multiplication [z*] a * 0 = 0 * a = 0 [L106.6] Zero for multiplication [i*] a * 1 = 1 * a = a [L106.8] Identity for multiplication [d*] a * (b + c) = (a * b) + (a * c) [L107.3] Distributivity of multiplication over addition

Division

As when we defined subtraction to be the inverse operation of addition, we want an inverse operation to multiplication so that we can solve for x in equations such as a * x = b.

We call our inverse operation division. As with multiplication, there are a number of common ways this operation is expressed. For use in this presentation, we choose to use the slash character (/) to represent the division operation. We want division and multiplication each to be the inverse of the other, as is the case with addition and subtraction, so we have two candidate definitions:
[A120.1] (a * b) / b = a for all a and b except b=0 [A120.2] (a / b) * b = a for all a and b except b=0
Our definitions exclude zero because we already have a rule that says anything times zero is zero, so we know a priori that we can't make these new rules work for all a when b is zero.

The fact that we can't divide by zero is the first time we have encountered a special case in our structure, where we have to add a qualification to one of our rules stating that you can't do something rather than extending our structure to make it possible to do that. When, in building our structure of numbers, we realized that we could not answer the question "what is 3 - 5?", we expanded the structure to allow us to answer tha question ("negative 2"). In this case, we can't answer the question "what is 5 / 0?", but, for the first time, instead of trying to expand our structure to be able to answer that question, we make the statement "you can't do that". As we will see later, the further we go in defining our structure, the more such exceptions and caveats we need to make.

We check that the two assumptions above are compatible by starting with one and converting it into the other.
(a * b) / b = a [A120.1] ((a * b) / b) * b = a * b Right-multiply both sides by b (c / b) * b = c Previous line with c for a*b; this is [A120.2]
We can quickly get some useful lemmas by plugging in a few different values for a and b:
[L121] a / 1 = a From [A120.1 or 2] with b=1, after a*1=a [L122] b / b = 1 From [A120.1] with a=1, after b*1=b [L123] (1/b)*b = 1 From [A120.2] with a=1 [L124] 0 / b = 0 From [A120.1] with a=0, after 0*b=0 [L124.2] a / a = 1 From [A120.1] with a=1 and b=a
If we are looking at the equation
[I125] a = c / b
what does that mean? If we assume
[A126] c = a * b
then [I125] becomes
[I127] a = (a * b) / b
which is [A120.1]. This is true by definition, so our assumption [A126] is a valid assumption to use in solving [I125]. What we are saying here is that the solution (a) to [I125] is the value that, when multiplied by b, gives c.
[L128] If a = c / b, then c = a * b, and vice-versa (from [I125] and [A126])

Associative

As we did with subtraction, we want to prove the associative laws for division so we know how we can transform various combinations of parentheses and the multiplication and division operations. We already know about a * (b * c), so there are three other possible combinations of * and / with the parentheses in the same position:
  • a / (b * c)
  • a * (b / c)
  • a / (b / c)
[I129.1] a / (b * c) = d Given a = d * (b * c) From [L128] a = (d * c) * b From [a*] and [c*] a / b = d * c From [L128] [I129.2] (a / b) / c = d From [L128] [L129.3] a / (b * c) = (a / b) / c From [I129.1] and [I129.2]
[I130.1] a * (b / c) = d Given a * (b / c) * c = d * c Multiply both sides by c a * b = d * c Reduce b /c * c = b by [A120.1] [I130.2] (a * b) / c = d From [L128] [L130.3] a * (b / c) = (a * b) / c From [I130.1] and [I130.3]
[I131.1] a / (b / c) = d Given a = d * (b / c) From [L128] = (d * b) / c From [L130.3] a * c = d * b From [L128] c * a = d * b From [c*] (c * a) / b = d From [L128] c * (a / b) = d From [L130.3] [I131.2] (a / b) * c = d From [c*] [L131.3] a / (b / c) = (a / b) * c From [I131.1] and [I131.2]
We now have all of our rules of association for multiplication and division. The following four equations, repeated from above, show all eight possible combinations of * and / operators and grouping of three variables. Note that this table is identical to the table of rules of association for addition and subtraction, with * instead of + and / instead of -.
[a*] a * (b * c) = (a * b) * c [L129.3] a / (b * c) = (a / b) / c [L130.3] a * (b / c) = (a * b) / c [L131.3] a / (b / c) = (a / b) * c
We derive a few more useful lemmas.
a / b = (a * 1) / b From [i*] = a * (1 / b) From [LL130.3] [L132] a / b = a * (1 / b) Summary of the above lines
1 / (a / b) = (1 / a) * b From [L131.3] = b * (1 / a) From [c*] = b / a From [L132] [L133] 1 / (a / b) = b / a Summary of the above lines
(a / b) * (c / d) = ((a / b) * c) / d) From [L130.3] = (c * (a / b)) / d) From [c*] = ((c * a) / b) / d) From [L130.3] = (c * a) / (b * d) From [L129.3] = (a * c) / (b * d) From [c*] [L134] (a / b) * (c / d) = (a * c) / (b * d) Summary of the above lines
(a / b) / (c / d) = ((a / b) * 1) / (c / d) From [i*] = (a / b) * (1 / (c / d)) From [L130.3] = (a / b) * (d / c) From [L133] = (a * d) / (b * c) From [L134] [L135] (a / b) / (c / d) = (a * d) / (b * c) Summary of the above lines

Rational Numbers

You may have noticed in the above section about the division operation that we discussed things like 1 / a without commenting on the fact that our number system, which up to now includes only integers, does not in general include the numbers that can represent that. The proper sequence would have been to introduce rational numbers first, but I wanted to finish the discussion about the properties of the division operation before discussing rational numbers. With that out of the way, let's turn to rational numbers.

We can easily build a table for specific values of a, b and c for equation [I125] by taking all pairs of integer values for a and b, generating c as their product, and defining the value of c/b to be a for all of those triplets. For example, 2*3=6, therefore 6/3=2.

Our division table does not include all possible combinations of c/b, so there are some division equations for which the answer can not be found in our tables. For example, 3/2 does not appear in our table because, in our system of numbers up to this point, which is all integers, there is no number that, when multiplied by 2, yields 3.

In order for our numbers to be closed under division, we have to add some new numbers, which are the numbers needed to solve the equation c/b when there is no integer number a such that a*b=c. We call these numbers rational numbers, because they are the ratio of two integers, and we choose to represent them as a fraction using the division operator. In other words, when we ask what is the answer to the equation c/b, we are simply defining the answer to be c/b and stating that that value is a number. We will then examine how to manipulate these numbers.

We have defined rational numbers as numbers of the form c/b. We also know from our table-based enumeration of division equations that, for any number c which can be written as a*b, the value of the division equation c/b is a. We define the value of our rational number that we write as c/b to be consistent with the known solutions of our division equations written the same way. Thus the value of the rational number 6/3 is defined to be 2, etc.

Algebra

With division as the inverse of multiplication, the multiplicative identity 1, and rational numbers, our ring is now a field.

This is as far as we will go with algebra. When we continue with exponentiation to derive real numbers and then complex numbers, those structures are still fields.

Operator Precedence

Up to now, we have been using parentheses to ensure that the order of application of operators in an expression is unambiguous. We noted earlier that we don't need those parentheses in an expression that consists solely of a number of values added together, and likewise that we don't need parentheses in an expression that consists solely of a number of values multiplied together. This is nice because it reduces the amount of writing we need to do.

We can further reduce the need for parentheses by defining a rule that tells us which operations to evaluate first when there are no parentheses to guide us. When we start with an operation and then define a second operation as the repeated application of the first operation, we can think of that second operation as being more powerful than the first operation. We then give priority to the more powerful operator, defining our rule of precedence to be that, in an expression in which the order of evaluation would otherwise be ambiguous, we will evaluate the more powerful operators first.

We define addition (+) and subtraction (-) to be at the first level, and multiplication (*) and division (/) to be at the second level and higher power than the first level. Thus, for example, the expression a + b * c will be equal to a + (b * c), and the expression a / b - c will be equal to (a / b) - c.

In cases where there are multiple operators of the same power, we define the order of evaluation to be left to right. Thus, for example, the expression a / b * c will be equal to (a / b) * c, and the expression a - b + c will be equal to (a - b) + c.

Exponentiation

Up to this point the structure we have built is pretty clean. With rational numbers and our four operators (+, -, *, /), we have a system that is closed and mostly complete and consistent, with the only exception being that we can't divide by zero. Other than that one exception, operations are well-defined, we have a nice set of rules including our commutative, associative, and distributive rules, and we have a host of identities and lemmas we can apply to our rational numbers.

Once we add exponentiation, things get a lot messier: we will have expressions that have multiple values, bigger swaths of undefined operations, and many places where our lemmas and rules of manipulation no longer apply. It might seem like it's hardly worth trading our nice clean rational numbers for this mess. But despite all of the rough edges, there are enough useful things you can do with real and complex numbers that it is worth carefully defining where those rough edges are and avoiding them. So, let's forge ahead.

As with addition, once we start using multiplication for real problems, we often find we want to multiply the same number together many times, such as 3*3*3*3. As we did when defining multiplication, we define a new operator that means the same as repeated multiplication. We call this new operation exponentiation. In programming languages this is sometimes written using the up-arrow (^) as an operator, but since this is HTML we have the luxury of using the standard notation, which is to write the exponent as a superscript. For example the expression 34 means 3 multiplied by itself 4 times, or 3 * 3 * 3 * 3. We call the number on the left the base, and the superscript number the exponent. The operation of exponentiation is also referred to as taking a base to a power, where the power is the exponent.

In line with our precedence rules by which we evaluate higher-power operations first, we will evaluate exponentiation before multiplication, division, addition, and subtraction, when there are no parentheses to otherwise indicate the order of evaluation.

From [a*] we know we can group repeated multiplication any way we want, so for example 3 * 3 * 3 * 3 = (3 * 3 * 3) * 3 = (3 * 3) * (3 * 3). Using our new superscript notation, we can write this as 34 = (33) * (31) = (32) * (32). More generally, we can see these things from our definition of exponentiation and [a*]:
[L201.1] a(b + c) = ab * ac [L201.2] a1 = a [L201.3] (ab)c = a(b * c) [L201.4] (ab)c = ab*c = ac*b = (ac)b From [L201.3] and [c*]
We can figure out how to deal with (a * b)n by starting with n=2:
(a * b)2 = (a * b) * (a * b) = a * b * a * b From [a*] = a * a * b * b From [c*] = a2 * b2 [L201.5] (a * b)2 = a2 * b2 Summary of above lines
Then we use induction for the general case:
Assume (a * b)n = an * bn for some n (a * b)(n + 1) = (a * b)n * (a * b)1 From [L201.1] = (an * bn) * (a * b) From [L201.5] = an * a * bn * b From [a*] and [c*] = a(n + 1) * b(n + 1) True when n=2 from [L201.5], so by induction true for all positive n [L201.5] (a * b)n = an * bn
Unlike addition and multiplication, we can quickly see from counterexamples that exponentiation is neither commutative:
23 = 2 * 2 * 2 = 8 32 = 3 * 3 = 9 8 != 9, so 23 != 32
nor associative:
2(32) = 2(3 * 3) = 29 = 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 = 512 (23)2 = (2 * 2 * 2)2 = 82 = 8 * 8 = 64 512 != 64, so 2(32) != (23)2
These initial lemmas are based on our intuitive definition of exponentiation as repeated multiplication, which provides obvious answers only in the case where the exponent is a counting number (strictly positive integer). Let's extend our definition to cover other numbers in our algebra.
[A202.1] d = b + c Starting assumption [I202.2] b = d - c ad = ab * ac From [A202.1] and [L201.1] ad / ac = ab * ac / ac Assuming ac!=0 [I202.3] ab = ad / ac [L202.4] a(d - c) = ad / ac Substitute b from [I202.2]
We can't divide by zero, so the above is not valid when ac is zero. When is that expression zero? From the definition of exponentiation, this expression represents repeated multiplication of a. What number when multiplied by itself is zero? There is only one such number: zero. So [L202.4] is not valid when a = 0, but it is valid for any other base.

Let's look at two special cases of [L202.4].
a0 = a(1 - 1) From [L46.5], a!=0 = a1 / a1 From [L202.4] = a / a From [L201.2] = 1 From [L124.2] [L203] a0 = 1 Above lines summarized, a!=0
a-b = a(0 - b) From [L76], a!=0 = a0 / ab From [L202.4], b!=0 = 1 / ab From [L203] [L204] a-b = 1 / ab Above lines summarized, a!=0, b!=0 [L204.1] a-1 = 1 / a From [L204] with b = 1, and [L201.2]
The above extends our exponentiation operator to all integer exponents and all bases other than zero. What about rational exponents?

Remember that our goal is to define a set of consistent and useful operations. To that end, we want to ask ourselves how we can define exponentiation using a rational exponent such that it is consistent with the rest of our algebra. Rational numbers are equivalent to division using integers, which is the inverse of multiplication. Our exponentiation rule [L201.3] includes multiplication, from which we can derive a rule for division.
a = a1 [L201.2] = a(b / b) From [L124.2], b!=0 = a(b * 1/b) From [L132] = a(1/b * b) From [c*] = (a1/b)b From [L201.3] [L205] (a1/b)b = a Summary of the above lines
What the above says is that the value of a1/b is the number that, when raised to the power b, is equal to a. For example, the number a1/2 is the number that, when raised to the power 2, is equal to a. We call a1/b the b-th root of a. The case where b is 2 or 3 is common enough that we define special names: we call a2 a squared and a1/2 the square root of a; we call a3 a cubed and a1/3 the cube root of a.

Previously when we added a new operation to represent repeated application of an earlier operation (addition as repeated counting and multiplication as repeated addition), we did not encounter closure problems until we added an inverse operation to the newly added operation (subtraction, division). As we will see below, this is not the case for exponentiation: here we will run into closure problems even without an inverse operation. But to keep the flow the same as with the other operators, I will discuss the inverse operation before getting back to closure.

Logarithms

As when we defined division to be the inverse operation of multiplication, we want an inverse operation to exponentiation so that we can solve for x in equations such as ax = b.

We call our inverse operation logarithm.
There is a curious hole in math terminology about logarithms. Our other operations all have names: we talk about performing addition, multiplication, or exponentiation. We do addition by adding two addends to get a sum. But we don't "do logarithm": we "take a logarithm". The word logarithm refers to one of the elements in that operation, similar to how the word exponent refers to one of the elements in the operation of exponentiation. There seems to be no single word for logarithms that corresponds to the operation names such as addition, multiplication, and exponentiation. Talking about logarithms is like talking about sums rather than addition.
[A221.1] loga(ab) = b for all a and b except a=0 or b=0 [A221.2] alogab = b for all a and b except a=0 or b=0
We can derive a few lemmas for log.
[L222.1] loga(a) = loga(a1) = 1 [L201.2] and [A221.1] with b=1 [L222.2] loga(1) = loga(a0) = 0 [L203] and [A221.1] with b=0 [L222.3] loga(1/a) = loga(a-1) = -1 [L204.1] and [A221.1] with b=0
[I223.1] loga(ac) = c [A221.1] using c instead of b [I223.2] loga(ad) = d [A221.1] using d instead of b [I223.3] loga(ac) + loga(ad) = c + d Add left sides and right sides of [I223.1] and [I223.2] [I223.4] loga(ac+d) = c+d [A221.1] using c+d instead of b [L223.5] loga(ac+d) = loga(ac) + loga(ad) Transitive equals on [I223.3] and [I223.4]
[I224.1] loga(ac) + loga(ad) = c + d Subtract left sides and right sides of [I223.1] and [I223.2] [I224.2] loga(ac-d) = c-d [A221.1] using c-d instead of b [L224.3] loga(ac-d) = loga(ac) - loga(ad) Transitive equals on [I224.1] and [I224.2]
[I225.1] loga(ac+d) = loga(ac*ad) [L201.1] [I225.2] loga(ac+d) = loga(ac) + loga(ad) [L223.5] [I225.3] loga(ac*ad) = loga(ac) + loga(ad) Transitive equals on [I225.1] and [I225.2] [L225.4] loga(x*y) = loga(x) + loga(y) Substitute x for ac and y for ad
[I226.1] loga(ac-d) = loga(ac/ad) [L202.4], ad!=0 [I226.2] loga(ac-d) = loga(ac) - loga(ad) [L224.3] [I226.3] loga(ac/ad) = loga(ac) - loga(ad) Transitive equals on [I226.1] and [I226.2] [L226.4] loga(x/y) = loga(x) - loga(y) Substitute x for ac and y for ad, y!=0

Principal Values

Previously, we noted that, when we added division to our algebraic structure, we had to add a small complication in that we can't divide by zero. When we add square root (or, more generally, exponentiation with any non-integer exponent), we run into another kind of special case where we have to take additional care: multivalued functions. We note that every number has two square roots: for example, the square root of 4 is 2 or -2, because either of those numbers, when multiplied by itself, is equal to 4. With multivalued functions like square root, we can run into trouble if we are not careful about choosing which value to use. Here's an example of this problem:
(41/2)2 = 4 41/2 * 41/2 = 4 2 * 41/2 = 4 Substitute 2 as the first square root 2 * -2 = 4 Substitute -2 as the second square root -4 = 4 Wrong!
The bad substitution in the above sequence may be easy to spot and understand, but as we go further into building our algebra, problems of this nature become subtler and harder to recognize.

We can reduce the probability of running into this kind of problem by carefully selecting which of these multiple values to use. When we have one preferred value for a multivalued function, we call that the principal value of the function. For example, the principal value of sqrt(4) is 2.

Irrational Numbers

The ancient Greeks knew that 21/2 (the square root of two) is not a rational number. There are a lot of proofs of this. I happen to like this one that demonstrates that all roots (square root, cube root, and others) that are not integers are not rational.
Assume ab = c (b=2 for square root, b=3 for cube root, etc) and a = d/e, e!=1 where d/e is reduced to the lowest form, so they have no prime factors in common. Then ab = (d/e)b = db/eb = c = c/1 But db has no prime factors that are not in d, and eb has no prime factors that are not in e, so db and eb have no prime factors in common, and the fraction can not be reduced at all, and in particular can not be reduced to c/1, therefore it can not be equal to c. Since there is no rational number satisfying the original assumption, any solution must not be a rational number, except in the case that e=1, which means the root is an integer.
In order for our numbering system to be closed under exponentiation, we need to extend our numbers to include these values that are not rational numbers. We call them irrational numbers.

When we added negative numbers and rational numbers, that was after we had added not only an operation defined by repetition, but also its inverse. In this case, we had to extend our numbers to provide closure even without having yet added that inverse operation.
A brief aside about infinity: before adding irrational numbers, our set of numbers was always countably infinite, which means there was always a way to map the entire set of numbers onto the counting numbers. For example, we can count off all the integers, both positive and negative, by ordering them like this: 0, 1, -1, 2, -2, 3, -3, and so on. We can count off all the rational numbers by ordering them according to the sum of the numerator and denominator and alternating positive and negative, like this: 0, 1/1, -1/1, 1/2, -1/2, 2/1, -2/1, 1/3, -1/3, 2/2, -2/2, 3/1, -3/1, 1/4, and so on, then removing duplicates (any fraction that is not reduced). But once we add all the irrational numbers we can no longer come up with a counting order like this, which is why we say the set of all irrational numbers is uncountable.

For a proof of this assertion, look up Canter's diagonalization argument.

Decimal Notation

When we introduced rational numbers, such as 1/2, we defined their values in terms of the division operation, but did not provide any other representation. This was perhaps acceptable, as we can easily manipulation rational numbers in order to answer questions about them.

With irrational numbers, it is not quite so easy. How can we tell, for example, which of 21/2, 31/3, or 723/510 is the largest? We would like a representation that allows us to do real-world calculations with these values.

When counting up with integers, we use a place-notation system in which each digit, as we move to the left, represents a value that is ten times as much as the digit just to its right. For example, 1234 means 1 * 1000 + 2 * 100 + 3 * 10 + 4. We extend this sequence by defining each place to the right of the ones digit as having a place value of one tenth of the digit to its left. In order to unambiguously know which place is the ones place, we put a decimal point (.) just to the right of the ones digit (we in America, that is; in some other parts of the world people use a comma (,) instead). For example, 0.5678 means 5 * 1/10 + 6 * 1/100 + 7 * 1/1000 + 8 * 1/10000.

We can convert fractions to decimal form such as a.bcde by remembering that that means a + b/10 + c/100 + d/1000 + e/10000
723/510 = (510 + 213) / 510 = 510/510 + 213/510 = 1 + 213/510 = 1 + 10 * 213/510 / 10 = 1 + 2130/510 / 10 = 1 + (2040 + 90)/510 / 10 = 1 + 2040/510 / 10 + 90/510 / 10 = 1 + 4/10 + 10 * 90/510 / 100 = 1 + 4/10 + 900/510 / 100 = 1 + 4/10 + (510 + 390)/510 / 100 = 1 + 4/10 + (510/510 + 390/510) / 100 = 1 + 4/10 + 1/100 + 390/510 / 100 = 1 + 4/10 + 1/100 + 10 * 390/510 / 1000 = 1 + 4/10 + 1/100 + 3900/510 / 1000 = 1 + 4/10 + 1/100 + (3570 + 330)/510 / 1000 = 1 + 4/10 + 1/100 + (3570/510 + 330/510) / 1000 = 1 + 4/10 + 1/100 + 7/1000 + 330/510 / 1000 = 1.417 + more digits from 330/510 / 1000
Figuring out the decimal representation for a number such as 21/2 is not quite as straightforward, but we can start by the brute-force approach of trial and error to get an estimate.
12 = 1, 1<2 22 = 4, 4>2, so our number must start with 1 1.12 = 1.21 1.22 = 1.44 1.32 = 1.69 1.42 = 1.96 1.52 = 2.25 so our number must start with 1.4 1.412 = 1.9881 1.422 = 2.0164 so our number must start with 1.41 1.4112 = 1.990921 1.4122 = 1.993744 1.4132 = 1.996569 1.4142 = 1.999396 1.4152 = 2.002225 so our number must start with 1.414
From this much we can determine that 21/2 is less than 723/510. We don't have an exact answer, but for real world questions we often don't need to go to very many decimal digits to get the answer.

Our decimal notation is a sum of fractions, so any finite decimal number can be converted to a rational number. Conversely, irrational numbers can not be exactly represented as a decimal number, we can only approximate them when using decimal notation. If we want to maintain an exact representation of an irrational number such as 2, we have to keep it in that notation or something similar.

Imaginary Numbers

Adding irrational numbers extends our numbers to include the value of 21/2 and other fractional roots of positive numbers, but it doesn't cover everything. In particular, our numbers don't yet include a value for the expression -11/2. This is the square root of negative 1, which is equal to the number that, when multiplied by itself, equals negative 1. But any positive number multiplied by itself is a positive number, and from [L104.5] any negative number multiplied by itself is also a positive number, so we don't have any numbers that are candidates to be the square root of negative 1. In order to have exponentiation be closed for negative bases, we need to extend our numbers. We need to add a set of numbers that, when multiplied by themselves, produce negative numbers.

When we added negative numbers, we used our existing counting numbers with an added character (-) in front to indicate a negative number. We will do something similar here, using our existing counting numbers with an added character, in this case the letter i, following the number to indicate the new kind of numbers we are adding. We define 1i (or just i) to be the number such that i2 = -1, and given a number a, we define ai = a * i (which is consistent with a common convention of defining ab = a * b).

We need to pick a name to distinguish these new numbers from what we had before, and "the square root of negative one" is too unwieldy, so we pick a shorter name and call them imaginary numbers.

When we defined negative numbers, we might have instead called them imaginary numbers, because you can't have negative lengths or a negative number of apples in the real world, so those numbers are not real, right? In the sense that they are highly useful for certain mathematical calculations, imaginary numbers are no more "imaginary" than negative numbers. It is unfortunate that we are stuck with a name that causes some people to get distracted from thinking about these new numbers as simply the next step in expanding our numbering system to be closed under exponentiation.

To distinguish them from our newly added imaginary numbers, we go back and lump together our previously defined rational and irrational numbers and call those real numbers. Having made the distinction between real and imaginary numbers, we note that we can have imaginary rational numbers, such as (1/2)i, or imaginary irrational numbers, such as 21/2i, as well as negative imaginary numbers such as -4i or negative irrational imaginary numbers such as -21/2i.

If we work through the mechanics of addition and subtraction with imaginary numbers, we find that they work the same as real numbers but with that extra i everywhere. To put it another way, imaginary numbers are closed under addition and subtraction. This is not the case with multiplication: imaginary numbers are not closed under multiplication, since i * i = -1, which is not an imaginary number. Similarly, imaginary numbers are not closed under division, since i / i = 1, which is not imaginary.

Complex Numbers

Since we defined imaginary numbers as being a different set of numbers from real numbers, we can't convert from one to the other, so if we try to add a real number a and an imaginary number bi together, we can't reduce that, so we just write it as a + bi. We call this kind of number a complex number, and since a or b could be zero, we note that all real numbers and all imaginary numbers are complex numbers.

We are, in a sense, cheating when we use the + symbol to enumerate the real and imaginary parts of a complex number, because, as just stated, we can't actually do anything with that operator to reduce the number. In that sense, we could have used any special character in that location. But we choose to use the + sign because it turns out the rules we have that deal with the + operator on real numbers also work with complex numbers: commutative, associative, and distributive rules all work consistently when applied to complex numbers when we use a + sign between the real and imaginary parts.

As with square root, complex numbers come with multivalued functions, some with an infinite number of solutions. It's easy to get bad results if you're not careful, so it's important to define a principal value for these functions and consistently use it.

Cartesian Coordinates

Since real and imaginary numbers can't be reduced to each other and are thus orthogonal, we can represent them on the plane. We choose real to be the X axis and imaginary to be the Y axis.

With this cartesian environment, we can represent complex numbers in polar coordinates using the standard conversion: (r, θ) = (sqrt(x2 + y2), arctan(y/x), where x is the real part and y is the imaginary part (and with the appropriate sign adjustments for quadrants other than I). Converting the other way, we have (x, y) = (r * cos(θ), r * sin(θ)). Sometimes we refer to a complex number as z, where we can decompose it either by real and imaginary parts, written as x = Re(z), y = Im(z), or by polar coordinates, written as r = |z|, θ = Arg(z), where |z| is the magnitude of z and Arg(z) is the argument of z. More precisely, arg(z) is the argument of z, and Arg(z) is the principal argument of z. arg(z) is a multi-valued function equal to Arg(z) + n*2*π for all integer values of n.

We can treat our complex numbers as vectors in the two dimensional complex plane, so that adding two complex numbers can be displayed in our plane as vector addition. More interesting is multiplication, where we can see that when we use polar coordinates we get this nice result: (r1,θ1) * (r2,θ2) = (r1*r2, θ1+θ2).
(r1,θ1) * (r2,θ2) = (r1*cos(θ1) + r1*sin(θ1)i) * (r2*cos(θ2) + r2*sin(θ2)i) = r1*(cos(θ1) + sin(θ1)i) * r2*(cos(θ2) + r2*sin(θ2)i) = r1*r2 * (cos(θ1) + sin(θ1)i) * (cos(θ2) + r2*sin(θ2)i) = r1*r2 * (cos(θ1)*cos(θ2) + cos(θ1)*sin(θ2)i + sin(θ1)*cos(θ2)i + sin(θ1)*sin(θ2)*i2 = r1*r2 * ((cos(θ1)*cos(θ2) - sin(θ1)*sin(θ2)) + (cos(θ1)*sin(θ2) + sin(θ1)*cos(θ2))i) = r1*r2 * (cos(θ1+θ2) + sin(θ1+θ2)i) = r1*r2*cos(θ1+θ2) + r1*r2*sin(θ1+θ2)i = (r1*r2, θ1+θ2) [L301] (r1,θ1) * (r2,θ2) = (r1*r2, θ1+θ2) The above summarized

Euler's Formula

Here is Euler's Formula:
e = cos(θ) + i*sin(θ)
Feynman calls this "one of the most remarkable, almost astounding, formulas in all of mathematics" and refers to it as an "amazing jewel".

As described in an article at Brilliant, Euler's Formula can be derived using the series expansions of sin(x), cos(x), and ex:
cos(x) = 1 - x2/2! + x4/4! - ... sin(x) = x - x3/3! + x5/5! - ... ex = 1 + x + x2/2! + x3/3! + ...
so:
ei*x = 1 + i*x + (i*x)2/2! + (i*x)3/3! + (i*x)4/4! + (i*x)5/5! + ... = 1 + i*x - x2/2! - i*x3/3! + x4/4! + i*x5/5! - ... = (1 - x2/2! + x4/4! - ...) + i*(x - x3/3! + x5/5! - ...) = cos(x) + i*sin(x)
In the section on Cartesian Coordinates above, we noted that any complex number can be represented in polar coordinates using r and theta, but we didn't have a good place to put the i. With Euler's Formula, we can now unambiguously represent any complex number z = x + i*y as |z| * ei*arg(z) where |z| is the magnitude of z and arg(z) is the argument of z.

Complex Exponentiation

Given w = u + i*v and z = x + i*y, how do we calculate wz?

We would like wz to satisfy the rules of exponentiation that we derived for real numbers, such as ka+b = ka * kb. We will assume that we can apply this rule to complex exponentiation and see how that works out.

From the discussion of Euler's Formula above we know that we can represent any nonzero complex number w as |w|*ei*arg(w), and we can represent the real number |w| as eln(|w|). Let's see where that takes us.
wz = (|w|*e(i*arg(w)))z Expand w = (eln(|w|)*ei*arg(w))z Use exp form for magnitude of w = (eln(|w|)+i*arg(w))z ea * eb = ea+b = e(ln(|w|)+i*arg(w))*z (ea)b = ea*b = e(ln(|w|)+i*arg(w))*(x+i*y) Expand z to real and imaginary parts = eln(|w|)*x + ln(|w|)*i*y + i*arg(w)*x + i*arg(w)*i*y (a+b)*(c+d)=ac+ad+bc+bd = e((ln(|w|)*x - arg(w)*y) + i*(ln(|w|)*y + arg(w)*x) i2=-1 and rearrange terms [L310] wz = e((ln(|w|)*x - arg(w)*y) + i*(ln(|w|)*y + arg(w)*x) The above summarized
This gives us a number of the form r * ei where r = e((ln(|w|)*x - arg(w)*y) and θ = ln(|w|)*y + arg(w)*x, both of which we can evaluate.

Note that the above result includes arg(w) in two places, once multiplied by x and once multiplied by y. arg is a multi-valued function, and thus complex exponentiation is also multi-valued for all exponents except zero.

If we are raising to a real power, then y is zero, so [L310] reduces to
wx = e((ln(|w|)*x) + i*(arg(w)*x) [L310] with y=0 = |w|x * ei*arg(w)*x For real x and all w
This equation says the magnitude of the result is the magnitude of w raised to the x power and the arg of the result is the arg of w multiplied by x. If, for example, we are squaring and thus x is 2, we square the magnitude of the number and double the angle. This result is consistent with our earlier observation that, when multiplying two complex numbers, we can multiply the magnitudes and add the angles.

If y is zero and x is an integer, then ei*arg(w)*x gives the same result for all of the multiple values of arg(w), so the overall function is single-valued. If x is not an integer, this is not the case. For example, if x is 1/2, then we get two different answers by plugging in Arg(w) and Arg(w) + 2*π. These are the two square roots of a number: they always have the same magnitude and differ in angle by π.

If we consider the path that would be traced out for powers of some fixed w as we change the real exponent, we can see that it generates a circle or a spiral. Here is a nice visualization of zx from Suitcase of Dreams for when |z|>1:


If we are raising to an imaginary power, then x is zero, so [L310] reduces to
[L311] wi*y = e(-arg(w)*y + i*ln(|w|)*y) [L310] with x=0
Let's evaluate ii. We use [L311] with w=i and y=1:
ii = e(-arg(w) + i*ln(|w|)) [L311] with w=i and y=0 = e-π/2 * ei * 0 |w|=1, ln(1) is 0 = e-π/2 Imaginary part drops out completely! = 0.207879...
Surprisingly, ii is a real number, a little larger than one fifth. At least, that's one answer. We can use any of the answers e-π/2 + k*2π for any integer k.

We see that we can represent any nonzero complex number in the form ei*z, given z = x + i*y.
ei*z = ei*(x+i*y) = ei*x + i*i*y = e-y + i*x = e-y * ei*x
One interesting thing we can do now is to extend Euler's Formula from real theta to complex theta, which allows us to define sin and cos for the entire complex plane:
ei*z = cos(z) + i*sin(z) e-i*z = cos(z) - i*sin(z) cos is an even function, sin is an odd function ei*z + e-i*z = 2*cos(z) cos(z) = 1/2 (ei*z + e-i*z) ei*z - e-i*z = 2*i*sin(z) sin(z) = 1/(2*i) (ei*z - e-i*z)

Euler's Identity

We evaluate Euler's Formula with theta set to pi:
ei = cos(π) + i*sin(π) = -1 + 0 = -1
We add one to both sides to get the typical presentation, ei + 1 = 0.

Not only does this identity tie together five of the key values of algebra (e, π, i, 1, and 0), it does it with one each of the key operations we derived above (equality, addition, multiplication, exponentiation). That's a pretty sweet equation.

Final Closure

Throughout this presentation, we have expanded our system of numbers as we defined new operators and discovered our system of numbers was not closed under the new operators. But with complex numbers, we have reached a point where we don't need to define any new number types. Complex numbers are sufficient to solve all algebraic equations. This is one of the interpretations of the Fundamental Theorem of Algebra, but the proofs are pretty difficult, so I'm not going to try to prove it here.

Tuesday, August 17, 2021

You Are Not Alone

Imagine that you live alone. I don't mean living by yourself in an apartment or house, I mean imagine you are the only person in the world. Furthermore, imagine that no other people have ever touched the world, so that you are living in a wilderness without any of the artifacts of humanity. Kind of like Brian in Hatchet, but without the hatchet. And without any clothes or any other manufactured items

Think about what you have to do to survive:
  • Gather or hunt your own food and prepare it
  • Protect yourself from predators and parasites
  • Create protection from the environment, such as clothes and a shelter
  • Care for your own injuries and illnesses
In that situation, how much could you accomplish? What could you create? What wealth could you accumulate?

The Only Human

But let's go one step further. Think about all of the knowledge you have that you learned from someone else rather than from direct experience. Now imagine that you did not know any of that. You only know the things that you have learned through you own interactions with the world. We'll be generous and say you also know things that you might reasonably have discovered on your own.

Now how much could you accomplish?

Remember, you can only work with available natural materials such as wood and stone. You can't use metal, ceramic, plastic, rubber, or cloth unless and until you can make it yourself. Remember also, we are assuming you don't have any knowledge except that which you have learned through direct experience. You are unlikely to even know that any of those materials exist or are possible, let alone know how to create them.

Would you be able to survive? Would you have any time left over to start the long process of discovering, learning about, and making any of the unavailable materials just mentioned? Compared to what you own today, and the accomplishments of your real life so far, how much could you have collected or accomplished in our imaginary situation?

Knowledge is Power

Let's ease up on the restrictions a bit and allow you to retain all of the knowledge you have. In fact, let's take it one step further and make available to you all of the collected knowledge and experience of humankind. Basically let's say you have internet access. Now you can look up anything you want, even if you have never thought about it before. You can read about and watch videos on how to make a bow and arrow, or how to knap flint to make an arrowhead, or how to make steel, or how a computer works.

Of course, reading about how to do something and actually being able to do it are not the same thing. If you want to make an arrowhead, first you'll have to find and identify some flint, then you'll have to practice, practice, practice knapping before you get a decent arrowhead. You should eventually be able to make your flint arrowhead and an arrow to attach it to, and with a lot more work you'll be able to make a functional bow. Your internet connection will provide you with many details that would take much longer to get right if you had to figure them out yourself, such as what kind of wood to use, how to fletch and nock the arrow, how to make string, how to make glue, and how to string your bow.

The knowledge you can get from your internet connection will help you much more quickly learn how to identify edible and poisonous plants, skin and cure animal hides, make fire (it's harder than you might think; rubbing two sticks together is not an effective approach), make and fire ceramic (clay), and maybe, if you are lucky enough to find some copper ore (which your internet knowledge can help you identify), create some metal tools.

There are many things you will not be able to create by yourself, even with a long and healthy life and with access to all that information. As examples, producing integrated circuits and stainless steel require far more prerequisite infrastructure than you could create in one lifetime. But having access to the distilled knowledge of millions of lifetimes of exploration and experimentation will allow you to create much more than you could if, as in our initial supposition, you had to learn everything yourself.

With all that knowledge available to you, how much could you create and accomplish in a world without other people and their creations as compared to your current life?

We've seen how much more you would likely be able to create if you had access to the knowledge of humankind via the internet. In the real world as well, we use that knowledge to help us accomplish much more than we could without it. We don't have to rely solely on what we have directly learned from our own experience. We benefit from the experiences and knowledge collected by many other people.

The Wealth of the World

What if, in addition to the knowledge humankind has collected, you also had access to the physical things humankind has created? Let's now assume that the world exists just as it does today, with all of its roads, factories, and other infrastructure, but with no other people. What could you accomplish?

The first question is, how long will all that infrastructure continue to operate without any people? How long will you continue to have electricity, water, communications, or the internet? If you were to apply your time and energy towards keeping those systems up, how much difference would it make? Probably not a lot. Those systems are too big, there are too many, and they require too much experience for your efforts as one person to make much difference. Without the continuing work of a very large number of people, all of these systems, that we rely on in the ordinary course of our lives, would likely fail relatively quickly.

If all those systems fail, what could you accomplish? You could perhaps figure out how to generate some electricity, but keeping that system running would certainly take some of your time. And you would still have to spend some time collecting and preparing food. For a while you could live off canned and preserved food that you could raid from a grocery store, but eventually you'd have to start gathering or hunting again, and that would cut into the time you have available for doing other work.

But for our imaginary scenario, let's say all of those systems continued to work. Let's even take it a step further, and stipulate that all of the factories and supply chains continue to operate. We'll even say you can order stuff online. So basically, everything works as it does in the real world, except that you don't have the ability to communicate or collaborate with any people. Now we are essentially asking, how much can you create or accomplish in the real world if you do not collaborate with anyone else or specifically ask anyone else to do some custom work for you?

People Power

This is not that much different that the way many people operate, and some people can create amazing things. One person can create a wonderful piece of art, or a fun computer program, or an elegant piece of furniture. But most of the things in the world, and all of the most complex and sophisticated things, are made by groups of people, sometimes very large groups of people, collaborating towards a common goal.

I hope that this exercise has helped you see how much all of us rely on the work of other people to accomplish what we do. In all of our lives, there are innumerable people who have helped us get to where we are and whose labors continue to contribute to our success. There is no person walking this earth who has not been helped by someone else at some point. As babies, we would have died if there were no one feeding us and caring for us. We have all learned things from teachers, friends, strangers, and, through media, from people we have never met. We have all inherited wealth from our ancestors, whether it is a personal mansion or the use of our public streets, bridges, and other infrastructure. We use knowledge from around the world and across time. We benefit from the factories and other capital created by our ancestors that provide us with better and less expensive goods. We rely on the labor of others to provide us with food, clean water, electricity, and many other things, so that we can focus on our own specialty. For large projects, we collaborate with others to get more done, and even for small projects we may solicit some piece of custom work from someone else. In all of these ways, the work of other people, both past and present, makes it possible for us to own more, do more, and produce more than we could without them.

The next time you think "I did it all myself", please remember to be grateful for all the people who helped you do it: all the people who kept you alive and cared for you as a baby or beyond, all the people who gained the knowledge of the world, all the people who helped you learn some of it, all the people who built the world around you, all the people who made things that you now have, and all the people who are still providing goods and services to you. You are not alone.