Contents
- Preface
- Introduction
- Counting
- Addition
- Subtraction
- Negative Numbers
- Multiplication
- Division
- Rational Numbers
- Exponentiation
- Logarithms
- Principal Values
- Irrational Numbers
- Imaginary Numbers
- Complex Numbers
- Final Closure
Preface
Many years ago I read that Richard Feynman gave a talk to a room full of scientists in which he rederived basic abstract algebra on real numbers in under an hour. Since then I found that Feynman gave this derivation in a discussion on Algebra in his Lectures on Physics, for which I give a link a few paragraphs below.I'm not going to compete with Feynman, but doing this derivation seemed like a fun challenge to undertake. Below I present my explanation of how one gets to complex numbers based on a few simple concepts: repetition, inverse and closure. Along the way I try to throw in a few comments about abstract algebra. By the end, we will look at Euler's Identity,
eiπ+1=0
,
and maybe make it a little less mystical than it might appear.
It is not necessary for you to understand all of the references to math terms, so you don't need to follow those links unless you want to learn about that concept. Similarly, it is not necessary for you to follow and understand in detail every proof. Hopefully you can simply ignore any parts you don't immediately understand and yet still get something out of the overall presentation.
I walked this path mostly for my own entertainment, but I thought perhaps others might get something out of it. It is quite long and likely contains some errors, so caveat lector.
Here are a couple of other documents that discuss Algebra that you might find interesting:
- Feynman Lectures on Physics, chapter 22: Algebra, including a discussion of Euler's Formula, which Feynman referred to as "one of the most remarkable, almost astounding, formulas in all of mathematics."
- Elementary Algebra by J. H. Tanner, PhD, 1904
Introduction
Imagine that none of this stuff exists, so we are making it all up as we go. We are going to define our numbering system from the ground up, gradually building up a structure of definitions and operations that all manage to work together nicely. It's not just by random chance that things work nicely: we are defining our numbers and operations precisely to make them work together nicely.In the code blocks below, I label each assumption (or definition) with a name such as A1 enclosed in square brackets, like this: [A1]. Lemmas (things which can be proved from the assumptions and are used in later proofs) are labeled similarly but with L rather than A. Other intermediate steps in a proof which are not referenced outside of that proof are labeled similarly but with I. These names may be referenced later to build up additional lemmas. The references look the same, but appear in the text or in comments after an equation rather than before.
Concepts
There are three basic ways we will be extending our system:- Repetition: performing the same operation many times. For example, multiplication is repeated addition.
- Inverse: an operation that has the opposite effect of some other operation. For example, subtraction is the inverse of addition.
- Closure: the results of an operation are in the same set as the operands. For example, the natural numbers (or positive integers) are closed under addition, because you can add any two natural numbers and get another natural number; but they are not closed under subtraction, because there are some expressions on natural numbers using subtraction whose results are not natural numbers, such as (3 - 5).
Preview
Here is the quick preview of how we will move from counting to complex:- start with zero and the successor function
- repeated successors yields counting and the natural numbers
- repeated counting yields addition
- inverse of addition yields subtraction
- closure on subtraction yields negative numbers
- repeated addition yields multiplication
- inverse of multiplication yields division
- closure on division yields rational numbers
- repeated multiplication yields exponentiation
- inverse of exponentiation yields logarithms
- closure on exponentiation with positive rational numbers yields real numbers
- closure on exponentiation with negative rational numbers yields complex numbers
- all of our operations on complex numbers are already closed, so we are done
Counting
At the most basic level, we start with some simple assumptions, which happen to be a subset of the Peano axioms.We define a starting point for counting. Historically, people typically started with one, but for later simplicity in this exercise we start with zero. We define a successor function s(x) that takes a number x and produces the next number, which by definition is distinct from x.
[A1] zero exists [A2] given x, s(x) generates another number, where s(x) is not the same as x
Equals
We define an equals operator (=) so that the statement a=a is true, and the statement a=b means that, for any true statement containing a, we can replace any or all instance of a by b and the resulting statement will also be true. We further assume that if a=b is false, then the same replacements as described above will generally (but not always) yield a false statement.The equals operator is:[A3] a=a is true for all a [A4] a=b is a replacement rule (described above)
- Reflexive: a=a (by definition)
- Symmetric: if a=b then b=a. Starting with the true statement a=a and the predicate a=b, by our definition of equals we can replace any instance of a by b in a=a and still have a true statement; we chose to replace the first a by b, yielding b=a.
- Transitive: if a=b and b=c, then a=c. Taking the assumed true statement a=b, and applying our equals rule using the second statement b=c, we replace b by c in the first statement, yielding a=c.
For convenience, we define the not-equals operator != to be false whenever equals on the same values is true, and vice=versa.[L5.1] if a=b then b=a (demonstrated above) [L5.2] if a=b and b=c then a=c (demonstrated above)
The above definition also leads almost directly to one of the common ways of solving algebraic equations: performing the same operation to both sides of an equation, such as adding the same number to both sides of an equation, or multiplying both sides by the same number. Here's an example of adding the same amount to both sides of an equation.
Note that this works for any function:a = b Assume this is our starting equation we are working with a + c = a + c True by definition [A3] a + c = b + c From [A4]
f(x) might be 2*x, x+3, sin(x), or anything else we desire. Thus we can start with any true equation, perform the same valid operation on both sides, and still have a true equation.[I6.1] a = b Assume this is our starting equation we are working with [I6.2] f(a) = f(a) True by definition [A3] [I6.3] f(a) = f(b) From [A4] using [I6.2] as a starting equation and [I6.1] as our replacement rule [L6.4] if a = b then f(a) = f(b) for any f defined for a
Natural Numbers
Given our previously defined starting point of zero, we now define the natural numbers:By definition, s(x)!=x, so 1!=0, 2!=1, etc. Note that we did not assume that repeated application of s(x) would not eventually give us the same number. Without that assumption it is possible that, for example, s(s(s(x)))=x, or in other words, 3=0. This yields a "modulo" system, which can be useful. But for this particular exposition, I want to use the "normal" numbers, so we will add the assumption that s(x) is never equal to any previous value in the sequence. More precisely, we assume:[A7.0] 0=zero [A7.1] 1=s(0) [A7.2] 2=s(1) [A7.3] 3=s(2) etc. to infinity.
We have now defined an unending stream of distinct numbers, each of which is a successor to one other number.[A8] For any x, repeated application of the successor function any number of times will never generate x.
Greater Than
We next define the relational operators less than (<) and greater than (>) with the following statements:We are now at the point where we can count and know (by definition) that each time we count we get a number that is greater than all of the previous numbers. We can start with any number and count up from there by repeated application of the successor function. For example, if we start with 4 (which is s(s(s(s(zero))))) we can count up from there by three by repeated application of the successor function three times to get s(s(s(4))), which we can calculate is 7. This gets unwieldy pretty fast. To make this simpler, let's define an "addition" operator + that gives us the same results as repeated counting.[A9] s(a) > a [A10] if (a > b) and (b > c) then (a > c) [A11] (b < a) always has the same truth value as (a > b)
Addition
We define the addition operator (+
) as follows:
Some quick examples:[A21] a + 0 = a [A22] a + s(b) = s(a + b)
Since s(a) = a+1, we also have[L23.1] a + 1 = a + s(0) = s(a + 0) = s(a) [L23.2] a + 2 = a + s(1) = s(a + 1) = s(s(a))
For some of what we want to do below, we are going to need to use the rule of induction:[L23.3] a + s(b) = a + (b+1) [L23.4] s(a + b) = (a + b) + 1
[A24] If an equation is true for a known value of n, and it can be demonstrated to be true for n+1 for any n when true for n, then it is true for all natural numbers x where x > n.
Associative
We now show that our addition operator is associative. We want to prove that (a+b)+n = a+(b+n) for all n. We start by showing this is true for n=1, then use induction:Thus by induction we have our proof of associativity.[L25.1] a + (b + 1) = (a + b) + 1 From [A22], [L23.3] and [L23.4] [I25.2] a + (b + n) = (a + b) + n Inductive assumption, true for n=1 a + (b + (n + 1)) = a + ((b + n) + 1) From [L25.1] on (b+(n+1)) = (a + (b + n)) + 1 From [L25.1] with (b+n) for b = ((a + b) + n)+ 1 From [I25.2] applied to (a+(b+n)) = (a + b) + (n + 1) From [L25.1] in reverse with (a+b) for a and n for b [L26] a + (b + c) = (a + b) + c Above lines summarized, with c for n+1
Commutative
We use a similar approach to show that addition is commutative, such that a+b=b+a. We start by showing that 0 commutes with a for any a.Now we show that 1 commutes with any number by induction.[I27.1] 0 + 0 = 0 From [A21] with 0 for a 0 + 1 = 0 + s(0) From [L23.1] = s(0 + 0) From [A22] with 0 for a and b = s(0) From [L27] = 1 [L27.2] 0 + 1 = 1 Summary of the above few lines [I27.3] 0 + n = n Inductive assumption, true for n=1 from [L27.2] [I27.4] 0 + (n + 1) = (0 + n) + 1 From [L26] [I27.5] 0 + (n + 1) = n + 1 By induction from [I27.3] and [I27.4] [L27.6] 0 + a = a From [I27.5] with a for n+1 [I27.7] 0 + a = a = a + 0 From [L27.6] and [A21] [L27.8] 0 + a = a + 0 From [L5.2]
Finally, we use induction again to show that any two numbers commute.1 + (n + 1) = 1 + s(n) From [L23.1] on (n+1) with n for a = s(1 + n) From [A22] with 1 for a and n for b = s(n + 1) From inductive assumption that 1 commutes with n, known true for n=0 = n + s(1) From [A21] with n for a and 1 for b = n + (1 + 1) From [L23.1] on s(1) with 1 for a = (n + 1) + 1 From [L25.1] [L28] 1 + a = a + 1 Summary of the above with a for n+1
As a final note for addition, since we have demonstrated that (a+b)+c=a+(b+c), we can omit the parentheses when adding multiple terms without creating any ambiguity.a + (n + 1) = (a + n) + 1 From [L25.1] = (n + a) + 1 From inductive assumption that a commutes with n, known true for n=1 [L28] = n + (a + 1) From [L25.1] = n + (1 + a) From [L28] = (n + 1) + a From [L25.1] [L29] a + b = b + a Summary of the above with b for n+1
Repeated application of this rule can be used for addition with four or more terms without parentheses. By combining this rule with [L29] commutative law, we can see that we can take an expression with multiple terms added together, such as[A30] a + b + c = (a + b) + c = a + (b + c)
a + b + c + d + e
and rearrange and group the terms any way we want.
The associative rule also makes it easy to calculate our addition facts. We already know that 1=0+1, 2=1+1, 3=2+1 etc from our definitions [A7] with [L23.1]. That lets us fill in the first row of our addition fact table. We can then calculate all of the n+2 values based on the n+1 values, and repeat ad infinitum for the rest of the numbers.
Wikipedia has proofs of associativity and commutativity of addition, which are similar to mine but actually a little more concise, and here is a proof of commutativity that does not rely on associativity - but I wanted to think through these derivations myself and present them here in-line with the rest of my exposition.n + 2 = n + (1 + 1) = (n + 1) + 1 n + 3 = n + (2 + 1) = (n + 2) + 1 n + 4 = n + (3 + 1) = (n + 3) + 1
Identity
At this point we know that a+0=a [A21] and 0+a=a [L27.6], or in other words adding zero to any number (on either side, since we showed addition is commutative) yields that number. This is an interesting enough fact that we will give this number a special name: the Identity for addition.It's easy to show that there is only one identity for addition.
Assume two identity values e and f. Consider the expression e+f. Because e is an identity, e+f=f. Because f is an identity, e+f=e. Therefore e=f. [L31] Since this is true for any two identities, all are in fact the same one identity.
Algebra
We have built up our concepts in layers, like building a house: we set a foundation with zero and the successor function, put in some rim joists with the natural numbers, and laid on some flooring with the addition operator and its identity element. We have created a little structure from our concepts. Whereas a house is a physical structure, this is an algebraic structure.It turns out that this algebraic structure is useful enough that mathematicians have given this kind of structure a name: a monoid. A monoid has these characteristics (with our case in parentheses):
- It has a set of elements (the natural numbers).
- It has a binary operation on those elements (the + operator).
- The operation is associative (+ is associative).
- The operation is closed (adding two natural numbers always produces another natural number).
- It has an identity element (zero).
[a+] a + (b + c) = (a + b) + c [L26] Associativity of addition [c+] a + b = b + a [L29] Commutativity of addition [i+] a + 0 = 0 + a = a [A21], [L27.6] Identity for addition
Subtraction
At this point we have the ability to perform addition, which allows us to calculate a value for x in such equations asx = a + b
.
But we don't yet have the ability to solve for x in the equation
a + x = b
.
We want to add an operation that is the opposite of addition.
In other words, if we start with a and add b to it, we want to be
able to take the result and perform another operation using b
in order to get back to a.
An operator that has this characteristic is called an inverse.
We are going to define an operation that is the inverse of addition.
We will call that operation subtraction,
and we will use the dash character (-
) as the operator.
Before we defined addition, we already had the successor function [A2] and we defined the numbers [A7] in terms of the successor function. We defined addition with two axioms [A21] and [A22], then showed that adding 1 to any number is the same [L23] as applying the successor function. Including the successor function and the definitions of the numbers in terms of the successor function, we really had four pieces going into the definition of addition.
We could follow the same path and define a predecessor function that is the inverse of the successor function, but instead we will skip that step and work in terms of adding and subtracting 1 instead of successor and predecessor functions.
We define our subtraction operator (
-
) recursively,
similarly to how we defined the addition operator, using an additional
axiom [A41.1] in place of defining a predecessor function p(x):
So let's see how this works:[A41] a - 0 = a [A41.1] (a + 1) - 1 = a [A42] a - (b + 1) = (a - b) - 1
3 - 0 = 3 From [A41] 3 - 1 = (2 + 1) - 1 = 2 From [A41.1], and since 3 is the successor to 2 (i.e. 3=2+1) 3 - 2 = 3 - (1 + 1) = (3 - 1) - 1 = 2 - 1 = (1 + 1) - 1 = 1
Associative
We want to prove the associative laws for subtraction so we know how we can transform various combinations of parentheses and operators. We already know abouta + (b + c)
,
so there are three other possible combinations of + and - with the
parentheses in the same position:
a - (b + c)
a + (b - c)
a - (b - c)
a - (b + c)
.
Next we do[L43.1] a - (b + n) = (a - b) - n Inductive assumption, true for n=1 from [A42] a - (b + (n + 1)) = a - ((b + n) + 1) From [a+] = (a - (b + n)) - 1 From [A42] = ((a - b) - n) - 1 From [L43.1] on (a-(b+n)) = (a - b) - (n + 1) From [A42] with (a-b) for a and n for b [L43.2] a - (b + c) = (a - b) - c Above lines summarized, with c for n+1
a + (b - c)
, which we do by induction
after first doing a + (b - 1)
.
(a + (n + 1)) - 1 = ((a + n) + 1) - 1 From [a+] = a + n From [A41.1] with a+n for a = a + ((n + 1) - 1) From [A41.1] with n for a [L44] (a + b) - 1 = a + (b - 1) Above lines summarized, with b for n+1
Finally we tackle[L45.1] a + b = a + (b - 0) From [A41] with b for a [L45.2] a + b = (a + b) - 0 From [A41] with (a+b) for a [L45.3] a + (b - 0) = (a + b) - 0 From [L45.1] and [L45.2] by [A4] [L45.4] a + (b - n) = (a + b) - n Inductive assumption, true for n=0 by [L45.3] a + (b - (n + 1)) = a + (b - (1 + n)) From [c+] with n for a and 1 for b = a + ((b - 1) - n) From [L43.2] on b-(1+n) = (a + (b - 1)) - n From [L45.4] with b-1 for b = ((a + b) - 1) - n From [L44] = (a + b) - (1 + n) From [L43.2] with a+b for a, 1 for b, n for c = (a + b) - (n + 1) From [c+] with n for a and 1 for b [L45.5] a + (b - c) = (a + b) - c Above lines summarized, with c for n+1
a - (b - c)
,
which we build up to through quite a few lemmas.
[L46.1] 0 - 0 = 0 [A41] with 0 for a [L46.2] (0 + 1) - 1 = 0 [A41.1] with 0 for a [L46.3] 1 - 1 = 0 From [L27.2] on 0+1 [L46.4] n - n = 0 Inductive assumption, true for n=1 from [L46.3] (n + 1) - (n + 1) = (n + 1) - (1 + n) From [c+] = ((n + 1) - 1) - n) From [L43.2] with n+1 for a, 1 for b, n for c = n - n From [A41.1] on (n+1)-1 with n+1 for a = 0 From [L46.4] [L46.5] a - a = 0 Above lines summarized, with a for n+1
a - b = a - (b + 0) From a+0=0 with b for a = a - (b + (n - n)) From a-a=0 with n for a = a - ((b + n) - n) From [L45.5] with b for a, n for b and c = a - ((n + b) - n) From commutative+ = a - (n + (b - n)) From [L45.5] = (a - n) - (b - n) From [L43.2] [L47] a - b = (a - n) - (b - n)
Substituting a = (c + n), b = (d + n) in [L47] yields [L48.1] (c + n) - (d + n) = ((c + n) - n) - ((d + n) - n) = c - d [L48.2] c - d = (c + n) - (d + n) [L48.1] last and first parts
(a - n) + n = n + (a - n) From [c+] = (n + a) - n From [L45.5] = (a + n) - n From [c+] = a + (n - n) From [L45.5] = a + 0 From [L46.5] = a From [i+] [L49] (a - n) + n = a Above lines summarized
We now have all of our rules of association for addition and subtraction. The following four equations, repeated from above, show all eight possible combinations of + and - operators and grouping of three variables.a - (b - c) = (a + c) - ((b - c) + c) From [L48.2] with a for c, b-c for d, c for n = (a + c) - b From [L49] with c for n = (c + a) - b From [c+] on a+c = c + (a - b) From [L45.5] = (a - b) + c From [c+] [L50] a - (b - c) = (a - b) + c Above lines summarized
Earlier we saw that, because of [L26], we can write[L26] a + (b + c) = (a + b) + c [L43.2] a - (b + c) = (a - b) - c [L45.5] a + (b - c) = (a + b) - c [L50] a - (b - c) = (a - b) + c
a + b + c
and know that it is unambiguous.
But that is not true if we write a - b - c
, because
the statement (a - b) - c = a - (b - c)
is not in general true.
In order to be able to write fewer parentheses, we arbitrarily choose
to have a - b - c
mean the same thing as (a - b) - c
.
We have specified that the middle variable (b in our equation), following the[A51] a - b - c = (a - b) - c
-
operator, should be
grouped with the variable on its left,
so we call the -
operator left-associative;
but we generally say it is not associative,
meaning it does not associate both ways as does addition.
Unlike addition, subtraction is not commutative, and it has no identity. More precisely, we could say that zero is a right identity for subtraction, but since it is not also a left identity, it is not a simple identity and we usually don't mention it.
Negative Numbers
You may already have noticed that adding the subtraction operator to our structure has created a bit of a problem: we are now able to write expressions which we can not evaluate within our structure. For example, the expression2 - 4
can not be reduced to
a single natural number.
When we reduce this equation according to our rules, we eventually
get to the point where we need to solve for 0 - 1
,
and we have no rule to reduce that any further.
In other words, our system is no longer a closed system:
to state the problem more precisely,
the natural numbers are not closed under subtraction.
We would like to be able to solve any equation we can write with our subtraction operator, so we will define new numbers that we can use for that purpose. We call these numbers negative numbers. We choose to write them using the same digits as we write our natural numbers, with a leadingA pet peeve of mine: elementary school math teachers who tell their students "You cannot subtract 5 from 3." This statement is misleading in its imprecision, since it can be solved with the use of negative numbers. Math is a precise field. The correct statement should include that qualification: "You cannot subtract 5 from 3 using the counting numbers we are studying."
Likewise for other incorrect statements such as "You can not divide 3 by 2" and "You can not take the square root of -4."
-
character, such as -1 and -2.
In our house-building analogy, so far we have built a little house from the foundation upwards, and now we realize we need some more support in order to finish subtraction. Adding negative numbers is like adding another room to that house: in order to have a solid structure, we need to extend our foundation. To save on design work, we are going to reuse the same basic plan as we used when we built up the natural numbers. This is like using the same blueprint for the second room of our house as for the first, except in mirror image because we find symmetry pleasing. Here is a little diagram:
Thus we go back to the beginning of our derivation of natural numbers. To distinguish our original numbers from our newly defined negative numbers, we will call all of the numbers generated by our successor function (that would be all numbers 1 and above) the positive numbers. We will call the collection of all of these numbers (positive, negative and zero) the integers. We will call the characteristic of being "positive" and "negative" the sign of the number.+-----+ +-----+ / 3 \ / 3 \ +----+ +----+----+ +----+----+ | 2 | | 2 | | 5 | 2 | +------+ +----+-+ +----+-+ +-+----+----+-+ | 1 | | 1 | | 1 | | 4 | 1 | +------+ +------+ +------+ +------+------+ 1. Natural 2. Addition 3. Subtraction 4. Negative Numbers Numbers on Naturals Oops! 5. Addition on Negatives 3. Completion of Subtraction
Since we want our rules to apply to all integers, we start by stating that in any of our previous assumptions and derivations, a variable name can refer to any integer unless the specific proof or assumption states otherwise (such as for induction proofs).
We started by defining a successor operator s(x) [A2], and we now define a corresponding predecessor operator p(x) that generates our negative numbers in a way which is symmetric to s(x):
In all of our original assumptions and following proofs, we now state that variable names in those assumption refer to any integer. We define the predecessor function as the inverse of the successor function and vice-versa. In other words:[A61] given x, p(x) generates another number, where p(x) is not the same as x
We define our negative numbers in the same way as we defined our natural (positive) numbers [A7]:[A62.1] p(s(a)) = a [A62.2] s(p(a)) = a
We take our no-duplicates assumption [A8] on the successor function and state it for the predecessor function:[A63.1] -1 = p(0) [A63.2] -2 = p(-1) [A63.3] -3 = p(-2) etc. to negative infinity.
For the relational operators, we can derive their meaning relative to the predecessor operator:[A64] For any x, repeated application of the predecessor function any number of times will never generate x.
s(a) > a [A9] p(s(a)) > p(a) Apply p(x) to both sides [L6.6] a > p(a) From [A62.1] [L65] p(a) < a From [A9]
Addition
We add to our definition of Addition ([A21] and [A22]) to handle negative numbers, and we extend our induction assumption [A24] to negative numbers:For each of our original assumptions through addition, we have now added similar assumptions to handle our negative numbers. All of our assumptions are completely symmetrical: take any of the original assumptions, replace successor by predecessor, replace 1 by -1, and exchange < with >, and you will get the equivalent assumption for our negative numbers. Because all of our other proofs in those sections are based on those assumptions, the symmetric proofs for negative numbers follow from the symmetric assumptions in exactly the same way as for the natural numbers. Thus all of the results and conclusions in those sections are valid for addition of negative numbers: commutative, associative, identity, algebra.[A71] a + p(b) = p(a + b) [A72] If an equation is true for a known value of n, and it can be demonstrated to be true for n+(-1) for any n when true for n, then it is true for all natural numbers x where x < n.
We list the results of one lemma here, leaving the details of the derivation as an exercise to the reader:
We derive a couple of other useful results:[L73] a + -1 = p(a)
p(s(a)) = a [A62.1] p(a + 1) = a [L23.1] (a + 1) + -1 = a [L73] a + (1 + -1) = a [a+] (1 + -1) = 0 [L74] -1 + 1 = 0 [c+]
The above statement says that, for any element a in our set of natural numbers, there is an element -a (a negative number, negative a) which can be added to that natural number to produce zero (our identity element). We call negative a the inverse element of a, and likewise a is the inverse element of -a.(1 + -1) = 0 [L74] n + -n = 0 Inductive assumption, true for n=1 [L74] (n + -n) + (1 + -1) = 0 From [i+] because (1 + -1) = 0 (n + 1) + (-n + -1) = 0 (n + 1) + (-(n+1)) = 0 From p(x) defn [L75] a + -a = 0 Above lines summarized, with a for n+1
-a + a = 0 [L75] (-a + a) - a = 0 - a Subtract a from each side -a + (a - a) = 0 - a [L45.5] [L76] -a = 0 - a [L46.5] and [i+]
a + -a = 0 [L75] (a + -a) - -a = 0 - -a Subtract -a from each side a + (-a - -a) = 0 - -a [L45.5] a = 0 - -a [L46.5] [L76.1] a = -(-a) [L76]
a + -b = a + (0 - b) [L76] = (a + 0) - b [L45.5] = a - b [i+] [L77] a + -b = a - b
Subtraction
As with addition, we note that we can create a set of symmetric assumptions using negative numbers in place of positive numbers, so that all of our results and conclusions of subtraction on positive numbers also work on negative numbers.For improved symmetry with the definition of addition, we restate our assumptions defining subtraction to use the successor and predecessor functions, and we add a symmetric assumption that covers negative numbers. We no longer need
(a+1)-1=0
[A41.1]
as an assumption for subtraction,
because it is equivalent to p(s(a))=a)
[A62.1].
Since these assumptions are just a rewriting of our original
assumptions for subtraction, all of our derivations remain the same.
[A41] a - 0 = a Repeat of original [A41] [A81] a - s(b) = p(a - b) [A42] restated in terms of s and p [A82] a - p(b) = s(a - b) Symmetric assumption to [A81]
Algebra
With the addition of negative numbers to our structure, our set is closed with respect to subtraction. We now have a set (the integers) with an associative binary operator (+) with an identity (0) and inverse elements (the negative numbers). This algebraic structure is called a group. Because our operator (addition) is commutative, our algebraic structure is an abelian group. The group, however, ignores the subtraction operator.Multiplication
Once we start using addition for real tasks, we find that we are often adding the same number many times, such as 3+3+3+3. Because this is so common, we would like to define a shortcut - a new operator - that means the same thing. We call this operation multiplication.There are various conventions for how the multiplication operator is written: x, * and dot are common, and in some cases a convention is adopted that two variables written next to each other with no operator between them are to be multiplied. Most computer programming languages use the asterisk character (*), and I will use that here.
In order to have as much symmetry as we can, and to minimize our design work, we will define multiplication using a similar approach as we did when we defined addition:
We could equivalently have used a slightly different formulation for [A103] in which we add -1 rather than subtracting 1, as supported by [L77]:[A101] a * 0 = 0 [A102] a * (b + 1) = (a * b) + a [A103] a * (b - 1) = (a * b) - a
a * (-1) = a * (0 - 1) [L76] = (a * 0) - a [A103] = 0 - a [A101] = -a [L76] [L104.1] a * -1 = -a Above lines summarized
If the second operand is negative, we can factor that out and we see that it changes the sign of the result.a * (b + -1) = a * (b - 1) [L77] = (a * b) - a [A103] = (a * b) + -a [L77] = (a * b) + (a * -1) [L104.1] [L104.2] a * (b + -1) = (a * b) + (a * -1) Above lines summarized
a * -n = -(a * n) Inductive assumption, true for n=1 a * -(n + 1) = a * (-n - 1) = (a * -n) - a = -(a * n) - a = 0 - (a * n) - a = 0 - ((a * n) + a) = 0 - (a * (n + 1)) = -(a * (n + 1)) [L104.3] a * -b = -(a * b) Above summarized, with b for n+1 [L104.4] -a * b = -(a * b) Swap a with b and use [c*]
-a * -b = -(-a * b) [L104.3] = -(-(a * b)) [L104.3] again = a * b [L76.1] [L104.5] -a * -b = a * b Above lines summarized
Identity and Zero
By setting b=0 in [A102], we see that 1 is a right-identity for multiplication:We show by induction that zero multiplied on either side gives zero:a * (0 + 1) = (a * 0) + a From [A102] with 0 for b a * 1 = 0 + a From [i+] on LHS, [A101] on RHS [L105] a * 1 = a
By doing the same proof using [A103] we can conclude that [L106.6] holds for all integers.[L106.1] 0 * 0 = 0 From [A101] with 0 for a [L106.2] 0 * n = 0 Inductive assumption, true for n=0 [L106.3] 0 * (n + 1) = (0 * n) + 0 From [A102] with 0 for a, n for b [L106.4] 0 * (n + 1) = 0 + 0 From [L106.2] [L106.5] 0 * (n + 1) = 0 [L106.6] 0 * a = 0 Above summarized with a for n+1
We show that 1 is a left identity:
Since 1 is both a left identity and a right identity, we can drop the handedness and just refer to it as an identity.1 * 1 = 1 From [L105] with a=1 1 * n = n Inductive assumption, true for n=1 1 * (n + 1) = (1 * n) + 1 From [A102] with a=1 and b=n = n + 1 From Inductive assumption [L106.8] 1 * a = a Above summarized, with a for n+1
With addition we had one special number, 0, which when added to any number yielded that number. With multiplication we see that we have two special numbers: the number 1 is an identity for multiplication, but 0 is also special, since anything multiplied by 0 yields 0. We choose to use the word "zero", when associated with a specific operation such as multiplication, to mean a value that, when given as an operand to that operator, always yields zero. Our multiplication operator has only one zero, but other systems and operators may have more than one zero.
By the same argument [L31] as for the additive identity, we can see that there is only one multiplicative identity and only one multiplicative zero.
Distributive
We show that multiplication is distributive over addition by induction:The above proof can be repeated using -1 instead of 1 (by [L104.2]), so [L107.3] covers all integers.[L107.1] a * (b + 0) = a * b = (a * b) + 0 = (a * b) + (a * 0) a * (b + 1) = (a * b) + a [A102] a * (b + 1) = (a * b) + (a * 1) From [L105] on rightmost a [L107.2] a * (b + n) = (a * b) + (a * n) Inductive assumption, true for n=1 a * (b + (n + 1)) = a * ((b + n) + 1) From [a+] = (a * (b + n)) + a From [A102] = ((a * b) + (a * n)) + a From [L107.2] = (a * b) + ((a * n) + a) From [a+] = (a * b) + (a * (n + 1)) From [A102] [L107.3] a * (b + c) = (a * b) + (a * c) Above summarized, with c for n+1
Using the same proof steps using [A103] rather than [A102] demonstrates that multiplication distributes over subtraction as well. Since by [L77] subtraction is the equivalent of adding the negative of a number, this is consistent.
[L107.4] a * (b - c) = (a * b) - (a * c)
2 * 1 = 2 = 1 + 1 2 * n = n + n Inductive assumption, true for n=1 2 * (n + 1) = 2 * n + 2 = (n + n) + (1 + 1) = (n + 1) + (n + 1) 2 * a = a + a 1 * b = b (0 * 1) * b = (0 * b) + b (n + 1) * b = (n * b) + b Inductive assumption, true for n=0 (n + 2) * b = (n * b) + b + b Inductive assumption, true for n=0 ((n + 1) + 1) * b = (n + 2) * b = (n * b) + b + b = ((n + 1) * b) + b (a + 1) * b = (a * b) + b
Associative
We show multiplication is associative by induction:As with the distributive law, we can replace 1 by -1 to show that our conclusion covers negative numbers as wel.[L108.1] (a * b) * 0 = 0 = a * 0 = a * (b * 0) [L108.2] (a * b) * 1 = a * b = a * (b * 1) From [L105] on each side [L108.3] (a * b) * n = a * b = a * (b * n) Inductive assumption, true for n=1 (a * b) * (n + 1) = ((a * b) * n) + (a * b) = (a * (b * n)) + (a * b) From [L108.3] = a * ((b * n) + b) From [L107.3] with b*n for b, b for c = a * (b * (n + 1)) From [A102] with b for a, n for b [L108.4] (a * b) * c = a * (b * c) Above lines summarized, with c for n+1
Commutative
As with addition, the fact that multiplication is associative [L108.4] means that, if we have an expression that is a string of values multiplied together, we can drop the parentheses from the expression without creating any ambiguity; and the fact that it is commutative means that we can rearrange all of those multiplied values to any order we want.m * n = n * m Inductive assumption, true for m=0 or 1 and n=0 or 1 (m + 1) * (n + 1) = (m + 1) * n + (n + 1) From [A102] = (m * n) + m + (n + 1) From [(a+1)*b = a*b+b] = (n * m) + n + (m + 1) From Inductive assumption and [a+] = (n + 1) * m + (m + 1) From [same as two lines up] = (n + 1) * (m + 1) From [A102] [L109] a * b= b * a
Algebra
We have added a second operator to our repertoire that, like addition, is an associative binary operator with an identity. With two such operators, where one distributes over the other, we have a ring (for a more precise definition, follow the link). In the same way that group ignores subtraction, the ring ignores the division operator. As with addition, there are a few rules from the above section that we will use often enough that we want to reference them by name rather than lemma number.[a*] a * (b * c) = (a * b) * c [L108.4] Associativity of multiplication [c*] a * b = b * a [L109] Commutativity of multiplication [z*] a * 0 = 0 * a = 0 [L106.6] Zero for multiplication [i*] a * 1 = 1 * a = a [L106.8] Identity for multiplication [d*] a * (b + c) = (a * b) + (a * c) [L107.3] Distributivity of multiplication over addition
Division
As when we defined subtraction to be the inverse operation of addition, we want an inverse operation to multiplication so that we can solve forx
in equations such as a * x = b
.
We call our inverse operation division. As with multiplication, there are a number of common ways this operation is expressed. For use in this presentation, we choose to use the slash character (/) to represent the division operation. We want division and multiplication each to be the inverse of the other, as is the case with addition and subtraction, so we have two candidate definitions:
Our definitions exclude zero because we already have a rule that says anything times zero is zero, so we know a priori that we can't make these new rules work for all a when b is zero.[A120.1] (a * b) / b = a for all a and b except b=0 [A120.2] (a / b) * b = a for all a and b except b=0
The fact that we can't divide by zero is the first time we have encountered a special case in our structure, where we have to add a qualification to one of our rules stating that you can't do something rather than extending our structure to make it possible to do that. When, in building our structure of numbers, we realized that we could not answer the question "what is 3 - 5?", we expanded the structure to allow us to answer tha question ("negative 2"). In this case, we can't answer the question "what is 5 / 0?", but, for the first time, instead of trying to expand our structure to be able to answer that question, we make the statement "you can't do that". As we will see later, the further we go in defining our structure, the more such exceptions and caveats we need to make.
We check that the two assumptions above are compatible by starting with one and converting it into the other.
We can quickly get some useful lemmas by plugging in a few different values for a and b:(a * b) / b = a [A120.1] ((a * b) / b) * b = a * b Right-multiply both sides by b (c / b) * b = c Previous line with c for a*b; this is [A120.2]
If we are looking at the equation[L121] a / 1 = a From [A120.1 or 2] with b=1, after a*1=a [L122] b / b = 1 From [A120.1] with a=1, after b*1=b [L123] (1/b)*b = 1 From [A120.2] with a=1 [L124] 0 / b = 0 From [A120.1] with a=0, after 0*b=0 [L124.2] a / a = 1 From [A120.1] with a=1 and b=a
what does that mean? If we assume[I125] a = c / b
then [I125] becomes[A126] c = a * b
which is [A120.1]. This is true by definition, so our assumption [A126] is a valid assumption to use in solving [I125]. What we are saying here is that the solution (a) to [I125] is the value that, when multiplied by b, gives c.[I127] a = (a * b) / b
[L128] If a = c / b, then c = a * b, and vice-versa (from [I125] and [A126])
Associative
As we did with subtraction, we want to prove the associative laws for division so we know how we can transform various combinations of parentheses and the multiplication and division operations. We already know abouta * (b * c)
,
so there are three other possible combinations of * and / with the
parentheses in the same position:
a / (b * c)
a * (b / c)
a / (b / c)
[I129.1] a / (b * c) = d Given a = d * (b * c) From [L128] a = (d * c) * b From [a*] and [c*] a / b = d * c From [L128] [I129.2] (a / b) / c = d From [L128] [L129.3] a / (b * c) = (a / b) / c From [I129.1] and [I129.2][I130.1] a * (b / c) = d Given a * (b / c) * c = d * c Multiply both sides by c a * b = d * c Reduce b /c * c = b by [A120.1] [I130.2] (a * b) / c = d From [L128] [L130.3] a * (b / c) = (a * b) / c From [I130.1] and [I130.3][I131.1] a / (b / c) = d Given a = d * (b / c) From [L128] = (d * b) / c From [L130.3] a * c = d * b From [L128] c * a = d * b From [c*] (c * a) / b = d From [L128] c * (a / b) = d From [L130.3] [I131.2] (a / b) * c = d From [c*] [L131.3] a / (b / c) = (a / b) * c From [I131.1] and [I131.2] We now have all of our rules of association for multiplication and division. The following four equations, repeated from above, show all eight possible combinations of * and / operators and grouping of three variables. Note that this table is identical to the table of rules of association for addition and subtraction, with * instead of + and / instead of -.We derive a few more useful lemmas.[a*] a * (b * c) = (a * b) * c [L129.3] a / (b * c) = (a / b) / c [L130.3] a * (b / c) = (a * b) / c [L131.3] a / (b / c) = (a / b) * ca / b = (a * 1) / b From [i*] = a * (1 / b) From [LL130.3] [L132] a / b = a * (1 / b) Summary of the above lines1 / (a / b) = (1 / a) * b From [L131.3] = b * (1 / a) From [c*] = b / a From [L132] [L133] 1 / (a / b) = b / a Summary of the above lines(a / b) * (c / d) = ((a / b) * c) / d) From [L130.3] = (c * (a / b)) / d) From [c*] = ((c * a) / b) / d) From [L130.3] = (c * a) / (b * d) From [L129.3] = (a * c) / (b * d) From [c*] [L134] (a / b) * (c / d) = (a * c) / (b * d) Summary of the above lines(a / b) / (c / d) = ((a / b) * 1) / (c / d) From [i*] = (a / b) * (1 / (c / d)) From [L130.3] = (a / b) * (d / c) From [L133] = (a * d) / (b * c) From [L134] [L135] (a / b) / (c / d) = (a * d) / (b * c) Summary of the above linesRational Numbers
You may have noticed in the above section about the division operation that we discussed things like1 / a
without commenting on the fact that our number system, which up to now includes only integers, does not in general include the numbers that can represent that. The proper sequence would have been to introduce rational numbers first, but I wanted to finish the discussion about the properties of the division operation before discussing rational numbers. With that out of the way, let's turn to rational numbers.
We can easily build a table for specific values of a, b and c for equation [I125] by taking all pairs of integer values for a and b, generating c as their product, and defining the value of c/b to be a for all of those triplets. For example, 2*3=6, therefore 6/3=2.
Our division table does not include all possible combinations ofc/b
, so there are some division equations for which the answer can not be found in our tables. For example, 3/2 does not appear in our table because, in our system of numbers up to this point, which is all integers, there is no number that, when multiplied by 2, yields 3.
In order for our numbers to be closed under division, we have to add some new numbers, which are the numbers needed to solve the equationc/b
when there is no integer numbera
such thata*b=c
. We call these numbers rational numbers, because they are the ratio of two integers, and we choose to represent them as a fraction using the division operator. In other words, when we ask what is the answer to the equationc/b
, we are simply defining the answer to bec/b
and stating that that value is a number. We will then examine how to manipulate these numbers.
We have defined rational numbers as numbers of the formc/b
. We also know from our table-based enumeration of division equations that, for any number c which can be written asa*b
, the value of the division equationc/b
is a. We define the value of our rational number that we write asc/b
to be consistent with the known solutions of our division equations written the same way. Thus the value of the rational number 6/3 is defined to be 2, etc.Algebra
With division as the inverse of multiplication, the multiplicative identity 1, and rational numbers, our ring is now a field.
This is as far as we will go with algebra. When we continue with exponentiation to derive real numbers and then complex numbers, those structures are still fields.Operator Precedence
Up to now, we have been using parentheses to ensure that the order of application of operators in an expression is unambiguous. We noted earlier that we don't need those parentheses in an expression that consists solely of a number of values added together, and likewise that we don't need parentheses in an expression that consists solely of a number of values multiplied together. This is nice because it reduces the amount of writing we need to do.
We can further reduce the need for parentheses by defining a rule that tells us which operations to evaluate first when there are no parentheses to guide us. When we start with an operation and then define a second operation as the repeated application of the first operation, we can think of that second operation as being more powerful than the first operation. We then give priority to the more powerful operator, defining our rule of precedence to be that, in an expression in which the order of evaluation would otherwise be ambiguous, we will evaluate the more powerful operators first.
We define addition (+) and subtraction (-) to be at the first level, and multiplication (*) and division (/) to be at the second level and higher power than the first level. Thus, for example, the expressiona + b * c
will be equal toa + (b * c)
, and the expressiona / b - c
will be equal to(a / b) - c
.
In cases where there are multiple operators of the same power, we define the order of evaluation to be left to right. Thus, for example, the expressiona / b * c
will be equal to(a / b) * c
, and the expressiona - b + c
will be equal to(a - b) + c
.Exponentiation
Up to this point the structure we have built is pretty clean. With rational numbers and our four operators (+, -, *, /), we have a system that is closed and mostly complete and consistent, with the only exception being that we can't divide by zero. Other than that one exception, operations are well-defined, we have a nice set of rules including our commutative, associative, and distributive rules, and we have a host of identities and lemmas we can apply to our rational numbers.
Once we add exponentiation, things get a lot messier: we will have expressions that have multiple values, bigger swaths of undefined operations, and many places where our lemmas and rules of manipulation no longer apply. It might seem like it's hardly worth trading our nice clean rational numbers for this mess. But despite all of the rough edges, there are enough useful things you can do with real and complex numbers that it is worth carefully defining where those rough edges are and avoiding them. So, let's forge ahead.
As with addition, once we start using multiplication for real problems, we often find we want to multiply the same number together many times, such as3*3*3*3
. As we did when defining multiplication, we define a new operator that means the same as repeated multiplication. We call this new operation exponentiation. In programming languages this is sometimes written using the up-arrow (^) as an operator, but since this is HTML we have the luxury of using the standard notation, which is to write the exponent as a superscript. For example the expression34
means 3 multiplied by itself 4 times, or3 * 3 * 3 * 3
. We call the number on the left the base, and the superscript number the exponent. The operation of exponentiation is also referred to as taking a base to a power, where the power is the exponent.
In line with our precedence rules by which we evaluate higher-power operations first, we will evaluate exponentiation before multiplication, division, addition, and subtraction, when there are no parentheses to otherwise indicate the order of evaluation.
From [a*] we know we can group repeated multiplication any way we want, so for example3 * 3 * 3 * 3 = (3 * 3 * 3) * 3 = (3 * 3) * (3 * 3)
. Using our new superscript notation, we can write this as34 = (33) * (31) = (32) * (32)
. More generally, we can see these things from our definition of exponentiation and [a*]:We can figure out how to deal with[L201.1] a(b + c) = ab * ac [L201.2] a1 = a [L201.3] (ab)c = a(b * c) [L201.4] (ab)c = ab*c = ac*b = (ac)b From [L201.3] and [c*](a * b)n
by starting with n=2:Then we use induction for the general case:(a * b)2 = (a * b) * (a * b) = a * b * a * b From [a*] = a * a * b * b From [c*] = a2 * b2 [L201.5] (a * b)2 = a2 * b2 Summary of above linesUnlike addition and multiplication, we can quickly see from counterexamples that exponentiation is neither commutative:Assume (a * b)n = an * bn for some n (a * b)(n + 1) = (a * b)n * (a * b)1 From [L201.1] = (an * bn) * (a * b) From [L201.5] = an * a * bn * b From [a*] and [c*] = a(n + 1) * b(n + 1) True when n=2 from [L201.5], so by induction true for all positive n [L201.5] (a * b)n = an * bnnor associative:23 = 2 * 2 * 2 = 8 32 = 3 * 3 = 9 8 != 9, so 23 != 32These initial lemmas are based on our intuitive definition of exponentiation as repeated multiplication, which provides obvious answers only in the case where the exponent is a counting number (strictly positive integer). Let's extend our definition to cover other numbers in our algebra.2(32) = 2(3 * 3) = 29 = 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 = 512 (23)2 = (2 * 2 * 2)2 = 82 = 8 * 8 = 64 512 != 64, so 2(32) != (23)2We can't divide by zero, so the above is not valid when[A202.1] d = b + c Starting assumption [I202.2] b = d - c ad = ab * ac From [A202.1] and [L201.1] ad / ac = ab * ac / ac Assuming ac!=0 [I202.3] ab = ad / ac [L202.4] a(d - c) = ad / ac Substitute b from [I202.2]ac
is zero. When is that expression zero? From the definition of exponentiation, this expression represents repeated multiplication ofa
. What number when multiplied by itself is zero? There is only one such number: zero. So [L202.4] is not valid whena = 0
, but it is valid for any other base.
Let's look at two special cases of [L202.4].a0 = a(1 - 1) From [L46.5], a!=0 = a1 / a1 From [L202.4] = a / a From [L201.2] = 1 From [L124.2] [L203] a0 = 1 Above lines summarized, a!=0The above extends our exponentiation operator to all integer exponents and all bases other than zero. What about rational exponents?a-b = a(0 - b) From [L76], a!=0 = a0 / ab From [L202.4], b!=0 = 1 / ab From [L203] [L204] a-b = 1 / ab Above lines summarized, a!=0, b!=0 [L204.1] a-1 = 1 / a From [L204] with b = 1, and [L201.2]
Remember that our goal is to define a set of consistent and useful operations. To that end, we want to ask ourselves how we can define exponentiation using a rational exponent such that it is consistent with the rest of our algebra. Rational numbers are equivalent to division using integers, which is the inverse of multiplication. Our exponentiation rule [L201.3] includes multiplication, from which we can derive a rule for division.What the above says is that the value ofa = a1 [L201.2] = a(b / b) From [L124.2], b!=0 = a(b * 1/b) From [L132] = a(1/b * b) From [c*] = (a1/b)b From [L201.3] [L205] (a1/b)b = a Summary of the above linesa1/b
is the number that, when raised to the power b, is equal to a. For example, the numbera1/2
is the number that, when raised to the power 2, is equal to a. We calla1/b
the b-th root of a. The case where b is 2 or 3 is common enough that we define special names: we calla2
a squared anda1/2
the square root of a; we calla3
a cubed anda1/3
the cube root of a.
Previously when we added a new operation to represent repeated application of an earlier operation (addition as repeated counting and multiplication as repeated addition), we did not encounter closure problems until we added an inverse operation to the newly added operation (subtraction, division). As we will see below, this is not the case for exponentiation: here we will run into closure problems even without an inverse operation. But to keep the flow the same as with the other operators, I will discuss the inverse operation before getting back to closure.Logarithms
As when we defined division to be the inverse operation of multiplication, we want an inverse operation to exponentiation so that we can solve forx
in equations such asax = b
.
We call our inverse operation logarithm.There is a curious hole in math terminology about logarithms. Our other operations all have names: we talk about performing addition, multiplication, or exponentiation. We do addition by adding two addends to get a sum. But we don't "do logarithm": we "take a logarithm". The word logarithm refers to one of the elements in that operation, similar to how the word exponent refers to one of the elements in the operation of exponentiation. There seems to be no single word for logarithms that corresponds to the operation names such as addition, multiplication, and exponentiation. Talking about logarithms is like talking about sums rather than addition.We can derive a few lemmas for log.[A221.1] loga(ab) = b for all a and b except a=0 or b=0 [A221.2] alogab = b for all a and b except a=0 or b=0[L222.1] loga(a) = loga(a1) = 1 [L201.2] and [A221.1] with b=1 [L222.2] loga(1) = loga(a0) = 0 [L203] and [A221.1] with b=0 [L222.3] loga(1/a) = loga(a-1) = -1 [L204.1] and [A221.1] with b=0[I223.1] loga(ac) = c [A221.1] using c instead of b [I223.2] loga(ad) = d [A221.1] using d instead of b [I223.3] loga(ac) + loga(ad) = c + d Add left sides and right sides of [I223.1] and [I223.2] [I223.4] loga(ac+d) = c+d [A221.1] using c+d instead of b [L223.5] loga(ac+d) = loga(ac) + loga(ad) Transitive equals on [I223.3] and [I223.4][I224.1] loga(ac) + loga(ad) = c + d Subtract left sides and right sides of [I223.1] and [I223.2] [I224.2] loga(ac-d) = c-d [A221.1] using c-d instead of b [L224.3] loga(ac-d) = loga(ac) - loga(ad) Transitive equals on [I224.1] and [I224.2][I225.1] loga(ac+d) = loga(ac*ad) [L201.1] [I225.2] loga(ac+d) = loga(ac) + loga(ad) [L223.5] [I225.3] loga(ac*ad) = loga(ac) + loga(ad) Transitive equals on [I225.1] and [I225.2] [L225.4] loga(x*y) = loga(x) + loga(y) Substitute x for ac and y for ad[I226.1] loga(ac-d) = loga(ac/ad) [L202.4], ad!=0 [I226.2] loga(ac-d) = loga(ac) - loga(ad) [L224.3] [I226.3] loga(ac/ad) = loga(ac) - loga(ad) Transitive equals on [I226.1] and [I226.2] [L226.4] loga(x/y) = loga(x) - loga(y) Substitute x for ac and y for ad, y!=0Principal Values
Previously, we noted that, when we added division to our algebraic structure, we had to add a small complication in that we can't divide by zero. When we add square root (or, more generally, exponentiation with any non-integer exponent), we run into another kind of special case where we have to take additional care: multivalued functions. We note that every number has two square roots: for example, the square root of 4 is 2 or -2, because either of those numbers, when multiplied by itself, is equal to 4. With multivalued functions like square root, we can run into trouble if we are not careful about choosing which value to use. Here's an example of this problem:(41/2)2 = 4 41/2 * 41/2 = 4 2 * 41/2 = 4 Substitute 2 as the first square root 2 * -2 = 4 Substitute -2 as the second square root -4 = 4 Wrong! The bad substitution in the above sequence may be easy to spot and understand, but as we go further into building our algebra, problems of this nature become subtler and harder to recognize.
We can reduce the probability of running into this kind of problem by carefully selecting which of these multiple values to use. When we have one preferred value for a multivalued function, we call that the principal value of the function. For example, the principal value of sqrt(4) is 2.Irrational Numbers
The ancient Greeks knew that21/2
(the square root of two) is not a rational number. There are a lot of proofs of this. I happen to like this one that demonstrates that all roots (square root, cube root, and others) that are not integers are not rational.In order for our numbering system to be closed under exponentiation, we need to extend our numbers to include these values that are not rational numbers. We call them irrational numbers.Assume ab = c (b=2 for square root, b=3 for cube root, etc) and a = d/e, e!=1 where d/e is reduced to the lowest form, so they have no prime factors in common. Then ab = (d/e)b = db/eb = c = c/1 But db has no prime factors that are not in d, and eb has no prime factors that are not in e, so db and eb have no prime factors in common, and the fraction can not be reduced at all, and in particular can not be reduced to c/1, therefore it can not be equal to c. Since there is no rational number satisfying the original assumption, any solution must not be a rational number, except in the case that e=1, which means the root is an integer.
When we added negative numbers and rational numbers, that was after we had added not only an operation defined by repetition, but also its inverse. In this case, we had to extend our numbers to provide closure even without having yet added that inverse operation.A brief aside about infinity: before adding irrational numbers, our set of numbers was always countably infinite, which means there was always a way to map the entire set of numbers onto the counting numbers. For example, we can count off all the integers, both positive and negative, by ordering them like this: 0, 1, -1, 2, -2, 3, -3, and so on. We can count off all the rational numbers by ordering them according to the sum of the numerator and denominator and alternating positive and negative, like this: 0, 1/1, -1/1, 1/2, -1/2, 2/1, -2/1, 1/3, -1/3, 2/2, -2/2, 3/1, -3/1, 1/4, and so on, then removing duplicates (any fraction that is not reduced). But once we add all the irrational numbers we can no longer come up with a counting order like this, which is why we say the set of all irrational numbers is uncountable.
For a proof of this assertion, look up Canter's diagonalization argument.Decimal Notation
When we introduced rational numbers, such as 1/2, we defined their values in terms of the division operation, but did not provide any other representation. This was perhaps acceptable, as we can easily manipulation rational numbers in order to answer questions about them.
With irrational numbers, it is not quite so easy. How can we tell, for example, which of21/2
,31/3
, or 723/510 is the largest? We would like a representation that allows us to do real-world calculations with these values.
When counting up with integers, we use a place-notation system in which each digit, as we move to the left, represents a value that is ten times as much as the digit just to its right. For example, 1234 means 1 * 1000 + 2 * 100 + 3 * 10 + 4. We extend this sequence by defining each place to the right of the ones digit as having a place value of one tenth of the digit to its left. In order to unambiguously know which place is the ones place, we put a decimal point (.) just to the right of the ones digit (we in America, that is; in some other parts of the world people use a comma (,) instead). For example, 0.5678 means 5 * 1/10 + 6 * 1/100 + 7 * 1/1000 + 8 * 1/10000.
We can convert fractions to decimal form such asa.bcde
by remembering that that meansa + b/10 + c/100 + d/1000 + e/10000
Figuring out the decimal representation for a number such as723/510 = (510 + 213) / 510 = 510/510 + 213/510 = 1 + 213/510 = 1 + 10 * 213/510 / 10 = 1 + 2130/510 / 10 = 1 + (2040 + 90)/510 / 10 = 1 + 2040/510 / 10 + 90/510 / 10 = 1 + 4/10 + 10 * 90/510 / 100 = 1 + 4/10 + 900/510 / 100 = 1 + 4/10 + (510 + 390)/510 / 100 = 1 + 4/10 + (510/510 + 390/510) / 100 = 1 + 4/10 + 1/100 + 390/510 / 100 = 1 + 4/10 + 1/100 + 10 * 390/510 / 1000 = 1 + 4/10 + 1/100 + 3900/510 / 1000 = 1 + 4/10 + 1/100 + (3570 + 330)/510 / 1000 = 1 + 4/10 + 1/100 + (3570/510 + 330/510) / 1000 = 1 + 4/10 + 1/100 + 7/1000 + 330/510 / 1000 = 1.417 + more digits from 330/510 / 100021/2
is not quite as straightforward, but we can start by the brute-force approach of trial and error to get an estimate.From this much we can determine that12 = 1, 1<2 22 = 4, 4>2, so our number must start with 1 1.12 = 1.21 1.22 = 1.44 1.32 = 1.69 1.42 = 1.96 1.52 = 2.25 so our number must start with 1.4 1.412 = 1.9881 1.422 = 2.0164 so our number must start with 1.41 1.4112 = 1.990921 1.4122 = 1.993744 1.4132 = 1.996569 1.4142 = 1.999396 1.4152 = 2.002225 so our number must start with 1.41421/2
is less than723/510
. We don't have an exact answer, but for real world questions we often don't need to go to very many decimal digits to get the answer.
Our decimal notation is a sum of fractions, so any finite decimal number can be converted to a rational number. Conversely, irrational numbers can not be exactly represented as a decimal number, we can only approximate them when using decimal notation. If we want to maintain an exact representation of an irrational number such as2
, we have to keep it in that notation or something similar.Imaginary Numbers
Adding irrational numbers extends our numbers to include the value of21/2
and other fractional roots of positive numbers, but it doesn't cover everything. In particular, our numbers don't yet include a value for the expression-11/2
. This is the square root of negative 1, which is equal to the number that, when multiplied by itself, equals negative 1. But any positive number multiplied by itself is a positive number, and from [L104.5] any negative number multiplied by itself is also a positive number, so we don't have any numbers that are candidates to be the square root of negative 1. In order to have exponentiation be closed for negative bases, we need to extend our numbers. We need to add a set of numbers that, when multiplied by themselves, produce negative numbers.
When we added negative numbers, we used our existing counting numbers with an added character (-) in front to indicate a negative number. We will do something similar here, using our existing counting numbers with an added character, in this case the letter i, following the number to indicate the new kind of numbers we are adding. We define1i
(or justi
) to be the number such thati2 = -1
, and given a numbera
, we defineai = a * i
(which is consistent with a common convention of definingab = a * b
).
We need to pick a name to distinguish these new numbers from what we had before, and "the square root of negative one" is too unwieldy, so we pick a shorter name and call them imaginary numbers.
When we defined negative numbers, we might have instead called them imaginary numbers, because you can't have negative lengths or a negative number of apples in the real world, so those numbers are not real, right? In the sense that they are highly useful for certain mathematical calculations, imaginary numbers are no more "imaginary" than negative numbers. It is unfortunate that we are stuck with a name that causes some people to get distracted from thinking about these new numbers as simply the next step in expanding our numbering system to be closed under exponentiation.
To distinguish them from our newly added imaginary numbers, we go back and lump together our previously defined rational and irrational numbers and call those real numbers. Having made the distinction between real and imaginary numbers, we note that we can have imaginary rational numbers, such as(1/2)i
, or imaginary irrational numbers, such as21/2i
, as well as negative imaginary numbers such as-4i
or negative irrational imaginary numbers such as-21/2i
.
If we work through the mechanics of addition and subtraction with imaginary numbers, we find that they work the same as real numbers but with that extra i everywhere. To put it another way, imaginary numbers are closed under addition and subtraction. This is not the case with multiplication: imaginary numbers are not closed under multiplication, sincei * i = -1
, which is not an imaginary number. Similarly, imaginary numbers are not closed under division, sincei / i = 1
, which is not imaginary.Complex Numbers
Since we defined imaginary numbers as being a different set of numbers from real numbers, we can't convert from one to the other, so if we try to add a real numbera
and an imaginary numberbi
together, we can't reduce that, so we just write it asa + bi
. We call this kind of number a complex number, and since a or b could be zero, we note that all real numbers and all imaginary numbers are complex numbers.
We are, in a sense, cheating when we use the + symbol to enumerate the real and imaginary parts of a complex number, because, as just stated, we can't actually do anything with that operator to reduce the number. In that sense, we could have used any special character in that location. But we choose to use the + sign because it turns out the rules we have that deal with the + operator on real numbers also work with complex numbers: commutative, associative, and distributive rules all work consistently when applied to complex numbers when we use a + sign between the real and imaginary parts.
As with square root, complex numbers come with multivalued functions, some with an infinite number of solutions. It's easy to get bad results if you're not careful, so it's important to define a principal value for these functions and consistently use it.Cartesian Coordinates
Since real and imaginary numbers can't be reduced to each other and are thus orthogonal, we can represent them on the plane. We choose real to be the X axis and imaginary to be the Y axis.
With this cartesian environment, we can represent complex numbers in polar coordinates using the standard conversion:(r, θ) = (sqrt(x2 + y2), arctan(y/x)
, where x is the real part and y is the imaginary part (and with the appropriate sign adjustments for quadrants other than I). Converting the other way, we have(x, y) = (r * cos(θ), r * sin(θ))
. Sometimes we refer to a complex number asz
, where we can decompose it either by real and imaginary parts, written asx = Re(z), y = Im(z)
, or by polar coordinates, written asr = |z|, θ = Arg(z)
, where|z|
is the magnitude ofz
andArg(z)
is the argument ofz
. More precisely,arg(z)
is the argument ofz
, andArg(z)
is the principal argument ofz
.arg(z)
is a multi-valued function equal toArg(z) + n*2*π
for all integer values ofn
.
We can treat our complex numbers as vectors in the two dimensional complex plane, so that adding two complex numbers can be displayed in our plane as vector addition. More interesting is multiplication, where we can see that when we use polar coordinates we get this nice result:(r1,θ1) * (r2,θ2) = (r1*r2, θ1+θ2)
.(r1,θ1) * (r2,θ2) = (r1*cos(θ1) + r1*sin(θ1)i) * (r2*cos(θ2) + r2*sin(θ2)i) = r1*(cos(θ1) + sin(θ1)i) * r2*(cos(θ2) + r2*sin(θ2)i) = r1*r2 * (cos(θ1) + sin(θ1)i) * (cos(θ2) + r2*sin(θ2)i) = r1*r2 * (cos(θ1)*cos(θ2) + cos(θ1)*sin(θ2)i + sin(θ1)*cos(θ2)i + sin(θ1)*sin(θ2)*i2 = r1*r2 * ((cos(θ1)*cos(θ2) - sin(θ1)*sin(θ2)) + (cos(θ1)*sin(θ2) + sin(θ1)*cos(θ2))i) = r1*r2 * (cos(θ1+θ2) + sin(θ1+θ2)i) = r1*r2*cos(θ1+θ2) + r1*r2*sin(θ1+θ2)i = (r1*r2, θ1+θ2) [L301] (r1,θ1) * (r2,θ2) = (r1*r2, θ1+θ2) The above summarizedEuler's Formula
Here is Euler's Formula:Feynman calls this "one of the most remarkable, almost astounding, formulas in all of mathematics" and refers to it as an "amazing jewel".eiθ = cos(θ) + i*sin(θ)
As described in an article at Brilliant, Euler's Formula can be derived using the series expansions of sin(x), cos(x), and ex:so:cos(x) = 1 - x2/2! + x4/4! - ... sin(x) = x - x3/3! + x5/5! - ... ex = 1 + x + x2/2! + x3/3! + ...In the section on Cartesian Coordinates above, we noted that any complex number can be represented in polar coordinates using r and theta, but we didn't have a good place to put the i. With Euler's Formula, we can now unambiguously represent any complex numberei*x = 1 + i*x + (i*x)2/2! + (i*x)3/3! + (i*x)4/4! + (i*x)5/5! + ... = 1 + i*x - x2/2! - i*x3/3! + x4/4! + i*x5/5! - ... = (1 - x2/2! + x4/4! - ...) + i*(x - x3/3! + x5/5! - ...) = cos(x) + i*sin(x)z = x + i*y
as|z| * ei*arg(z)
where|z|
is the magnitude ofz
andarg(z)
is the argument ofz
.Complex Exponentiation
Givenw = u + i*v
andz = x + i*y
, how do we calculatewz
?
We would likewz
to satisfy the rules of exponentiation that we derived for real numbers, such aska+b = ka * kb
. We will assume that we can apply this rule to complex exponentiation and see how that works out.
From the discussion of Euler's Formula above we know that we can represent any nonzero complex numberw
as|w|*ei*arg(w)
, and we can represent the real number|w|
aseln(|w|)
. Let's see where that takes us.This gives us a number of the formwz = (|w|*e(i*arg(w)))z Expand w = (eln(|w|)*ei*arg(w))z Use exp form for magnitude of w = (eln(|w|)+i*arg(w))z ea * eb = ea+b = e(ln(|w|)+i*arg(w))*z (ea)b = ea*b = e(ln(|w|)+i*arg(w))*(x+i*y) Expand z to real and imaginary parts = eln(|w|)*x + ln(|w|)*i*y + i*arg(w)*x + i*arg(w)*i*y (a+b)*(c+d)=ac+ad+bc+bd = e((ln(|w|)*x - arg(w)*y) + i*(ln(|w|)*y + arg(w)*x) i2=-1 and rearrange terms [L310] wz = e((ln(|w|)*x - arg(w)*y) + i*(ln(|w|)*y + arg(w)*x) The above summarizedr * ei*θ
wherer = e((ln(|w|)*x - arg(w)*y)
andθ = ln(|w|)*y + arg(w)*x
, both of which we can evaluate.
Note that the above result includesarg(w)
in two places, once multiplied byx
and once multiplied byy
.arg
is a multi-valued function, and thus complex exponentiation is also multi-valued for all exponents except zero.
If we are raising to a real power, theny
is zero, so [L310] reduces toThis equation says the magnitude of the result is the magnitude ofwx = e((ln(|w|)*x) + i*(arg(w)*x) [L310] with y=0 = |w|x * ei*arg(w)*x For real x and all ww
raised to thex
power and thearg
of the result is the arg ofw
multiplied byx
. If, for example, we are squaring and thusx
is 2, we square the magnitude of the number and double the angle. This result is consistent with our earlier observation that, when multiplying two complex numbers, we can multiply the magnitudes and add the angles.
Ify
is zero andx
is an integer, thenei*arg(w)*x
gives the same result for all of the multiple values ofarg(w)
, so the overall function is single-valued. Ifx
is not an integer, this is not the case. For example, ifx
is 1/2, then we get two different answers by plugging inArg(w)
andArg(w) + 2*π
. These are the two square roots of a number: they always have the same magnitude and differ in angle by π.
If we consider the path that would be traced out for powers of some fixedw
as we change the real exponent, we can see that it generates a circle or a spiral. Here is a nice visualization ofzx
from Suitcase of Dreams for when|z|>1
:
If we are raising to an imaginary power, thenx
is zero, so [L310] reduces toLet's evaluate ii. We use [L311] with[L311] wi*y = e(-arg(w)*y + i*ln(|w|)*y) [L310] with x=0w=i
andy=1
:Surprisingly,ii = e(-arg(w) + i*ln(|w|)) [L311] with w=i and y=0 = e-π/2 * ei * 0 |w|=1, ln(1) is 0 = e-π/2 Imaginary part drops out completely! = 0.207879...ii
is a real number, a little larger than one fifth. At least, that's one answer. We can use any of the answerse-π/2 + k*2π
for any integerk
.
We see that we can represent any nonzero complex number in the formei*z
, givenz = x + i*y
.One interesting thing we can do now is to extend Euler's Formula from real theta to complex theta, which allows us to defineei*z = ei*(x+i*y) = ei*x + i*i*y = e-y + i*x = e-y * ei*xsin
andcos
for the entire complex plane:ei*z = cos(z) + i*sin(z) e-i*z = cos(z) - i*sin(z) cos is an even function, sin is an odd function ei*z + e-i*z = 2*cos(z) cos(z) = 1/2 (ei*z + e-i*z) ei*z - e-i*z = 2*i*sin(z) sin(z) = 1/(2*i) (ei*z - e-i*z)Euler's Identity
We evaluate Euler's Formula with theta set to pi:We add one to both sides to get the typical presentation,ei*π = cos(π) + i*sin(π) = -1 + 0 = -1ei*π + 1 = 0
.
Not only does this identity tie together five of the key values of algebra (e, π, i, 1, and 0), it does it with one each of the key operations we derived above (equality, addition, multiplication, exponentiation). That's a pretty sweet equation.Final Closure
Throughout this presentation, we have expanded our system of numbers as we defined new operators and discovered our system of numbers was not closed under the new operators. But with complex numbers, we have reached a point where we don't need to define any new number types. Complex numbers are sufficient to solve all algebraic equations. This is one of the interpretations of the Fundamental Theorem of Algebra, but the proofs are pretty difficult, so I'm not going to try to prove it here.Subscribe to: Posts (Atom)