Mathematical Reasoning
Introductory Notes on Mathematical Proofs and Discrete Math
Abstract
This document presents a compilation of notes I took while taking Discrete Math & Probability Theory (CS70) at the University of California at Berkeley, with an emphasis on Discrete Math. These notes were inspired by the book Introduction to Mathematical Proofs.
Note 1 Numbers, Sets, and Functions
1.1 Quadratic Equations
One of the simplest classes of equations we regularly solve are linear equations, equations of the form below, where the greatest power of our variable is one. As a matter of convention, we write the general equation as
| (1.1) |
Solving this equation can be done by rearranging and then dividing both sides by , yielding
Naturally, we may want to solve some more complicated quadratic equations, where the greatest power of the variable is two. Such equations take the general form
| (1.2) |
To solve this equation, we need to isolate and in doing so, reduce the number of terms with the variable from two to one. This can be accomplished factoring through a problem solving strategy called wishful thinking. First, we being by taking out the coefficient of the first term, .
Writing the left-hand side of the equation like this, we can see by adding the right constant inside the parentheses, this can be reduced to a square. Returning to our original equation, this results in the following steps.
We’ve derived the exact type of equation we can easily solve, one in which the variable is restricted to a single term.
We needed to include the plus-minus () sign since squaring removes the sign and therefore the term
could be either positive or negative and have the same square. By simplifying the square root, and moving the constant beside to the other side, we derive the quadratic formula which gives the solutions to a quadratic equation in terms of its coefficients.
| (1.3) | ||||
| (1.4) |
From this, we can construct our first theorem in this course.
Theorem 1.
A quadratic equation has at least one solution when
| (1.5) |
Proof.
In the case where , we have explicitly constructed a solution to the quadratic equation. To verify this, we will plug the positive root into the equation to ensure the result is 0.
Therefore, we have show a solution exists when for a quadratic equation. More than that, we have determined the specific form of such a solution. End of proof
We can actually show a stronger result about the solutions of a quadratic equation, though the proof is to the reader.
Theorem 2.
If is positive, a quadratic equation has two solutions; if it is negative, the equation has no real solutions; and if it is exactly equal to zero there is only one unique solution. This can be express in the following way.
The techniques used to solve this problem and prove its solution provide a preview of the kinds of thinking and problem solving which will be invaluable to this course. More than any one theorem or result, this course is an introduction to mathematical thinking.
1.2 Inequalities
Our discussion so far has revolved around statements of equality, and so naturally we might wonder about inequalities. To clearly define our terminology going forward, we will use the following definitions.
| Terminology | Definition |
|---|---|
| Positive | |
| Negative | |
| Nonnegative |
In order to work with these, it is important to review some basic properties of inequalities.
-
i.
For any real numbers and , one of the following statements is true:
-
ii.
For real numbers , , and …
-
iii.
Given that , adding or subtracting constants from both sides preserves equality, meaning for any , the inequalities below are also true.
-
iv.
If , the effect of multiplying by another number depends on the sign of .
In other words, if is negative, the sign of the inequality is flipped.
-
v.
The square of any real number is nonnegative. Furthermore, the square of a number is 0 if and only if the number itself is 0.
These five fundamental properties are essential in proving some foundational inequalities, the first of which is called the Arithmetic-Geometric Mean Inequality. In the presentation of the theorem, we explain how to approach proofs.
Theorem 3 (AGM Inequality).
If and are real numbers, then:
| (1.6) |
Rough Work.
A mathematical proof begins with true statements and derives the desired result through a series of logical sound steps. In looking for an appropriate place to start, it often helps to rearrange the statement we want to prove until we reach a true statement, working backwards to construct the proof. In this case:
We know the last line is true, since the square of any number must be nonnegative, hence writing the proof is now easy.
Proof.
For any real numbers and , we know the square of the difference must be positive.
Therefore, the Arithmetic-Geometric Mean holds for any real numbers. End of proof
Furthermore, in the case where both and are nonnegative, we can take positive square roots to obtain another form of the inequality.
| (1.7) |
This form elucidates the name of this inequality; the left side is the arithmetic mean of and , while the right-hand side represents the geometric mean, while powerful inequality relates the two types of averages. Our proof also provides us some additional insight, since it begins with the claim
Remember the square of any real number is equal to 0 if and only if the number is 0. Therefore we have a simple condition for equality in the AGM Inequality.
Theorem 4.
Equality hold in the AGM Inequality if and only if or . In other words
| (1.8) |
To introduce the second major inequality, the absolute value of a number, traditionally denoted as needs to be defined.
The absolute value can be thought of as the positive distance to the origin. For example, is exactly 2 units away from the origin. The absolute value can also be visualized through its graph.
Once again, we present fundamental properties of absolute values, in order to provide a foundation for working with these mathematical functions.
-
i.
For any real number , and .
-
ii.
Given real numbers and , .
-
iii.
Squaring a number strips its negative sign, and taking the positive root ensures the result will be positive. Therefore the [positive] square root of the square is the absolute value.
The last property is rather interesting, and can be proved. Notice for proving these simple properties of absolute values, a proof by cases is often most effective, a proof technique in which we break all possibilities into distinct cases or scenarios and prove the claim for each of them. Since the absolute value function behaves different for positive and negative numbers, these are naturally our cases.
Theorem 5.
For any real number , we have:
| (1.9) |
Proof by Cases.
Consider the cases when is positive, zero and negative separately. Note formally, one of these cases must be true from the trichotomy property (see the first property of inequalities above).
Case 1.
Case 2.
Case 3.
Assume that for some positive real number . Notice that Plugging this into the expression of the square root of the square:
End of proof
At this point, we can go ahead and introduce the next major inequality. Note that we mentioned that the absolute value of the product of two numbers is the product of the absolute values, or
Unfortunately, the same does not hold true for sums–the analogous statement is false.
However, there is a relationship between the two quantities which can be expressed as an inequality.
Theorem 6 (Triangle Inequality).
Given real numbers , the sum of the absolute values is always greater than or equal to the absolute value of the sum.
| (1.10) |
Rough work.
Once again, we begin with rough work to determine how to progress on the proof. In order to remove the absolute value sides, it seems reasonable to square both sides.
Notice we arrived at a statement which is clearly true. Remember that the absolute value of any real number is always greater than or equal to the number. At this point we can write the proof.
Proof.
Consider any real number and . From the basic properties of absolute values we know the product of these numbers must be less than or equal to its absolute value.
Both terms being squared are positive, hence we can safely square root both sides without affecting the inequality.
End of proof
Once again in this proof, we used rough work to arrive at a statement we knew was true, and then worked backward to derive the result we wanted from this true statement. We had to be more careful working backward in this example since we square rooted an inequality. Though this works when both terms inside the square are positive, it may fail spectacularly otherwise.
1.3 Sets
So far we have been relying on intuition to define the real numbers, the focus of our attention. Although the formal definitions of the real numbers are outside the scope of these notes, it is worth paying attention to this fundamental notion of a group of numbers. In mathematics, this concept is represented as a set, a well-defined collection of objects. Some examples are listed below.
The last example presents two ways to denote the empty set, the set with no elements. Notice is the set of squares of all even numbers. The ‘:’ introduces a condition in the set definition. Since every set must be well-defined, every object is either an element of the set or not. If set contains element but does not contain , we can write
In defining sets, we often specify the elements to be included from some universe of objects. However, the elements of elements not in clearly make a set as well, one which is intrinsically linked to . This set is denoted the complement of , written as and defined as
Though we will explore notions of size in more depth later on, for now we define the cardinality, written as
as its size. For finite sets, sets without an infinite number of elements, the cardinality is simply the number of elements. All such cardinalities are in the set , with 0 reserved for the empty set.
Since a set is unordered and cannot contain duplicate items, a natural definition of equality is that both sets contain the same elements. On the other hand, if every element of set is in set , then in some sense the latter set contains the former. Mathematically, we write that is a subset of or
Notice under this definition, the empty set is a subset of every set. Similarly, a set is always a subset of itself. In cases where we want to exclude this possibility, that the sets are equal, in explaining the relationship between the two sets, we can write is a proper subset of or
which means but . Along with the real numbers there are a handful of other important sets used to refer to classes of numbers.
| Symbol | Name | Set |
|---|---|---|
| Natural Numbers | ||
| Integers | ||
| Rational Numbers | ||
| Real Numbers |
Notice we defined the natural numbers to exclude 0, but the symbol could be used to refer to all naturals along with 0. Based on the descriptions of each of these sets, each contains the previous plus some additional elements. Therefore, we can write
| (1.11) |
Since each set contains the previous, we may wonder what elements are in one but not the other. In some cases, this set itself might be valuable. For example all real numbers which are not rational are referred to as as irrational, and the set which contains all such numbers the irrational numbers. This set then is a result of taking the real numbers and subtracting away the rational numbers. This operation, along with a few more are formalized before for arbitrary sets and .
The intersection is the set of all elements in both, while the union is the set of all elements in either. Two sets are called disjoint if their intersection is the empty set meaning they have no common elements. Written out this would mean for disjoint sets and ,
The cartesian product of two sets outputs a set with a completely different structure: a set of tuples containing all possibly pairs of elements from and . Unlike sets, tuples are ordered lists of elements which could contain duplicates. This idea of product can be extended beyond two sets.
Repeated products of the same set can be abbreviated as a power, similar to the real numbers.
One final concept that must be introduced to complete our discussion on introductory set theory is the power set of a set, which contains all possible subsets. Formally we write this as
To ensure this section includes a proof, we end with a simple theorem about the size of a power set. We will explore another proof of this theorem later in the course.
Theorem 7.
For a finite set , if , then its power set has cardinality .
Examples.
One way to begin working on a proof is the test the claim with examples, since they may yield insight into why the result is true.
In each of these cases, the result holds. When the sizes of sets were 0 and 2, their power sets had sizes of 1 and 4 respectively. Notice we would also expect the number of subsets to be even, since for every subset , is also one, and this theorem seems to check out in that regard. At this point we can present the proof
Proof.
Since is finite and then the set must have elements. Let us arbitrarily assign labels so that
In constructing a subset, each of these distinct elements can be chosen to either be included or excluded from the set. Since every subset can be created with these series of choices, and each choice has two options, the number of subsets is . End of proof
A major class of proofs involves proving two sets are equal. One possible way of doing this involves showing each set is a subset of another. To prove set is a subset of , we must show if is an element of then it is also an element of . This would mean every element in is contained in and therefore . To provide an example, we introduce another theorem.
Theorem 8 (De Morgan’s laws).
Given two sets and , the following statements are true.
| (1.12) | ||||
| (1.13) |
Proof.
Since the proof the statements are nearly identical, only the first will be proved; the second is left as an exercise. Since we are proving the equality of sets, we proceed by showing the left-hand side (LHS) is a subset of the right-hand side (RHS) and vice versa.
Part 1.
Assume that , we will show is also an element of through a series of implications.
Part 2.
Now assume that , we will show is also an element of .
Since we have proved that and , this is enough to show equality.
End of proof
1.4 Functions
At this point we can introduce a very special type of set which has profound uses in mathematics. This, of course, is a function, which is a set of input-output ordered pairs which can represented as
| Term | Definition |
|---|---|
| Domain | The set of all allowed inputs |
| Codomain (Target Space) | The set of all allowed outputs |
| Image (Range) | The set of all actual outputs |
The function, then, takes an input from the domain and returns a single output in the codomain. This property of linking the domain to the codomain explains why function are often called maps. Since the image is the set of outputs, and a subset of the codomain, it would be the set that results from applying the function to every element in the domain. Thus, for a function with domain , the image is often denoted as .
In defining a function, the domain and codomain must be explicitly specified, hence the following notation is often used.
A function is called real-valued if its image is a subset of the real numbers, and thus all its outputs are real numbers. The graph of a function is a plot of all the input-output pairs. In working with functions there are a few elementary ways to combine them.
| (1.14) | |||
| (1.15) | |||
| (1.16) |
We are often extremely interested in the image of a real-valued function. In particular, we may wonder about the nature of this set: does it have a maximum or minimum, or does the function go to infinity? Formally, a function is bounded if there exists some such that
for all . Notice in this definition acts both a positive and negative bound since the statement is equivalent to
It turns out that sums and products of bounded functions are bounded as well, results we prove below. Notice in these proofs we clearly state our assumptions and what we plan on showing, a technique that helps inform our approach.
Theorem 9.
If functions and are bounded so are the functions and , formed by the sum and product of these functions respectively.
Proof.
Since and are bounded, there exist and for which
We hope to show there exists some bound on , therefore our aim should be to express this in terms of the quantities we already have bounds for.
This shows that we have discovered a bound for the function: the product of the two bounds of our original function. Hence the function is bounded. End of proof
Proof.
Once again, assume there exist and for which
We wish to find a bound for . However, in this case it is more tricky to decompose this into and , and requires the triangle inequality.
Thus, we have proved , showing the sum of the bounds of the functions is the bound for the sum of the functions. End of proof
In general, approach bounding proofs by expressing the bounded functions by their definitional inequality. Then work on manipulating these inequalities through inequality or absolute value rules to obtain the desired result.
1.5 Fields
So far in our deals with , we have been implicitly relying on properties or axioms of arithmetic in the real numbers. Many of these properties are not exclusive to the reals, and apply to a class of algebraic structures called fields. A field, , consists of
is a set of elements in the field in including at least the additive identity 0 and the multiplicative identity 1. A field also includes two predefined binary operations, addition and multiplication, which take two inputs in and return the sum and product in respectively. These function must obey the following axioms for .
| Name | Addition | Multiplication |
|---|---|---|
| Closure | ||
| Associativity | ||
| Commutativity | ||
| Identity | ||
| Inverses | each non-zero element has an additive inverse | each element has an multiplicative inverse |
| Distributivity | ||
To clarify the inverse axioms, the first states that for any , there exists a in the field, such that
In this case, is simply an element of the field, and is called the additive inverse of . Commutatively, is the additive inverse of . Sometimes, the additive inverse of a field element is itself–the trivial example is 0:
For non-zero elements in the field, a multiplicative inverse exists as well, which means for , there exists a , such that
Similar to the additive inverse, this inverse is an element of the field and likewise can be the same number. With this axiom, we can define two new operations in the field. Subtraction amounts to adding by the additive inverse and division is multiplication by the multiplicative inverse. For field elements , in some field .
| (1.17) | ||||
| (1.18) |
Notice the here emphasizes the expression on the left is defined to be equal to the expression on the right. To avoid confusion, when working in fields, we will refrain from using subtraction and division and instead refer to the inverses.
Naturally, we may now wonder, having laid out these axioms, which sets which classes of numbers, with regularly defined addition and multiplication, constitute fields. To prove whether a set with defined operations is a field is simply a matter of checking each of the axioms.
Theorem 10.
The natural numbers and integers are not fields.
Proof.
We will prove this theorem in two parts, first dealing with the natural numbers proceeded by the integers.
Part 1.
The additive inverse axiom fails for all integers. Specifically, the number 1, for example, has no additive inverse since for all . Since 1 had no additive inverse this is not a field.
Part 2.
The multiplicative inverse axiom fails for . Once way of rigourously showing this is to notice is even for all . Therefore this value can never be 1, and hence no multiplicative inverse exists.
End of proof
Theorem 11.
The rational numbers and real numbers with regularly defined addition and multiplication constitute fields.
Proof.
The process of proving these are fields are rather tedious, verifying each of the field axioms individually, and left as an exercise. However, note that the inverse axioms that failed for the natural numbers and integers hold for these two sets. The additive and multiplicative inverses are respectively shown below.
End of proof
These suite of axioms lead to a host of simple properties, which should feel familiar to us after working with the real numbers. We introduce an expansive theorem which lays out many of these properties.
Theorem 12.
Given a field defined with addition and multiplication and three elements , , and , the following properties must hold.
| (1.19) | |||
| (1.20) | |||
| (1.21) | |||
| (1.22) | |||
| (1.23) | |||
| (1.24) |
Proof.
Proving every statement in the theorem would take up a lot of space, hence we will present proofs for the first and last statement, hopefully providing a model for approaching proofs to the remaining statements.
Statement 1.18.
Notice that although we take this for grated in the real numbers, it is not a field axiom and must be proved. Since 0 is the additive inverse, in searching for a proof we should look to exploit its properties.
| identity | ||||
| distributive |
At this point notice by closure we know lies within the field. Therefore, by the inverses axiom it must have an additive inverse. Now we add this to both sides of the equation.
| associative | ||||
| inverse | ||||
| identity | ||||
This is the result we were looking for. Notice instead of subtracting from each side, as we would in the real numbers, we added the additive inverse to both sides. In this proof we clearly listed which axiom allowed us to make each step, and although this is not necessary, it makes it less likely to perform an operation which is not allowed.
Statement 1.23.
Once again, we attempt a direct proof based on the field axioms. Since from the inverses axiom, we know there exists some such that
We simply multiply both sides of out equation with this term.
| associativity | ||||
| inverse | ||||
| identity | ||||
Once again, we have arrived at our result, hence we have proved this this statement for arbitrary fields. Notice we could approach this problem as if we were asked to prove this for real numbers. Instead of dividing both sides by we multiplied by to arrive at the result. End of proof
The two fields we have discussed so far have had an infinite number elements with predefined addition and multiplication operations that mirror their common usage. However, we may wonder what other fields exists, and in particular, what such a field would look like for a finite number of elements, a finite field. Since a field must have at least two elements, let us attempt to construct a two element field, called the binary field. Consider the field
Unfortunately, we must define multiplication and addition in this field since otherwise the latter would defy the closure field axiom as
For finite fields with a small number of elements, the most effective method of defining these operations involves writing addition and multiplication tables, providing outputs for all possible pairs of input for each function.
Notice, that multiplying by 0 always returns 0 (as we proved above) and multiplying by 1 gives the same number. Similarly, adding 0 returns the same number by the identity property.
This only leaves one output to be decided, the sum . In the next theorem we prove that this is equal to 0. To do this, we introduce a new type of proof, a proof by contradiction.
Theorem 13.
If is the binary field with , then .
Proof by Contradiction.
By the closure property of fields, we know the sum must lie within and is therefore it is either 0 or 1. Assume for sake of contradiction that it is indeed 1, so that
By the inverse axiom, we know that 1 has an additive inverse in the field. We now add this inverse to both sides of the equation.
This is clearly false since by definition a field must have distinct elements 0 and 1. Since this would be a logical consequence of letting , that assumption must be false. Hence we are left only with the possibility that
End of proof
In a proof by contradiction, we assume the opposite of the statement we are trying to prove, and show how this leads a result that is clearly false. This contradiction shows that our initial assumption must be false, and therefore its opposite true. In this way, we indirectly prove our claim is true by showing it cannot be false. In the last proof, we demonstrated if then , which is clearly false. Therefore, we could conclude , which meant , what we wanted to show.
Moving on, we can tackle the construction of three-element field . Immediately, we write the addition and multiplication tables and fill out the simple rows and columns.
Completing the multiplication table is simple; since every element in the field must have a multiplicative inverse, in the case of , this must be itself. In other words, every non-zero row or column in a field multiplication table must have a single 1.
This leaves the addition table to be completed, to do this we introduce a new theorem which helps complete the non-diagonal entries.
Theorem 14.
For a three-element field with the following equation must be true.
Proof by Contradiction.
Assume for sake of contradiction that . Then, by closure the sum must either be or 1. We would show either case results in a contradiction.
Case 1.
Since 1 is in the field, the inverses property tells us that an additive inverse exists. Adding both sides with this additive inverse we immediately see the contradiction.
Since the elements are distinct this is a contradiction.
Case 2.
We approach this case in a very similar way, clearing the ’s from both sides to expose a contradiction.
This again, is a contradiction. Therefore in the absence of any alternatives, we have proved that End of proof
Note in this proof, we applied a case-by-case approach to a proof by contradiction. This is analogous to a proof by cases where we prove the result in every case, except we actually show a contradiction in each case.
The remaining two spaces are rather easy to fill out if we remember the property that in a field
Fixing by focusing on an individual role or column, this tells us that different entries must be different. This is because, for example,
Interestingly, this means every row and column of the addition table must contain each of the field elements. Exploiting the analogous rule for multiplication, every non-zero row or column must have every field element appear exactly once. Armed with this knowledge, we can finally complete the operator definitions, showing a three-element field exists.
1.6 Exercises
-
(a)
Prove that for two real numbers and , .
-
(b)
Given , determine the maximum value of using the arithmetic-geometric mean inequality.
-
(c)
For real numbers show that
-
(d)
When does equality hold for the triangle inequality? In other words, for what values of and does
-
(e)
Prove for any the following inequality holds true.
-
(f)
Prove the following for three real numbers , , and .
-
(g)
Show the image of the function is the set (0, 1], for
-
(h)
Let and be sets in the domain of . Prove the following relationship about the images of the function.
-
(i)
Let and be two bounded functions. Which of the following functions are also bounded?
-
(j)
If and are bounded, prove that the function is bounded.
-
(k)
Consider the four element field with operations and . Complete the addition and multiplication tables below, if possible, and determine the number of different fields which satisfy the axioms.
-
(l)
For each of the statements below either provide a proof or counterexample. (1) For any , we have , and (2) for any in .
Note 2 Language and Proofs
2.1 Statements and Connectives
The atomic unit of mathematical reasoning is the mathematical statement or proposition, a sentence that is either true or false. This requirement is called the principle of bivalence. For a statement, its truth value is whether it is true or false. Examples of statements are given below.
In our examples, the first statement is false while the other two are true. On the other hand “let them eat cake” is not a mathematical statement. These statements can be combined with logical connectives to express more complicated thoughts, which are listed below.
| Name | Usage | Meaning |
|---|---|---|
| Negation | not | |
| Conjunction | and | |
| Disjunction | or | |
| Conditional | if then | |
| Biconditional | if and only if |
Similar to defining operators in a finite field, we provide rigorous definitions for these logical connectives by showing whether the compound statement is true of false for every possible set input. Fortunately there are very few possible inputs since each statement can either be true or false. In composing a table, called a truth table, we let each row correspond to a unique input.
Notice this truth table shows completely negation, since it tells us how the statement behaves given every input. In this case, if a statement is true its negation is false and if a statement is false its negation is true. In English, then, the negation of a statement would be its opposite. The truth tables for the remaining connective are given below; notice each require two inputs and thus have four possible inputs.
These truth tables provide complete definitions for each of these operators, but fail to provide very much intuition.
-
i.
Negation: As mentioned before, if is true its negation is false and vice versa. The simple interpretation of this that is the opposite of .
-
ii.
Conjunction: is true only when both and are true, which matches with we think about “and.”
-
iii.
Disjunction: is true when either , , or both and are true. Another way to think about this is that is only false when both statements and are both false. There is a slight difference between this logical “or” and the one in everyday usage; often times, “or” in English can mean one or the other but not both, which would be called the exclusive-or in mathematics.
-
iv.
Conditional: This connective is essential in proofs, and thus is worth taking closer look at.
The first statement is called the hypothesis and the second statement the conclusion. The statement , interpreted as either
“if then ” or
“ implies ,”
means when (the hypothesis) is true, (the conclusion) must be true as well. On the other hand, when is false, it does matter whether is true or false, and the statement is true in either case. Once again, the only time a conditional is false is when is true but is false.
-
v.
Biconditional: A biconditional is true when both and have the truth value and false they do not. Therefore, if the biconditional is true the statements and are logically interchangeable since both are true or false together. For this reason, and can be said to be “logically equivalent.” This concept can be expressed in two ways as either
where the second notation is traditionally used to indicate the two logical compound expressions are equivalent, especially when one side might itself contain the biconditional connective.
It turns out that from these logical connective we can construct many expressions which are identical. We prove a few of the important logical equivalences.
Theorem 15 (De Morgan’s laws).
For two logical statements and , the following equivalences are true.
| (2.1) | ||||
| (2.2) |
Proof.
The general way to prove logical statements are equivalent is to construct truth tables and show both statements have identical outputs for the same inputs. In order to avoid receptiveness, we will do this for the first statement and leave the proof of the second as an exercise.
| T | T | T | F |
| T | F | F | T |
| F | T | F | T |
| F | F | F | T |
This gives us the truth table for the left-hand side of the equation. We follow the same process, slow building up the right-hand side.
| T | T | F | F | F |
| T | F | F | T | T |
| F | T | T | F | T |
| F | F | T | T | T |
Since and are true and false together, given the same inputs (have identical columns) the two statements are logically equivalent.
End of proof
Theorem 16.
The following important logical equivalences are also true.
| (2.3) | ||||
| (2.4) | ||||
| (2.5) |
2.2 Predicates and Quantifiers
Even after the application of connectives, statements are rather limited and specific, and so to express classes of statements, predicates are used. A predicate with variables is an expression which is either true or false when its variables are supplied. Since predicates are not statements until its variables are defined, they are sometime called open statements.
Above is an example of such a predicate; although it currently is neither true nor false, as soon as is given a reasonable input its truth value is defined.
Notice the truth value changes as a function of the input, though we need only consider inputs for which it is well-defined. For example, in this case, the expression makes no sense since even and odd are not defined for real numbers.
Having introduced this machinery, we can return to DeMorgan’s laws and explain its connection to the identically named rules for sets. For two sets and , consider the following statements.
From DeMorgan’s laws, we know the following logical equivalence is true.
We will translate both sides into the language of set theory.
Similarly, with these defined statements,
So far we have only discussed whether statements are true for individual inputs, but often times we are interested in the behavior of a predicate over a set of inputs. To do this, we introduce two quantifiers.
Attaching either a universal quantifier or existential quantifier to a predicate makes it a statement.
| “for all in , is true” | |||
| “there exists an in such that | |||
| is true” |
In English these quantifiers can often be disguised, but are commonly used when talking about mathematical statements. A predicate with multiple variables can have multiple quantifiers, where order matters.
The first statement means that for any real input of , we can find a -value which is more extreme that it. This is trivially true since any finite input of the function is finite, take for example
On the other hand, the second statements says there is some bound for which all values in the image of lie between and . Note this is identical to the definition of a bounded function we discussed earlier, only written in the form of a statement. Now consider the negation of this statement. To argue a function is not bounded is to should for all bounds we can chose at least one point which exceeds this bound.
If we compare the definition of bounded with its negation we notice an interesting relationship, where, in addition to the predicate being negated, all universal quantifiers turn into existential ones and vice versa. This is formalized below.
Theorem 17.
When negating a quantified predicate, flip the quantifiers in addition to negating the predicate.
| (2.6) | ||||
| (2.7) |
2.3 Proving Existential Statements
The simplest class of statements we need to prove are those with existential quantifiers; those of the form
To prove such a statement is rather straightforward and usually involves finding the for which the claim is true. For example, we can prove the following statement, which is sufficient to show the irrational numbers do not obey closure under regular addition and therefore cannot be considered a field.
Theorem 18.
There exist two irrational numbers and for which + is rational.
Proof.
Expressed as a logical statement, we are interested in the statement
We will prove this existentially qualified statement by fining such a pair of irrational and for which their sum is irrational. Consider now:
Since we have found at least one pair of elements for which this statement is true, we have proved that the sum of two irrational numbers can be rational. End of proof
Of course, explicitly finding an example which satisfies the statement may not be as easy. Sometimes it is possible to show such an example must exist without actually finding it in a type of proof called a non-constructive proof. The most famous example of such a proof is given below.
Theorem 19.
There exist two irrational numbers and for which is rational.
Proof by Cases.
This proof seems nearly impossible to attempt directly. The irrational numbers are a very diverse set and a clear starting point is elusive. Instead, we attempt a non-constructive proof with a trick.
Case 1. is irrational
Notice what happens when this number is raised to the power of .
Therefore if is irrational, then we have found two irrational numbers for which their power is rational.
Case 2. is rational
It turns out that if this case is correct, then our proof is automatically done, since the square root of two, an irrational number, when raised to the power of itself is rational.
Ultimately, we have shown that either
must be rational and the result of raising one irrational number to the power another. Although we do not know which case is correct, one of them must be true and thus there must exist two irrational numbers who power is rational. (In case you were still curious, it turns out the first case is correct.) End of proof
We will now introduce even and odd integers and use this an an opportunity to prove statements about this property.
Notice with definition every integer is either even or odd but not both. The sets of even and odd numbers look as follows
where denotes the even integers and the odd integers. We will leave our discussion of parity, the property of being either even or odd, here for now.
2.4 Unary Proofs
Despite this discussion about proving existential quantifiers, we are primarily interested in proving or disproving statements with universal quantifiers. That is, we prefer more general results about all natural numbers, all integers, or all real numbers. To disprove a statement, we can prove its negation. For a universally quantified statement, these have the following forms.
Hence, proving such a statement false amounts to finding at least one example for which the result fails to hold. As we showed above, this can either be done explicitly or implicitly (non-constructively). In the scenario where we find such case, we call it a counterexample. Since the negation of a universal quantifier is existential, only one counterexample is needed to prove the entire claim false.
It follows that we may now want to explore techniques to help prove these statements true. Of course, we usually deal with infinite sets and it is impossible to manually verify the predicate holds for all possible inputs. However, once we realize the majority of results fall within one of three classes of statements,
we can apply the tools mathematical logic to discover more efficient methods to approach these proofs. But first, we provide examples of statements of this form statements we have already encountered.
| Type | Examples |
|---|---|
| for all real numbers , | |
| for all , | |
| for any finite set , the size of is | |
| for any statements and , | |
| if and then their arithmetic mean equals their geometric mean. | |
| if then a quadrative equation has at least one solution | |
| if is a field with , then |
We first begin our examination of proof techniques with statements of the form
Notice that these statements are actually of the form
where represents many of the definitions and assumptions we take for granted. For example, in proving the triangle inequality, we assumed that the absolute value function was defined in a particular way, along with a few axioms of the real numbers. However since the hypotheses are not meaningful in the sense that they provides any significant information apart from the generally accepted rules, they are acceptable to omit. The two main methods of proving such a statement are a direct proof and a proof by contradiction. In the former,we derive the statement directly from those which are true. An example of such a proof is given below.
Theorem 20.
For any real number the following inequality is always true.
| (2.8) |
Proof.
It turns out that this statement is rather useful, and appears frequently in other places. This statement is relatively easy to prove as well, since it appears to be a special case of the arithmetic-geometric inequality. Take:
Since both and are greater than 0 we can apply the special case of the theorem which hold for positive numbers.
End of proof
While a direct proof is rather straightforward, it can difficult to see how to proceed when confronted with a statement. In this case, a proof by contradiction can be used: assume the negation of the statement and show that it must be wrong. Remember that a either a statement or its negation must be true, therefore showing the negation to be false is equivalent to proving the statement true. An example where a direct proof might be difficult is given below.
Theorem 21.
For any list of real numbers, there must be at least one value which is a large as the average. Alternatively, this can be expressed as:
| (2.9) |
Proof by Contradiction.
Attempting a direct proof seems unwise; instead, we approach this with a proof by contradiction. Assume there exist a list of real numbers for which every number is less than the average. Therefore we would have
Since we have a bound on every single term, we can find a bound on the sum.
At this point, we have arrive at the conclusion that the sum of the numbers is less than itself. This is a contradiction, and thus such a list cannot exist. End of proof
Notice in this case, the result we where trying to prove was actually doubly quantified. Formally, we would write
| for all lists of real numbers, there exists some element in the list for which… |
In proving this statement via contradiction, we need to be careful in taking the negation, remembering to swap quantifier types. Our negation is then
| there exists some list of real numbers for which all elements are… |
Notice this is precisely what we assumed during the proof, thus making it logically sound. Proofs by contradiction are also rather useful for claims of irrationality. Although rational numbers are well-defined and easy to work with, the irrationals are less so. Using a proof by contradiction would allows to work with the former class of numbers rather than the latter. An example of such a proof is below.
Theorem 22.
The square root of two is irrational.
Proof by Contradiction.
Above we used the fact that the square root of 2 was irrational, but we never proved it. We do so now using a proof by contradiction. Assume, then, that the square root of two is rational so we can write
for and in lowest terms. We emphasize this point here, that the fraction is in lowest terms, since ever rational number can be written in this way. At this point we square both sides to remove the square root.
Notice that can be expressed as two times an integer so it must even. It follows that must also be even, since if it was odd so would its square. therefore, for some , we can write
This means that is also even and so must be as well. This however is a contradiction, since we assumed the fraction was in reduced form and therefore both the number and denominator cannot be divisible by two. End of proof
2.5 Conditional Proofs
In the last chapter, we also mentioned two other classes of statements we are likely to prove.
Despite being slightly more complex, many of the techniques we see in this section are similar to those we explored earlier. We now tackle the second class of statements of the form . A summary of the three techniques we will use throughout the course of the section are below.
| Method | Approach | Statement |
|---|---|---|
| Direct | assume , show | |
| Contrapostive | assume , show | |
| Contradiction | assume , show contradiction |
Once again, the most straightforward approach is to attempt a direct proof. For an example, we can prove the following theorem rather simply, assuming the hypothesis and manipulating it to arrive at the conclusion.
Theorem 23.
The sum and product of two even numbers are both even.
Proof.
We proceed with a direct proof; to help see the hypothesis and conclusion we can write this out as a logical statement.
We’re using shorthand here where is a more compact way of writing and . Nonetheless, our assumptions and desired conclusion are now clear, and we can proceed. Assume are even so that for some :
We will now express both the product and sum in terms of and .
Since the sum and product can be expressed as 2 times some integer we can conclude they are both even. For sake of completeness, we introduce a more general theorem. The proof of each of its remaining parts are very similar and left as an exercise. End of proof
Theorem 24.
For integers the following conditions determine whether the sum or product is even or odd.
-
i.
if is even and is even, is even and is even
-
ii.
if is even and is odd, is odd and is even
-
iii.
if is odd and is even, is odd and is even
-
iv.
if is odd and is odd, is even and is odd
This theorem can be summarized with the following two tables,
which are oddly reminiscent of the addition and multiplication tables in a binary field. In fact, if we constructed a field
its operations would be defined in exactly this way, with as the additive identity and the multiplicative identity. For now, we leave this connection as an unlikely coincidence, but we will revisit this after tackling modular arithmetic.
Above is one of the logical equivalences we proved earlier, the right-hand side is called is called the contrapositive or contraposition. This fundamental equivalence tells us that to prove a conditional we can instead prove the contrapositive by assuming the negation of the conclusion and deriving the negation of the hypothesis. The canonical example is the set of statements:
Original. if it rains the streets are wet
Contrapositive. if the streets are dry then it did not rain
This is useful when the negations of the hypothesis and conclusions are better defined or easier to work with. This often occurs when we have statements with “not equal to” and in claims about parity.
Theorem 25.
If the is even, then must be even as well. Similarly, if is odd, then must be odd. In other words, always has the same parity as .
Proof by Contrapositive.
We will show the two parts of this theorem separately, and prove the first with the contrapositive. The second part is left as an exercise.
Part 1. is even
We can write out our statement formally, so we can see its hypothesis and conclusion more easily. This also makes it easy to see the contrapositive.
The contrapositive is substantially easier to prove; assume that is odd so that for some integer , . We can then express the square in terms of .
Since can also be expressed as one more than an even number, it must also be odd. Therefore, we have proved our original statement. Notice we used this fact in the proof that the square root of two was irrational, though only casually proved it. End of proof
Theorem 26.
For any two non-negative numbers and ,
Proof.
This result means that different positive numbers must have different squares. This fact was actually essential in many of our proofs, since it means that given we are working with positive numbers, we are allowed to square root both sides of the equation without affecting the equality. In other words, knowing each positive number has a unique squares allows us to work backward and determine a number uniquely from its square. It turns out this property is not unique to the squaring function but rather applies to an entire class of functions; but for now we stick with proving this statement, or rather its contrapositive.
The proof at this point is rather straightforward, and involves simplifying the hypothesis to reach the conclusion.
End of proof
2.6 Conditional Statements and Contradiction
The final proof technique, as outlined earlier, is a proof by contradiction. Remember the only case in which a conditional is false is when the hypothesis is true and the conclusion is false. Therefore its negation is as follows.
Once again, in a proof by contradiction the idea is to assume the negation and show a contradiction. Since we are interested in proving a conditional, we assume the hypothesis and the negation of the conclusion, showing this yields a contradiction. We use this proof technique a revisit and strengthen a theorem we presented earlier.
Theorem 27.
For any prime number , the th root of is irrational, where is any integer greater than 1.
Proof.
Notice that this statement can be expressed as a conditional; if is prime then
Once again, proving something is irrational positively is difficult, hence it makes sense to approach this via a proof by contradiction. For sake of contradiction then, we assume the hypothesis and the negation of the conclusion, that is prime and for the th root of is rational. By definition, this means
were and are co-prime integers. Proving this same theorem in the special case where both and were 2, we can guess the proof for this will likely be similar. Therefore we raise both sides of the equation to the power of and rearrange.
Clearly the left-hand side of this equation is divisible by , so therefore must be as well. Since is prime, this implies that itself must be divisible by . Hence, for some , we can write
Dividing both sides by , we reach the following equality, which immediately yields the contradiction.
Now the right-hand side is divisible by so this implies so is the left-hand side. But since is a prime, dividing means also divides . But this is a contradiction, since this means that both and are divisible by and therefore the fraction
was not in lowest terms after all. Therefore, we have proved what we intended to. End of proof
Notice what we proved could be interpreted as showing the polynomial with integer coefficients
has no rational roots. Naturally, we maybe interested in a more general questions of when polynomials have rational roots; so we now prove a more general result to this effect and see how this above theorem simply follows as a corollary.
Theorem 28 (Rational Root Theorem).
Let be an arbitrary polynomial with integer coefficients, which can be written as
If has rational roots, meaning there exists some such that , then if is expressed as
in lowest terms, then must divide , the constant term, and must divide , the leading coefficient.
Proof.
This could be approached with a proof by contradiction, but instead we provide a direct proof. Assume that there exists a rational solution the which is expressed as a fraction in lowest terms. Then we can write the evaluation of the polynomial at which should be equal to 0, by definition.
Once again, similar to the insight we used in the previous few proofs, we want to clear the fractions so we can once again work with integers. This can easily be done by multiplying both sides of the equation with , yielding
We then rearrange this polynomial in two ways by isolating all the and all the terms, so we can factor.
To reach the first equation we moved the last term to the other side, and for the second equation, we moved the leading coefficient. Notice that both left-hand sides now have a common factor so we can simplify even further.
In the first equation, we can see that left-hand side is divisible by , therefore so must the right-hand side. Notice that since
was in lowest terms, clearly is not divisible by , hence this means that divides . By symmetry, the second equation shows that must divide . This is exactly what we needed to show, which is that if has a rational root, then
End of proof
We can now offer a slicker proof to our original statement, using the theorem we just proved. Remember we wanted to prove
where is any integer greater than 1 and is a prime number. Above we used a long proof by contradiction to show this, but now we offer a simpler proof.
Proof.
Consider the following polynomial, for which the th root of is a solution.
It is sufficient to show that the polynomial has no rational roots, because that implies the root of must be irrational. Notice that this is a polynomial with rational roots, hence we can apply the rational roots theorem to conclude that if
is a rational root of the polynomial then must divide 1, the leading coefficient, and must divide , the constant term. There the set of all possible rational roots to this polynomial is
At this point, we simply try all possibilities to conclude no such rational can exist.
Notice this is technically a proof by contradiction; we assume that a rational root exists, then use the rational roots theorem to that this root must then be one of
Showing that none of these are actually roots results in a contradiction, proving our original statement. End of proof
In some sense, the rational roots theorem we proved was a more general result than showing
has no rational roots. Proving the harder general case allowed us to revisit our original problem and more easily show it was true. In this case it simply amounted to evaluating the polynomial at four points.
We will revisit this technique of proving more general results and then looking back to apply these to more specific cases later. In the next chapter, we will see cases where this may be necessary; but finding more general principles is a virtue in itself. It turns out that the rational roots theorem, once again, is a specific case of Gauss’ Lemma, but this remains out of scope of this set of notes.
Note 3 Mathematical Induction
3.1 Sum and Product Notation
In the last note, we ended up working with polynomials with an arbitrary number of terms, relying on the following notation.
While this was perfectly rigorous, it relies on the reader being able to notice the pattern among terms in the summation. It may be less ambiguous to explicitly state this pattern, for which we introduce the following notation.
Using this notation, we would express a generic degree polynomial with coefficients as the following.
Notice that both of these are common computational processes in computer science which can be expressed as with the following pseudo-code.
The following properties of summation and product naturally extend from its definition, and should intuitively be true, though writing out a proof should not be too difficult.
There is an analogous equivalence for a product, though this time the number we we pull out get raised to the power of the number of terms.
These summations (and products) appear frequently in mathematics, especially in calculus where the number of terms goes to infinity, so over the course of this chapter, we help develop tools to help simplify then more useful formula. As an example, consider the sum of the first natural numbers.
Theorem 29.
For all natural numbers , the following equation holds.
Proof.
We offer a direct proof of this fact using a trick; first denote the sum itself as , so we have the expression
Now we sum the sequence the with itself in reverse order, in essence adding two equations.
Notice in the bottom sum, each of the terms is . Therefore we can simplify the expression even further to get
| (3.1) | |||
| (3.2) |
This is exactly what we need to show, hence we have derived the formula. End of proof
Notice the simplified expression is more useful to us, since it more clearly expresses the relationship between and the sum. More specifically, from this formula we can tell that the sum is actually a well-defined polynomial in , a fact which is obfuscated by the summation. However in the above proof we used a trick, and it is not clear how we would approach proving formulae like this more generally.
3.2 The Naturals Numbers
While most of our proofs until how have been focused on the real numbers and integers, we now turn our attention to the natural numbers. To develop another fundamental proof technique for working with these numbers, we need a more precise definition of the natural numbers, which we currently refer to with the set
To avoid the ambiguity of the ellipses, we can define the natural numbers () as the intersection of all subsets of the real numbers with the following properties
-
i.
-
ii.
if , then where is the successor function which represents .
Notice this is consistent with our intuitive definition of the natural numbers. If obeys these two definitions, then we know 1 must be in , along with all number in the sequence
Furthermore, since this set itself satisfies these conditions, we know that it is an example of one such set . Therefore the intersection of all such sets is indeed what we expected,
Taking the intersection of all such sets is essentially equivalent to taking the “smallest” set which satisfies these properties; otherwise, notice that the real numbers themselves also satisfy both properties. With this definition, we have a more systematic way to understand the natural numbers which yields a new proof technique.
Theorem 30 (Principle of Induction).
Let be a predicate defined for natural numbers . If the following two statements hold,
-
i.
is true
-
ii.
if is true, then is true
then is true for all natural numbers .
Proof.
We consider the set for which this predicate is true and show that this is equal to the set of natural numbers.
First notice that since is defined for natural numbers only, hence immediately we know . Now we know that
-
i.
because is true, and
-
ii.
if is true, then is true, so if then .
Therefore we have shown that is one of the sets used in the above definition. Since is the intersection of all possible such sets, we know that it must be contained in so that
Since we have showed is both a subset and a superset of , this means . Since represents the set of number for which is true, this means is true for all natural numbers . End of proof
This means that we now have a proof technique which can be applied to proving statements which are true for over the natural numbers, namely by showing that the statements holds for and that for all , if is true then it implies . This can be precisely expressed with the following logical equivalence, which conveys the same idea.
3.3 Inductive Proofs
Through the above manipulation, we have discovered a fundamentally different way to prove universally quantified statements, specifically those which are true over the natural numbers, by exploiting the definition of the natural numbers. The logical equivalence
shows exactly what is required, but we can unwrap this. Proving some statement using this tool is called a proof by induction. Since the right-hand side of the expression is a conjunction, we need to show both sides are true, meaning a proof by induction has two separate parts.
-
i.
Base Case. Show is true.
-
ii.
Induction Step. Assume is true for any natural number , and use this to show must be true.
The validity of this proof technique follows directly from the principle of mathematical induction. We can use this the approach the problem of proving equalities like the following systematically.
First, let us define the statement we are trying to prove more explicitly. If we denote the above expression with the predicate , we want to show that if the free variable is a natural number, then the statement is true, or
Proof.
Since we have a statement we wish to prove the over the natural numbers, we use a proof by induction. This involves showing the base case and then the induction step.
-
i.
Base Case. We want to show is true, which is easy to show.
-
ii.
Induction Step. Assume is true for some natural number so that we have
We want to show that the statement must follow; which written out corresponds to
Notice the similarity in structure on the right-hand side of both equations, which suggests a obvious substitution. We prove by starting with the right-hand side and deriving the left-hand side.
We are allowed to make the substitution in the second line, because we assumed this about the sum of the first natural numbers. We use this assumption to extend what we know to conclude the same thing about the first natural numbers. Therefore, this concludes the proof; we have shown that is true for all natural numbers .
End of proof
At this point, we should point out that never do we assume what we are trying to prove. Instead, in the induction step we assume is true for one particular value of and show . This assumption, that is true, referred to as the induction hypothesis in the inductive step. Our first inductive proof was rather verbose, but we now offer a more streamlined example to prove something similar.
Theorem 31.
For all non-negative integers , the summation formulas below have the following closed form solutions.
-
i.
(3.3) -
ii.
(3.4)
Proof.
We only prove the first statement by induction; the proof for the second statement is very similar.
-
i.
Base Case. We want to confirm the statement is true for .
-
ii.
Induction Step. Assume that the statement is true for some arbitrary number , so that
We want to show that the statement is also true for . To show this, we write out the sum of the first squares and then use the induction to simplify.
This is exactly what we needed to show, hence this concludes the proof by induction.
End of proof
3.4 Sequences and Recurrences
In some sense, induction treats the natural numbers as an ordered list of elements, starting with 1 and increasing in the order we write them out.
A natural generalization of this idea is to consider ordered lists of items with arbitrary elements, a notion we denote a sequence, which can be represented as
The subscript represents the term number, which gives the position of a particular element, or term, in the sequence. More formally a sequence is defined as a function from the natural numbers, which maps term numbers to terms.
Since the natural numbers are ordered, constructing this function ties the order of the natural numbers to those of the terms in the sequence, imposing an order. Two kinds of sequences that appear frequently in mathematics are arithmetic sequences and geometric sequences. The former is a sequence such that that the difference between consecutive terms is constant while the later is a sequence in which consecutive terms have the same ratio.
In the above sequences, we can specify an arithmetic sequence completely by its first term and the common difference . Meanwhile, for a geometric sequence, the first term is needed along with the common ratio . By inspection, we can see that the th term of an arithmetic sequence is
while for a geometric sequence, we simply multiply the ratio to the first term times, giving the formula
Given the similarity in structure of sequences to the natural numbers, induction provides a natural way to prove things about these sequences. Consider the following theorem about the sum of first terms of arithmetic and geometric sequences.
Theorem 32.
Consider the arithmetic sequence and the geometric sequence , with the following terms
The sum of the first terms of each sequence have the following formula, given .
| (3.5) | ||||
| (3.6) |
| (3.7) | ||||
| (3.8) |
Proof.
We prove both statements in order, using induction, starting with the sum of an arithmetic sequence.
-
i.
Base Case. This is for , because we are simply summing a sequence with 1 term.
-
ii.
Induction Step. Assume this statement is true for the sum of the first terms of an arithmetic sequence. We show that also holds for the first terms.
This proves the formula for the sum for the first terms of an arithmetic sequence; we now approach a similar proof for the sum of the first terms of a geometric sequence.
-
i.
Base Case. Once again, this is trivial for since that involves summing a one-term sequence.
-
ii.
Induction Step. Assume this statement is true for the sum of the first terms. We show that also holds for the first terms.
End of proof
Notice that the algebra involved in these proofs were much messier that for those we dealt with before, yet we forged ahead. One benefit of induction is that that if the statement is true, then the algebra will almost certainly work out, so we know how exactly to simply the expression. Now consider a different kind of sequence.
Theorem 33.
If we define a sequence recursively, so that
then, for all natural numbers , this sequence is equivalent to the polynomial
| (3.9) |
Proof.
We provide an informal proof of this statement, based on our previous result; a more thorough proof would simplify formalize this with induction. Consider for some arbitrary , we begin simplifying the expression using the second rule.
Notice there is however, one additional case we need to check now, which is when , but clearly the formula trivially holds here. End of proof
The sequence we encountered above is rather different from those which we previously encountered. Such a sequence is by a recurrence relation is characterized by two features:
-
i.
Base cases. Explicit values of the sequence for a small number of terms. In this case, by definition, .
-
ii.
Recursive definition. A relationship between the value of the sequence to other values, typically smaller. In the above example, for non-base cases, we were given
While this type of recurrence equation describes a perfectly well-defined sequence, a more explicit definition of each term is often desirable. This explicit formula is called closed-form solution of the recurrence, which usually relates directly to , rather than other terms. However the recursive definition is sometimess cleaner; we can convey the essence of an arithmetic and geometric sequence with the following recurrence relations.
The closed forms of these are the term expression we began with, which are
3.5 Extending Induction
So far the power of induction seems rather limited since it applies to only statements of a very particular form, namely statements which are true over the natural numbers. However it turns out that induction can can applied to prove statements which are true over an entire class of sets, including for all non-negative integers, or all even numbers, or all odd numbers, which is explained with the following theorem.
Theorem 34.
Suppose we want to prove some predicate for all , where is some set which can be expressed as
A more precise condition is that be countable, but we return to this notion in a later chapter. Then it is sufficient to show
-
i.
Base Case. is true.
-
ii.
Induction Step. if is true for any natural number , then must be true
Proof.
We prove this by reduction, showing that this technique is exactly identical to induction. First consider defining the following new predicate which is related to .
First notice that the elements of are indexed by , so to prove for all is equivalent to proving for all . At this point, we notice that we can apply induction on to prove for all natural numbers, which entails showing the following two things.
-
i.
Base Case. is true.
-
ii.
Induction Step. if is true for any natural number , then must be true
Now if we show these both hold, then this proves over the natural numbers, which is equivalent to showing is true for all in . All that is left is to translate these conditions back into terms of .
-
i.
Base Case. is true.
-
ii.
Induction Step. if is true for any natural number , then must be true
End of proof
This shows that to prove some predicate over a set which can be expressed as an ordered list, we can: (1) show that it is true for the first element, and (2) that if it is true for some element, it is true for the next element.
This suggests the common “domino” analogy for understanding induction. To prove that that a line of dominoes have fallen down, it is sufficient to show that the first one had fallen down and that if a domino has been knocked down, it causes the next one to fall.
With this theorem, we can introduce some common patterns used in induction proofs to prove statements over different sets, and armed with these we can revisit and generalize some of our past results.
| Set | Base Case | Induction Step |
|---|---|---|
So, for example, to prove a statement is true for all odd natural number, we first show that it is true for 1, and that if it true for some odd integer it is also true for . This is a direct application of the theorem we proved where we define
Now, with this technique, we can extend many of the inequalities we showed previously, and solve a more general recurrence.
Theorem 35 (Generalized Triangle Inequality).
For any number of real numbers , the absolute value of their sum is bounded by the sum of their absolute values.
Proof.
We apply a proof by induction on over the set of natural numbers greater than 1. Our base case should then be .
-
i.
Base Case. In the case where , this reduces into the triangle inequality we proved previously.
-
ii.
Induction Step. Assume that this generalized triangle inequality holds for numbers, so that
We will now show that this also holds for numbers; consider the absolute value of their sum.
In the last step, we make the substitution from the induction hypothesis, which complete the inductive step and thus the proof.
End of proof
We can now approach more general solutions recurrence relations as well. A sequence of the form
is called a first-order linear recurrence relation, which is a well-defined sequence, given the base case is defined. The “first-order” refers to the fact that refers only to one previous term, and the “linear” comes from the fact that is a linear function of , plus some constant.
Theorem 36 (First-Order Linear Recurrences).
The solution to a first-order linear recurrence relation defined as
has closed-form solution
Proof.
Unsurprisingly, we are going to prove this by induction in a long and messy proof. However, we actually have two cases to consider, one where is 1 and one where it is not. We will first prove the more general of the two results, which is when .
-
i.
Base Case. This is easy to check, since it only involves the first erm, which we are given.
-
ii.
Induction Step. Assume that the closed form is correct for the th term so that
Now we show the formula also holds for the next term, namely for the th term. Do this this, we apply the recursive definition of this term.
Now, unfortunately, we still have to deal with the case of separately. In this case, however, this recurrence is easier to work with.
-
i.
Base Case. Once again, it is easy to verify the closed-form solution works for just one term.
-
ii.
Induction Step. Assume the formula holds for the th term, and show that it also hold for the term after that.
End of proof
Notice that this immediately proves the equivalence of the recurrence relation definition of arithmetic and geometric sequences to the one we originally used. If we take , then we get
which refers to arithmetic sequences, while if we take then we find
which corresponds exactly to geometric sequences. Of course, it may seem odd that we need a separate case at all to account for when , but it turns out, in some sense, the general formula still holds. Rather than being thrown off guard by noticing the in the denominator, we can instead take the limit as tends to 1.
The first limit should be easy to see; the second one is the difference quotient representing derivative of the function
at . This is out of the scope of this chapter, but it goes to show that this closed-form solution is not as disjoint as it may appear to be.
3.6 Strong Induction
The power of induction is that it allow use to make stronger assumptions in the process of proving our statement. Instead of proving the statement directly for any arbitrary element , in the induction step, we assume is true for some natural number and simply have to prove to be true. This advantage comes from knowing something about the structure of the natural numbers themselves.
However, it turns out that we are allowed to assume something even stronger during the induction step.
Theorem 37 (Principle of Strong Induction).
Let be a predicate defined for natural numbers . If the following two statements hold,
-
i.
is true
-
ii.
if are true, then is true
then is true for all natural numbers .
Proof.
We prove this by showing that this is essential equivalent to the principle of induction. First, we define the following predicate
so that is true if and only if is true for all up to . Notice showing is true for all is equivalent to showing is true for all natural numbers. Remember the principle of mathematical induction tell us to show this predicate is true, all we need is to show is the following.
-
i.
is true
-
ii.
if is true, then is true
Therefore, if we show these two statements hold, then must be true as well. First consider the first statement; by definition, is the same as . Next, we consider the implication, which we express in terms of .
Therefore, we have shown to prove is true far all natural numbers , and thus is true, we can show
-
i.
is true, and
-
ii.
if are true, then is true.
End of proof
This allows us to augment the amount of information we get for “free” during the induction step, which allows us to prove statements that we otherwise might not be able to. While this unlocks the door to more complicated recurrence relations, we can also re-visit a fact about the natural numbers we have been relying on without proof.
Theorem 38 (Prime Factorization).
Any natural number can be expressed as a, possibly empty, product of primes; meaning
for some set of primes and corresponding powers . By convention, we say that the empty product is 1. In other words, every natural number has a prime factorization.
Proof.
We can prove this by strong induction on , with a base case of .
-
i.
Base case. As explained, the empty product is taken to be 1 so 1 is trivially the product of primes, namely none at all.
-
ii.
Induction Step. Assume that this holds for all numbers less than or equal to some . We now prove this for by casework. Now by definition, every number greater than or equal to 2 is either prime or composite. If , then we are immediately done since we take
Otherwise, it it composite, which means by definition, it can be written as the product of two numbers less than itself, since it has a divisor that is not equal to 1 or itself. Therefore,
By our induction hypothesis, we know both and can be expressed as a product of primes. Since is the product of these, it can clearly also be expressed as a product of primes.
End of proof
It turns out, as you probably already know, that something even stronger is true–that every natural number has a unique prime factorization. This is a theorem called the fundamental theorem of arithmetic, which is currently outside of our grasp, but which we will prove later. As we will see, what we proved just now is an important building block in the proof of the fundamental theorem. A proof that follows a similar pattern, and is also foundational to computers today, is the following.
Theorem 39.
Every non-negative integer can be expressed in binary so that it is a sum of distinct powers of 2. In other words, there exists a subset of the non-negative integer such that
Once again, we use the convention that the empty sum is 0, which allows us to include 0 in this theorem.
Proof.
We will once again use strong induction.
-
i.
Base case. Once again, we trivially take represent 0. The reason this convention makes sense is because we do the same in the decimal numbers system; while we can express
we can only express 0 by letting all the coefficients of powers of ten be 0. The same reasoning motivates our choice here.
-
ii.
Induction step. Assume that all non-negative numbers less than or equal to can be expressed as the sum of distinct powers of 2. We will now show the same holds for . Now we have two cases to consider, one where this number is even and one where this is odd. In the former we can write
Since must be strictly less than , from the induction hypothesis, we know that it can be expressed as a sum of distinct powers of two. Clearly, when we multiply each of these by two (raising their exponents by one), we find a new representation. The second case to consider is when
is odd. This means that is even and by the induction hypothesis can be represented as a sum of unique powers of 2. Since it is even, this sum cannot include , which means we simply add this to find the desired representation.
End of proof
Having shown some unique applications of this proof technique, we return, as promised, to more complicated recurrence relations. In particular, we can now prove closed form representations for recurrences beyond first order, in which each term is a function of more than just the previous term. In the extreme case, we could have a sequence for which each term is defined in terms of all its predecessors.
However, more often, we run into cases where each term is a function of two, three or some other finite number of terms. A famous such sequence is the Fibonacci sequence,
in which the first two terms are 1 and each term after this is the sum of the previous two. The first few terms of this sequence are
It turns out that there indeed is a closed-form solution to the Fibonacci sequence, albeit one which seems far from intuitive, which we will show below
Theorem 40.
The closed form of the Fibonacci sequence is given by the exact formula,
where and are the two roots of the quadratic , or explicitly
The number appears in other places and is often referred to as the golden ratio.
Proof.
It may not immediately be obvious that the closed firm solution holds; it seems rather miraculous that this formula, peppered with radicals, should somehow yield whole numbers, and additionally correspond to the Fibonacci sequence. Yet, we do just that.
-
i.
Base Cases. The recursive part of the Fibonacci sequence definition only applies to terms past the fist two, hence we will have to show the formula hold manually; this is will be our base cases.
To show the formula also holds for the seconds term, we use a trick, which is noting that since and are roots of ,
We can use this to simply the exponents in the canonical closed form representation of the second term.
This matches with the first few terms of the first two terms of the Fibonacci sequence, which means the base cases hold.
-
ii.
Induction Step. Assume that the formula holds for all terms with number less than or equal to , we want to show that it will also hold for the next term. Specifically, we will use,
We can now plug this into the recursive definition of to show that the closed form also holds for the next term.
Now we can use the observation we used earlier, in reverse, to swiftly finish the proof. Namely equals when is or . Making the substitution, we immediately get what we want.
This is what we needed to show to prove our claim by induction. End of proof
3.7 The Well-Ordering Principle
We can now show another equivalent fact about the natural numbers which is inextricably linked to induction. This is called the well-ordering principle, which states that every non-empty subset of the natural numbers has a smallest element.
Theorem 41 (Well-ordering Principle).
Let be a non-empty subset of the natural numbers, . , then, must have a smallest element.
Proof.
We instead prove the contrapositive of this statement, that if is a subset of the natural numbers which has no smallest element, then it is the empty set. Now consider the relative complement of in the natural numbers, which we denote
Notice 1 is an element of , since 1 cannot be an element of , otherwise this would be its least element. Now assume
which means that none of these numbers are elements of . This implies that cannot be in otherwise it would be its smallest element. Therefore we have shown that is in . Therefore, we have
-
i.
, and
-
ii.
if , then .
Hence from the principle of strong induction, we know that that contains the natural numbers. However, since is the result of subtracting from the natural numbers, we also know that is a subset of , which together implies
This completes the proof of the contrapositive, which means that if has no smallest element, then it must be the empty set. Equivalently, then, every non-empty subset of the natural numbers must have a smallest element. End of proof
In some sense, the well-ordering principle is the opposite of induction, which makes it a powerful tool for proof by contradiction over the natural numbers. If a predicate about all natural numbers is false then the set
will be non-empty subset of the natural numbers. It must be non-empty because otherwise this means the predicate is true for all natural numbers. By the well-ordering principle, this set must have a smallest element. A common pattern in a proof by contradiction is the following:
-
i.
Assume that the proposition is false;
-
ii.
By the well-ordering principle, there must be a smallest natural number for which the predicate is false, denote this ;
-
iii.
Show that we can somehow derive a smaller number for which this proposition is also false, contradicting the original assumption that was the smallest such couter-example.