CMSC 27100 — Lecture 6

We were in the middle of proving

For all integers $m$ and $n$, there exist integers $a$ and $b$ such that $\gcd(m,n) = a \cdot m + b \cdot n$.

If $m = n = 0$, then any $a$ and $b$ works. So without loss of generality, we have non-zero $m$ or $n$. Consider a set $S = \{am + bn \mid a,b \in \mathbb Z\}$. Let $d$ be the least positive integer in $S$ and let $a$ and $b$ be such that $d \in S$. First, we will show that $d$ is a common divisor of $m$ and $n$.

Suppose otherwise that $d$ isn't a common divisor of $m$ and $n$. Without loss of generality, say $d \not \mid m$. Then by the Division Theorem, there exist $q$ and $r$ such that $m = q \cdot d + r$ and $d \gt r \gt 0$. Then we have \begin{align*} r &= m - q \cdot d \\ &= m - q \cdot (am + bn) \\ &= (1-qa)m + (-bq)n \end{align*} This gives us $r \in S$. But since $0 \lt r \lt d$, this contradicts that $d$ was the least positive integer in $S$. Therefore, $d$ must be a common divisor of $m$ and $n$.

We stopped here last time, so we will continue and show that $d$ is the greatest common divisor of $m$ and $n$.

Now, consider any common divisor $c$ of $m$ and $n$. Then there exist integers $s$ and $t$ such that $m = cs$ and $n = ct$. This gives us \begin{align*} d &= am + bn \\ &= acs + bct \\ &= c \cdot (as + bt) \end{align*} Therefore, $c \mid d$ and therefore $c \leq d$. Thus, $d = \gcd(m,n)$.

Bézout's lemma is also called Bézout's identity. Confusingly, Bézout has another famous result named after him in algebraic geometry called Bézout's theorem.

Here, it's worth noting a subtle assumption that we made. Our assumption in the proof above was that $d$ was not a common divisor of $m$ and $n$. This is a proposition of the form $\not p$. We then used this assumption to arrive at a contradiction. According to our rules, this means that we now have a proof of $\neg \neg p$.

Is $p$ and $\neg \neg p$ the same thing? In classical logic, this is the case. But consider the following interpretation for inference rules. We've said that an inference rule says that if we have proofs of the statements on the top, this allows us to construct a proof for the statement on the bottom. This is the intuitionist interpretation of logic.

If we think a few steps more, we see that in this interpretation, we can view this as a computation. An inference rule describes consuming proofs of a certain type in order to construct a proof of a certain type. Since the proofs are necessary ingredients for the computation and construction of a new proof, we really need to have such proofs in hand.

So the question we need to think about is this: Is having a proof of $\neg \neg p$ good enough to use as a proof of $p$? What we really have is a proof that it can't be the case that there's a proof of $\neg p$, which is a proof that it can't be the case that there's a proof of $p$. But we are still without a proof of $p$.

Another way is to consider the question of the law of excluded middle, which is the statement $p \vee \neg p$. In classical logic, this is obviously true—in fact, it's an axiom. It must be the case that $p$ is either true or false. But the intuitionist interpretation of this statement is that it must be the case that we have a proof of $p$ or a proof of $\neg p$. This is not true.

So a proof system for classical logic is one that includes an inference rule for LEM, while intuitionists would reject it. \[\frac{}{p \vee \neg p} \mathrm{LEM}.\] Using LEM, one can derive the rule \[\frac{\neg \neg p}{p} \neg\neg e.\]

Computing the gcd

Bézout's lemma gives us a nice alternate way to represent the gcd of $m$ and $n$. But how does one compute the gcd? The obvious way to do this is to compute all divisors of $m$ and $n$, find the common divisors, and find the largest of those. The big problem with this approach is that factoring integers is a notoriously hard problem to solve efficiently. There are some heuristics that we can apply and some tricks that we can pull, but in general we do not have a fast algorithm for factoring numbers. This fact happens to be the backbone of many public-key cryptographic systems we use today.

However, if all we care about is the greatest common divisor and not any of the other divisors, then we will see a theorem will get us there efficiently.

First, a bit of notation.

Let $n,d,q,r$ be integers that satisfy $n = d \cdot q + r$ and $0 \leq r \lt d$, as in the Division Theorem. We define the function $n \mathop{\mathbf{mod}} d = r$.

You may be more familiar with $\mathop{\mathbf{mod}}$ as the modulus operator % from Python. It's a handy way to refer to the remainder without having to say "$r$ as in the Division Theorem" every time. We can similarly define $n \mathop{\mathbf{div}} d = q$ (hence the name of the function, $\operatorname{divmod}$).

If $d = \gcd(a,b)$, $b \neq 0$, and $r = a \mathop{\mathbf{mod}} b$, then $d = \gcd(b,r)$.

Let $a$ and $b \gt 0$ be arbitrary. Let $q$ be such that $a = qb + r$. If $d \mid a$ and $d \mid qb$, then $d \mid (a - qb)$, and therefore, $d \mid r$. In other words, $d$ is a common divisor of $b$ and $r$. Now consider an arbitrary common divisor $c$ of $b$ and $r$. If $c \mid b$ and $c \mid r$, we have $c \mid (qb+r) = a$, so $c$ is also a common divisor for $a$. Since $c$ is a common divisor of $a$ and $b$, this means we must have $c \leq d = \gcd(a,b)$. Therefore, $\gcd(b,r) = d = \gcd(a,b)$.

This result gives us a way to compute the gcd: $$\gcd(a,b) = \begin{cases} a & \text{if $b = 0$,} \\ \gcd(b, a \mathop{\mathbf{mod}} b) & \text{otherwise,} \end{cases}$$

which becomes the following function fairly straightforwardly:

def euclid(a,b):
    if b == 0:
        return a
    else:
        return euclid(b, a%b)

This algorithm is called Euclid's algorithm, named after Euclid who describes it in the Elements. Euclid is the Greek mathematician from around 300 BC who, in the Elements, describes a lot of the elementary geometry, algebra, and number theory that we learn in school. Euclid's algorithm is considered to be one of the oldest known algorithms.

Suppose we want to compute the gcd of 232 and 72. We apply Euclid's algorithm as follows: \begin{align*} \gcd(232,72) &= \gcd(72,16) \\ &= \gcd(16,8) \\ &= \gcd(8,0) \\ &= 8 \end{align*}

Strong Induction

We briefly mentioned that the naive way of computing the gcd of $m$ and $n$ was inefficient. How do we measure the efficiency of algorithms? Informally, we can think of this as the number of steps that the algorithm needs to take, and we compare this to the size of the input of the algorithm. We would like the number of steps of our algorithm to grow slowly as we give it larger and larger inputs.

But what is the size of a number? Here, the size of the number is the size of its representation. This would be the number of digits it takes to write the number down. This means that the size of a number $n$ is $d = \log n$. Here, the exact base doesn't matter too much. But what this means then is that searching for divisors from 1 through $\sqrt n$ would require approximately $\sqrt n = 10^{\log \sqrt n} = 10^d$ steps, assuming your number was in decimal representation (swap with the appropriate base if desired). In other words, the number of steps we need to compute this grows exponentially as we increase the size of our number. Unfortunately, exponentially growing functions do not grow slowly.

So does Euclid's algorithm do any better than this? How would we know? We'll take a brief excursion into algorithm analysis and learn about strong induction and some properties of the Fibonacci numbers along the way.

We will define this famous sequence of numbers in this way.

The Fibonacci numbers are defined by

  1. $f_0 = 0$, $f_1 = 1$, and
  2. $f_n = f_{n-1} + f_{n-2}$ for $n \geq 2$.

This sequence is attributed to Fibonacci, an Italian mathematician from the 13th Century in what was then the Republic of Pisa, though their origin can be traced further back to a number of Indian mathematicians as early as the 3rd Century BC. The Fibonacci numbers are one of those things from math that spookily shows up in all sorts of different places not only in mathematics, but in all sorts of natural phenomena. You may have been asked to compute the Fibonacci numbers as a programming exercise before.

def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

Since this definition is recursive, the Fibonacci numbers seem to lend themselves to inductive proofs quite naturally. However, there is one snag with the definition of the Fibonacci numbers, and it's that they're defined in terms of the previous two terms. If we think back to our template, this means that we have to make a stronger assumption: if, for some statement $P$, we want to show $P(k+1)$, not only do we need to assume $P(k)$, but $P(k-1)$ as well.

This is where we need to introduce the notion of strong induction. Strong induction is a reformulation of our induction axiom that gives us a "stronger" hypothesis. Recall that we had \[P(0) \wedge \forall k (P(k) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] Now, we have "strengthened" our hypothesis by adding an additional term: \[P(0) \wedge \forall k (P(k-1) \wedge P(k) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] But what if we needed more terms? That's no problem either. \[P(0) \wedge \forall k (P(0) \wedge P(1) \wedge \cdots \wedge P(k-1) \wedge P(k) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] Or more succinctly, \[P(0) \wedge \forall k (\forall j(j \leq k \rightarrow P(j)) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] Why does this work? One of the things you can try to convince yourself of is that this is actually no different from ordinary induction. It's not hard to see that every proof by induction is also a proof by strong induction, but it takes a bit more reasoning to convince ourselves of the converse.

Of course, if we think back to our computations on recursive definitons and functions, we remember that we saw a pattern: the computation for $f(n)$ always contained the computation for $f(n-1)$, which contained the computation for $f(n-2)$, which ...

So in fact, all strong induction does is make explicit a fact that was implicit: if $P(k-1)$ was true, it would necessarily have been the case that $P(k-2)$ was true, and so on. And since we proved that $P(0)$ was true right at the start, we can assume that everything from $P(0)$ to $P(k)$ is true if we assume that $P(k)$ is true.

This means we need to make a slight modification to our template for proof by induction in order to get a template for a proof by strong induction, namely: the inductive hypothesis.

  1. Prove the necessary base cases.
  2. Prove the inductive case.
    1. Choose an arbitrary natural number $k$.
    2. State your inductive hypothesis. Here, there are two broad kinds of inductive hypotheses that you use for proof by strong induction
      1. If your hypothesis is something like \[P(k-i+1) \wedge P(k-i+2) \wedge \cdots \wedge P(k-1) \wedge P(k),\] your inductive hypothesis would be to assume that \[P(k-i+1), P(k-i+2), \dots, P(k-1), P(k)\] hold. Here, $i$ is the number of terms you need. For example, in the case of the Fibonacci sequence, you will need $i = 2$, so you want to assume $P(k-1)$ and $P(k)$ hold. This assumes that $i$ is small enough that it makes sense to just list the terms. If it isn't, see the next variant.
      2. If your hypothesis is something like $P(0) \wedge P(1) \wedge \dots P(k-1) \wedge P(k)$, then your inductive hypothesis would be to assume that $P(j)$ holds for all natural numbers $j$ with $0 \leq j \leq k$. You can modify this to be similar to the above, replacing $0$ with $i$ if it's warranted.
    3. Use this assumption together with known facts to prove $P(k+1)$. You should clearly state when you use the inductive hypothesis.