MPCS 50103 — Lecture 1

Welcome to MPCS 50103! Basic course information can be found on the home page of the course website.

What is this course about?

This course is about discrete mathematics, as opposed to, say, "continuous" mathematics, like calculus/analysis, or "discreet" mathematics, which you do very quietly and carefully. More specifically, discrete mathematics is the study of mathematics on discrete structures like integers, as opposed to continuous structures, like the real numbers.

Why should you take this course?

Because you have to in order to complete your degree.

I once had a department chair who argued that calculus shouldn't be a requirement for computer science majors. After confirming that he was actually quite serious about this, he explained that discrete mathematics is much more relevant to computer science and he felt that calculus was a significant deterrent to prospective computer scientists. Of course, not many of us agreed that calculus was really that bad, but it is true that computer science at its core is really a subfield of discrete mathematics. Historically speaking, we can think of computer science as an offshoot of mathematical logic that sort of spiralled out of control into what it is today.

Practially speaking, this course will give you the basic mathematical tools necessary for studying computer science. Not only will the content be useful, but the proof techniques that we study will show up in theoretical computer science courses later on.

Intellectually speaking, this course will introduce you to some ideas of mathematical proof that you may not have seen before, depending on your mathematical background. Typically, in North America, many students do not see the topics that we cover in discrete math until, well, they take a course like this.

What will we be covering in this course?

This course is a grab bag of fun discrete mathematical treats.

  1. Logic, proof, and set theory. While we won't go into mathematical logic or set theory in any real depth, they form the basis for the ideas and proof strategies that we'll use throughout the course. Most importantly, this part of the course is where many foundational ideas for the rest of the course are introduced.
  2. Number theory. Number theory is the study of the properties of numbers and in particular, the prime numbers.
  3. Combinatorics. Combinatorics is the study of how to count things. This sounds trivial until you have to start trying to figure out how many unique ways to arrange some particular structures there are.
  4. Probability theory. Probability is the study of quantifying the likelihood that events will occur. We focus on discrete probability, where events occur with countably many possible outcomes.
  5. Graph theory. Graph theory is the study of graphs. These are not graphs in the calculus or stats sense, but graphs as a set of vertices connected by edges. Graphs can represent relations among a set of objects.

Keep in mind that while this is an explicit list of content that we expect to cover, there is a lot more that we'll be covering implicitly, in the course of studying these topics. As already mentioned, a big part of this course is to develop your skills in mathematical proof.

Logic

Mathematical logic is the study of reasoning from the point of view of mathematics. On its own it's a rich and deep field of study. In the context of computer science, the development of mathematical logic in the early 20th century leads directly to the notions of computability. For the purposes of this course, the basics of mathematical logic allow us to develop a language to express mathematical ideas and a framework in which to prove their validity.

For instance, you've probably run across particularly math-y statements like

For all integers $n \geq 5$, $2^n \gt n^2$.

This appears to be (mostly) natural language, but the statement itself is one of the things that we want to (and can) express using mathematical logic. Ultimately, we want to be able to make statements of this kind and to verify whether they're valid or not.

In the strictest sense, the language is really just a collection of symbols and some rules about how to arrange them. So, you're allowed to write something that looks like this, but not this and so on and so forth. The magic comes when we define meaning for these symbols.

Propositional logic

We will begin with propositional logic and expand from there later on. The basic element of propositional logic is the proposition.

Definition 1.1. A proposition is a statement that is either true or false.

All mathematical results will be something that's either true or false. Even something as simple as $2+2 = 4$ is a proposition: the statement asserts that the addition of 2 and 2 is equivalent to 4.

We can classify propositions into two broad types.

Definition 1.2. An atomic proposition is one that cannot be broken down into smaller propositions. A compound proposition is not atomic.

In mathematics, this sort of breaking down an object into parts is common. Very often, we'll be looking at complicated objects and structures and in order to understand these things better, we want to break them down into their simplest, indivisible parts. In this sense, a compound proposition is made up of more than one proposition, which are joined together by some connective.

In mathematical logic, we represent propositions symbolically by formulas. Formulas are made up of propositional variables, parentheses, and connectives. Propositional variables, usually lowercase roman letters like $p,q,r,\dots$, represent atomic propositions and connectives allow us to build compound propositions by joining one or more propositions together. Parentheses are used to denote the ordering that we should interpret the connectives in. We represent propositional formulas by lowercase Greek letters, $\alpha, \beta, \dots$. Hypothetically, we would be able to express every proposition that shows up in this course symbolically, but we won't go that far.

There are four basic logical connectives, called Boolean connectives, after George Boole who defined them in 1854. We will define and consider each one.

Definition 1.3. $\neg p$ is called negation and is pronounced "not $p$". $\neg p$ is true if and only if $p$ is false.

For instance, $\neg q$ is the proposition "I cannot take two weeks off". Note that negation is a unary connective, in that it only applies to one formula, while all the other connectives are binary. Observe that because of this, we tend to express atomic propositions positively.

The definition for the unary connective $\neg$ with a propositional formula $\varphi$ is summarized by

$\varphi$ $(\neg \varphi)$
$T$ $F$
$F$ $T$

Definition 1.4. $p \wedge q$ is called conjunction, and is pronounced "$p$ and $q$". $p \wedge q$ is true if and only if both $p$ and $q$ are true.

This connective expresses the idea that both $p$ and $q$ are true. For instance, $q \wedge r$ is the proposition "I can take two weeks off and I can fly to Tokyo".

$p \vee q$ is called disjunction and is pronounced "$p$ or $q$". $p \vee q$ is true if and only if $p$ is true or $q$ is true or both.

For example, $q \vee r$ is the proposition "I can take two weeks off or I can fly to Tokyo". One tricky thing to note with English is that many times we use "or" to mean "exclusive or". For instance, when you are asked whether you prefer beef or chicken, the expectation is that you may only choose one and not both. This logical connective allows for both $p$ and $q$ to be true, which corresponds to something like "and/or" in English.

The binary connectives $\wedge$ and $\vee$ are summarized for propositional formulas $\varphi$ and $\psi$ by

$\varphi$ $\psi$ $\quad$ $\varphi \wedge \psi$ $\varphi \vee \psi$
$T$ $T$ $T$ $T$
$T$ $F$ $F$ $T$
$F$ $T$ $F$ $T$
$F$ $F$ $F$ $F$

Definition 1.5. $p \rightarrow q$ is called implication and is pronounced "if $p$, then $q$". $p \rightarrow q$ is true if and only if $q$ is true or $p$ is false.

For example, $p \rightarrow r$ would be the proposition "If flights to Tokyo are under 900 CAD, then I can fly to Tokyo". The direction in which you read this connective is very important. For instance, it may be the case that flights to Tokyo may not be under 900 CAD but I may still be able to fly. Here, $p$ is called the hypothesis and $q$ is called the conclusion.

$\rightarrow$ is summarized for propositional formulas $\varphi$ and $\psi$ by

$\varphi$ $\psi$ $\quad$ $\varphi \rightarrow \psi$
$T$ $T$ $T$
$T$ $F$ $F$
$F$ $T$ $T$
$F$ $F$ $T$

Implication is probably less familiar than conjunction or disjunction but it's particularly important since it's heavily used in expressing mathematical statements. Again, unlike conjunction and disjunction, implication is not commutative, which means that $p \rightarrow q$ and $q \rightarrow p$ are not necessarily logically equivalent. Here, $q \rightarrow p$ is called the converse of $p \rightarrow q$.

One way to think about implications is as a claim, as we do in mathematics. Is the claim true? This depends on whether the hypothesis is true. If the hypothesis isn't true, then the claim can be considered vacuously true. This is sort of like when you begin a claim about all positive numbers that are less than 0: you can say anything about these numbers because they don't exist!

However, if the hypothesis is true, then the truth of our claim depends on whether the conclusion is also true. If it is, then all is right with the world. But if our hypothesis is true while our promised conclusion is false, then we have to conclude that we didn't know what we were talking about and that our claim was false after all.

Using these connectives, we can construct fairly complicated propositions.

Example 1.6. Consider the proposition $\varphi = \neg( (p \wedge \neg q) \vee (q \wedge r) \vee \neg s)$. We can determine when this proposition is true based on whether the propositions $p,q,r,s$ are true. To do this, we can construct a truth table for $\varphi$.

$p$ $q$ $r$ $s$ $\quad$ $(p \wedge \neg q)$ $q \wedge r$ $\neg s$ $\quad$ $\varphi$
$T$ $T$ $T$ $T$ $F$ $T$ $F$ $F$
$T$ $T$ $T$ $F$ $F$ $T$ $T$ $F$
$T$ $T$ $F$ $T$ $F$ $F$ $F$ $T$
$T$ $T$ $F$ $F$ $F$ $F$ $T$ $F$
$T$ $F$ $T$ $T$ $T$ $F$ $F$ $F$
$T$ $F$ $T$ $F$ $T$ $F$ $T$ $F$
$T$ $F$ $F$ $T$ $T$ $F$ $F$ $F$
$T$ $F$ $F$ $F$ $T$ $F$ $T$ $F$
$F$ $T$ $T$ $T$ $F$ $T$ $F$ $T$
$F$ $T$ $T$ $F$ $F$ $T$ $T$ $F$
$F$ $T$ $F$ $T$ $F$ $F$ $F$ $T$
$F$ $T$ $F$ $F$ $F$ $F$ $T$ $F$
$F$ $F$ $T$ $T$ $F$ $F$ $F$ $T$
$F$ $F$ $T$ $F$ $F$ $F$ $T$ $F$
$F$ $F$ $F$ $T$ $F$ $F$ $F$ $T$
$F$ $F$ $F$ $F$ $F$ $F$ $T$ $F$

We can use this method to compare two propositional formulas.

Example 1.7. Consider the propositions $\neg p \vee q$ and $p \rightarrow q$.

$p$ $q$ $\quad$ $\neg p \vee q$ $p \rightarrow q$
$T$ $T$ $T$ $T$
$T$ $F$ $F$ $F$
$F$ $T$ $T$ $T$
$F$ $F$ $T$ $T$

This logical equivalence gives us another way to think about the implication connective. Later on, this will provide some insight into how we might go about proving claims of this form.

Definition 1.8. Two propositional formulas $\varphi$ and $\psi$ are logically equivalent if they have the same valuation (i.e. they have the same truth table).

Definition 1.9. Consider the propositional formulas $\neg(p \vee q)$ and $\neg p \wedge \neg q$.

$p$ $q$ $\quad$ $\neg(p \vee q)$ $\neg p \wedge \neg q$
$T$ $T$ $F$ $F$
$T$ $F$ $F$ $F$
$F$ $T$ $F$ $F$
$F$ $F$ $T$ $T$

This logical equivalence is one of De Morgan's laws. The other one is symmetric to the one above: $$\neg(p \wedge q) \equiv \neg p \vee \neg q.$$

Here, we use $\equiv$ to denote equivalence rather than $=$. This because $=$ is used to denote formulas that are exactly the same (i.e. they contain the same symbols in the same order). If we only want to say that the two formulas mean the same thing, we use $\equiv$.

Predicates

Now, one thing you might notice is that propositions as we've defined them don't quite let us say everything we may want to. For instance, if I wanted to talk about some property about integers, I can certainly define propositions for them. However, there's no way to relate the various propositions about integers to the fact that we're talking about the same kinds of objects. For that, we need to expand our language a bit further. The extension that we'll be talking about is called predicate or first-order logic.

First, we need to determine a domain of discourse, which is the set to which all of our objects belong. This can be something like the natural numbers $\mathbb N$ or the integers $\mathbb Z$ or matrices or functions or graphs and so on. There's also nothing stopping us from making our domain the set of all dogs, if what we really wanted to do was express statements about dogs, or other "real-world" types of objects. This occurs less in math but more in places like database theory.

This brings us to an important thing to keep in mind: statements that may be true in one domain will not necessarily be true in another domain. Obviously, statements that we make about integers will not necessarily hold true for dogs, but it is important to remember that the same could be said for integers vs. natural numbers.

Definition 1.10. Here are some useful domains/sets:

Note that in this course, we consider 0 to be a natural number. Not everyone agrees!

Then, we want to express properties and relationships about the objects in our domain. For this, we need to define predicates. Predicates have an arity associated with them. A $k$-ary predicate is a predicate with $k$ arguments. So a unary or 1-ary predicate would be expressing some property of a particular object, while predicates with arity greater than 1 can be thought of as expressing some relationship between a number of objects.

Example 1.11. If we take the students in this class to be our domain, then we can define some predicates like $M(x)$ for "$x$ is an MPCS student or $B(x)$ for "$x$ is a Booth student". Now, we can use these predicates together with logical connectives to say things like $M(x) \wedge \neg B(x)$, which says "$x$ is an MPCS student and $x$ is not a Booth student". Then we can evaluate the truth of this statement for some particular $x$. Observe that without predicates, we'd have to define this statement as a separate proposition for each person we want to say something about.

We can also define a binary predicate, like $S(x,y)$ for "$x$ is sitting next to $y$". Then, we can fill in $x$ and $y$ and evaluate this statement. Again, without the use of predicates, we'd have to encode a separate proposition for each pair of students $x,y$.

Example 1.12. For a more mathematical example, if we define $E$ to be a unary predicate that expresses the evenness property on the domain of the natural numbers, then we can say that $E(x)$ means "$x$ is an even number". We would then have $E(12000)$ is true while $E(765)$ is false.

The less-than relation is another example of a predicate, although because it's pretty important, we've given it its own symbol ($\lt$) and usage. We can define a predicate $L(x,y)$ to mean "$x$ is less than $y$" or "$x \lt y$". Then we can say that $L(3,30)$ is true and $L(30,3)$ is false.

In the previous examples, almost instinctively, we used both concrete objects and variables with our predicates. When we use specific objects with predicates, our statements are not all that different from propositions. However, variables are considered placeholders for objects in the domain and it's the use of variables, together with quantifiers, that gives us more expressive power than propositional logic.

Definition 1.13. The symbol $\forall$ is called the universal quantifier and is read "for all". The symbol $\exists$ is called the existential quantifier and is read "there exists".

Basically, we use $\forall$ when we want to make a statement about all objects in our domain ("every dog is a good boy") and we use $\exists$ when we want to say that there is some object out there in our domain that has a particular property ("there is a dog that can play Jenga").

Example 1.14. We can use predicates and quantifiers together with the logical connectives we introduced with propositional logic to say things like $\forall x (E(x) \rightarrow \neg E(x+1))$. If we take our domain to be $\mathbb Z$, this says that for every integer $x$, if $x$ is even, then $x+1$ is not even.

Example 1.15. Let's fix our domain to be $\mathbb N$. Recalling our less than predicate $L$ from above, we can consider the following proposition: $$\forall x \exists y L(x,y).$$ This reads "for every natural number $x$, there exists a natural number $y$ such that $x \lt y$". This statement turns out to be true: every natural number does have some other natural number that it's less than. Now, let's consider the following similar statement, where the order of the quantifiers is reversed: $$\exists y \forall x L(x,y).$$ Here, we're saying that there is a natural number $y$ for which every natural number $x$ is less than it. In other words, this says that there is a "biggest" natural number, which is obviously not true. As we can see, the order of the quantifiers matters.

Example 1.16. Remember that the domain is important. For instance, if we are working in $\mathbb N$ and we modify the above example slightly to $\forall x \exists y L(y,x)$, we have the statement that "for every natural number $x$, there is some natural number $y$ that is less than $x$". This statement is false if we consider $x = 0$. However, this statement is true if our domain is $\mathbb Z$ (for every integer $x$, there is some integer $y$ that is less than $x$).

Something we have just run into is the negation of a quantified statement. For propositional logic, there wasn't much to do, but with quantified statements, we can get some interesting insights.

Theorem 1.17. Let $\varphi$ be a predicate formula. Then, \begin{align*} \neg \forall x \varphi &\iff \exists x \neg \varphi \\ \neg \exists x \varphi &\iff \forall x \neg \varphi \end{align*}

We can consider this to be a generalized version of De Morgan's laws for quantifiers.

Let's take a moment to think about what this is saying.

Example 1.18. Taking our example $\forall x (E(x) \rightarrow \neg E(x+1)$, we can think about what the negation of this statement might be. Of course, we can begin by slapping on a negation: $$\neg \forall x (E(x) \rightarrow \neg E(x+1)).$$ Is there anything more we can do to clarify what this statement means though? We can make use of the negation of quantified statements that were just mentioned: $$\exists x \neg(E(x) \rightarrow \neg E(x+1)).$$ From here, we can make use of the propositional equivalences we saw earlier. Applying the implication equivalence gets us $$\exists x (\neg (\neg E(x) \vee \neg E(x+1))).$$ Now we can use De Morgan's laws together and eliminate the resultant double negations to get $$\exists x (E(x) \wedge E(x+1)).$$ Now, this statement says that there exists an integer $x$ such that $x$ is even and $x+1$ is even.

The final statement we arrived at makes it much more clear what the negation of our original statement actually meant. Obviously, since it was so simple, we could have probably guessed what it meant without going through the formal process of applying logical equivalences. However, if you're faced with proving some proposition, it is helpful to try to methodically translate the statement to formal logic and consider what it is saying in this way to make it more clear what it is you need to prove.

Proof

A proof is a proof. What kind of a proof? It's a proof. A proof is a proof, and when you have a good proof, it's because it's proven.

By constructing truth tables, we can determine the truth values for any given formula based on the connectives in that formula. The big problem with this is that truth tables can get very large: if you have $n$ different propositional variables, you're looking at a truth table of size $2^n$. However, there is another presentation of the logical rules we've defined that avoids the use truth tables and evaluating the meaning of the formulas altogether. These are called inference rules and these rules allow us to deduce propositions by transforming other propositions.

Example 1.19. The main inference rule for implication is called modus ponens and is defined by $$\frac{p \rightarrow q \quad p}{q}.$$

What does this rule mean? It's an argument of the form

  1. If the bus is late, then I will miss my transfer.
  2. The bus is late.
  3. Therefore, I will miss my transfer.

Here, it's easy to see that the statement takes the place of $p$ and the statement takes the place of $q$. For instance, the following argument follows the same structure:

  1. If $m$ and $n$ are even numbers, then $m+n$ is an even number.
  2. $m$ and $n$ are even numbers.
  3. Therefore, $m+n$ is an even number.

We can define a whole bunch of these inference rules and by using these rules, we can manipulate formulas to acquire other formulas. Inference rules define a proof system, which formalizes our intuitive notion of proof. One thing to notice that the content of the argument has no bearing on whether it's valid or not. The only thing that matters is whether we have followed the rules to get from one statement to the next. This is not unlike how many people try to argue on the internet.

Example 1.20. Another simple rule is the simplification rule. This is actually two rules of the form $$\frac{p \wedge q}{p} \quad \frac{p \wedge q}{q}.$$ Intuitively, this says that if $p \wedge q$ is a valid proposition, then we can conclude $p$ (or $q$). Such an argument looks like this:

  1. The sky is cloudy and it is snowing outside.
  2. Therefore, it is snowing outside.

Then, we can combine rules. Consider the following argument.

  1. $p \rightarrow r$
  2. $p \wedge q$
  3. Therefore, $p$, by simplification (2)
  4. Therefore, $r$, by modus ponens (1,3)

A similar argument using natural language would look like,

  1. If it is snowing outside, then the bus is late.
  2. The sky is cloudy and it is snowing outside.
  3. Therefore, it is snowing outside.
  4. Therefore, the bus is late.

It seems very obvious that we can get $p$ from $p \wedge q$ and it seems superfluous that we would have to simplify the statement, but the nature of formal proof is that these things cannot be assumed and the rules need to be strictly followed. This is not unlike programming, and in fact, there are some fascinating and fundamental connections between formal proof and programs.

From a less mathematical point of view, these rules can be considered forms of argumentation. These rules, together with a base set of rules define a proof system for us to work with. In this sense, argumentation and proof are really the same thing. We begin with a set of premises and try to use the rules to arrive at a conclusion. If we can do this, then we've formed a valid argument or proof.

This is a very mathematically formal notion of proof that we won't dive into too much more of. I mention it because it does form a basis from which we can explore the more informal notion of proof, which we will be using to demonstrate the validity of mathematical claims. So rather than introducing the nitty-gritty of inference rules and proof systems, we will introduce proof techniques and mention briefly their more formal origins. But first, some terminology about the kinds of statements we want to prove.

Definition 1.21. We have already seen propositions, which are statements that are either true or false. We call particularly important propositions that have been proven to be true theorems. Less important propositions are called lemmas. We can think of lemmas as propositions that we don't really care about except in the service of proving more important theorems. A corollary is a proposition that follows immediately from a theorem.

Direct proof

The first method is called direct proof, which basically means following our hypothetical inference rules to reach some conclusion. How we do this will depend on the kind of statement we want to prove. There are a few broad kinds of statements.

Universal quantification

Universally quantified statements are of the form $\forall x P(x)$. In other words, it's a claim about every element in the domain. An example of this would be the following.

Proposition 1.22. For every real number $x$, $x^2 + 1 \geq 2x$.

First, observe that we can write this as a propositional formula: assuming that the domain is $\mathbb R$, the real numbers, then the first part of our proposition can be written $\forall x$.

So how do we prove this statement? At first glance, it seems like we have a lot of work to do, because we have to show that this statement holds for every real number. One way we can attempt this is to plug in every real number and check if the statement holds. This is bad because there are uncountably many real numbers.

Instead, what we do is "pick" a generic representative and continue our argument with that representative. Since we just choose an arbitrary element with nothing special about it except that we know it's a real number, we can conclude that our statement must hold for any real number, because we could've picked any of them.

Proof. Let $c$ be an arbitrary real number. Then $c-1$ is also a real number and therefore, $(c-1)^2 \geq 0$. Expanding the term on the left side, we have $c^2 - 2c + 1 \geq 0$. We add $2c$ to both sides to get $c^2 + 1 \geq 2c$. Since $c$ was arbitrary, we can conclude that $x^2 + 1 \geq 2x$. $\tag*{$\Box$}$

Generally, a proof of $\forall x P(x)$ along these lines looks like

  1. Pick a representative domain element, say $d$.
  2. State known facts.
  3. Conclude that $P(d)$ holds.
  4. Since $d$ was arbitrary, we can conclude that $\forall x P(x)$ holds.

Existential quantification

Existentially quantified statements are of the form $\exists x P(x)$. In this case, the statement claims that there is at least one element for which $P(x)$ is satisfied. This means that in order to prove this statement, we need to find an element that works or somehow show that one is out there. Consider the following proposition.

Proposition 1.23. For some real number $x$, $x^2 + 4x + 4 = 0$.

One possible proof is as follows.

Proof. Let $x = -2$. Then $(-2)^2 + 4 \cdot (-2) + 4 = 8 - 8 = 0$. $\tag*{$\Box$}$

This is a perfectly valid proof, because we've shown that there's a real number that works: $-2$. And in fact, there are famous stories of mathematicians presenting such proofs at conferences. However, these proofs aren't very satisfying because they don't tell us how these answers came about. Something more insightful might be the following.

Proof. Consider $x^2 + 4x + 4 = 0$. We can factor this equation to obtain $(x+2)^2 = 0$. This means that $x+2 = 0$ and therefore, the only real number satisfying this equation is $x = -2$. $\tag*{$\Box$}$

This proof is more insightful and gives us an idea of how to approach similar problems. However, sometimes this isn't possible, and we don't have a clear way to find or construct a specific number. In the worst case, the best we can do is to show that there has to be an element out there that works, but without any idea of how to obtain it.

Generally, a proof of $\exists x P(x)$ will involve the following,

  1. State known facts.
  2. Show that $P(d)$ holds for some element $d$ in the domain
  3. Since we showed that there's some $d$ that works, we can conclude that $\exists x P(x)$.

Implication

Most theorems are statements that are implications, which are statements of the form $p \rightarrow q$. In order to prove such statements, we assume $p$ and then use that assumption to show that the conclusion $q$ is true.

Proposition 1.24. For all real numbers $x$, if $x \gt 4$, then $x^2 \gt 9$.

Proof. Let $x$ be an arbitrary real number and assume that $x \gt 4$. Then $x$ is positive. We multiply both sides of the inequality to get $x^2 \gt 4x$. Multiplying the inequality by $4$ gives us $4x \gt 4^2 = 16$. Then combining these inequalities, we have $$x^2 \gt 4x \gt 16 \gt 9.$$ Therefore, $x^2 \gt 9$. $\tag*{$\Box$}$

From this, we can see that a proof of $p \rightarrow q$ involves

  1. Assume that $p$ holds.
  2. State known facts.
  3. Conclude that $q$ holds.

If and only if

An important kind of statement is the if and only if statement. When we say $p$ if and only if $q$, this means that both $p \rightarrow q$ and its converse are true. In other words, $p$ and $q$ can be considered logically equivalent. The way to prove such a statement then follows from this explanation: we have to obtain a proof for $p \rightarrow q$ and a proof for $q \rightarrow p$.

Counterexample

A proof by counterexample is really a disproof of a statement, particularly a universally quantified statement. Consider a statement $\forall x P(x)$. If we want to disprove this statement, we would be proving the statement $\neg \forall x P(x)$. However, recall that this is equivalent to $\exists x \neg P(x)$.

Claim 1.25. For every integer $n$, if $n$ is even, then $n^2$ is odd.

Proof. We will disprove the claim. Let $n$ be an arbitrary integer and assume that it is even. Suppose that $n^2$ is an odd integer. Consider $n = 2$. Then $n^2 = 4$, which is not an odd integer. Therefore, the claim was incorrect. $\tag*{$\Box$}$

Whether such proofs are easy depends on how easy it is to find a counterexample. There are many examples of conjectures that have yet to be disproven because the claim holds for an enormous number of cases and the search for a counterexample has yielded nothing so far. However, numbers can get very large, so we can never count out the possiblity of finding a counterexample to these claims someday. This is just to say that finding a counterexample is not always going to be easy.

Contrapositive

This proof technique makes use of the fact that $p \rightarrow q$ is logically equivalent to $\neg q \rightarrow \neg p$ (try constructing the truth table for yourself). Rather than assuming $p$ and constructing a proof for $q$, an alternative is to assume $\neg q$ and conclude $\neg p$. Such a proof suffices to show that $p \rightarrow q$ because of the logical equivalence of the two statements.

Consider the following proposition.

Proposition 1.26. For all real numbers $x$, if $x^5 - 3x^4 + 2x^3 - x^2 + 4x - 1 \geq 0$, then $x \geq 0$.

This may be offputting because the hypothesis of this proposition is quite complicated. However, proving the contrapositive gives us an easier, but indirect, route.

Proof. Let $x$ be an arbitrary real number. We will prove the contrapositive,

If $x \lt 0$, then $x^5 - 3x^4 + 2x^3 - x^2 + 4x - 1 \lt 0$.
Note that we make use of the fact that $\neg (a \geq b)$ is equivalent to $a \lt b$.

So, we assume that $x \lt 0$. Then this means that $x^5 \lt 0$, $2x^3 \lt 0$, $4x \lt 0$, $-3x^4 \lt 0$, $-x^2 \lt 0$, and $-1 \lt 0$. Adding these together, we obtain $$x^5 - 3x^4 + 2x^3 - x^2 + 4x - 1 \lt 0.$$ Thus, the contrapositive holds, and therefore, the original implication holds. $\tag*{$\Box$}$

From this, our approach for proving $p \rightarrow q$ is the following,

  1. State the contrapositive $\neg q \rightarrow \neg p$.
  2. Assume that $\neg q$ holds.
  3. State known facts.
  4. Conclude that $\neg p$ holds.
  5. Since we have shown that $\neg q \rightarrow \neg p$ holds, we can conclude that $p \rightarrow q$ holds.

Contradiction

This proof technique is similar to the contrapositive, in that our proof is indirect. First, we need another idea from propositional logic.

Definition 1.27. Let $p$ be a propositional variable. Then the statement "$p \wedge \neg p$ is true" is a contradiction.

If we consider the truth table for $p \wedge \neg p$, we will see that it is always false. Intuitively, what the statement $p \wedge \neg p$ says is that $p$ is true and $p$ is false, something that we know can't be correct.

The proof template for proving $p \rightarrow q$ looks like the following.

  1. Assume that $p$ holds.
  2. Assume, for contradiction, that $\neg q$ holds.
  3. State known facts.
  4. Reach a contradiction.
  5. Conclude that our assumption that $\neg q$ holds is false.
  6. Therefore, it must be the case that $q$ holds.

Proposition 1.28. For all real numbers $x,y$, if $x+y \lt 10$, then $x \lt 5$ or $y \lt 5$.

Proof. Let $x,y$ be arbitrary real numbers. First, we assume that $x+y \lt 10$. Then, we assume for the sake of contradiction that $x \geq 5$ and $y \geq 5$. Adding these two inequalities together gives us $x+y \geq 10$. But this contradicts our very first assumption, that $x+y \lt 10$. Therefore, our assumption that $x \geq 5$ and $y \geq 5$ could not have been true. Thus, $x \lt 5$ and $y \lt 5$. $\tag*{$\Box$}$

Proofs and programs

Now, one of the big questions of mathematical logic back in the early 20th century is whether the two presentations we've seen are the same. With truth tables, we are concerned with semantics and what it means if we know something is true or false. With inference rules, we are concerned with syntax and what we may conclude, given a particular formula and how formulas are constructed.

This means that when we're considering logical statements, we can talk about truth or proof. This leads us to the following questions. If I can prove something, does that mean it is true? And if something is true, does that mean there has to be a proof for it?

Luckily for us, it turns out that the answer is yes (mostly)! This is a good thing. For propositional logic, this was shown to be true by Emil Post in 1920. Post turns out to be a fairly important logician with several important contributions to computability theory later on, along with people like Turing, Gödel, and Church.

It's not surprising then that this line of questioning turns out to lead us directly to the notion of computability. Since proof rules are really mechanical, the idea is that if we're able to express mathematical theorems formally, we should be able to build a machine that just crunches a bunch of rules to produce a proof for us. Since proof = truth, this means we can figure out whether a given theorem is true or not by sticking it in the machine.

This sounds suspiciously like a computer and we can think of logicians of this era as basically the earliest computer scientists (like modern computer scientists, they appear keen on disrupting the mathematics industry and putting mathematicians out of jobs by replacing them with machines). But the fact that we now have both extremely powerful computers in our pockets and open problems in mathematics is a sign that this original dream may have run into some complications.

We can continue down this line of thinking, which informs our understanding of programs. You see, when we write a program, there are really two things that we are trying to tie together. The first is what we want the program to do, or how it behaves. This is the semantics of the program. The second thing is the actual text of the program that the copmiler reads and chews up. This is the syntax of the program.

Strangely enough, it seems that in both logic and programming, we have the same sort of separation between form and behaviour. If you delve deeper in to mathematical logic, you'll happen upon a subfield called proof theory, where you can find what's called the Curry-Howard Correspondence, which basically says plainly that proofs and programs are really the same thing.

So in this sense, understanding formal proofs and inference rules informs our approach to informal proofs, or "the proofs that you find in mathematics textbooks and papers". Unlike what most people believe, mathematicians are people and they are people who communicate primarily in natural (rather than formal) language. This means that part of proof-writing is understanding how to communicate your ideas and arguments to another person.

There's a famous open problem in number theory called the abc conjecture. Now, the thing is, there is a Japanese mathematician, Shinichi Mochizuki, who has claimed to have solved it and has produced three volumes worth of material to do it. The problem is that he is famously withdrawn and not collegial at all, which means the number of people who actually understand the proof remains at 1. From this perspective, can we really say that he's proved it at all, if he can't convince anyone else that he's done it?