CMSC 27100 — Lecture 2

Predicate logic

Now, one thing you might notice is that propositions as we've defined them don't quite let us say everything we may want to. For instance, if I wanted to talk about some property about integers, I can certainly define propositions for them. However, there's no way to relate the various propositions about integers to the fact that we're talking about the same kinds of objects. For that, we need to expand our language a bit further. The extension that we'll be talking about is called predicate or first-order logic.

First, we need to determine a domain of discourse, which is the set to which all of our objects belong. This can be something like the natural numbers $\mathbb N$ or the integers $\mathbb Z$ or matrices or functions or graphs and so on. There's also nothing stopping us from making our domain the set of all dogs, if what we really wanted to do was express statements about dogs, or other "real-world" types of objects. This occurs less in math but more in places like database theory.

This brings us to an important thing to keep in mind: statements that may be true in one domain will not necessarily be true in another domain. Obviously, statements that we make about integers will not necessarily hold true for dogs, but it is important to remember that the same could be said for integers vs. natural numbers.

Definition 2.1. Here are some useful domains/sets:

the set of integers, $\mathbb Z = \{\dots, -3, -2, -1, 0, 1, 2, 3, \dots \}$,
the set of natural numbers, $\mathbb N = \{0, 1, 2, \dots\}$,
the set of real numbers, $\mathbb R$,
the set of rational numbers, $\mathbb Q$,
the set of complex numbers, $\mathbb C$.

Note that in this course, we consider 0 to be a natural number. Not everyone agrees!

Then, we want to express properties and relationships about the objects in our domain. For this, we need to define predicates. Predicates have an arity associated with them. A $k$-ary predicate is a predicate with $k$ arguments. So a unary or 1-ary predicate would be expressing some property of a particular object, while predicates with arity greater than 1 can be thought of as expressing some relationship between a number of objects.

Example 2.2. For example, if we define $E$ to be a unary predicate that expresses the evenness property, then we can say that $E(x)$ means "$x$ is an even number". We would then have $E(12000)$ is true while $E(765)$ is false.

The less-than relation is another example of a predicate, although because it's pretty important, we've given it its own symbol and usage. We can define a predicate $L(x,y)$ to mean "$x$ is less than $y$" or "$x \lt y$". Then we can say that $L(3,30)$ is true and $L(30,3)$ is false.

In the previous examples, almost instinctively, we used both concrete objects and variables with our predicates. When we use specific objects with predicates, our statements are not all that different from propositions. However, variables are considered placeholders for objects in the domain and it's the use of variables, together with quantifiers, that gives us more expressive power than propositional logic.

Definition 2.3. The symbol $\forall$ is called the universal quantifier and is read "for all". The symbol $\exists$ is called the existential quantifier and is read "there exists".

Basically, we use $\forall$ when we want to make a statement about all objects in our domain and we use $\exists$ when we want to say that there is some object out there in our domain that has a particular property.

Example 2.4. We can use predicates and quantifiers together with the logical connectives we introduced with propositional logic to say things like $\forall x (E(x) \rightarrow \neg E(x+1))$. If we take our domain to be $\mathbb Z$, this says that for every integer $x$, if $x$ is even, then $x+1$ is not even.

Example 2.5. Let's fix our domain to be $\mathbb N$. Recalling our less than predicate $L$ from above, we can consider the following proposition: $$\forall x \exists y L(x,y).$$ This reads "for every natural number $x$, there exists a natural number $y$ such that $x \lt y$". This statement turns out to be true: every natural number does have some other natural number that it's less than. Now, let's consider the following similar statement, where the order of the quantifiers is reversed: $$\exists y \forall x L(x,y).$$ Here, we're saying that there is a natural number $y$ for which every other natural number $x$ is less than it. In other words, this says that there is a "biggest" natural number, which is obviously not true. As we can see, the order of the quantifiers matters.

Example 2.6. Remember that the domain is important. For instance, if we are working in $\mathbb N$ and we modify the above example slightly to $\forall x \exists y L(y,x)$, we have the statement that "for every natural number $x$, there is some natural number $y$ that is less than $x$". This statement is false if we consider $x = 0$. However, this statement is true if our domain is $\mathbb Z$ (for every integer $x$, there is some integer $y$ that is less than $x$).

Something we have just run into is the negation of a quantified statement. We have the following logical equivalences:

Theorem 2.7. \begin{align*} \neg \forall x \varphi &\iff \exists x \neg \varphi \\ \neg \exists x \varphi &\iff \forall x \neg \varphi \\ \end{align*}

We can consider this to be a generalized version of De Morgan's laws for quantifiers.

Let's take a moment to think about what this is saying.

If $\neg \forall x \varphi$, then this means that $\varphi$ isn't true for all $x$ in the domain and there is some element, say $a$ such that $\neg \varphi$. Thus, $\exists x \neg \varphi$.
If $\exists x \neg \varphi$, then there is an element, say $a$ such that $\neg \varphi$ is true. Therefore, $\varphi$ isn't true for every element in the domain and we have $\neg \forall x \varphi$.
If $\neg \exists x \varphi$, then there is no element in the domain such that $\varphi$ is true. This means that for every element in the domain, $\neg \varphi$ is true and we have $\forall x \neg \varphi$.
If $\forall x \neg \varphi$, this means that $\varphi$ is false for every element in the domain. In other words, there is no element in the domain that makes $\varphi$ true and we have $\neg \exists x \varphi$.

Example 2.8. Taking our example $\forall x (E(x) \rightarrow \neg E(x+1)$, we can think about what the negation of this statement might be. Of course, we can begin by slapping on a negation: $$\neg \forall x (E(x) \rightarrow \neg E(x+1)).$$ Is there anything more we can do to clarify what this statement means though? We can make use of the negation of quantified statements that were just mentioned: $$\exists x \neg(E(x) \rightarrow \neg E(x+1)).$$ From here, we can make use of the propositional equivalences we saw earlier. Applying the implication equivalence gets us $$\exists x (\neg (\neg E(x) \vee \neg E(x+1))).$$ Now we can use De Morgan's laws together and eliminate the resultant double negations to get $$\exists x (E(x) \wedge E(x+1)).$$ Now, this statement says that there exists an integer $x$ such that $x$ is even and $x+1$ is even.

The final statement we arrived at makes it much more clear what the negation of our original statement actually meant. Obviously, since it was so simple, we could have probably guessed what it meant without going through the formal process of applying inference rules. However, if you're faced with proving some proposition, it is helpful to try to methodically translate the statement to formal logic and consider what it is saying in this way to make it more clear what it is you need to prove.

Speaking of proof, we can define new inference rules for predicate logic more carefully so that, together with our rules from propositional logic, we have a proof system for predicate logic too. Then we can ask the same questions of whether our new proof system coincides with the semantic notion of truth in first-order logic. This is much less obvious than in propositional logic, where we simply needed to figure out whether a proposition was true or not. In first-order logic, we also have to think about the domain and how predicates are defined.

In other words, in order for a first-order sentence to be "true", it has to be true for every single interpretation (domain, definitions of predicates, etc.). In propositional logic, each row of a truth table is considered an "interpretation" and we only had as many as $2^n$ of them for $n$ propositional variables; i.e. it was finite. For predicate logic, the number of potential interpretations is infinite. Luckily, Kurt Gödel proved in 1930 that there is a proof system that coincides with semantic truth for predicate logic. This is called the Completeness Theorem.

Now, just because Gödel showed that there's a proof out there for every "true" formula doesn't mean that we necessarily have a way of finding it. This is the question that Hilbert asked in 1928: is there a way to "algorithmically" figure out whether a first-order sentence is true or not (although he wouldn't have used quite the same terminology)? If the answer is yes, this process would basically give us a proof of the validity of that sentence. But the answer is no; to prove this, Alan Turing basically invented the notion of a computer (mathematically) in 1936.

You might wonder why predicate logic is called first-order logic and whether that means there is a second-order logic. First-order logic is an extension of propositional logic which does not allow for any quantification. First-order logic adds quantification over elements of the domain. Second-order logic extends first-order logic by adding quantification over predicates. And of course, you can guess that mathematicians have continued in this pattern since.

Set theory

This book, like almost every other modern mathematics book, develops its subject matter assuming a knowledge of elementary set theory. This assumption is often not justified.
—Logic for Applications, Anil Nerode and Richard Shore

Now that we've defined the language of first-order logic, we can move on to talking about set theory. Modern mathematics happens to be built on the idea that everything is a set. Everything. Functions? Sets. Numbers? Sets. Problems? Sets. Sets? Sets. So, it's important to familiarize ourselves with the basics of sets and how they work.

Definition 2.9. A set is an unordered collection of objects. If $S$ is a set and $a$ is a member of $S$, then we write $a \in S$. Similarly, if $a$ is not a member of $S$, then we write $a \not \in S$.

Here, objects is used loosely. Often, we'll just stick to numbers, but these can really be anything: books, trees, other sets, or anything else.

Note that since sets are unordered, this means that $\{1,2,3\}$ and $\{3,2,1\}$ are the same set. Furthermore, since an element is either a member or not a member of a set, each element occurs in a set at most once. This means that $\{2,1,3,2\}$ is really just $\{1,2,3\}$.

Definition 2.10. The set $\{\}$ containing no elements is the empty set and is denoted by $\emptyset = \{\}$.

In light of the above remark about objects in sets, note that $\{\emptyset\}$ is not the same as $\emptyset = \{\}$. The set $\{\emptyset\}$ is actually the set containing the empty set, which means it is not empty; it contains one element: the empty set.

There are two common ways to define sets. First, we can just list the elements of the set, like $\{1, 2, 3\}$. This is easy to do for finite sets. For infinite sets, we can do something like $\{1, 2, 3, \dots\}$ and usually, that's enough for us to understand that this is the set containing all natural numbers.

For more complicated definitions, we define sets using comprehensions, also called set-builder notation.

Example 2.11. Similar to what we did above, if we wanted to define the set of even natural numbers, we can write something like $\{2,4,6,\dots\}$. Using comprehensions, we can also write $$\{n \in \mathbb N \mid \exists m \in \mathbb N, n = 2 \cdot m \},$$ which says that we want the set of all natural numbers $n$ such that $n$ can be expressed as $2 \cdot m$ for some natural number $m$. That, of course, turns out to be the even natural numbers.

We can do more complicated things with this, since there aren't very many hard rules about what kinds of conditions we can or cannot place on our sets. For instance, we can define the set $$\{n \in \mathbb N \mid \exists m \in \mathbb N, (2 \cdot m = n) \wedge (\forall k \in \mathbb N, n \neq k^2) \}.$$ Sometimes, we don't even bother with that level of formality and just write something like $$\{n \in \mathbb N \mid \text{$n$ is even and not a square}\}.$$ This means that we have a lot of power in how we want to define our sets. Without any further restrictions on what we can say when defining sets, this puts us in the realm of naive set theory. This system was devised by Georg Cantor and later, Gottlob Frege as part of his quest to formalize a foundation for mathematics.

However, we can get into trouble if we're not careful about how we define sets. Bertrand Russell noticed this problem, leading to the following famous example: $$S = \{T \mid T \not \in T\}.$$ If we let something like this happen, then we get to conclude $S \in S \iff S \not \in S$, which is very bad if we want mathematics to be consistent (when Russell communicated this to Frege, he was quite spooked). This led to the development of axiomatic set theory. For our purposes, we won't need to delve into axiomatic set theory, but it is important to note that when we talk about set theory being the basis of all mathematics, what we really mean is the axiomatic set theory.

Definition 2.12. Two sets $A$ and $B$ are equal if and only if they have the same elements. That is, $$A = B \iff (\forall x, x \in A \iff x \in B).$$

Definition 2.13. A set $S$ is a subset of $T$, written $S \subseteq T$, when every element of $S$ belongs to $T$. A set $S$ is a proper subset of a set $T$, written $S \subsetneq T$, when $S$ is a subset of $T$ and there exists an element of $T$ which does not belong to $S$.

You may notice that sometimes $\subset$ is used for proper subset. This works quite nicely with $\subseteq$ meaning subset. However, we'll avoid this particular notation because there are many other mathematicians who use $\subset$ to mean (not necessarily a proper) subset. Instead, we will use $\subseteq$ and $\subsetneq$ to keep things clear.

From the above, we note that by definition, we have $S \subseteq S$ and if $S = T$, we have $S \subseteq T$ and $T \subseteq S$. This second definition gives us an alternate characterization of two sets that are equal.

Definition 2.14. The cardinality of a set $S$ is the number of elements in $S$ and is denoted $|S|$. If $S$ is finite, then this will be a natural number. So, the size of the set $\{1,2,4,8\}$ would be $|\{1,2,4,8\}| = 4$.

If $S$ is an infinite set, then things are a bit trickier. The cardinality of the natural numbers is defined to be $|\mathbb N| = \aleph_0$, while the cardinality of the real numbers is $|\mathbb R| = 2^{\aleph_0}$. Here, we reach into the Hebrew alphabet for $\aleph$ (aleph). Anyhow, the cardinalities of $\mathbb N$ and $\mathbb R$ are clearly not the same, a fact famously proved by Cantor in 1891. There are many infinite sets that have cardinality $\aleph_0$, such as the set of even natural numbers $\{n \mid \exists m \in \mathbb N, n = 2\cdot m\}$ or the set of rational numbers $\mathbb Q = \{\frac m n \mid \exists m,n \in \mathbb N\}$. There are also many sets with cardinality $2^{\aleph_0}$. This brings us to a fun problem for you to think about in your spare time: are there any infinite sets that have cardinality strictly between $|\mathbb N| = \aleph_0$ and $|\mathbb R| = 2^{\aleph_0}$?

Definition 2.15. The power set of a set $A$ is the set containing all of the subsets of $A$, $$\mathcal P(A) = \{S \mid S \subseteq A\}.$$

Theorem 2.16. If $A$ is a finite set, then $|\mathcal P(A)| = 2^{|A|}$.

We will prove this later, once we learn about mathematical induction. However, because of this fact, you'll sometimes see the power set of a set $A$ denoted by $2^A$. This also gives you some insight about why it might be that $|\mathbb R| = 2^{\aleph_0}$.

Recall that sets are unordered collections and don't contain multiples of elements. But what if order does matter? We need to define another structure to work with.

Definition 2.17. An $n$-tuple $(a_1, a_2, \dots, a_n)$ is an ordered collection that has $a_1$ as its first element, $a_2$ as its second element, $\dots$, and $a_n$ as its $n$th element. An ordered pair is a 2-tuple.

Observe that since tuples are ordered, we have $(a_1, \dots, a_n) = (b_1, \dots, b_n)$ if and only if $a_i = b_i$ for $i = 1, \dots, n$.

As a side note, we claimed earlier that everything in mathematics is just some kind of set, so how would we define tuples using sets? To keep things simple, we'll consider the definition of the pair first. We can define $(x,y)$ to be the set $\{\{x\}, \{x,y\}\}$. In this way, we can distinguish $x$ and $y$ and we have a way to determine the order in which they appear. This definition is due to Kuratowski in 1921. We can then generalize this definition for arity $n \gt 2$.