CMSC 28000 — Lecture 22

Diagonalization

Now, we move on to the technique that we'll use to show that $A_{TM}$ is undecidable. The crux of the idea has to do with the number of languages that exist compared to the number of Turing machines there are.

This will take us back to some basic set theory. Specifically, we'll be dealing with the size of infnite sets. Georg Cantor was interested in the question of how to compare the sizes of two infinite sets. For instance, if we had the set of natural numbers and the set of all even numbers, what can we say about their sizes? Cantor's solution to this was to show that two sets have the same size if you could pair every element of one set with the other.

Recall the following definitions about functions.

Let $A$ and $B$ be two sets and consider a function $f:A \to B$. We say $f$ is one-to-one or injective if $f(a) \neq f(b)$ whenever $a \neq b$. We say $f$ is onto or surjective if for every $b \in B$, there exists $a \in A$ such that $f(a) = b$. A function that is both one-to-one and onto is a correspondence or bijection.

We say a set $A$ is countable if either it is finite or has the same size as $\mathbb N$.

Our intuitive understanding of infinity leads us to think that infinite sets are all the same size, since they all have infinitely many things in them. So we might expect that we can if we try hard enough, we can always come up with a correspondence. As it turns out, this is not the case and there are infinite sets that are larger than $\mathbb N$. Such a set is said to be uncountable.

The trick that Cantor uses, the diagonal argument, works similarly to the pumping lemma proofs we've seen. Suppose that $\mathbb R$ were countable. Then there's a bijection $f : \mathbb N \to \mathbb R$ and we can list all the elements. We can then use this hypothetical list to construct a real number $x \in \mathbb R$. However, by the way that the real number $x$ is constructed, it is actually impossible that it shows up in the list, which implies that $f$ could not have been a bijection after all.

We'll use the same idea for many of the proofs in the next few classes. We'll start by using it to show that unrecognizable languages exist. But first, here are some theorems about sets of strings.

For every alphabet $\Sigma$, the set of strings $\Sigma^*$ is countable.

The proof of this is not difficult: just take $f(n) = w$, where $w$ is the $n$th string in lexicographic order, for some order on $\Sigma$. In a sense, what we're really doing is we're really just counting in base-$|\Sigma|$.

But what about languages? The set of languages turns out to be uncountable. We'll adapt Cantor's diagonal argument for this proof.

For every alphabet $\Sigma$, the set of languages, $2^{\Sigma^*}$, is uncountable.

Suppose that $2^{\Sigma^*}$ is countable. We will make use of the countability of $\Sigma^*$. Since we assume that $2^{\Sigma^*}$ is countable, there exists a bijection $f : \Sigma^* \to 2^{\Sigma^*}$. In other words, we map each string over $\Sigma$ to a language, or a subset of $\Sigma^*$.

Now, define the language $D$ by \[D = \{w \in \Sigma^* \mid w \not \in f(w)\}.\] Since $D \subseteq \Sigma^*$ is a language and $f$ is a bijection, there exists a string $x \in \Sigma^*$ such that $f(x) = D$. So we can ask ourselves whether $x \in D$. We have \[x \in D \iff x \in f(x),\] since $D = f(x)$. But by definition of $D$, we have \[x \in D \iff x \not \in f(x).\] This gives us \[x \in f(x) \iff x \not \in f(x),\] or, in other words, \[x \in D \iff x \not \in D,\] which is a contradiction. Therefore, $2^{\Sigma^*}$ is not countable.

Why is this called a diagonalization argument? Consider the following table, where we have strings running down, and a list of languages running across. This table tells us, hypothetically, whether a particular string is in a particular language, which is something we can construct if we had $f$. Notice that $D$ is constructed by going along the diagonal: $x \in f(x)$ if and only if $x \in D$. The contradiction becomes clear when we look up the entry for $D$ on $w$, where $f(w) = D$.

$f(\varepsilon)$ $f(0)$ $f(1)$ $f(00)$ $f(01)$ $\cdots$
$\varepsilon$ ✔️️ ✔️️ ✔️️
$0$ ✔️️ ✔️️
$1$ ✔️️ ✔️️ ✔️️
$00$ ✔️️
$01$ ✔️️ ✔️️ ✔️️
$\vdots$

This proof might seem a bit discomforting because of the way that we defined $D$. And you are not alone—many (admittedly unserious) people are so uncomfortable with diagonalization arguments that they dispute that $|\mathbb R| \gt |\mathbb N|$. But even if you are uncomfortable with how we defined $D$, it has to exist! There has to be a language with the words that we picked, and if it's a language, it has to show up in somewhere in our bijection.

We can now use this fact to show that there are languages that are not recognizable. In other words, there are languages that have no Turing machine that recognizes them.

Some languages are not Turing-recognizable.

First, we need to show that the set of Turing machines is countable. Recall that the set of all string $\Sigma^*$ is countable. Then we observe that every Turing machine has a string encoding $\langle M \rangle$, for some fixed encoding. Now, not every string corresponds to a proper encoding of a Turing machine, but we can ignore or skip all of those. Thus, there are countably many Turing machines.

Now, suppose that every language is recognizable. Then there is a Turing machine that recognizes each of these languages. We can define a mapping $f : \Sigma^* \to 2^{\Sigma^*}$ that assigns each Turing machine to the language it recognizes. However, we have showed that the set of all languages is uncountable. Thus, $f$ cannot be a bijection, and in fact, there are strictly more languages than there are Turing machines. This means there must exist some language that is not recognized by a Turing machine.

Undecidability

This might be a bit surprising, but it is also not quite satisfying. After all, who cares if there's some crazy problem out there that's not solvable? What if we only care about "real" problems?

Just to refresh our memory, $A_{TM}$ is defined $$ A_{TM} = \{ \langle M,w \rangle \mid \text{$M$ is a TM and $M$ accepts $w$} \} $$ and we are ready to prove the following result.

$A_{TM}$ is undecidable.

We assume that $A_{TM}$ is decidable and end up showing that it isn't. Suppose we have a TM $H$ that decides $A_{TM}$. Then on $\langle M,w \rangle$, $H$ halts and accepts if $M$ accepts $w$ and it rejects if $M$ does not accept $w$.

We construct a new TM $D$ that takes as input the description of a Turing machine and operates as follows:

  1. On input $\langle M \rangle$, where $M$ is a TM, run $H$ on input $\langle M, \langle M \rangle \rangle$.
  2. If $H$ accepts, then reject; if $H$ rejects, then accept.

Let's consider the result of running each Turing machine on a string of encodings of Turing machines. Essentially, we have \[L(D) = \{\langle M \rangle \in \Sigma^* \mid \langle M \rangle \not \in L(M)\}.\]

Again, we can view this as a table. Suppose we have an enumeration of Turing machines $M_1, M_2, \dots$. Then, we have the list of Turing machine encodings running down and whether the $i$th machine accepts it running across. Now, let's consider the result of $D$ when run on a description of one of these machines: it is basically the opposite of whatever runs along the diagonal.

$M_1$ $M_2$ $M_3$ $M_4$ $M_5$ $\cdots$
$\langle M_1 \rangle$ ✔️️ ✔️️ ✔️️
$\langle M_2 \rangle$ ✔️️ ✔️️
$\langle M_3 \rangle$ ✔️️ ✔️️ ✔️️
$\langle M_4 \rangle$ ✔️️
$\langle M_5 \rangle$ ✔️️ ✔️️ ✔️️
$\vdots$

Now, let us consider the result of $D$ when run on the input string $\langle D \rangle$. We have \[\langle D \rangle \in L(D) \iff \langle D \rangle \not \in L(D).\] That is, $D$ must accept if $D$ does not accept $\langle D \rangle$ and it must reject if $D$ accepts $\langle D \rangle$, which is a contradiction. Thus, neither Turing machines $D$ or $H$ can exist and $A_{TM}$ is undecidable.

This is already a pretty big deal. Essentially, this result says that there's no algorithm that will tell us the outcome of a sufficiently complicated function on some arbitrary input. Since it's recognizable, the best we can do is to simulate it.

However, this result is already quite handy. We can make use of this to show that there are many other problems that aren't decidable either. To start, we'll consider a problem that you've already heard about: Turing's halting problem. The halting problem is defined \[ HALT_{TM} = \{ \langle M,w \rangle \mid \text{$M$ is a TM and $M$ halts on input $w$} \}.\]

$HALT_{TM}$ is undecidable.

Assume that there exists a Turing machine $H$ that decides $HALT_{TM}$. Then we can construct the following Turing machine to decide $A_{TM}$:

  1. On input $\langle M,w \rangle$, where $M$ is a TM and $w$ is a string, run TM $H$ on input $\langle M,w \rangle$.
  2. If $H$ rejects, reject.
  3. If $H$ accepts, simulate $M$ on $w$ until it halts.
  4. If $M$ accepts, then accept; if $M$ rejects, then reject.

So if $H$ decides $HALT_{TM}$, then we can decide $A_{TM}$ with the preceding TM. However, we know that $A_{TM}$ is undecidable, so this is a contradiction and $HALT_{TM}$ must be undecidable.

As it turns out, Turing's original proof of (what we now call) the halting problem is shown directly via a diagonal argument like the proof of the undecidability of $A_{TM}$. It's not too hard to see how we might be able to modify the proof of $A_{TM}$ to work for the halting problem.