CMSC 28000 — Lecture 24

Undecidable problems

We showed last class that $A_{TM}$ is undecidable. This result will be quite handy. We can make use of this to show that there are many other problems that aren't decidable either. To start, we'll consider a problem that you've already heard about: Turing's halting problem. The halting problem is defined \[ HALT_{TM} = \{ \langle M,w \rangle \mid \text{$M$ is a TM and $M$ halts on input $w$} \}.\]

$HALT_{TM}$ is undecidable.

Assume that there exists a Turing machine $H$ that decides $HALT_{TM}$. Then we can construct the following Turing machine to decide $A_{TM}$:

  1. On input $\langle M,w \rangle$, where $M$ is a TM and $w$ is a string, run TM $H$ on input $\langle M,w \rangle$.
  2. If $H$ rejects, reject.
  3. If $H$ accepts, simulate $M$ on $w$ until it halts.
  4. If $M$ accepts, then accept; if $M$ rejects, then reject.

So if $H$ decides $HALT_{TM}$, then we can decide $A_{TM}$ with the preceding TM. However, we know that $A_{TM}$ is undecidable, so this is a contradiction and $HALT_{TM}$ must be undecidable.

As it turns out, Turing's original proof of (what we now call) the halting problem is shown directly via a diagonal argument like the proof of the undecidability of $A_{TM}$. It's not too hard to see how we might be able to modify the proof of $A_{TM}$ to work for the halting problem.

Let's look at some more examples of undecidable languages. To show that these languages are undecidable, we will start using an idea that is fairly central to theoretical computer science: reducibility.

Essentially, this is the idea that you can be clever about using a solution to some other problem you've already solved. For instance, it is fairly labourious to show that a problem is undecidable by setting up a diagonalization argument every time. Instead, we want to be able to make use of results that we've already proved, which in this case would involve reducing a problem to some other problem that we know is already undecidable.

In fact, we already did this to show that the Halting problem was undecidable: instead of constructing a bespoke diagonalization proof, we made use of the fact that we already showed $A_{TM}$ was undecidable. We will continue to use this same strategy. We can go and ask the same questions of TMs as we've been asking about DFAs and CFGs. As you'd probably expect, they are all undecidable. Let's consider the following language \[ E_{TM} = \{ \langle M \rangle \mid \text{$M$ is a TM and $L(M) = \emptyset$} \}. \]

$E_{TM}$ is undecidable.

First, we describe the construction for a machine $M_w$ that is based on some Turing machine $M$ and some string $w$. It operates in the following way:

  1. On input $x$, if $x \neq w$, reject.
  2. If $x = w$, run $M$ on input $w$ and accept if $M$ accepts.

The machine $M_w$ rejects every word that is not $w$. Then the only word that can be accepted is $w$. If $M_w$ doesn't accept $w$, then $L(M_w) = \emptyset$. Thus, $M_w$ recognizes the empty language if and only if it doesn't accept $w$. We will use this construction in our proof.

Now, we assume that there exists a TM $R$ that decides $E_{TM}$ and construct a TM $S$ that decides $A_{TM}$ as follows:

  1. On input $\langle M,w \rangle$, where $M$ is an encoding of a TM and $w$ is an input string, construct $M_w$ as described.
  2. Run $R$ on $\langle M_w \rangle$
  3. If $R$ accepts, reject; if $R$ rejects, accept.

Thus, if $R$ decides $E_{TM}$, then $S$ decides $A_{TM}$. However, we know that $A_{TM}$ is undecidable, so $R$ cannot exist and $E_{TM}$ is undecidable.

This gives us a handy way to show that other decision problems for Turing machines are undecidable. Let's consider the problem of equality between two TMs, specified as \[EQ_{TM} = \{ \langle M_1, M_2 \rangle \mid \text{$M_1,M_2$ are TMs and $L(M_1) = L(M_2)$} \}.\]

$EQ_{TM}$ is undecidable.

This time, we reduce from $E_{TM}$. Suppose we have a TM $R$ that decides $EQ_{TM}$. Then we can construct the following TM to decide $E_{TM}$.

  1. On input $\langle M \rangle$, where $M$ is a TM, run $R$ on $\langle M, N \rangle$, where $N$ is a TM that rejects all inputs.
  2. If $R$ accepts, accept; if $R$ rejects, reject.

Here, $N$ rejects all input so $L(N) = \emptyset$. If we test whether $M$ and $N$ accept the same language, then we're basically testing whether $L(M) = \emptyset$ and now we have a way to decide $E_{TM}$.

Here's another kind of question we can ask of Turing machines. Since they are so powerful, we may want to ask whether or not the language of a TM can be recognized by a less powerful model of computation, like a DFA. Such a language is defined \[ REG_{TM} = \{ \langle M \rangle \mid \text{$M$ is a TM and $L(M)$ is a regular language} \}.\] As it turns out, this is not decidable either.

$REG_{TM}$ is undecidable.

Let $R$ be a TM that decides $REG_{TM}$. Construct a TM $S$ that does the following:

  1. On input $\langle M,w \rangle$, where $M$ is a TM and $w$ is a string, construct the TM $M'$:
    1. On input $x$, if $x$ has the form $0^n 1^n$, accept.
    2. If $x$ doesn't have this form, run $M$ on $w$ and accept if $M$ accepts $w$.
  2. Run $R$ on $\langle M' \rangle$.
  3. If $R$ accepts, accept; if $R$ rejects, reject.

How does this work? The TM $M'$ accepts all strings in $\{0^n 1^n \mid n \geq 0\}$, which we know is context-free. If $M$ doesn't accept $w$, then $L(M') = \{0^n 1^n \mid n \geq 0\}$. If $M$ accepts $w$, then $M'$ also accepts every other string and in this case $L(M') = \Sigma^*$, which is regular. In this way, if there's a way to figure out whether $M'$ recognizes a regular language, then we can come up with a way to decide $A_{TM}$.

At this point, we begin to see a possible pattern emerge: if we want to ask about whether a Turing machine accepts this or that set of strings, we can come up with an easy way to test for membership. Simply construct a Turing machine that collapses to one language or another. Perhaps we can generalize this idea...

Rice's Theorem

So far, it seems like every question we ask about Turing machines is undecidable. As it turns out, this is not just an educated hunch, but a theorem. First, we'll try to formalize the notion of a "property".

A property $P$ of a Turing machine is a semantic property if it depends on the language recognized by $M$ and not the syntactic structure of $M$. In other words, if $L(M_1) = L(M_2)$, then $M_1$ and $M_2$ have the same semantic properties.

A property $P$ can be expressed as a language consisting of exactly the encodings $\langle M \rangle$, where $M$ has property $P$. A semantic property $P$ is said to be non-trivial if there exists a TM $M_1$ such that $\langle M_1 \rangle \in P$ and a TM $M_2$ such that $\langle M_2 \rangle \not\in P$.

Then we have the following theorem.

All non-trivial semantic properties of Turing machines are undecidable.

In other words, for any given Turing machine $M$, we can't decide any properties about $L(M)$ except for properties that are true for either exactly all or exactly none of the languages recognized by Turing machines. The proof is surprisingly simple, since it follows the basic template that we've been using so far.

To show this, we assume that $P$ is a decidable language that satisfies some property and we let $R_P$ be a TM that decides $P$. We will reduce TM membership to property testing.

Let $T_\emptyset$ be a TM that always rejects, so $L(T_\emptyset) = \emptyset$. Without loss of generality, we assume that $\langle T_\emptyset \rangle \not\in P$, since otherwise, we can just use $\overline P$ instead. Since $P$ is non-trivial, there exists a TM $T$ with $\langle T \rangle \in P$.

We construct a Turing machine that decides $A_{TM}$ by being able to distinguish between $T_\emptyset$ and $T$.

  1. On input $\langle M,w \rangle$, construct the TM $M_w$:
    1. On input $x$, simulate $M$ on $w$. If it rejects, reject. If it accepts, go to the next step.
    2. Simulate $T$ on $x$. If it accepts, accept.
  2. Use $R_P$ to determine whether $\langle M_w \rangle \in P$. If YES, accept. If NO, reject.

So $M_w$ simulates $T$ if $M$ accepts $w$. Then $L(M_w) = L(T)$ if $M$ accepts $w$ and $L(M_w) = \emptyset$ otherwise. Thus, $\langle M_w \rangle \in P$ iff $M$ accepts $w$.

Rice's theorem was proved by Rice in 1951 (the result appeared in a journal later in 1953). It is interesting to note that a similar result was discussed by Turing in his original 1936 paper about undecidability.