CMSC 28000 — Lecture 26

Post's Correspondence Problem

Suppose we have a collection of dominoes. A domino is a tile that has a string on the top and a string on the bottom: $\binom{a}{ab}$. A collection of dominoes looks like $$ \left\{ \binom{b}{ca}, \binom{a}{ab}, \binom{ca}{a}, \binom{abc}{c} \right\}.$$ We want to find a match from a sequence of dominoes drawn from our collection: $$ \binom{a}{ab} \binom{b}{ca} \binom{ca}{a} \binom{a}{ab} \binom{abc}{c}.$$

An instance of the Post Correspondence Problem is a sequence of pairs of strings $$(u_1,v_1),(u_2,v_2),\dots,(u_k,v_k)$$ where $u_i, v_i \in \Sigma^*, i = 1,\dots,k$. A solution of the above instance is the sequence $i_1,i_2,\dots,i_n \in \{1,\dots,k\}$ such that $$u_{i_1} u_{i_2} \cdots u_{i_n} = v_{i_1} v_{i_2} \cdots v_{i_n}.$$ The Post Correspondence problem asks whether or not a given instance of PCP has a solution. We define the language of PCP instances that have a solution to be $$L_{PCP} = \{ \langle P \rangle \mid \text{$P$ is a PCP instance that has a solution} \}.$$ Then we have the following result.

PCP is undecidable.

First, let's talk about how we do this. The idea is to reduce from $A_{TM}$. We show that for any Turing machine $M$ and input string $w$, we can construct an instance of PCP such that a match is an accepting computation path for $M$ on $w$. Thus, if we have a match, then we can determine whether $M$ accepts $w$.

Suppose there is a TM $R$ that decides PCP. We construct a TM $S$ that decides $A_{TM}$. Let $M = (Q,\Sigma,\Gamma,\delta,q_0,q_A,q_R)$. We will have $S$ construct an instance of the PCP $P$ that has a match if and only if $M$ accepts $w$. Essentially, we will be defining pairs for the set $P'$.

First, we show how to do this with a restricted case of PCP where we require a particular starting pair. Later, we will show how to remove this restriction. The starting pair is defined $$\binom{\#}{\#q_0 w_1 w_2 \cdots w_n \#}.$$ Note that this is the starting configuration for $M$ on $w$.

Next, for every $a,b \in \Gamma$ and every $q,q' \in Q$ where $q \neq q_R$, if $\delta(q,a) = (q',b,R)$, add the pair $\binom{qa}{bq'}$. These pairs handle transitions for the tape head moving to the right.

For every $a,b,c \in \Gamma$ and every $q,q' \in Q$ where $q \neq q_R$, if $\delta(q,a) = (q',b,L)$, add $\binom{cqa}{q'cb}$. These pairs handle transitions for the tape head moving to the left.

For every $a \in \Gamma$, add $\binom a a$. These pairs correspond to tape contents that aren't near the tape head.

Add $\binom \# \#$ and $\binom{\#}{\Box\#}$. These pairs handle the delimiters between the configurations.

For every $a \in \Gamma$, add $\binom{a q_A}{q_A}$ and $\binom{q_A a}{q_A}$. This handles when the computation reaches the accept state. Note that the top lags behind the bottom, so we need to add some extra steps in order for us to get the match that we want.

Finally, we add the pair $\binom{q_A\#\#}{\#}$ to finish.

Now, we've constructed an instance $P'$ of PCP with a restriction, so it doesn't quite work the way we want it to. If we just use $P'$, there is a match regardless of whether $M$ accepts $w$ or not. So we have a bit more work to do.

First, let $w = w_1 w_2 \cdots w_n$ be some string of length $n$. Then we define the following strings:

Now, we convert our restricted PCP instance $P'$ that we constructed into a proper PCP instance $P$. If we have $$ P' = \left\{ \binom{u_1}{v_1}, \binom{u_2}{v_2}, \dots , \binom{u_k}{v_k} \right\},$$ where $\binom{u_1}{v_1} = \binom{\#}{\#q_0 w_1 w_2 \cdots w_n \#}$, then we define $P$ to be the set $$ \left\{ \binom{\star u_1}{\star v_1 \star}, \binom{\star u_1}{v_1 \star}, \binom{\star u_2}{v_2 \star}, \dots, \binom{\star u_k}{v_k \star}, \binom{\star \triangle}{\triangle} \right\}.$$

Then in the PCP instance $P$, the tile $\binom{\star u_1}{\star v_1 \star}$ is the only one that can be the first tile since it is the only one that starts with the same symbol $\star$. The addition of the $\star$s doesn't matter in the other tiles, since the $\star$s are simply interleaved in the original string. Then the final tile is always going to be $\binom{\star \triangle}{\triangle}$ to take care of the extra $\star$.

Post's Correspondence Problem originates all the way back from 1946, due to Emil Post. While we've shown that this is undecidable via Turing machine simulation, Post's original proof of undecidability uses the string generation system formulated by Post in 1943 which we've allueded to, the Post canonical systems. The languages generated by such systems are recognizable and so these systems are Turing-complete.

Of course, this isn't just a neat trick to show that there are undecidable problems that look deceptively easy. As with any undecidable problem, it can become another tool in our toolbox to show that a problem is undecidable. For instance, let's consider the problem of intersection emptiness for CFGs, $$ISE_{CFG} = \{ \langle G_1, G_2 \rangle \mid \text{$G_1, G_2$ are CFGs, $L(G_1) \cap L(G_2) = \emptyset$} \}.$$

$ISE_{CFG}$ is undecidable.

We assume that we can decide intersection emptiness for context-free languages. Let $\Sigma = \{a,b\}$ and we define $I$ to be an arbitrary PCP instance over $\Sigma$, $$(\alpha_1, \beta_1), \dots, (\alpha_k,\beta_k).$$ We define the following languages over the alphabet $\Sigma \cup \{c\} = \{a,b,c\}$.

All of these languages are context-free. Then we observe that $L_0 \cap L_{mi} \neq \emptyset$ if and only if $I$ has a solution. Since $L_0$ and $L_{mi}$ are context-free, if we could decide emptiness of the intersection of context-free languages, we can also decide whether or not an arbitrary instance of PCP has a solution.

Reduction

Now we'll formalize the notion of reducibility and reduction. First, we'll need to revisit the notion of Turing machines and functions.

While our basic definition of a Turing machine is a language recognition device, we can modify it so that it becomes a computing device, which seems very relevant and practical. Instead of having accept and reject states, we define our machine to have a single halting state. Then we have the following notion.

A function $f: \Sigma^* \to \Sigma^*$ is a computable function if some Turing machine $M$, on every input $w$, halts with only $f(w)$ left on its tape.

As usual, there are multiple equivalent definitions for the specifics here. For instance, the choice of input alphabet is up to us, so if we want to compute, say, integer functions, then we can take our input and output to be unary or binary or something else. This means that a machine computing the function $f(n) = 2n$ can be implemented as a machine that takes $1^n$ as input and computes $1^{2n}$ in unary, or it could be something that does it in binary, like $(n)_2 \mapsto (n)_2 0 = (2n)_2$.

This notion is useful for a few things. First, it gives a very obvious way for us to connect Turing machines with the concept of recursive functions. Secondly, it gives us a way to think about data transformation. Thirdly, we'll be using it in the formal definition of reducibility, and it shows up again for various definitions in complexity theory.

But is this machine equivalent to our basic model? Sure, it's not hard to see that we can designate something like an output tape for our output to appear on and then we just hit the accept state once we're done computing our function. Similarly, we can always "simulate" the notion of accept and reject states; after all, languages are just decision problems, which are just functions from $\Sigma^*$ to $\{0,1\}$, so we output a 1 for accept and output a 0 for reject.

For our current purposes, we are more interested in transforming various kinds of input strings into other input strings. For instance, we can take as input some description of a Turing machine $\langle M \rangle$ and transform it into some other machine $\langle M' \rangle$. This is something that we've been implicitly doing when we assume that various machines can be constructed.

A language $A$ is reducible to a language $B$, denoted $A \leq B$, if there is a computable function $f:\Sigma^* \to \Sigma^*$, where for every $w$, $$ w \in A \iff f(w) \in B.$$ The function $f$ is called the reduction from $A$ to $B$. The reduction is a formal way of specifying a transformation of an input string for a particular problem to an input string for some other problem.

Let's consider the proof that $HALT_{TM}$ is undecidable. To show an explicit reduction, we need to show that there exists a computable function $f$ that transforms strings $\langle M,w \rangle$ into strings $\langle M',w' \rangle$ where $$ \langle M,w \rangle \in A_{TM} \iff \langle M',w' \rangle \in HALT_{TM}.$$ To do this, we simply show that there is a TM that computes an appropriate function $f$:

  1. On input $\langle M,w \rangle$, construct the following machine $M'$:
    1. On input $x$, run $M$ on $x$.
    2. If $M$ accepts, accept; if $M$ rejects, enter a loop.
  2. Output $\langle M',w \rangle$.
$\tag*{$\Box$}$

We can conclude that the proofs we've done so far are eamples of reduction. However, there is one hitch. If we look at the proof that $E_{TM}$ is undecidable closely, we see that it deviates from our definition a bit: notice that our TM for $A_{TM}$ accepted whenever $E_{TM}$ rejected our machine. This means that what we actually showed was $A_{TM} \leq \overline{E_{TM}}$, or rather non-emptiness for Turing machines. This is fine, since this still implies that $E_{TM}$ is undecidable, but there are some consequences from what our reductions tell us. First, we have the following observations.

Let $A$ and $B$ be languages.

  1. If $A \leq B$ and $B$ is decidable, then $A$ is decidable.
  2. If $A \leq B$ and $A$ is undecidable, then $B$ is undecidable.
  3. If $A \leq B$ and $B$ is recognizable, then $A$ is recognizable.
  4. If $A \leq B$ and $A$ is unrecognizable, then $B$ is unrecognizable.

These are natural consequences of the reduction relation and tells us something about the relative hardness of problems $A$ and $B$. Next, we observe that $\overline{E_{TM}}$ is what we say is a co-recognizable language.

We say that a language is co-recognizable if it is the complement of a recognizable language.

A language is decidable if and only if it is recognizable and co-recognizable.

If $A$ is decidable, then $A$ and $\overline A$ are recognizable, since a decidable language and its complement are recognizable.

If $A$ and $\overline A$ are recognizable, then there exist Turing machines $M_1$ and $M_2$ that recognize $A$ and $\overline A$, respectively. Then we can construct a Turing machine $M_3$ that on input $w$ operates as follows:

  1. Set $t = 1$.
  2. Run $M_1$ on $w$ for $t$ steps. If $M_1$ has accepted within $t$ steps, accept.
  3. Run $M_2$ on $w$ for $t$ steps. If $M_1$ has accepted within $t$ steps, reject.
  4. Set $t = t+1$ and go to step 1.

Since $A \cup \overline A = \Sigma^*$, we have that either of $M_1$ or $M_2$ must accept, which means that $M_3$ is guaranteed to halt. This means that $A$ is decidable.

Note that the TM $M_3$ that we constructed is a deterministic machine and uses a neat trick to avoid falling down a non-halting rabbit hole, which is the danger whenever you're dealing with recognizable languages. We simply bound the computation and systematically lengthen the bound until we get an answer, which we are guaranteed to do since every input must belong to one of the two languages. But if we used a nondeterministic machine, the construction would be even easier: just run the two machines in parallel until one accepts.

This result gives us a clear example of an unrecognizable language.

$\overline{A_{TM}}$ is not recognizable.

If $\overline{A_{TM}}$ were recognizable, then $A_{TM}$ would be decidable.

But from this, we can also get that $E_{TM}$ is also unrecognizable. First, we can conclude that our reduction from $A_{TM}$ doesn't reduce to $E_{TM}$.

$A_{TM}$ is not reducible to $E_{TM}$.

Suppose that we do have a reduction $f$ such that $A_{TM} \leq E_{TM}$. Then this means that by definition we have $\overline{A_{TM}} \leq \overline{E_{TM}}$ from the same reduction $f$. However, $\overline{E_{TM}}$ is recognizable but $\overline{A_{TM}}$ is unrecognizable.

Now, note that this says nothing about whether or not $E_{TM}$ or $\overline{E_{TM}}$ are recognizable or unrecognizable. It could be that one of them are recognizable (which means the other isn't) or both could be unrecognizable. The answer turns out to be the first.

$\overline{E_TM}$ is recognizable.

Recall that $\overline{E_{TM}}$ is the language of Turing machines that accept at least one word. Therefore, all we need to do is enumerate every word and check if $M$ accepts it. There is a slight problem with this: what if we encounter a word that makes $M$ run forever? Then we can never move on to the next word. So we need to be a bit more clever.

We define a Turing machine that does the following:

  1. On input $\langle M \rangle$,
  2. For $i \in \mathbb N$,
    1. For each $x \in \Sigma^{\leq i}$,
      1. Run $x$ for up to $i$ steps.
      2. If $x$ is accepted by $M$ within $i$ steps, then accept. Otherwise, repeat the loop with the next string.
    2. If no strings of length up to $i$ are accepted within $i$ steps, increase $i$ by one.

If no word is accepted by $M$, this machine will run forever. If there is a word $w \in L(M)$, then $w$ must be of finite length $n$ and is accepted within a finite number of steps $t$. This means that the machine will detect that $w$ is accepted when $i = t$.

An alternative proof uses nondeterministic Turing machines: the NTM nondeterministically guesses a string $w$ and simulates $M$ on it. If $w$ is accepted, then this means at least one branch of computation on the NTM is an accepting branch and the word is accepted. Otherwise, none of the branches are accepting branches.

This clearly gives us the following.

$E_{TM}$ is unrecognizable.

Note that unrecognizability is even harder than undecidability. Recall that a language can be recognizable, but undecidable—these are problems that can be solved algorithmically, as long as we're not worried about getting a negative answer. But unrecognizability means that a problem doesn't even have that.

What is the difference between the two? If we examine each of our problems, membership and emptiness, and look at their complements, non-membership and non-emptiness, we get a hint. Membership and non-emptiness are really existential search problems: a search for an accepting computation path or a search for a string that will be accepted. But emptiness and non-membership are universal search problems: a search to ensure no strings are in the language or that there are no possible accepting computation paths. There's a bit of an analogue with problems in NP and co-NP, where one involves an existential search for a certificate while the other involves a universal search for certificates.