CMSC 28000 — Lecture 6

The power of nondeterminism

To play around with nondeterminism a bit more, let's think back to one of the more exotic language operations I mentioned earlier, the proportional removal $\frac 1 2 L$. We'll prove something about $L \frac 1 2$, the language of the second halves of words from $L$, instead. Proportional removals were studied by Stearns and Hartmanis in the early 60s.

Let $A$ be an DFA. Then there is an NFA that recognizes the language $$L(A) \frac 1 2 = \{y \in \Sigma^* \mid xy \in L(A), |x| = |y|\}.$$

Let's think about the approach roughly first. We need to check if our input word $y$ is the latter half of some word in our language $L(A)$. This means that there's some state $p$ of our DFA $A$ from which we can reach a final state on $y$. If $xy \in L(A)$, there will be a path that looks something like \[q_0 \xrightarrow{x} p \xrightarrow{y} q \in F.\] However, we will also need to check that $y$ is the correct length (half of a word) by checking that there's a path of length $|y|$ from the initial state of $A$ to $p$.

It's not clear if this is even possible with a DFA. However, with an NFA, we can simply guess all the possibilities. First, we guess what $p$ could've been. Then, we run our machine in parallel, by starting to process $y$ from the state $p$ that we guessed, and by guessing a path starting from the initial state $q_0$. If we reach a final state on $y$ and find a path from $q_0$ to $p$ of the same length, then we accept.

Here's how we can imagine a machine does this: use triples of states $\langle p,q,r \rangle$. The machine guesses $p$ at the beginning and it doesn't change, so it remembers which one was guessed. The state $q$ is the state that the machine ends up in by guessing a prefix starting from the initial state $q_0$. The state $r$ is the state that the machine ends up in by processing $y$ and starting on $p$.

Let's define the construction formally.

Let $A = (Q,\Sigma,\delta,q_0,F)$ be a DFA. We will construct an NFA $A' = (Q',\Sigma,\delta',I',F')$, where

This construction formalizes our reasoning from before the start of the proof. What's left to do is to argue that this NFA accepts the correct language.

Suppose that our input string $w$ is accepted by the NFA. So $w \in L(A')$ and therefore $\delta'(I',w) \cap F' \neq \emptyset$. So there is some state, say $\langle p,q_0,p \rangle \in I'$ such that there is a string $x$ with $|x| = |w|$, $\delta(q_0,x) = p$, and $\delta(p,w) \in F$. But this means that $\delta(q_0,xw) \in F$, so $xw \in L(A)$. Since $|x| = |w|$, we have $w \in L(A)\frac 1 2$.

Now, suppose we have a string $w \in L(A) \frac 1 2$. By definition, this means that there is a string $x$ such that $xw \in L(A)$ and $|x| = |w|$. This means that there is a state $q$ in $A$ such that $\delta(q_0, x) = q$ and $\delta(q,w) = q' \in F$. Then since $\langle q,q_0,q \in I'$, we have $\langle q, \delta(q_0,x), \delta(q,w) \rangle = \langle q,q,q' \rangle \in \delta'(\langle q,q_0,q \rangle, w)$ and therefore $\delta'(\langle q,q_0,q \rangle, w) \cap F' \neq \emptyset$. So $w \in L(A')$.

The subset construction

So now, the obvious question to ask is: does nondeterminism really give us more power? That is, can NFAs recognize languages that DFAs can't? The example that we just constructed seems like something that would be really difficult to pull off with an NFA. And yet, it turns out the answer is no—DFAs are just as powerful as NFAs.

But how? We'll show that given any NFA $N$, we can construct a DFA $M$ that accepts the same language. The key observation is in examining what exactly is happening with our nondeterministic transition function. Remember that for each state and symbol, the transition function gives a set of states rather than a single state as a deterministic finite automaton.

So how many sets are there? If there are $n$ states, then there are $2^n$ possible subsets of $Q$. This is a large number which we've been trained as computer scientists to be alarmed at, but the important thing is that it is finite. This gives us a way to "simulate" the nondeterminism of an NFA using a DFA: just make each subset of $Q$ its own state!

This construction is called the subset construction and was again due to Rabin and Scott in 1959. An interesting question that one can consider is when the worst case of $2^n$ states comes up. Similar questions about the worst-case growth in the number of states of a DFA or NFA can be asked for other transformations or language operations in a line of research in automata theory called state complexity.

For instance, recall our NFA for $L \frac 1 2$. Since the states of our NFA is the set of 3-tuples of states of our original machine, if we started with an $n$-state DFA, we'd construct an NFA that has $n^3$ states. Taking the subset construction of this means that we could end up with $2^{n^3}$ states in an equivalent DFA. But Mike Domaratzki showed in 2001 that with a clever enough construction, a DFA recognizing $L \frac 1 2$ only requires at most $n e^{\Theta(\sqrt{n \log n})}$ states.

Let's define the subset construction formally.

Let $N = (Q,\Sigma,\delta,I,F)$ be an NFA. We will construct a DFA $\mathcal D(N) = (Q',\Sigma,\delta',q_0',F')$ such that $L(\mathcal D(N)) = L(N)$. We will define each component of $\mathcal D(N)$:

Let's try this with this example. Here's the NFA again.

And here is the transition table with all the subsets.

$a$ $b$
$\emptyset$ $\emptyset$ $\emptyset$
$\{q_0\}$ $\{q_0\}$ $\{q_0,q_1\}$
$\{q_1\}$ $\{q_2\}$ $\{q_2\}$
$\{q_2\}$ $\{q_3\}$ $\{q_3\}$
$\{q_3\}$ $\emptyset$ $\emptyset$
$\{q_0,q_1\}$ $\{q_0,q_2\}$ $\{q_0,q_1,q_2\}$
$\{q_0,q_2\}$ $\{q_0,q_3\}$ $\{q_0,q_1,q_3\}$
$\{q_0,q_3\}$ $\{q_0\}$ $\{q_0,q_1\}$
$\{q_1,q_2\}$ $\{q_2,q_3\}$ $\{q_2,q_3\}$
$\{q_1,q_3\}$ $\{q_2\}$ $\{q_2\}$
$\{q_2,q_3\}$ $\{q_3\}$ $\{q_3\}$
$\{q_0,q_1,q_2\}$ $\{q_0,q_2,q_3\}$ $\{q_0,q_1,q_2,q_3\}$
$\{q_0,q_1,q_3\}$ $\{q_0,q_2\}$ $\{q_0,q_1,q_2\}$
$\{q_0,q_2,q_3\}$ $\{q_0,q_3\}$ $\{q_0,q_1,q_3\}$
$\{q_1,q_2,q_3\}$ $\{q_2,q_3\}$ $\{q_2,q_3\}$
$\{q_0,q_1,q_2,q_3\}$ $\{q_0,q_2,q_3\}$ $\{q_0,q_1,q_2,q_3\}$

Note that all the subsets in red cannot be reached from the initial state $\{q_0\}$. Now, here is the state diagram for the DFA.

I've omitted all the subsets that can't be reached, but this is still a large DFA relative to the size of the NFA. In fact, one can show that the language $\{a,b\}^* b \{a,b\}^{n-1}$ can be recognized by an NFA with $n+1$ states but requires a DFA with $2^n$ states. (This is why it was interesting to see that an exercise from the first edition of Hopcroft and Ullman's book asked for a DFA that accepted this language for $n = 10$)

Now, we want to show that $\mathcal D(N)$ and $N$ recognize exactly the same language.

Let $N$ be an NFA. Then $L(N) = L(\mathcal D(N))$.

We want to show that $w \in L(N)$ if and only if $w \in L(\mathcal D(N))$. This means that by definition, $\delta(I,w) \cap F \neq \emptyset$ if and only if $\delta'(I,w) \cap F \neq \emptyset$. So, we will show that $\delta(q,w) = \delta'(\{q\},w)$. We will show this by induction on $w$.

First, we have $w = \varepsilon$. Then $\delta(q,\varepsilon) = \{q\} = \delta'(\{q\},\varepsilon)$.

Now, let $w = ua$ with $a \in \Sigma$ and string $u \in \Sigma^*$ and assume that $\delta(q,u) = \delta'(\{q\},u)$. Then we have \begin{align*} \delta'(\{q\},ua) &= \delta'(\delta'(\{q\},u),a) &\text{(by definition of $\delta'$)}\\ &= \delta'(\delta(q,u),a) &\text{(by inductive hypothesis)}\\ &= \bigcup_{p \in \delta(q,u)} \delta(p,a) &\text{(by definition of $\delta'$)}\\ &= \delta(q,ua). &\text{(by definition of $\delta$)} \end{align*} Thus, we have shown that $\delta(q,w) = \delta'(\{q\},w)$ for all $q \in Q$ and $w \in \Sigma^*$, and we can conclude from the definition of the final states that $L(N) = L(\mathcal D(N))$.

Of course, this only gives us half of what we need, although what remains is fairly simple. We can just state the following.

A language $L$ is accepted by a DFA if and only if $L$ is accepted by an NFA.

We have already shown the only if direction of this theorem in Theorem 6.4. What's left to show is the if direction, which involves taking a DFA and constructing an equivalent NFA. This should be fairly simple to define and prove correctness for, so we won't go through it.

From this we get the following.

The class of DFA languages is closed under concatenation.

Let $A$ and $B$ be DFAs. By Theorem 5.4, there exists an NFA that accepts the language $L(A)L(B)$. Then by Theorem 6.5, there must exist a DFA that accepts $L(A)L(B)$.

$\varepsilon$-transitions

Now, we'll consider another possible enhancement to the finite automaton: allowing transitions to other states via $\varepsilon$. This additional trick will be very convenient, but again, we have to ask the question: does this make finite automata more powerful? Some of you have already caught on when I introduced the NFA with multiple initial states and wondered if multiple initial states really makes a difference over a single initial state. That same intuition will serve us when we think about how $\varepsilon$-transitions actually work.

First, the formal definition.

An $\varepsilon$-NFA is a 5-tuple $A = (Q,\Sigma,\delta,I,F)$ as in the definition of the NFA, except that the transition function is now a function from a state and either a symbol or $\varepsilon$ to a set of states, $\delta : Q \times (\Sigma \cup \{\varepsilon\}) \to 2^Q$.

So the question now is, what is the result of a transition function that allows $\varepsilon$s? Let's consider the following example.

Consider the state diagram below. We can ask for each state, which other states we should be in if we "read $\varepsilon$"? In other words, if we don't read anything, where would we be?

What we would like is a way to say, for some set of states $P \subseteq Q$, "all the states that are reachable on one or more $\varepsilon$-transitions from a state $p \in P$". We'll define the $\varepsilon$-closure of $P$ to be exactly this.

The $\varepsilon$-closure of a state $q \in Q$ is denoted by $\mathcal E^*(q)$ and is defined inductively by

Then the $\varepsilon$-closure of a set $P \subseteq Q$, denoted $\mathcal E^*(P)$ is defined by $$\mathcal E^*(P) = \bigcup_{p \in P} \mathcal E^*(p).$$

This notion is mostly for the purposes of formal definitions: it allows us to say formally which states we are "in" without having read anything. We can then use this notion to define our extended transition function that allows for $\varepsilon$.

Let $\delta^* : Q \times \Sigma^* \to 2^Q$ be a function defined as follows,

In other words, our new extended transition function for the $\varepsilon$-NFA is just the one for the ordinary NFA, except that we need to ensure that we account for the $\varepsilon$-closure of the result.

You might observe that, as we still only need to consider at most $2^{|Q|}$ possible sets, that allowing $\varepsilon$-transitions do not give NFAs any additional power, and you would be absolutely correct. At this point, you can play the same game as with the NFA and perform the subset construction to acquire an equivalent DFA, as long as you make sure to account for the $\varepsilon$-closure.

If we have an $\varepsilon$-NFA $N = (Q,\Sigma,\delta,I,F)$, we can define a DFA $\mathcal D_\varepsilon(N) = (Q',\Sigma,\delta',q_0',F')$ in the same way as the subset construction above, but with two differences:

  1. $q_0' = \mathcal E^*(I)$,
  2. for a set of states $S \subseteq Q$ and symbol $a \in \Sigma$, the transition function $\delta'$ is defined $$\delta'(S,a) = \mathcal E^* \left(\bigcup_{p \in S} \delta(p,a)\right).$$

In other words, the subset construction for $\varepsilon$-NFAs is exactly the same as for NFAs, except that all the states of $\mathcal D_\varepsilon(N)$ are $\varepsilon$-closed subsets of $Q$. We can then use this to show the following.

A language $L$ is accepted by an $\varepsilon$-NFA if and only if $L$ is accepted by a DFA.

The proof is almost exactly the same as for ordinary NFAs so we won't be going through it here. This result establishes that DFAs, NFAs, and $\varepsilon$-NFAs are all equivalent models and that they all recognize the same class of languages. In other words, nondeterminism and $\varepsilon$-transitions do not give our basic finite automaton model any additional power.

For the most part, the distinction between an NFA and an $\varepsilon$-NFA is not that big, so it is usually not specified. You can assume that you can use $\varepsilon$-transitions in an NFA unless otherwise specified (but not DFAs).