CMSC 28000 — Lecture 5

The power of nondeterminism

So now, the obvious question to ask is: does nondeterminism really give us more power? That is, can NFAs recognize languages that DFAs can't? It turns out the answer is no.

But how? We'll show that given any NFA $N$, we can construct a DFA $M$ that accepts the same language. The key observation is in examining what exactly is happening with our nondeterministic transition function. Remember that for each state and symbol, the transition function gives a set of states rather than a single state as a deterministic finite automaton.

So how many sets are there? If there are $n$ states, then there are $2^n$ possible subsets of $Q$. This is a large number which we've been trained as computer scientists to be alarmed at, but the important thing is that it is finite. This gives us a way to "simulate" the nondeterminism of an NFA using a DFA: just make each subset of $Q$ its own state!

Let's try this with this example. Here's the NFA again.

And here is the transition table with all the subsets.

$a$ $b$
$\emptyset$ $\emptyset$ $\emptyset$
$\{q_0\}$ $\{q_0\}$ $\{q_0,q_1\}$
$\{q_1\}$ $\{q_2\}$ $\{q_2\}$
$\{q_2\}$ $\{q_3\}$ $\{q_3\}$
$\{q_3\}$ $\emptyset$ $\emptyset$
$\{q_0,q_1\}$ $\{q_0,q_2\}$ $\{q_0,q_1,q_2\}$
$\{q_0,q_2\}$ $\{q_0,q_3\}$ $\{q_0,q_1,q_3\}$
$\{q_0,q_3\}$ $\{q_0\}$ $\{q_0,q_1\}$
$\{q_1,q_2\}$ $\{q_2,q_3\}$ $\{q_2,q_3\}$
$\{q_1,q_3\}$ $\{q_2\}$ $\{q_2\}$
$\{q_2,q_3\}$ $\{q_3\}$ $\{q_3\}$
$\{q_0,q_1,q_2\}$ $\{q_0,q_2,q_3\}$ $\{q_0,q_1,q_2,q_3\}$
$\{q_0,q_1,q_3\}$ $\{q_0,q_2\}$ $\{q_0,q_1,q_2\}$
$\{q_0,q_2,q_3\}$ $\{q_0,q_3\}$ $\{q_0,q_1,q_3\}$
$\{q_1,q_2,q_3\}$ $\{q_2,q_3\}$ $\{q_2,q_3\}$
$\{q_0,q_1,q_2,q_3\}$ $\{q_0,q_2,q_3\}$ $\{q_0,q_1,q_2,q_3\}$

Note that all the subsets in red cannot be reached from the initial state $\{q_0\}$. Now, here is the state diagram for the DFA.

I've omitted all the subsets that can't be reached, but this is still a large DFA relative to the size of the NFA. In fact, one can show that the language $\{a,b\}^* b \{a,b\}^{n-1}$ can be recognized by an NFA with $n+1$ states but requires a DFA with $2^n$ states.

This construction is called the subset construction and was again due to Rabin and Scott in 1959. An interesting question that one can consider is when the worst case of $2^n$ states comes up. Similar questions about the worst-case growth in the number of states of a DFA or NFA can be asked for other transformations or language operations in a line of research in automata theory called state complexity.

For instance, recall our NFA for $L \frac 1 2$. Since the states of our NFA is the set of 3-tuples of states of our original machine, if we started with an $n$-state DFA, we'd construct an NFA that has $n^3$ states. Taking the subset construction of this means that we could end up with $2^{n^3}$ states in an equivalent DFA. It turns out if you come up with a clever enough construction, Mike Domaratzki showed in 2001 that a DFA recognizing $L \frac 1 2$ only requires at most $n e^{\Theta(\sqrt{n \log n})}$ states.

Now, let's define the subset construction formally.

Let $N = (Q,\Sigma,\delta,I,F)$ be an NFA. We will construct a DFA $\mathcal D(N) = (Q',\Sigma,\delta',q_0',F')$ such that $L(\mathcal D(N)) = L(N)$. We will define each component of $\mathcal D(N)$:

Now, we need to show that $\mathcal D(N)$ and $N$ recognize exactly the same language.

Let $N$ be an NFA. Then $L(N) = L(\mathcal D(N))$.

We want to show that $w \in L(N)$ if and only if $w \in L(\mathcal D(N))$. This means that by definition, $\hat \delta(I,w) \cap F \neq \emptyset$ if and only if $\hat \delta'(I,w) \cap F \neq \emptyset$. So, we will show that $\hat \delta(q,w) = \hat \delta'(\{q\},w)$. We will show this by induction on $|w|$.

First, we have $w = \varepsilon$. Then $\hat \delta(q,\varepsilon) = \{q\} = \hat \delta'(\{q\},\varepsilon)$.

Now, suppose for some string $u \in \Sigma^*$, we have $\hat \delta(q,u) = \hat \delta'(\{q\},u)$. Let $w = ua$ with $a \in \Sigma$. Then we have \begin{align*} \hat \delta'(\{q\},ua) &= \delta'(\hat\delta'(\{q\},u),a) &\text{(by definition of $\hat \delta'$)}\\ &= \delta'(\hat \delta(q,u),a) &\text{(by inductive hypothesis)}\\ &= \bigcup_{p \in \hat\delta(q,u)} \delta(p,a) &\text{(by definition of $\hat \delta'$)}\\ &= \hat \delta(q,ua). &\text{(by definition of $\hat \delta$)} \end{align*} Thus, we have shown that $\hat \delta(q,w) = \hat \delta'(\{q\},w)$ for all $q \in Q$ and $w \in \Sigma^*$, and we can conclude that $L(N) = L(\mathcal D(N))$.

Of course, this only gives us half of what we need, although what remains is fairly simple. We can just state the following.

A language $L$ is accepted by a DFA if and only if $L$ is accepted by an NFA.

We have already shown the only if direction of this theorem in Theorem 5.4. What's left to show is the if direction, which involves taking a DFA and constructing an equivalent NFA. This should be fairly simple to define and prove correctness for, so we won't go through it.

$\varepsilon$-transitions

Now, we'll consider another possible enhancement to the finite automaton: allowing transitions to other states via $\varepsilon$. This additional trick will be very convenient, but again, we have to ask the question: does this make finite automata more powerful? Some of you have already caught on when I introduced the NFA with multiple initial states and wondered if multiple initial states really makes a difference over a single initial state. That same intuition will serve us when we think about how $\varepsilon$-transitions actually work.

First, the formal definition.

An $\varepsilon$-NFA is a 5-tuple $A = (Q,\Sigma,\delta,q_0,F)$ as in the definition of the NFA, except that the transition function is now a function from a state and either a symbol or $\varepsilon$ to a set of states, $\delta : Q \times (\Sigma \cup \{\varepsilon\}) \to 2^Q$.

So the question now is, what is the result of a transition function that allows $\varepsilon$s? What we would like is a way to say, for some set of states $P \subseteq Q$, "all the states that are reachable on one or more $\varepsilon$-transitions from a state $p \in P$". We'll define the $\varepsilon$-closure of $P$ to be exactly this.

The $\varepsilon$-closure of a state $q \in Q$ is denoted by $\mathcal E(q)$ and is defined inductively by

Then the $\varepsilon$-closure of a set $P \subseteq Q$, denoted $\mathcal E(P)$ is defined by $$\mathcal E(P) = \bigcup_{p \in P} \mathcal E(p).$$

Now, we can define our new extended transition function.

Let $\hat \delta : Q \times \Sigma^* \to 2^Q$ be a function defined as follows,

In other words, our new extended transition function for the $\varepsilon$-NFA is just the one for the ordinary NFA, except that we need to ensure that we account for the $\varepsilon$-closure of the result.

You might observe that, as we still only need to consider at most $2^{|Q|}$ possible sets, that allowing $\varepsilon$-transitions do not give NFAs any additional power, and you would be absolutely correct. At this point, you can play the same game as with the NFA and perform the subset construction to acquire an equivalent DFA, as long as you make sure to account for the $\varepsilon$-closure.

If we have an $\varepsilon$-NFA $N = (Q,\Sigma,\delta,q_0,F)$, we can define a DFA $\mathcal D_\varepsilon(N) = (Q',\Sigma,\delta',q_0',F')$ in the same way as the subset construction above, but with two differences:

  1. $q_0' = \mathcal E(q_0)$,
  2. for a set of states $S \subseteq Q$ and symbol $a \in \Sigma$, the transition function $\delta'$ is defined $$\delta'(S,a) = \mathcal E \left(\bigcup_{p \in S} \delta(p,a)\right).$$

In other words, the subset construction for $\varepsilon$-NFAs is exactly the same as for NFAs, except that all the states of $\mathcal D_\varepsilon(N)$ are $\varepsilon$-closed subsets of $Q$. We can then use this to show the following.

A language $L$ is accepted by an $\varepsilon$-NFA if and only if $L$ is accepted by a DFA.

The proof is almost exactly the same as for ordinary NFAs so we won't be going through it here. This result establishes that DFAs, NFAs, and $\varepsilon$-NFAs are all equivalent models and that they all recognize the same class of languages. In other words, nondeterminism and $\varepsilon$-transitions do not give our basic finite automaton model any additional power.

Now, let's use this idea to build an $\varepsilon$-NFA for the concatenation of two languages. Recall that the big problem we had with concatenation of $L(A) L(B)$ for DFAs $A, B$ was that we had to "guess" where the word that belonged to $L(A)$ ended. With NFAs, we now have the ability to do just that.

However, there is the additional question of when exactly to guess. The ability to use $\varepsilon$-transitions simplifies this: if we add a $\varepsilon$-transition from each final state of $A$ to the initial state of $B$. Here is a schematic of our machine.

This way, whenever our computation on a word enters a final state of $A$, it "guesses" that it is time to start a computation on $B$. Let's formally define the construction.

The class of languages recognized by DFAs is closed under concatenation.

Let $L_1$ and $L_2$ be regular languages that are recognized by DFAs $A_1 = (Q_1, \Sigma, \delta_1, q_{0,1}, F_1)$ and $A_2 = (Q_2, \Sigma, \delta_2, q_{0,2}, F_2)$, respectively. We will construct an $\varepsilon$-NFA $A' = (Q',\Sigma, \delta', I', F')$ to recognize $L_1 L_2$. Here, we have

and the transition function $\delta':(Q_1 \cup Q_2) \times (\Sigma \cup \{\varepsilon\}) \rightarrow 2^{Q_1 \cup Q_2}$ is defined by

This construction hinges on the observation that we can only make it into the automaton $A_2$ if we enter a final state of $A_1$. Consider a word $w \in \Sigma^*$ that's accepted by $A'$. Then $\delta'(\{q_{0,a}\}, w) \cap F_2 \neq \emptyset$. So we can split the word up into two words $w = xy$. Since the only entry point into $A_2$ is $q_{0,2}$, we have that $\delta_2(q_{0,2},y) \in F_2$. But in order for $q_{0,2}$ to be reached, there had to be some word $x$ such that $\delta_1(q_{0,1},x) \in F_1$. This gives us our two words $x \in L(A_1)$ and $y \in L(A_2)$.

Since $\varepsilon$-NFAs can be transformed into an equivalent DFA, we can conclude that there is a DFA that recognizes $L(A_1) L(A_2)$.