CMSC 28000 — Lecture 5

For concatenation, conceptually, we can imagine that if we wanted to recognize a word $uv$, where $u \in L_1$ and $v \in L_2$, what we could do is check that $u$ is accepted by the DFA $A_1$ for $L_1$ and $v$ is accepted by the DFA $A_2$ for $L_2$. Then rather than running the two machines in parallel, we'd run them one after the other.

There is a problem with this approach: when we give a DFA a word $uv$, it doesn't actually know where $u$ ends and $v$ begins. An obvious idea to fix this would be to test whether a state was a final state in $A_1$, but this only gets us so far.

Consider the following scenario where we have a word $uvw$, where $u, uv \in L_1$ and $w \in L_2$ but $vw \not \in L_2$. In this case, our DFA could see that $u$ brings us to a final state of $A_1$ and start processing $vw$ on $A_2$, only to reject $vw$.

The big problem is that our transitions have to be uniquely determined, but we can't go back and try something else. In other words, we have to know exactly where we're going. And so, we seem to be hitting our first roadblock.

A nondeterministic finite automaton (NFA) is a 5-tuple $A = (Q,\Sigma,\delta,I,F)$, where

This looks basically the same as the DFA, except for one important difference: the transition function $\delta$ is now a function from a state and a symbol to a set of states. This explains why we don't require that $\delta$ is defined explicitly for all states and symbols. Since the image of $\delta$ is sets of $Q$, we implicitly define "missing" transitions as going to the empty set $\emptyset$.

This also means that we will need to formally redefine acceptance of a word with nondeterminism. Again, we will extend our transition function to a function $\hat \delta: Q \times \Sigma^* \to 2^Q$. We define $\hat \delta(q,\varepsilon) = \{q\}$ and for $w = xa$ with $x \in \Sigma^*$ and $a \in \Sigma$, we have $$ \hat \delta(q,xa) = \bigcup_{p \in \hat \delta(q,x)} \delta(p,a).$$

However, just like how we eventually give up and abuse notation by using $\delta$ for $\hat \delta$, we also do this for subsets of state $P \subseteq Q$ and often write $$\delta(P,a) = \bigcup_{q \in P} \delta(q,a)$$ when the context is clear.

Let's take a moment to see how this definition fits with our intution about how nondeterminism behaves. It is easy to see that by taking the union of these sets, we really do get all possible transitions. However, note that this definition also handles transitions that are missing: if $\delta(p,a) = \emptyset$, then for any set $S$, we have $S \cup \emptyset = S$, and so $\hat \delta$ remains well-defined.

We say that an NFA $A$ accepts a word $w$ if $\hat \delta(i,w) \cap F \neq \emptyset$ for some initial state $i \in I$. Then the language of an NFA $A = (Q,\Sigma,\delta,I,F)$ is defined by $$ L(A) = \left\{w \in \Sigma^* \mid \hat \delta(I,w) \cap F \neq \emptyset \right\}.$$

To play around with nondeterminism a bit, let's think back to one of the more exotic language operations I mentioned earlier, the proportional removal $\frac 1 2 L$. We'll prove something about $L \frac 1 2$, the language of the second halves of words from $L$, instead. Proportional removals were studied by Stearns and Hartmanis in the early 60s.

Let $A$ be an DFA. Then there is an NFA that recognizes the language $$L(A) \frac 1 2 = \{y \in \Sigma^* \mid xy \in L(A), |x| = |y|\}.$$

Let's think about the approach roughly first. We need to check if our input word $y$ is the latter half of some word in our language $L(A)$. This means that there's some state $p$ of our DFA $A$ from which we can reach a final state on $y$. We also need to check that $y$ is the correct length (half of a word) by checking that there's a path of length $|y|$ from the initial state of $A$ to $p$.

It's not clear if this is even possible with a DFA. However, with an NFA, we can simply guess all the possibilities. First, we guess what $p$ could've been. Then, we run our machine in parallel, by starting to process $y$ from the state $p$ that we guessed, and by guessing a path starting from the initial state $q_0$. If we reach a final state on $y$ and find a path from $q_0$ to $p$ of the same length, then we accept.

Here's how we can imagine a machine does this: use triples of states $\langle p,q,r \rangle$. The machine guesses $p$ at the beginning and it doesn't change, so it remembers which one was guessed. The state $q$ is the state that the machine ends up in by guessing a prefix starting from the initial state $q_0$. The state $r$ is the state that the machine ends up in by processing $y$ and starting on $p$.

Let $L$ be recognized by a DFA $A = (Q,\Sigma,\delta,q_0,F)$. We will construct an NFA $A' = (Q',\Sigma,\delta',I',F')$, where

$Q' = Q \times Q \times Q$,
$I' = \{\langle p,q_0,p \rangle \mid p \in Q\}$,
$F' = \{\langle p,p,q \rangle \mid q \in F\}$,
$\delta'(\langle p,q,r \rangle,a) = \{ \langle p,\delta(q,b),\delta(r,a) \rangle \mid b \in \Sigma \}$.

To prove this formally we can show that $\langle p,q,r \rangle \in \delta'(I',x)$ if and only if there exists $x \in \Sigma^*$ with $|x| = |y|$ such that $\delta(q_0,x) = q$ and $\delta(p,y) = r$ by induction on $|y|$. Then we have \begin{align*} \delta'(I',y) \cap F' \neq \emptyset &\iff \exists x \in \Sigma^*, |x| = |y|, \delta(q_0,x) = p, \delta(p,y) \in F \\ &\iff \exists x \in \Sigma^*, |x| = |y|, \delta(q_0,xy) \in F \\ &\iff \exists x \in \Sigma^*, |x| = |y|, xy \in L \end{align*}

The power of nondeterminism

So now, the obvious question to ask is: does nondeterminism really give us more power? That is, can NFAs recognize languages that DFAs can't? It turns out the answer is no.

But how? We'll show that given any NFA $N$, we can construct a DFA $M$ that accepts the same language. The key observation is in examining what exactly is happening with our nondeterministic transition function. Remember that for each state and symbol, the transition function gives a set of states rather than a single state as a deterministic finite automaton.

So how many sets are there? If there are $n$ states, then there are $2^n$ possible subsets of $Q$. This is a large number which we've been trained as computer scientists to be alarmed at, but the important thing is that it is finite. This gives us a way to "simulate" the nondeterminism of an NFA using a DFA: just make each subset of $Q$ its own state!

This construction is called the subset construction and was again due to Rabin and Scott in 1959. An interesting question that one can consider is when the worst case of $2^n$ states comes up. Similar questions about the worst-case growth in the number of states of a DFA or NFA can be asked for other transformations or language operations in a line of research in automata theory called state complexity.

For instance, recall our NFA for $L \frac 1 2$. Since the states of our NFA is the set of 3-tuples of states of our original machine, if we started with an $n$-state DFA, we'd construct an NFA that has $n^3$ states. Taking the subset construction of this means that we could end up with $2^{n^3}$ states in an equivalent DFA. It turns out if you come up with a clever enough construction, Mike Domaratzki showed in 2001 that a DFA recognizing $L \frac 1 2$ only requires at most $n e^{\Theta(\sqrt{n \log n})}$ states.