CMSC 28000 — Lecture 4

Let's prove formally that the product construction works. First, we'll show that the transition function behaves pretty much how we'd expect it to.

For all $w \in \Sigma^*$, $$\delta'(\langle p,q \rangle, w) = \langle \delta_1(p,w), \delta_2(q,w) \rangle.$$

We will prove this by induction on $|w|$.

For the base case, we have $w = \varepsilon$. Then $$\delta'(\langle p,q \rangle, \varepsilon) = \langle p,q \rangle = \langle \delta_1(p,\varepsilon, \delta_2(q, \varepsilon) \rangle.$$

Then for our inductive step, we assume that $\delta'(\langle p,q \rangle, x) = \langle \delta_1(p,x), \delta_2(q,x) \rangle$ for $x \in \Sigma^*$ and we consider $w = xa$ for $a \in \Sigma$. We have \begin{align*} \delta'(\langle p,q \rangle, xa) &= \delta'(\delta'(\langle p,q \rangle, x),a) \\ &= \delta'(\langle \delta_1(p,x), \delta_2(q,x) \rangle,a) \\ &= \langle \delta_1(\delta_1(p,x),a), \delta_2(\delta_2(q,x),a) \rangle \\ &= \langle \delta_1(p,xa), \delta_2(q,xa) \rangle \end{align*}

Let $w \in \Sigma^*$. Then, \begin{align*} w \in L(A') &\iff \delta'(q_0',w) \in F' \\ &\iff \delta'(\langle q_{0,1}, q_{0,2} \rangle, w) \in F_1 \times F_2 \\ &\iff \langle \delta_1(q_{0,1},w), \delta_2(q_{0,2},w) \rangle \in F_1 \times F_2 \\ &\iff \delta_1(q_{0,1},w) \in F_1 \text{ and } \delta_2(q_{0,2},w) \in F_2 \\ &\iff w \in L(A_1) \text{ and } w \in L(A_2) \\ &\iff w \in L_1 \text{ and } w \in L_2 \\ &\iff w \in L_1 \cap L_2 \end{align*}

The product construction is a construction due to Rabin and Scott in 1959. We will see this paper pop up again—a lot of fundamental results about finite automata come from here.

Now, let's consider union. If we think about our approach to intersection, we can see that it's not actually that different. Instead of accepting a word when both automata accept it, we can modify our final state set so that a word is accepted when at least one of the automata accept. Then such a DFA follows the product construction from above, with the difference being the set of final states, $$F' = (F_1 \times Q_2) \cup (Q_1 \times F_2).$$ This set consists of all pairs that contain a final state of $A_1$ or a final state of $A_2$.

However, there is another easy but slightly indirect way. We can make use of the closure properties we've already proved so far.

Let $L_1$ and $L_2$ be languages recognized by DFAs. Then $\overline{L_1}$ and $\overline{L_2}$ are also recognized by DFAs, by Theorem 3.3. Since DFA languages are closed under intersection, by Theorem 4.2, $\overline{L_1} \cap \overline{L_2}$ is also a DFA language. But this language is $\overline{L_1 \cup L_2}$ by De Morgan's laws. And since DFA languages are closed under complementation, $L_1 \cup L_2$ must also be a DFA language.

We can do the same kind of thing for other similar set operations like set difference. But what about an entirely different kind of operation like concatenation? We can define many more interesting string operations, like concatenation, and ask whether those operations preserve the property of being recognizable by DFAs.

For concatenation, conceptually, we can imagine that if we wanted to recognize a word $uv$, where $u \in L_1$ and $v \in L_2$, what we could do is check that $u$ is accepted by the DFA $A_1$ for $L_1$ and $v$ is accepted by the DFA $A_2$ for $L_2$. Then rather than running the two machines in parallel, we'd run them one after the other.

There is a problem with this approach: when we give a DFA a word $uv$, it doesn't actually know where $u$ ends and $v$ begins. An obvious idea to fix this would be to test whether a state was a final state in $A_1$, but this only gets us so far.

Consider the following scenario where we have a word $uvw$, where $u, uv \in L_1$ and $w \in L_2$ but $vw \not \in L_2$. In this case, our DFA could see that $u$ brings us to a final state of $A_1$ and start processing $vw$ on $A_2$, only to reject $vw$.

The big problem is that our transitions have to be uniquely determined, but we can't go back and try something else. In other words, we have to know exactly where we're going. And so, we seem to be hitting our first roadblock.

Nondeterminism

As DFAs are currently defined (and so named) the transition function $\delta$ is deterministic. That is, the next state is uniquely determined by the current state and input symbol, or there is exactly one state that the transition function points to. This keeps things simple for us and it accurately reflects how real computers work. However, it does seem like a bit of a stifling restriction and sometimes there are a few points where it'd be so much easier just to be able to go to multiple states.

Here, we'll relax the condition that the transition function is deterministic by introducing nondeterminism to finite automata. We call these machines nondeterministic finite automata for obvious reasons. Informally, this means that in an NFA, for each state and transition symbol, we'll allow transitions to multiple states.

Conceptually, what does this mean? In essence, the NFA can "guess" which transition to take. While with a DFA we could represent a computation on an input word with a single path, with nondeterminism, there are multiple paths which we can represent as a tree, with branches forming at the points where there are choices to be made.

But if there are multiple possible paths, how do we define acceptance under this model? If there is at least one path which, after reading the input word, reaches an accepting state, then the NFA accepts the word. This fits with the idea that the NFA makes guesses whenever it has a choice; the existence of a path that reaches a final state means that there is a set of guesses that the NFA could make in order for the word to be accepted. Conversely, the NFA rejects $w$ if there are no paths which reach a final state upon reading $w$.

The following is an example of an NFA, which accepts the language $\{a,b\}^* b \{a,b\}^2$. Note that on $q_0$, there are two transitions on $b$: one to $q_0$ and one to $q_1$.

A nondeterministic finite automaton (NFA) is a 5-tuple $A = (Q,\Sigma,\delta,I,F)$, where

This looks basically the same as the DFA, except for one important difference: the transition function $\delta$ is now a function from a state and a symbol to a set of states. This explains why we don't require that $\delta$ is defined explicitly for all states and symbols. Since the image of $\delta$ is sets of $Q$, we implicitly define "missing" transitions as going to the empty set $\emptyset$.

This also means that we will need to formally redefine acceptance of a word with nondeterminism. Again, we will extend our transition function to a function $\hat \delta: Q \times \Sigma^* \to 2^Q$. We define $\hat \delta(q,\varepsilon) = \{q\}$ and for $w = xa$ with $x \in \Sigma^*$ and $a \in \Sigma$, we have $$ \hat \delta(q,xa) = \bigcup_{p \in \hat \delta(q,x)} \delta(p,a).$$

However, just like how we eventually give up and abuse notation by using $\delta$ for $\hat \delta$, we also do this for subsets of state $P \subseteq Q$ and often write $$\delta(P,a) = \bigcup_{q \in P} \delta(q,a)$$ when the context is clear.

Let's take a moment to see how this definition fits with our intution about how nondeterminism behaves. It is easy to see that by taking the union of these sets, we really do get all possible transitions. However, note that this definition also handles transitions that are missing: if $\delta(p,a) = \emptyset$, then for any set $S$, we have $S \cup \emptyset = S$, and so $\hat \delta$ remains well-defined.

We say that an NFA $A$ accepts a word $w$ if $\hat \delta(i,w) \cap F \neq \emptyset$ for some initial state $i \in I$. Then the language of an NFA $A = (Q,\Sigma,\delta,I,F)$ is defined by $$ L(A) = \left\{w \in \Sigma^* \mid \hat \delta(I,w) \cap F \neq \emptyset \right\}.$$

To play around with nondeterminism a bit, let's think back to one of the more exotic language operations I mentioned earlier, the proportional removal $\frac 1 2 L$. We'll prove something about $L \frac 1 2$, the language of the second halves of words from $L$, instead. Proportional removals were studied by Stearns and Hartmanis in the early 60s.

Let $A$ be an DFA. Then there is an NFA that recognizes the language $$L(A) \frac 1 2 = \{y \in \Sigma^* \mid xy \in L(A), |x| = |y|\}.$$

Let's think about the approach roughly first. We need to check if our input word $y$ is the latter half of some word in our language $L(A)$. This means that there's some state $p$ of our DFA $A$ from which we can reach a final state on $y$. We also need to check that $y$ is the correct length (half of a word) by checking that there's a path of length $|y|$ from the initial state of $A$ to $p$.

It's not clear if this is even possible with a DFA. However, with an NFA, we can simply guess all the possibilities. First, we guess what $p$ could've been. Then, we run our machine in parallel, by starting to process $y$ from the state $p$ that we guessed, and by guessing a path starting from the initial state $q_0$. If we reach a final state on $y$ and find a path from $q_0$ to $p$ of the same length, then we accept.

Here's how we can imagine a machine does this: use triples of states $\langle p,q,r \rangle$. The machine guesses $p$ at the beginning and it doesn't change, so it remembers which one was guessed. The state $q$ is the state that the machine ends up in by guessing a prefix starting from the initial state $q_0$. The state $r$ is the state that the machine ends up in by processing $y$ and starting on $p$.

Let $L$ be recognized by a DFA $A = (Q,\Sigma,\delta,q_0,F)$. We will construct an NFA $A' = (Q',\Sigma,\delta',I',F')$, where

$Q' = Q \times Q \times Q$,
$I' = \{\langle p,q_0,p \rangle \mid p \in Q\}$,
$F' = \{\langle p,p,q \rangle \mid q \in F\}$,
$\delta'(\langle p,q,r \rangle,a) = \{ \langle p,\delta(q,b),\delta(r,a) \rangle \mid b \in \Sigma \}$.

To prove this formally we can show that $\langle p,q,r \rangle \in \delta'(I',x)$ if and only if there exists $x \in \Sigma^*$ with $|x| = |y|$ such that $\delta(q_0,x) = q$ and $\delta(p,y) = r$ by induction on $|y|$. Then we have \begin{align*} \delta'(I',y) \cap F' \neq \emptyset &\iff \exists x \in \Sigma^*, |x| = |y|, \delta(q_0,x) = p, \delta(p,y) \in F \\ &\iff \exists x \in \Sigma^*, |x| = |y|, \delta(q_0,xy) \in F \\ &\iff \exists x \in \Sigma^*, |x| = |y|, xy \in L \end{align*}