CMSC 28000 — Lecture 18

Now, we'll consider the other direction. We won't go through all the details this time.

Let $M$ be a pushdown automaton. Then there exists a context-free grammar $G$ such that $L(G) = N(M)$.

Let $M = (Q,\Sigma,\Gamma,\delta,q_0,Z_0)$ be a PDA that accepts by empty stack. We want to simulate the process of performing a computation on $M$ via a grammar $G$. Consider the first step of such a computation $$(q,au,X) \vdash (q',u,Y_1 \cdots Y_k)$$ where $q,q' \in Q$, $u \in \Sigma^*$, $a \in \Sigma \cup \{\varepsilon\}$, and $X, Y_1, \dots, Y_K \in \Gamma$.

In this computation, we follow the transition $(q',Y_1 \cdots Y_k) \in \delta(q,a,X)$. Note that $a \in \Sigma \cup \{\varepsilon\}$. Then the computation proceeds and at some point we end up in a configuration $(q_1,u_1,Y_2 \cdots Y_k)$. We continue the computation, removing each $Y_i$ and consuming the input word until we end up in a configuration $(r,\varepsilon,\varepsilon)$ for some state $r \in Q$.

We will construct a CFG $G = (V,\Gamma,P,S)$ that generates $N(M)$. First, we define the set of variables $V$ by $$ V = \{S\} \cup \{A_{q,X,r} \mid q,r \in Q, X \in \Gamma\}.$$

The idea is that from $A_{q,X,r}$, one can construct a derivation $A_{q,X,r} \Rightarrow^* w$ for a word $w$ if and only if $(q,w,X) \vdash^* (r,\varepsilon,\varepsilon)$. In a sense, the variable $A_{q,X,r}$ represents the computation from $q$ to $r$ on input $w$ with $X$ on the stack. We then need to define our set of productions appropriately.

Consider a transition $(r,Y_1 \cdots Y_k) \in \delta(q,a,X)$ as above. If what we have is $(r,\varepsilon)$, then we add the production $$A_{q,X,r} \to a.$$ This represents the action of popping $X$ off the stack, reading a symbol $a$, and moving to state $r$.

If $k \geq 1$, then we add productions of the form $$A_{q,X,r_k} \to aA_{r,Y_1,r_1} A_{r_1,Y_2,r_2} \cdots A_{r_{k-1},Y_k,r_k}$$ for all possible $r_1, \dots, r_k \in Q$. Be careful of the subscripts here: note that though we have a transition to the state $r$, the variable on the left hand side of the production is for the state $r_k$—the state $r$ is in the first variable of the right hand side. This represents the action of reading a symbol $a$, popping $X$ from the stack, pushing $Y_1 \cdots Y_k$ on to the stack, and considering all possible sequences of states $r_1, \dots, r_k$ such that we can pop $Y_1 \cdots Y_k$, while moving to state $r$.

There is a question of whether we can actually do this thing of taking all the possible sequences of states, since it seems like there could be an infinite number of such sequences. However, note that each of these production is based on a transition and the number of stack symbols that transition pushes onto the stack. Since this number is fixed, we only need to consider all of these combinations. Not all of these variables will be useful, and those that are will be the ones that admit a valid derivation.

Since we want to generate all words such that the PDA $M$ reaches a configuration with an empty stack, we add transitions for all states $q \in Q$ and the start state $q_0$ $$S \to A_{q_0,Z_0,q}.$$ From this, we have $(q_0,w,Z_0) \vdash^* (q,\varepsilon,\varepsilon)$ if and only if $S \Rightarrow A_{q_0,Z_0,q} \Rightarrow^* w$.

Closure properties for context-free languages

We haven't yet discussed closure properties for context-free languages in depth yet. We will see that these are not quite as clear-cut as with regular languages. That is, regular languages happen to be quite robust under many operations—it is very difficult to apply a language operation that the regular languages aren't closed under.

To prove closure properties, we apply the same methods as before: show that there's a representation for our language resulting from the operation. For context-free grammars, this typically means constructing a CFG from other CFGs.

Let $G_1 = (V_1,\Sigma,P_1,S_1)$ and $G_2 = (V_2, \Sigma,P_2,S_2)$ be context-free grammars. We can construct new grammars for each regular operation.

For a grammar that generates $L(G_1) \cup L(G_2)$, we add new production rules \[S \to S_1 \mid S_2.\]
For a grammar that generates $L(G_1) L(G_2)$, we add a new production rule \[S \to S_1 S_2.\]
For a grammar that generates $L(G_1)^*$, we add new production rules \[S \to S_1 S \mid \varepsilon.\]

To prove this formally, we would need to through an inductive argument—this makes a nice exercise.

However, just as with regular languages, we now have multiple equivalent representations for context-free languages. And just like with regular languages, some things are harder or easier to prove in one model over the other. With pushdown automata and having proved their equivalence with context-free grammars, we can now prove things about context-free languages that were maybe too cumbersome to do with grammars.

For instance, we just showed that CFLs are closed under union, concatenation, and star fairly easily with grammars. But what about, say intersection? Here, we'll show that CFLs are closed under intersection with a regular language. This might seem strange at first, but it turns out to be quite handy.

Let $L \subseteq \Sigma$ be a context-free language and let $R \subseteq \Sigma^*$ be a regular language. Then $L \cap R$ is a context-free language.

Let $A = (Q_A,\Sigma,\Gamma,\delta_A,s_A,Z_0,F_A)$ be a PDA that accepts $L$ and let $B = (Q_B, \Sigma, \delta_B, s_B, F_B)$ be a DFA that recognizes $R$. We will construct a PDA $C = (Q_C, \Sigma, \Gamma, \delta_C, s_C, Z_0, F_C)$ that recognizes $L \cap R$. We have

$Q_C = Q_A \times Q_B$,
$s_C = \langle s_A, s_B \rangle$,
$F_C = F_A \times F_B$,
and the transition function $\delta_C$ is defined by $$\delta_C(\langle p,q \rangle, a, X) = \begin{cases} \{ (\langle p',q \rangle, \gamma) \mid (p',\gamma) \in \delta_A(p,a,X) \} & \text{if $a = \varepsilon$,} \\ \{ (\langle p',\delta_B(q,a) \rangle, \gamma) \mid (p',\gamma) \in \delta_A(p,a,X) \} & \text{if $a \neq \varepsilon$.} \end{cases}$$

The PDA $C$ works by simultaneously moving through a computation of $A$ and $B$. When a transition is made, $C$ will keep track of the current state of $A$ via the first component of each state of $C$ and by using the stack as normal. The current state of $B$ is kept track of as the second component of the states of $C$. In addition, when $A$ makes $\varepsilon$-transitions, the state of $B$ does not change.

It is not too difficult to see that in this way, $C$ can move simultaneously through $A$ and $B$. Then, the set of accepting states of $C$ is defined to be pairs of states in $F_A$ and $F_B$. In other words, a word accepted by $C$ must have a computation that ends in a state $\langle p,q \rangle$ with $p \in F_A$ and $q \in F_B$. In other words, there is a computation for $w$ on $A$ that ends in a final state and there is a computation for $w$ on $B$ that ends in a final state. Thus, $C$ accepts $w$ if and only if $w$ is accepted by both $A$ and $B$ and this is only the case if and only if $w \in L \cap R$. Thus, $C$ is a PDA that accepts $L \cap R$ and $L \cap R$ is a context-free language.

Why would you want to take an intersection with a regular language? Recall that we can treat intersection with a language as kind of a filter. Regular languages happen to be quite handy for this—they're simple enough to describe and a lot of common "patterns" can be expressed as regular languages.

But how about intersection of two context-free languages? That poses a bit more of a challenge. For one thing, our product construction approach will fail—there's no obvious way to "share" a stack. For now, we're stuck and will have to keep this in the back of our minds.