CMSC 28000 — Lecture 11

Our excursion into non-regularity brings up another question concerning regular languages. We've seen that languages are non-regular because they require some way of "remembering" a non-finite number of things. This leads to how we might characterize this idea of finiteness more formally. This idea also brings us to the question of whether there's such a thing as a unique automaton for every regular language or whether there's a smallest such machine. This brings to mind the notion of equivalence.

A relation $R$ over a set $S$ is a subset $R \subseteq S \times S$ and we write $x \mathrel R y$ if $(x,y) \in R$. A relation $R$ is an equivalence relation if it satisfies the following:

An equivalence relation partitions $S$ into equivalence classes. An equivalence class with representative $x$ is $$[x] = \{y \in S \mid x \mathrel R y\}.$$ The index of a relation $R$ is the number of equivalence classes of $R$; $R$ is of finite index if it has finitely many equivalence clases. For two relations $R_1, R_2$ over $S$, we say $R_1$ is a refinement of, or refines $R_2$ if $R_1 \subseteq R_2$ (i.e. $\forall x,y \in S, x \mathrel{R_1} y \rightarrow x \mathrel{R_2} y$). If $R_1$ refines $R_2$, then each equivalence class of $R_2$ is a union of equivalence classes of $R_1$.

Let $A$ be a DFA. We define the relation $\sim_A$ for words $u,v \in \Sigma^*$ by $u \sim_A v$ if and only if $\delta(q_0,u) = \delta(q_0,v)$.

In other words, we can consider two words to be equivalent under $\sim_A$ if they reach the same state. This makes sense from the perspective of the automaton: all it knows is that this word has reached a particular state, but it doesn't know anything more about the word that got it there.

$\sim_A$ is reflexive, since $x \sim_A x$ if and only if $\delta(q_0,x) = \delta(q_0,x)$.
$\sim_A$ is symmetric: $$x \sim_A y \iff \delta(q_0,x) = \delta(q_0,y) \iff \delta(q_0,y) = \delta(q_0,x) \iff y \sim_A x.$$
$\sim_A$ is transitive: if $x \sim_A y$ and $y \sim_A z$, then $\delta(q_0,x) = \delta(q_0,y)$ and $\delta(q_0,y) = \delta(q_0,z)$, and therefore, $\delta(q_0,x) = \delta(q_0,z)$ and $x \sim_A z$.

Therefore, $\sim_A$ is an equivalence relation.

There is an equivalence class for each state of $A$, therefore the index of $\sim_A$ is at most $|Q|$ and is finite.

We say that a relation $\sim$ is right invariant if $\forall x,y,z, x \sim y \rightarrow xz \sim yz$.

Suppose $x \sim_A y$ and let $z \in \Sigma^*$. Then \begin{align*} \delta(q_0,xz) &= \delta(\delta(q_0,x), z) \\ &= \delta(\delta(q_0,y), z) \\ &= \delta(q_0,yz) \end{align*} Therefore, $xz \sim_A yz$ and therefore $\sim_A$ is right-invariant.

To show this, we show if $x \sim_A y$, then $x \in L \iff y \in L$. That is, if $x$ and $y$ are in the same equivalence class, they must both be in the language. This means that if $x \in L$, then the entire equivalence class $[x]$ is contained in $L$. Suppose $x \sim_A y$. Then $$x \in L \iff \delta(q_0,x) \in F \iff \delta(q_0,y) \in F \iff y \in L.$$

Let $L \subseteq \Sigma^*$ be a language. We define the relation $\equiv_L$ for words $x,y \in \Sigma^*$ by $x \equiv_L y$ if and only if $\forall z \in \Sigma^*, xz \in L \iff yz \in L$. $\equiv_L$ is the Myhill–Nerode equivalence relation.

Note that we can define $\equiv_L$ for any language, not just those that are regular. It's also important to note that the equivalence relation applies to all words, not just those in $L$.

Reflexivity: $xu \in L \iff xu \in L$ and therefore $x \equiv_L x$.
Symmetry: \begin{align*} x \equiv_L y &\iff \forall u \in \Sigma^*, (xu \in L \iff yu \in L) \\ &\iff \forall u \in \Sigma^*, (yu \in L \iff xu \in L) \\ &\iff y \equiv_L x \\ \end{align*}
Transitivity: Suppose $x \equiv_L y$ and $y \equiv_L z$. Then for all $u \in \Sigma^*$, $xu \in L \iff yu \in L$ and $yu \in L \iff zu \in L$. Thus, $xu \in L \iff zu \in L$ and therefore $x \equiv_L z$.

Suppose that $x \equiv_L y$. Then $xu \in L \iff yu \in L$ for all $u \in \Sigma^*$. Let $u = vz$. Then, $$x(vz) \in L \iff (xv)z \in L \iff (yv)z \in L \iff y(vz) \in L.$$

Let $L \subseteq \Sigma^*$, and let $\approx$ be any right-invariant equivalence relation on $\Sigma^*$ such that $L$ is the union of equivalence classes of $\approx$. Then $\approx$ refines $\equiv_L$.

Suppose $x \approx y$. Then $xu \approx yu$ for all $u \in \Sigma^*$. Since $L$ is the union of equivalence classes of $\approx$, we have $xu \in L$ if and only if $yu \in L$. Therefore, $x \equiv_L y$.

$(1) \implies (2)$: If $L$ is regular, then there exists a DFA $A$ that recognizes $L$. Then $L$ is the union of equivalence classes of $\sim_A$, which is a right invariant equivalence relation of finite index.

$(2) \implies (3)$: By Lemma 10.11, $\approx$ refines $\equiv_L$. Since $\approx$ is of finite index, and the index of $\equiv_L$ is at most the index of $\approx$, $\equiv_L$ must also be of finite index.

$(3) \implies (1)$: We define a DFA $A_{\equiv_L} = (Q,\Sigma,\delta,q_0,F)$ that recognizes $L$, with

$Q = \{[x] \mid x \in \Sigma^*\}$,
$q_0 = [\varepsilon]$,
$F = \{[x] \mid x \in L\}$,
$\delta([x],a) = [xa]$.

Here, the states of our DFA are exactly the equivalence classes of $\equiv_L$. Since $\equiv_L$ has finite index, this means the machine we defined has a finite number of states.

To show that $\delta$ is well-defined for our equivalence classes, suppose that $x \equiv_L y$. Since $\equiv_L$ is right-invariant, we have $xa \equiv_L ya$ for $a \in \Sigma$. Then, $$\delta([x],a) = [xa] = [ya] = \delta([y],a).$$

Then to show that $L(A_{\equiv_L}) = L$, we first see that by induction on $|x|$, we have $\delta([\varepsilon],x) = [x]$ for all $x \in \Sigma^*$. Then for $w \in \Sigma^*$, \begin{align*} w \in L(A_{\equiv_L}) & \iff \delta([\varepsilon],w) \in F \\ &\iff [w] \in F \\ &\iff w \in L \end{align*}

The Myhill–Nerode theorem is called as such because it was independently proved by John Myhill and Anil Nerode in the late 50s. Myhill was a professor at SUNY Buffalo. Nerode is currently a professor at Cornell, where he's been a faculty member for 60 years. More importantly, he proved this theorem while he was a PhD student here.

There are some significant consequences of the Myhill–Nerode theorem. The first is that it gives us another way to determine whether a language is regular or not.

Let $L = \{a^n b^n \mid n \in \mathbb N\}$. We claim that the equivalence classes of $\equiv_L$ are $$\{[\varepsilon], [a], [aa], \dots\} \cup \{[b],[ab],[aab],\dots\}.$$ We need to show that each of these classes are actually different and so we need to show that they are pairwise distinct. To show that two classes $[u]$ and $[v]$ are different, we show that there exists a word $w$ such that $uw \in L$ and $vw \not \in L$.

This shows that there are at least as many equivalence classes as we've described, but to fully characterize the language, we want to make sure that we're not missing any other possible classes. Consider a possible equivalence class defined by a word $w$. There are two possibilities.

It's clear from this that $L$ has infinitely many equivalence classes and so it can't be regular. However, if we could build an infinite automaton, the Myhill–Nerode equivalence classes would tell us how to do it (we will not do this).

CMSC 28000 — Lecture 11

The Myhill–Nerode Theorem