Our excursion into non-regularity brings up another question concerning regular languages. We've seen that languages are non-regular because they require some way of "remembering" a non-finite number of things. This leads to how we might characterize this idea of finiteness more formally. This idea also brings us to the question of whether there's such a thing as a unique automaton for every regular language or whether there's a smallest such machine. This brings to mind the notion of equivalence.
A relation $R$ over a set $S$ is a subset $R \subseteq S \times S$ and we write $x \mathrel R y$ if $(x,y) \in R$. A relation $R$ is an equivalence relation if it satisfies the following:
An equivalence relation partitions $S$ into equivalence classes. An equivalence class with representative $x$ is $$[x] = \{y \in S \mid x \mathrel R y\}.$$ The index of a relation $R$ is the number of equivalence classes of $R$; $R$ is of finite index if it has finitely many equivalence clases. For two relations $R_1, R_2$ over $S$, we say $R_1$ is a refinement of, or refines $R_2$ if $R_1 \subseteq R_2$ (i.e. $\forall x,y \in S, x \mathrel{R_1} y \rightarrow x \mathrel{R_2} y$). If $R_1$ refines $R_2$, then each equivalence class of $R_2$ is a union of equivalence classes of $R_1$.
Let $A$ be a DFA. We define the relation $\sim_A$ for words $u,v \in \Sigma^*$ by $u \sim_A v$ if and only if $\delta(q_0,u) = \delta(q_0,v)$.
In other words, we can consider two words to be equivalent under $\sim_A$ if they reach the same state. This makes sense from the perspective of the automaton: all it knows is that this word has reached a particular state, but it doesn't know anything more about the word that got it there.
The relation $\sim_A$ is an equivalence relation on $\Sigma^*$.
Therefore, $\sim_A$ is an equivalence relation.
The relation $\sim_A$ has finite index.
There is an equivalence class for each state of $A$, therefore the index of $\sim_A$ is at most $|Q|$ and is finite.
We say that a relation $\sim$ is right invariant if $\forall x,y,z, x \sim y \rightarrow xz \sim yz$.
The relation $\sim_A$ is right-invariant.
Suppose $x \sim_A y$ and let $z \in \Sigma^*$. Then \begin{align*} \delta(q_0,xz) &= \delta(\delta(q_0,x), z) \\ &= \delta(\delta(q_0,y), z) \\ &= \delta(q_0,yz) \end{align*} Therefore, $xz \sim_A yz$ and therefore $\sim_A$ is right-invariant.
The language $L(A)$ is the union of equivalence classes of $\sim_A$.
To show this, we show if $x \sim_A y$, then $x \in L \iff y \in L$. That is, if $x$ and $y$ are in the same equivalence class, they must both be in the language. This means that if $x \in L$, then the entire equivalence class $[x]$ is contained in $L$. Suppose $x \sim_A y$. Then $$x \in L \iff \delta(q_0,x) \in F \iff \delta(q_0,y) \in F \iff y \in L.$$
Let $L \subseteq \Sigma^*$ be a language. We define the relation $\equiv_L$ for words $x,y \in \Sigma^*$ by $x \equiv_L y$ if and only if $\forall z \in \Sigma^*, xz \in L \iff yz \in L$. $\equiv_L$ is the Myhill–Nerode equivalence relation.
Note that we can define $\equiv_L$ for any language, not just those that are regular. It's also important to note that the equivalence relation applies to all words, not just those in $L$.
$\equiv_L$ is an equivalence relation.
$\equiv_L$ is right-invariant.
Suppose that $x \equiv_L y$. Then $xu \in L \iff yu \in L$ for all $u \in \Sigma^*$. Let $u = vz$. Then, $$x(vz) \in L \iff (xv)z \in L \iff (yv)z \in L \iff y(vz) \in L.$$
Let $L \subseteq \Sigma^*$, and let $\approx$ be any right-invariant equivalence relation on $\Sigma^*$ such that $L$ is the union of equivalence classes of $\approx$. Then $\approx$ refines $\equiv_L$.
Suppose $x \approx y$. Then $xu \approx yu$ for all $u \in \Sigma^*$. Since $L$ is the union of equivalence classes of $\approx$, we have $xu \in L$ if and only if $yu \in L$. Therefore, $x \equiv_L y$.
Let $L \subseteq \Sigma^*$ be a language. The following are equivalent:
$(1) \implies (2)$: If $L$ is regular, then there exists a DFA $A$ that recognizes $L$. Then $L$ is the union of equivalence classes of $\sim_A$, which is a right invariant equivalence relation of finite index.
$(2) \implies (3)$: By Lemma 10.11, $\approx$ refines $\equiv_L$. Since $\approx$ is of finite index, and the index of $\equiv_L$ is at most the index of $\approx$, $\equiv_L$ must also be of finite index.
$(3) \implies (1)$: We define a DFA $A_{\equiv_L} = (Q,\Sigma,\delta,q_0,F)$ that recognizes $L$, with
Here, the states of our DFA are exactly the equivalence classes of $\equiv_L$. Since $\equiv_L$ has finite index, this means the machine we defined has a finite number of states.
To show that $\delta$ is well-defined for our equivalence classes, suppose that $x \equiv_L y$. Since $\equiv_L$ is right-invariant, we have $xa \equiv_L ya$ for $a \in \Sigma$. Then, $$\delta([x],a) = [xa] = [ya] = \delta([y],a).$$
Then to show that $L(A_{\equiv_L}) = L$, we first see that by induction on $|x|$, we have $\delta([\varepsilon],x) = [x]$ for all $x \in \Sigma^*$. Then for $w \in \Sigma^*$, \begin{align*} w \in L(A_{\equiv_L}) & \iff \delta([\varepsilon],w) \in F \\ &\iff [w] \in F \\ &\iff w \in L \end{align*}
The Myhill–Nerode theorem is called as such because it was independently proved by John Myhill and Anil Nerode in the late 50s. Myhill was a professor at SUNY Buffalo. Nerode is currently a professor at Cornell, where he's been a faculty member for 60 years. More importantly, he proved this theorem while he was a PhD student here.