CMSC 28000 — Lecture 2

Here is a property of reversal that we can prove: if we concatenate two strings and reverse them, it is the same as reversing them first then concatenating them in the opposite order. Seems obvious.

$(xy)^R = y^R x^R$.

We will show this by induction on $x$.

For our base case, let $x = \varepsilon$. We have \begin{align*} (xy)^R &= (\varepsilon y)^R &\text{substitution}\\ &= y^R &\text{definition of $\varepsilon$} \\ &= y^R \varepsilon &\text{definition of $\varepsilon$}\\ &= y^R \varepsilon^R &\text{definition of reversal}\\ &= y^R x^R & \text{substitution} \end{align*}

Now, for our induction step, let $x = az$ for some word $z \in \Sigma^*$ and assume that $(zy)^R = y^R z^R$. Then we have \begin{align*} (xy)^R &= ((az)y)^R &\text{subtitution}\\ &= (a(zy))^R &\text{definition of concatenation}\\ &= (zy)^R a &\text{definition of reversal}\\ &= (y^R z^R) a &\text{inductive hypothesis}\\ &= y^R (z^R a) &\text{associativity}\\ &= y^R (az)^R &\text{definition of reversal}\\ &= y^R x^R &\text{subtitution} \end{align*}

 

Let's note a few things about this proof. This proof gets into more detail than you'd typically require in this course, but the steps are here to illustrate some of the subtleties of working with strings. For instance, notice that we make use of associativity of concatenation—notice that we've implicitly turned $a$ into a string at this point.

This proof was accomplished via induction directly on strings. We can do this because strings are recursive objects. But something you often see is induction that's more indirect, on the length of the string. In principle, this is really the same thing: our base case would be 0, which corresponds to $\varepsilon$, and the inductive case, when $|w| \gt 0$, would correspond to strings of the form $ax$.

Languages

As we've seen earlier, languages in the formal language theoretic sense are just sets of strings and we describe a language in the same way as for any other set.

Here are some examples of languages.

We can define the notion of a language a bit more formally, with respect to an alphabet $\Sigma$.

We denote by $\Sigma^*$ the set of all finite words over the alphabet $\Sigma$. A language $L$ over $\Sigma$ is a subset $L \subseteq \Sigma^*$.

In other words, we fix some alphabet $\Sigma$ and take $\Sigma^*$ to be our universe.

You might notice that the more curious definition is for $\Sigma^*$. If you recall our brief diversion about monoids, you might recognize that $\Sigma^*$ is really just the free monoid on $\Sigma$.

Since languages are sets, we can perform any set theoretic operations we're familiar with, like union and intersection.

For a language $L \subseteq \Sigma^*$, the complement of $L$ is $\overline L = \Sigma^* \setminus L$.

This also means there are a few things we have to be careful with. For example, there is the notion of the empty language $\emptyset$, which is the language with no strings in it. This is different from the language $\{\varepsilon\}$, which is the language containing the empty string.

We can also extend many operations on strings to operations on languages.

For two languages $L_1$ and $L_2$, we define the catenation of $L_1$ and $L_2$ by $$ L_1 L_2 = \{uv \mid u \in L_1, v \in L_2 \}.$$

If $L_1 = \{\varepsilon, ab, abb, abc\}$ and $L_2 = \{ca, bac\}$, then \[L_1 L_2 = \{ca, bac, abca, abbac, abbca, abbbac, abcca, abcbac\}.\]

What exactly is the difference between $\emptyset$ and $\{\varepsilon\}$? Concatenation gives us one example of why this difference matters. Consider a language $L$ and the following languages:

This allows us to define powers for languages.

For a language $L \subseteq \Sigma^*$, the $k$th power of $L$ is

This definition leads to an interesting operation.

The Kleene star or Kleene closure of $L$ is defined by $$ L^* = \bigcup_{i \geq 0} L^i.$$ We define the positive Kleene closure of $L$ by $L^+ = LL^*$. One can see that this is just the nonzero powers of $L$, so we can also think of this as $$ L^+ = \bigcup_{i \geq 1} L^i.$$

Let $\Sigma = \{a,b\}$. Then $L = \{ab, bc\}$ is a language over $\Sigma$. If we take the Kleene star of $L$, we get the language made of up of combinations of $ab$ and $bc$, like $ababbcabbcbcab$.

One use of this is for constructing codes. In formal language theory, a code over an alphabet $\Sigma$ is a language $K \subseteq \Sigma^+$ such that if $w \in K^+$ then $w$ has a unique factorization into words of $K$ (here is a use of "factor" as "subword"). It's not hard to see that not every language is a code.

Finite automata

Finite automata are arguably the simplest model of computation. Finite automata first appeared as far back as the 1940s in the context of early neural networks. The idea was that we wanted to represent an organism or robot of some kind responding to some stimulus, which we can also call events. Upon the occurrence of an event, the robot would change its internal state in response and there would be a finite number of possible such states. Today, the use of finite automata for modelling such systems is still around in a branch of control theory in engineering under the name discrete event systems. But all of this has nothing to do with how we're going to consider finite automata.

First, we'll introduce the idea of a "machine" as used in the theory of computation, which we'll be coming back to throughout the course. This idea originates from Turing in 1936, as what you may have heard of as a Turing machine. A machine has an input tape, divided into cells, with an input word written on it. The machine has a tape head that can scan the input tape, reading a symbol from each cell.

In the full Turing machine model, the machine can move the tape head left and right and the machine can write to the tape. Instead, we will consider a simplified model where the machine can only read off of the tape and it can only scan to the right—the tape head can never move to the left. This simplified model is what we call a finite automaton.

The "finite control" of a machine is a finite set of instructions that determine how the machine operates. This is often represented as series of states. The machine keeps track of its state and the finite control contains rules for when the machine moves to another state, based on what it reads off of the input tape.

Upon reading the entire input, the machine accepts or rejects the input word based on the state that the machine is in. This means that we can think of the machine as describing a set of strings—those that are accepted by the machine. This is the language of the machine and we can think of such a machine as computing the problem that its language corresponds to.

We call these finite automata because they have only a finite number of states. It wasn't stated explicitly, but based on the description, we also assume that the tape head can only move in one direction. Because of this restriction, it's not really necessary to think about the tape portion of the machine, at least for now.

We will give a formal definition for finite automata.

A deterministic finite automaton (DFA) is defined as a 5-tuple $A = (Q,\Sigma,\delta,q_0,F)$, where

Note that by our definition, $\delta$ must be defined for every state and input symbol. We call this a deterministic finite automaton because the transition function $\delta$ is uniquely determined by the state and input symbol.

We can represent finite automata by drawing the important part of the machine: the state diagram. We draw each state as a circle and label it as appropriate. Transitions are represented as arrows, leaving one state and entering another, labelled by the appropriate input symbol. We denote accepting states by a double circle and we denote the start state by an arrow that comes in from nowhere in particular.

Suppose this DFA is $A = (Q, \Sigma, \delta, q_0, F)$. We can list all the parts that we've specified in our formal definition:

and $\delta$ is given by

$a$ $b$ $c$
$0$ $0$ $0$ $1$
$1$ $0$ $2$ $1$
$2$ $3$ $0$ $1$
$3$ $3$ $3$ $3$

What does this machine do? We take a string and read it symbol by symbol and trace where we go based on the character. We see that strings that contain the substring $cba$ will end up in state 3, which is an accepting state, while those that don't never reach state 3. In other words, this machine determines whether a string over the alphabet $\{a,b,c\}$ contains the substring $cba$.

For example, suppose we want to run this machine on the string $bccbcbab$. We can view a computation on this machine by \[0 \xrightarrow{b} 1 \xrightarrow{c} 1 \xrightarrow{c} 1 \xrightarrow{b} 2 \xrightarrow{c} 1 \xrightarrow{b} 2 \xrightarrow{a} 3 \xrightarrow{b} 3.\] Since the machine is in state 3 upon completion of the string, $bccbcbab$ is accepted by the machine.