CMSC 27100 — Lecture 11

Combinatorics on words

Strings are a fundamental object in computer science not only in the programs we write, but also in theoretical computer science. After all, everything that happens in computers really just boils down to words and manipulation of words of bits. We tend to think of words as sequential objects, but words turn out to be another example of a recursively defined object.

Definition 11.1. Let $\Sigma$ be an alphabet. Then the set of strings, or words, over the alphabet $\Sigma$, denoted $\Sigma^*$ is defined recurisvely by

The concatenation of two words $u = a_1 a_2 \cdots a_m$ and $v = b_1 b_2 \cdots b_n$ is $u \cdot v = a_1 a_2 \cdots a_m b_1 b_2 \cdots b_n$. Usually, we simply write $uv = u \cdot v$. The length of a word $w$, denoted $|w|$, is the number of symbols in $w$.

Note that "string" and "word" are used interchangeably.

Example 11.2. Let $\Sigma = \{0,1\}$. Then $\Sigma^*$ is the set of all binary strings. This set contains the words $\{\varepsilon, 0, 1, 00, 01, 10, 11, \dots\}$. The concatenation of the words $000$ and $101$ is $000101$. The length of $000101$ is 6.

We can think of string concatenation in the same way as we do multiplication on numbers, with the very important difference that string concatenation is not commutative. This leads to natural analogues, like powers.

Definition 11.3. Let $w \in \Sigma^*$. Then for $k \in \mathbb N$, we define the $k$th power of $w$ by

  1. $w^0 = \varepsilon$,
  2. $w^k = w^{k-1} w$ if $k \gt 0$.

Let's make use of this definition to prove something relatively simple about repetitions in words.

Lemma 11.4. For all words $u,v \in \Sigma^*$ and $n \in \mathbb N$, we have $(uv)^n u = u(vu)^n$.

Proof. We prove this by induction on $n$.

Base case. Let $n = 0$. Then we have $(uv)^0 u = u = u(vu)^0$.

Inductive case. Assume for $k \in \mathbb N$ that $(uv)^k u = u(vu)^k$. Now, consider $n = k+1$. We have \begin{align*} (uv)^{k+1} u &= uv (uv)^k u &\text{by definition of power}\\ &= uvu(vu)^k & \text{by ind. hyp.} \\ &= u(vu)^{k+1} &\text{by definition of power} \end{align*} $\tag*{$\Box$}$

The study of patterns and arrangements of words, called combinatorics on words, began relatively recently in the early 20th century. Having just talked a bunch about number theory, we'll look at some very similar kinds of question from combinatorics on words. In 1962, Roger Lyndon and Marcel-Paul Schützenberger answered the question of when the words $a,b,c$ would satisfy the equation $$a^m = b^n c^p$$ where $m,n,p \geq 2$ are integers. We'll look at some of the results that they proved in order to show that the only solutions to the above equation are trivial; i.e. $a,b,c$ are all powers of a single word $w$.

The following is a very simple lemma proved by Levi in 1944, which says that if we can split up a word in two ways $uv = xy$, then we can rewrite our words with respect to some overlapping part in the middle.

Lemma 11.5 (Levi's lemma). Let $u,v,x,y \in \Sigma^*$ and suppose that $uv = xy$.

  1. If $|u| \geq |x|$, then there exists $t \in \Sigma^*$ such that $u = xt$ and $y = tv$.
  2. If $|u| \lt |x|$, then there exists non-empty $t \in \Sigma^*$ such that $x = ut$ and $v = ty$.

Proof. We will prove case (1). The second case is similar. Assume that $|u| \geq |x|$. We will prove this by structural induction on $x$.

Base case. Let $x = \varepsilon$. Then we have $uv = y$. We choose $t = u$. This gives us $u = \varepsilon \cdot t = xt$ and $y = uv = tv$.

Inductive case. Let $x = wa$, where $w \in \Sigma^*$ and $a \in \Sigma$. Then we have $uv = (wa)y$. Our inductive hypothesis is that for word $w \in \Sigma^*$ such that $uv = w(ay)$, there exists a word $z \in \Sigma^*$ such that $u = wz$ and $y' = (ay) = zv$.

Observe that the first symbol of $z$ is $a$, which is the last symbol in $x$. We claim that $z = at$, where $t$ is the word we want to show exists. However, this decomposition cannot occur if $z = \varepsilon$. We must show that $z$ is nonempty.

Suppose $z = \varepsilon$. Then we have $u = w$. But since $x = wa$, this would mean that $|u| = |w| \lt |wa| = |x|$, which contradicts our original assumption. Thus, $z \neq \varepsilon$. We then have \begin{align*} u &= wz \\ &= wat &\text{since $z = at$} \\ &= xt &\text{since $x = wa$} \end{align*} and for $y' = ay = zv$, we recall that $z = at$ and therefore we have $y = tv$ as desired. $\tag*{$\Box$}$

Making use of the previous two lemmas, we can start proving some theorems of Lyndon and Schützenberger. This first result gives a condition for when a word has a prefix that is the same as its suffix. If this is the case, then that this means that the word is made up of alternating repetitions.

Theorem 11.6. Let $x,y,z \in \Sigma^*$ be nonempty words. If $xy = yz$, then there exist a nonempty word $u \in \Sigma^*$, a word $v \in \Sigma^*$, and $n \in \mathbb N$ such that $x = uv$, $z = vu$, and $y = (uv)^n u = u(vu)^n$.

Proof. We will show this by induction on $|y|$.

Base case. Since $y$ is non-empty, our base case is $|y| = 1$, which means $y = a \in \Sigma$. Then $xa = az$. Observe that $x$ begins with $a$ and $z$ ends with $a$, so we can write $x = ax'$ and $z = za'$. This gives us $ax'a = xa = az = az'a$ and therefore $x' = z'$. Therefore, we have $u = a$, $v = x'$, and $n = 0$.

Inductive case. Assume that for $k \geq 1$ that if $xy = yz$, then there exists $u,v \in \Sigma^*$ and $n \geq 0$ such that $x = uv$, $z = vu$, and $y = (uv)^n u = u(vu)^n$. Now, we consider $|y| \gt 1$. there are two cases.

  1. If $|x| \geq |y|$, we can apply Levi's lemma. That is, there exists a word $w \in \Sigma^*$ such that $x = yw$ and $z = wy$. Then let $u = y$, $v = w$, and $n = 0$.
  2. If $|x| \lt |y|$, again by Levi's lemma, there exists a word $w \in \Sigma^*$ such that $y = xw = wz$. By our inductive hypothesis, we know there exists words $u$ and $v$ and $n \geq 0$ such that $x = uv$, $z = vu$, and $w = (uv)^n u = u(vu)^n$. Then $$y = xw = uv(uv)^nu = (uv)^{n+1} u.$$
$\tag*{$\Box$}$

We only got as far as here in class, but here are two more neat results that you may be interested in.

Since we can view concatenation as a non-commutative analogue of multiplication, it makes sense to ask under what conditions we have $xy = yx$. The following result is another of Lyndon and Schützenberger's, which says that $xy = x$ if both $x$ and $y$ are powers of the same word (in fact, this is true in the other direction as well, but we'll only present the first proof).

Theorem 11.7. Let $x,y \in \Sigma^*$ be non-empty words. If $xy = yx$, then there exist a word $z \in \Sigma^*$ and integers $k,\ell \gt 0$ such that $x = z^k$ and $y = z^\ell$.

Proof. We will show this by strong induction on $|xy|$.

Base case. Since both $x$ and $y$ are non-empty, we have $|xy| = 2$ as our base case. This means that $|x| = |y| = 1$ and therefore $x = y$. Then our result holds, via $z = x = y$ and $k = \ell = 1$.

Inductive case. Assume that the result holds for all $x$ and $y$ with $|xy| \lt n$ and consider when $|xy| = n$. Without loss of generality, we take $|x| \geq y$. By Levi's lemma, there exists a word $w \in \Sigma^*$ such that $x = wy = yw$. If $|w| = 0$, then we have $x = y$ and therefore our result holds via $z = x = y$ and $k = \ell = 1$.

If $|w| \gt 0$, then $|wy| \lt n$ and therefore, by the inductive hypothesis, there exists a word $z$ with integers $k, \ell \gt 0$ such that $w = z^k$ and $y = z^\ell$. From this, we have $x = wy = z^{k+\ell}$. $\tag*{$\Box$}$

The following result appears in the Lyndon-Schützenberger paper, anddiscusses the conditions under which the powers of two different words are actually powers of the same word. In essence, the question is: how much of the word do you need to check to be sure they're repeating the same thing?

Theorem 11.8. Let $x,y \in \Sigma^*$ be words. If for some integers $m,n \gt 0$, the words $x^m$ and $y^n$ have a common prefix of length $|x| + |y|$, then there exists a word $z \in \Sigma^*$ and integers $k,\ell \gt 0$ such that $x = z^k$ and $y = z^\ell$.

Proof. Since $x^m$ and $y^n$ share a common prefix of length $|x|+|y|$, we observe that $yx^m$ and $y^{n+1}$ share a common prefix of length $|x| + 2|y|$. Now, observe that these two words share a common prefix of length $|x|+|y|$, the word $yx$. By a similar argument, we can show that the words $x^{m+1}$ and $xy^n$ share a common prefix of length $|x|+|y|$, the word $xy$. But this means, we have $xy = yx$ and we can directly apply Theorem 11.7. $\tag*{$\Box$}$

This result is actually not optimal. It was shown by Nathan Fine and Herbert Wilf in 1965 that the common prefix can be as short as $|x| + |y| - \gcd(|x|,|y|)$, a result now known as the Fine-Wilf Theorem.