CMSC 27100 — Lecture 14

Basic counting rules

Here, we'll outline some counting rules based on simple constructions. First up are cartesian products. Recall that tuples are the constituents of products of sets. In computer science, we often use tuples to tie pieces of information together—these are sometimes also called records or structs. In the theory of algebraic types, these are called product types.

Suppose that you've been invited to an elaborate dinner party like for a wedding or something. Assuming you have no further dietary restrictions, you are presented with a choice of four starters, three mains, and three desserts. How many different three-course meals can you construct? If we first choose one of the four starters, then for each choice, we have another three choices of mains, and for each of those choices, we have a choice of three desserts. So this gives us $4 \times 3 \times 3 = 36$ different ways to construct our desired meal.

We can think of each course as a set: $S$, $M$, $D$, and so what we're really doing is asking for the number of tuples that we can get out of these sets, where the first element of the tuple is a starter, the second is a main, and the third is a dessert. In other words, we want to know $|S \times M \times D|$. This gives us a general formula for the size of a Cartesian product.

For finite sets $A_1, A_2, \dots, A_n$, $$|A_1 \times A_2 \times \cdots \times A_n| = |A_1| \cdot |A_2| \cdots |A_n|.$$

Informally, the product rule describes the number of things you get when you're given several sets of things to choose from and you can pick one from each.

Let $\Sigma$ be an alphabet of size $k$. How many strings over $\Sigma$ of length $\ell$ are there? Here, we can think of strings in the C sense, as an array of characters, or a tuple. This means that an $\ell$-length string is an $\ell$-tuple belonging to the set $\Sigma \times \Sigma \times \cdots \times \Sigma = \Sigma^\ell$. Applying the product rule, we get that there are $k^\ell$ $\ell$-length strings over $\Sigma$.

Suppose that I have a set $A = \{x_1, x_2, \dots, x_n\}$ with $n$ objects in it. We saw earlier the claim that a set of $n$ objects has exactly $2^n$ subsets. Let's consider the following approach. For each subset $S \subseteq A$, let $w_S$ be a string over the binary alphabet $\{0,1\}$. Then we define $w_S$ by $w_S = a_1 a_2 \cdots w_n$, where for $1 \leq i \leq n$, \[ a_i = \begin{cases} 0 & \text{if $x_i \not \in S$,} \\ 1 &\text{if $x_i \in S$.} \end{cases} \] In other words, we fix some ordering of the elements of $S$ and associate each one to a position in our string. The "bit" associated with item $x_i$ is 1 if $x_i \in S$ and 0 if $x_i \not \in S$.

For instance, $w_\emptyset = 00 \cdots 0 = 0^n$, while $w_A = 11\cdots 1 = 1^n$. How many strings of the form $w_S$ are there? Our prior discussion about the number of strings makes this easy: over the binary alphabet (i.e. two symbols), there are exactly $2^n$ strings of length $n$. The string $w_S$ is sometimes called the characteristic string of $S$.

This is an example of using a bijection to count something. Here, we make use of the fact that we know how to count the number of binary strings quite easily. Then we set up a bijection between the subsets of $S$ and strings of length $n$, where $n = |S|$.

Something else we can do with sets is join them together by using union. This is a fairly common operation to do. In the theory of algebraic types, taking such unions are sometimes called union types but also sum types.

Consider the University of Chicago's CNetID password policy:

A standard password with a length of 12-18 characters can be used and must contain characters from three of the following categories:

Uppercase letters
Lowercase letters
Numbers (1, 2, 3, etc)
Symbols, including ! @ # $ % & * ( ) - + = _ | \ [ ] { } < > , . : ; /

Here, we have four distinct categories of symbols,

$U = \{A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z\}$,
$L = \{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z\}$,
$N = \{0,1,2,3,4,5,6,7,8,9\}$,
$S = \{!,@,\#,\%,\&,*,(,),-,+,=,_,|,\setminus,[,],\{,\},<,>,,,.,:,;,/\}$.

In other words, we want to make strings over the alphabet $U \cup L \cup N \cup S$. Then how many symbols do we have to make our password? Or, what is $|U \cup L \cup N \cup S|$? Luckily, it's pretty simple: just add them up. This gives us $26+26+10+26 = 88$ symbols with which we can make passwords. We can state this lemma more generally.

For finite disjoint sets $A_1, A_2, \dots, A_n$, $$|A_1 \cup A_2 \cup \cdots \cup A_n| = |A_1| + |A_2| + \cdots + |A_n|.$$

Informally, the sum rule describes a situation where you're given disjoint sets of things to choose from, but you may only make one choice in total.

For simplicity, let's consider only passwords of length 12. Taking our discussion above, we know that we can make $$88^{12} = 215,671,155,821,681,003,462,656$$ possible passwords of length 12. This is about $2.1567 \times 10^{23}$, or somewhere around a third of a mole's worth of particles. However, this doesn't take into account any of the restrictions on what kinds of symbols must be included.

Consider passwords from the sets $(U \cup L)^{12}$ and $(U \cup N)^{12}$. We seem to be able to apply the product and sum rules here easily, but there's a catch. The passwords from $U^{12}$ appear in both sets, so if simply added their sizes, we would be double counting this quantity. Thus we get, \begin{align*} |(U \cup L)^{12} \cup (U \cup N)^{12}| &= |(U \cup L)^{12}| + |(U \cup N)^{12}| - |(U \cup L)^{12} \cap (U \cup N)^{12}| \\ &= |(U \cup L)^{12}| + |(U \cup N)^{12}| - |U^{12}| \\ &= 52^{12} + 36^{12} - 26^{12} \\ &= 395,519,958,867,910,127,616 \end{align*}

Here, we made the careful observation that if we take the union of two arbitrary sets, we may have a non-empty intersection. In this case, we need to take care not to count the intersection twice, which the Sum Rule doesn't account for. This rule is called inclusion-exclusion, or the subtraction rule.

For all finite sets $A$ and $B$, $|A \cup B| = |A| + |B| - |A \cap B|$.

The reasoning here is simple: if the intersection of $A$ and $B$ is non-empty, then counting the elements of $A$ and counting the elements of $B$ separately will result in counting the elements in the intersection twice. This means we need to remove one count each of those elements from our total.

Permutations

Informally, permutations are ordered arrangements of a set of objects. This seems similar to what we'd do with strings or tuples, but the difference is that rather than selecting one element from a particular set, we are arranging elements from the same set. Let's define this formally first.

A permutation of a set $A$ is a bijective function $\pi : A \to A$.

Consider a set $A = \{1,2,3\}$. We can define the function $\pi$ by $\pi(1) = 3$, $\pi(2) = 1$, and $\pi(3) = 2$. Note that since a permutation is a bijection, we require that each element of $A$ get mapped to a unique element of $A$.

A natural question is how many such functions must exist. We can view such functions as filling in the following $$\begin{pmatrix} 1 & 2 & 3 \\ \_ & \_ & \_ \end{pmatrix}.$$ For something of this size, we can easily enumerate all of the possibilities: \begin{matrix} (1,2,3) & (1,3,2) & (2,1,3) & (2,3,1) & (3,1,2) & (3,2,1) \end{matrix} This suggests that we can view this as an ordered arrangement of the elements of a set and we can apply some of the counting rules we've already seen.

If $A$ is a set containing $n$ elements, then the number of permutations of $A$ is $n!$.

We can think of this as an application of the product rule, where our first choice is from a set $S$ of size $n$, then we have a choice from $S \setminus \{a_1\}$ of size $n-1$, then a choice from $S \setminus \{a_1,a_2\}$ of size $n-2$ and so on.

Suppose you are an old-timey travelling salesperson, going around selling knives or encyclopedia sets or what-not. You have a list of $n$ cities you need to hit up and obviously you would like to minimize the cost of travelling to these cities. The problem you have to solve is: what is the minimal cost route that allows you to visit all of these cities once, including your trip back?

We can try to solve this problem by checking each possible route. How many routes are there? This is the same as asking in which order we would like to travel to each city and how many total choices we have. So for $n$ cities, this is just $n!$. This doesn't seem so bad if you've only got four cities to go to, since $4! = 24$. However, if you're a bit more ambitious and want to go to 8 cities, then you're looking at $8! = 40320$ different routes to think about. This is not something you'd want to do by hand, but easily handled by a computer. Double the number of cities again, though, and we get $16! = 20922789888000$, which Wolfram|Alpha tells me is about the number of red blood cells in the human body or about 70 times the number of stars in our galaxy. This number clearly grows very quickly—on the order of $\sqrt{2\pi n} \left( \frac n e \right)^n$, by Stirling's approximation.

This problem is the Travelling Salesman Problem, one of the most famous problems for which we do not have an efficient algorithm for solving. The problem dates back as far as the 1800s although it wasn't formally stated in its current form until around the 1930s. One point of interest is that I said cost and not necessarily distance. If we assume that the costs are distances, then we impose a condition on our cost function: that the distances satisfy the triangle inequality, $d(x,y) \leq d(x,z) + d(z,y)$. This assumption makes the problem slightly (but not significantly) easier. However, not all cost functions necessarily behave that way and we have a very relevant example of one: flight prices.

But what if we didn't want to go to every city, but instead wanted to go to some subset of them. How many ways are there to do this? We can think about this informally: we carry out the same process as before, but stop when we reach our desired number of choices. this lead to the following definition and result.

An $r$-permutation is a permutation of $r$ elements taken from a set of $n$ elements.

If $P(n,r)$ denotes the number of $r$ permutations of an $n$-element set, then $$P(n,r) = n \cdot (n-1) \cdot \cdots \cdot (n-r+1).$$

If $n$ is positive and $0 \leq r \leq n$, then $$P(n,r) = \frac{n!}{(n-r)!}.$$

Proof. Note that the total number of ways to order a set of $n$ elements is $n!$. By definition, $P(n,r)$ is the number of ways to order $r$ out of $n$ elements. We are then left with $n-r$ elements to order, which we can do $(n-r)!$ ways. This gives us \begin{align*} n! &= P(n,r) \cdot (n-r)! \\ P(n,r) &= \frac{n!}{(n-r)!} \end{align*} $\tag*{$\Box$}$

Note that we could have proved this algebraically—it's not too hard to manipulate the two factorials into our desired form. However, this doesn't really tell us much about how to obtain the value $P(n,r)$. This is another example of combinatorial proof. This is a proof that relies on arguments about the construction and enumeration of the objects in question.

Notice that we relate the quantity that we're computing with the process by which we're constructing the associated object. This offers more insight into how we're enumerating or constructing the object than a direct algebraic proof. This kind of information is useful, particularly for computer scientists who are often interested in using this information to literally enumerate or construct solutions.

We've also subtly introduced another counting rule here: the division rule. Since the division rule is really about process (i.e. the number of equivalent ways to do something), we'll frame the rule in terms of a function. Specifically, we are concerned with $k$-to-1 functions.

A $k$-to-1 function $f : A \to B$ is a function that maps exactly $k$ elements of $A$ to each element of $B$.

If $f : A \to B$ is a $k$-to-1 function, then $|A| = k \cdot |B|$.

As a result, we can conclude that $|B| = \frac{|A|}k$.

In our discussion of the number of $r$-permutations, we can consider the function that maps permutations of $n$ things to the permutations of $r$ things. One such function is the one that simply takes the first $r$ elements of each permutation. For example, a mapping from 6-permutations to 3-permutations would simply take the "front" of the permutation: \[(3,1,5,2,4,6) \to (3,1,5)\] We see then that for each 3-permutation, there are $6 = 3!$ possible 6-permutations that map to it: \[(3,1,5) \leftarrow \begin{cases} (3,1,5,2,4,6) \\ (3,1,5,2,6,4) \\ (3,1,5,4,2,6) \\ (3,1,5,4,6,2) \\ (3,1,5,6,2,4) \\ (3,1,5,6,4,2) \end{cases} \] So when framed via the Division Rule, we have $|A| = n!$, $|B| = P(n,r)$, and $k = (n-r)!$. That is, for each $r$-permutation of $n$ things, we can associate it with $(n-r)!$ different $n$-permutations.