CMSC 27200 — Lecture 2

Last class, we claimed that an algorithm is efficient if it runs in polynomial time.

There are a few basic objections to tying polynomial time to efficiency or practical, but by now we've come up with good arguments. The most common one is the example of the algorithm that runs in $O(n^{1000000})$ time versus the algorithm that runs in $O(1.000001^n)$ time.

Of course, it is a cop-out to say that in most practical cases, this won't come up, but that's really the truth. In some strange coincidence, many of the algorithms we come up with tend to have small and reasonable constants, exponents, or bases, and of those that don't, it's often not too long before further research leads us to better algorithms.

To step a bit outside of the realm of algorithms, you may have heard about the Church-Turing thesis, which says that all reasonable models of computation (like the Turing machine or the RAM) are equivalent in power. The extended Church-Turing thesis says that not only are all of these models equivalent, but that they can all simulate each other with only a polynomial increase in running time. This means that the idea of tying efficiency and polynomial time together is quite robust and doesn't really depend on a particular model of computation.

Here's a fun question to think about: what about algorithms that use polynomial space?

Asymptotics

So now we know that to think about the complexity of an algorithm, we simply count the number of steps it takes or how much memory it uses in the worst case input, and this gives us a function in the size of the input. So if we want to compare the efficiency or complexity of algorithms, what we really end up doing is comparing the functions.

Rather than figuring out whether a function is always less than some other function, we are more interested in how the function grows as input grows larger. To do this, we have to turn to calculus/analysis and perform asymptotic analysis. By analyzing the time and space complexity of an algorithm asymptotically, we abstract away many of the small, unimportant details of implementation and get a better idea of whether the broader strategy of the algorithm is efficient.

Let $f:\mathbb N \to \mathbb R_+$ be a function. We say that $f(n)$ is $O(g(n))$ for some function $g:\mathbb N \to \mathbb R_+$ if there exist constants $c \gt 0$ and $n_0 \geq 0$ such that for all $n \geq n_0$, $f(n) \leq c \cdot g(n)$.

We should note that we are primarily interested in functions with positive inputs and outputs, so we will carry that assumption along with us.

Intuitively, what this definition says is that if $f(n)$ is $O(g(n))$, then there is some point $n_0$ at which for every $n$ after $n_0$, $f(n)$ is less than $g(n)$, for some constant scaling factor $c$. This allows us to talk broadly about the growth of the functions $f$ and $g$ without having to worry about small values.

We claim that $f(n) = n^2 + 100n + 1$ is $O(n^2)$. Note that when $n \gt 1$, we have \[0 \leq n^2 + 100n + 1 \leq n^2 +100n^2 +n^2 = 102n^2.\] So we can take $c = 102$ and $n_0 = 1$ as our constants to show that $f(n)$ is $O(n^2)$.

In fact, we can make the coefficient of $n$ as large as we like, and we would still be able to show that $f(n)$ is $O(n^2)$. The math checks out: we're always able to choose a large enough constant that we can get over the initial 'hump', since the growth of the $n^2$ term eventually outpaces the $n$ term. This highlights the basic idea behind asymptotic analysis being about orders of growth.

We call this "big O" notation. Roughly speaking, big O gives us an upper bound on the growth of a function. This system of notation is due the German analytic number theorists Bachmann and Landau in the late 19th Century and is sometimes called Landau notation. Its use in the analysis of algorithms was popularized by Donald Knuth in the 1960s.

Note that a common thing to do when using big O notation is to write something like $f(n) = O(g(n))$. This is not strictly correct, since these two things aren't really equal. For instance, writing $O(g(n)) = f(n)$ looks strange, because $O(g(n))$ isn't a function! One can argue that it's really a class of functions, and we could write something like $f(n) \in O(g(n))$. But in the end, we put up with this notational abuse because, like many other poor notational choices in mathematics, it is too late.

Let $f:\mathbb N \to \mathbb R_+$ be a function. We say that $f(n)$ is $\Omega(g(n))$ for some function $g:\mathbb N \to \mathbb R_+$ if there exist constants $c \gt 0$ and $n_0 \geq 0$ such that for all $n \geq n_0$, $f(n) \geq c \cdot g(n)$.

Let $f:\mathbb N \to \mathbb R_+$ be a function. We say that $f(n)$ is $\Theta(g(n))$ for some function $g:\mathbb N \to \mathbb R_+$ if there exist constants $c_1,c_2 \gt 0$ and $n_0 \geq 0$ such that for all $n \geq n_0$, $c_1 \cdot g(n) \leq f(n) \leq c_2 \cdot g(n)$.

Without any additional properties, we would have to go through a proof based on the definition in order to do any comparisons. However, there are a number of properties and identities that are quite easy to prove that will make our lives considerably easier.

Let $f:\mathbb N \to \mathbb R_+$ be a function.

$f(n)$ is $O(f(n))$.
If $f(n)$ is $O(g(n))$, then $cf(n)$ is $O(g(n))$ for all $c \gt 0$.
If $f_1(n)$ is $O(g_1(n))$ and $f_2(n)$ is $O(g_2(n))$, then $f_1(n) f_2(n)$ is $O(g_1(n) g_2(n))$.
If $f_1(n)$ is $O(g_1(n))$ and $f_2(n)$ is $O(g_2(n))$, then $f_1(n) + f_2(n)$ is $O(\max\{g_1(n), g_2(n)\})$.
If $f(n)$ is $O(g(n))$ and $g(n)$ is $O(h(n))$, then $f(n)$ is $O(h(n))$.

Here are some useful things to know. You can try to prove them using the results from above.

Let $f: \mathbb N \to \mathbb R$ be a function.

If $f(n) = a_0 + a_1 n + \cdots + a_k n^k$ with $a_k \gt 0$. Then $f(n) = \Theta(n^k)$.
If $f(n) = \log_a n$. Then $f(n) = \Theta(\log_b n)$ for all $a,b \gt 1$.
If $f(n) = \log_a n$. Then $f(n) = O(n^k)$ for all $a \gt 1$ and $k \gt 0$.
If $f(n) = n^k$. Then $f(n) = O(r^n)$ for all $r \gt 1$ and $k \gt 0$.
If $f(n) = n!$. Then $f(n) = 2^{\Theta(n \log n)}$.

This laundry list of results gives us a rough idea of how functions grow.

Logarithms grow very slowly and slower than any polynomial. Furthermore, the base of the logarithms don't matter at all.
The growth of polynomials is dictated by their degree. Higher-order polynomials grow faster, but never as fast as any exponential.

Stable Matching

Let's jump right in and see what kinds of things we'll be doing for the rest of the quarter.

A common problem in many domains is matching. There are lots of such scenarios:

Employers and students looking for internships
A group of (straight) men and women looking to pair up for dates
Professors and potential PhD students
A bunch of new Pokémon trainers and Pokémon that the Professor has waiting for them

If we have $n$ of one group and $n$ of the other group, then we just have to find a way to find $n$ distinct pairs between the groups. However, there is an additional complication: people (and Pokémon) have preferences.

The scenario we want to avoid is the following: Alice gets matched with G◯◯gle but would rather go to Faceb◯◯k for the summer while Bob got matched with Faceb◯◯k but would be much happier at G◯◯gle. Now, if both G◯◯gle and Faceb◯◯k both had the same preferences as Alice and Bob, then arranging this switch is not so bad. But what happens if their preferences are in conflict? Furthermore, this scenario only deals with two pairs, but one can easily see how such a situation could unravel and cause a chain reaction of adjustments if it involved conflicting preferences among more people.

Define the problem

We can begin by abstracting the problem. These scenarios all have the same sorts of constraints:

There are two distinct sets $(U,V)$ that we need to find a matching for,
Every $u \in U$ needs to be paired with a single $v \in V$,
Every $u \in U$ has a ranking of $v$'s and vice-versa.

For the sake of discussion, let's fix one of these scenarios. We'll let $U$ be the set of available positions and $V$ is the set of interns vying for those positions.

A matching $M$ is a set of pairs $(u,v)$ with $u \in U$ and $v \in V$ such that

each position $u \in U$ appears in at most one pair of $M$,
each intern $v \in V$ appears in at most one pair of $M$.

We say a matching is perfect if $|M| = |U| = |V| = n$.

Some of you may remember from discrete math that this sounds oddly similar to the idea of a complete matching. In fact, perfect matchings are complete matchings, with the additional property that every vertex in both sets of the bipartition are matched (since they're both the same size).

The goal is to find a matching that will make everyone "happy". Of couse, happy is not mathematically precise. Let's think about what makes an "unhappy" match.

A pair $(u,v)$ is unstable with respect to a matching $M$ if both

position $u$ prefers intern $v$ to its assigned intern $v'$; when $(u,v') \in M$,
intern $v$ prefers position $u$ to its assigned position $u'$; when $(u',v) \in M$.

In other words, an unstable pair is a pair that's not in the matching which would abandon their assigned matches for each other. Then it makes sense to try to come up with a matching or assignment that doesn't exhibit any unstable pairs.

A matching $M$ is stable if it there are no unstable pairs with respect to $M$.

That is, in a stable matching, there are no pairs $(u,v)$ that would be willing to abandon their assigned match. In that sense, the matching is stable.

The stable matching problem takes as input a set $U$ of size $n$ and a set $V$ of size $n$. Each $u \in U$ ranks elements of $V$ and each $v \in V$ ranks elements of $U$. The output is a stable matching $M$.

Here, we can view our input as a list of lists or a table. Suppose we have interns $A,B,C$ and employers $D,E,F$. Then their preferences may look something like

$\quad$	1st	2nd	3rd	$\quad$	1st	2nd	3rd
A	D	E	F	D	B	A	C
B	E	D	F	E	A	B	C
C	D	E	F	F	A	B	C

Consider the matching $\{(A,D), (B,F),(C,E)\}$. We observe that $(B,D)$ forms an unstable pair with respect to this matching.

It isn't clear that we can always come up with a stable matching. For example, consider the following scenario. Suppose instead of internships, we're in an introductory programming class and we want to pair up all the students for a pair programming exercise. Again, every student has a ranking of everyone else in the class.

$\quad$	1st	2nd	3rd
A	B	C	D
B	C	A	D
C	A	B	D
D	A	B	C

In this example, we can never come up with a matching that is stable (try it yourself!).

This scenario differs slightly from the one we started out with. First of all, matchings are defined for graphs in general, and not just bipartite graphs. However, we can see that for general graphs, a stable matching does not need to exist. It turns out the fact that we can define our matching on two well-defined groups is important!