MPCS 55001 — Lecture 1

Let's jump right in and see what kinds of things we'll be doing for the rest of the quarter.

A common problem in many domains is matching. There are lots of such scenarios:

If we have $n$ of one group and $n$ of the other group, then we just have to find a way to find $n$ distinct pairs between the groups. However, there is an additional complication: people (and Pokémon) have preferences.

The scenario we want to avoid is the following: Alice gets matched with G◯◯gle but would rather go to Faceb◯◯k for the summer while Bob got matched with Faceb◯◯k but would be much happier at G◯◯gle. Now, if both G◯◯gle and Faceb◯◯k both had the same preferences as Alice and Bob, then arranging this switch is not so bad. But what happens if their preferences are in conflict? Furthermore, this scenario only deals with two pairs, but one can easily see how such a situation could unravel and cause a chain reaction of adjustments if it involved conflicting preferences among more people.

Define the problem

We can begin by abstracting the problem. These scenarios all have the same sorts of constraints:

For the sake of discussion, let's fix one of these scenarios. We'll let $U$ be the set of available positions and $V$ is the set of interns vying for those positions.

A matching $M$ is a set of pairs $(u,v)$ with $u \in U$ and $v \in V$ such that

Some of you may remember from discrete math that this sounds oddly similar to the idea of a complete matching. In fact, perfect matchings are complete matchings, with the additional property that every vertex in both sets of the bipartition are matched (since they're both the same size).

The goal is to find a matching that will make everyone "happy". Of couse, happy is not mathematically precise. Let's think about what makes an "unhappy" match.

In other words, an unstable pair is a pair that's not in the matching which would abandon their assigned matches for each other. Then it makes sense to try to come up with a matching or assignment that doesn't exhibit any unstable pairs.

A matching $M$ is stable if it there are no unstable pairs with respect to $M$.

That is, in a stable matching, there are no pairs $(u,v)$ that would be willing to abandon their assigned match. In that sense, the matching is stable.

The stable matching problem takes as input a set $U$ of size $n$ and a set $V$ of size $n$. Each $u \in U$ ranks elements of $V$ and each $v \in V$ ranks elements of $U$. The output is a stable matching $M$.

The Gale–Shapley algorithm

The algorithm for finding stable matchings is due to Gale and Shapley in 1962. We can describe the algorithm as follows.

    \begin{algorithmic}
    \FUNCTION{stable-match}{preference lists for all $u \in U$ and $v \in V$}
        \STATE Initialize matching $M = \emptyset$
        \WHILE{there is an open position $u \in U$ that has not made an offer to every intern} 
            \STATE $v = $ the top intern on $u$'s list, who hasn't yet been made an offer by $u$
            \IF{$v$ is unmatched}
            \STATE add $(u,v)$ to $M$
            \ELIF{$v$ prefers $u$ to current position $u'$}
            \STATE replace $(u',v)$ with $(u,v)$ in $M$
            \ELSE
            \STATE $v$ rejects $u$
            \ENDIF
        \ENDWHILE
    \RETURN{$M$}
    \ENDFUNCTION
    \end{algorithmic}

This, like most of the algorithms we'll see in this class, is described in pseudocode. You'll be expected to present algorithms in pseudocode as well. As you can see, the standards are quite loose as to what constitutes "code", so we'll have more specific guidance on what we're looking for later.

For now, let's to try to understand what it is this algorithm does. All of the action happens inside the while loop, which is executed as long as some position $u$ is unfilled (unmatched with an intern). So let's consider an unfilled position $u$.

For $u$, we go down the preference list for $u$ and choose the first (i.e. the highest ranked) intern $v$ that $u$ has not offered a job to. There are three possibilities.

Analyzing the algorithm

There are a number of questions that this algorithm raises. First, how do we know this algorithm will terminate? It seems like it might be possible for interns to keep swapping indefinitely. Secondly, how do we know this will give us a correct stable matching? Again, since everyone's swapping all the time, how do we know we haven't missed someone?

We begin with the following observations about the status of each position $u$ and intern $v$:

The Gale–Shapley algorithm terminates after at most $n^2$ iterations of the while loop.

Consider the set $P(t)$, \[P(t) = \{(u,v) \mid \text{$u$ made an offer to $v$ by iteration $t$}\}.\] On each iteration, an open position $u$ gives out an offer to some intern $v$ which was not made before, in order of preference. This means that on each iteration, some position $u$ will proceed down its preference list until it runs out of interns to make offers to. This also means that on each iteration, there is a new $(u,v)$, so we have that $|P(t+1)| \gt |P(t)|$. But there are $|U||V| = n^2$ such pairs, so the algorithm must terminate after all such offers have been made.

Now, we'll work on showing that the algorithm gives us what we want: a stable perfect matching.

First, we note that a position makes an offer only if it isn't matched, which means that it can be matched to at most one intern. Similarly, an intern only ever accepts an offer for one position, and they only ever trade up.

We have to argue that every position $u \in U$ is matched to an intern and that every intern $v \in V$ is matched to a position.

First, we will show that every position $u \in U$ gets matched. Suppose, for contradiction, that the Gale–Shapley algorithm produces a matching $M$ with some unmatched position $u \in U$. Since the algorithm terminated, this means we exited the While loop and therefore, $u$ had made an offer to every intern $v \in V$ and remained unmatched.

In order for $u$ to remain unmatched, it must be the case that every $v \in V$ has already been matched. Since $M$ is a matching, each $v$ must be matched to a unique $u' \in U$. Since $|V| = n$, there must be $n$ positions matched to each intern. But there are only $n$ positions in total, while $u$ is unmatched, so this is a contradiction. Therefore, every $u \in U$ is matched.

Now, to see that every intern $v \in V$ must also be matched, we observe that since $M$ is a matching and every position $u \in U$ has been matched, by counting, we must have that all $n$ interns are also matched. Therefore, Gale–Shapley produces a perfect matching.

Suppose for contradiction that there is an unstable pair $(u,v)$ with respect to the matching $M$ produced by Gale–Shapley. In other words, there exist two pairs $(u,v')$ and $(u',v)$ such that $u$ prefers $v$ to $v'$ and $v$ prefers $u$ to $u'$.

Since $(u,v)$ is not in $M$, there are two scenarios that can have occurred: either $u$ did not make an offer to $v$ at all or $u$ made an offer to $v$ which was then later swapped.

If $u$ did not make an offer to $v$, this means that $u$ prefers $v'$ to $v$, since $u$ makes offers in decreasing order of preference. Therefore, $(u,v)$ is not unstable.

If $u$ did make an offer to $v$, this means that at some point $v$ either rejected the offer immediately or traded up later. That means that $v$ prefers $u'$ to $u$ and so $(u,v)$ is not unstable.

Thus, we have shown in both cases that $(u,v)$ is not actually unstable.

Putting all of the propositions we proved together, we have the following algorithm.

The Gale–Shapley algorithm will terminate and produce a stable matching for any instance.

Gale and Shapley were economists and game theorists. For this algorithm, they won the Nobel Prize in Economics. This happens to be the closest thing we have to a Nobel in computer science (because there's no Nobel Prize for Mathematics since Alfred Nobel didn't think mathematics was that important).

There's more!

As with many of the problems we want to solve and the algorithms that solve them, there are still a lot of questions that we can ask in terms of how we might be able to adapt our solution or some other properties of the particular solution we came up with.

For instance, does each instance ($U$,$V$, and their preferences) have a unique stable matching? And the natural follow-up to that question is whether Gale–Shapley gives us the same stable matching.

One of the things that complicates this question is that, with the way the algorithm is written, it's not clear that the algorithm will do the same thing. The condition of the while loop implicitly chooses some $u$ which is free, but it doesn't tell us which one to choose. You can imagine that the choice of $u$ in this step will change the outcome of the algorithm.

The surprising thing is that despite this, the algorithm is guaranteed to give us the same matching $M$, no matter how we choose $u$ at each iteration of the while loop. To see this, we need to analyze some properties of the matching $M$ a bit further.

We say an intern $v$ is a valid partner for position $u$ if there exists a stable matching that contains $(u,v)$. The best valid partner for a position $u$ is the intern $v$ for which $v$ is valid and no intern $v'$ who is ranked higher than $v$ by $u$ is a valid partner for $u$.

We'll show that Gale–Shapley always matches positions with their best valid intern.

Gale–Shapley produces a matching $M$ that matches every position $u$ to its best valid partner.

We'll show this by contradiction. Suppose there is an execution of the Gale–Shapley algorithm that produces a matching $M$ that contains a pair $(s,t)$ such that $t$ isn't the best valid partner for $s$.

Since positions give offers out in decreasing order of preference, this means that a position is rejected by a valid partner. Let $u$ be the first position that gets such a rejection, from its best valid intern $v$. Then $(u',v) \in M$ for some position $u'$.

Since $v$ is the best valid partner for $u$, there must be a stable matching $M'$ that contains $(u,v)$. Since $M'$ is a stable matching, we have $(u',v') \in M'$ for some other intern $v' \neq v$. This means that in the execution of the Gale–Shapley algorithm, $v$ chose $u'$ over $u$ and since interns only trade up, we can conclude that $v$ prefers $u'$ to $u$.

Then, we observe that when $v$ rejects $u$, $u'$ has not been rejected by any valid partner, since $v$ rejecting $u$ is the first such rejection. Since $v'$ is a valid partner for $u'$, this means that $u'$ must have made an offer to $v$ before $v'$. Thus, $u'$ prefers $v$ to $v'$.

But then $(u',v)$ form an unstable pair with respect to $M'$, which contradicts our assumption that $M'$ is stable. Therefore, $M$ matches every position to its best valid partner.

One consequence of this is that one group is always privileged in that their preferences will always be maximized at the expense of the other group. In other words, positions will always be matched with their highest ranked valid intern, while interns are always matched with their least preferred valid position.

Gale–Shapley produces a matching $M$ that matches every intern $v$ to its worst valid partner.

We'll show this by contradiction. Suppose that $M$ doesn't give every intern their worst valid position, and let $(u,v) \in M$ be such that $u$ is not $v$'s worst valid partner. Then there is a stable matching $M'$ that contains a pair $(u',v)$, where $v$ prefers $u'$ less than $u$. In other words, $v$ prefers $u$ to $u'$.

Then in $M'$, $u$ is paired with some other intern $v'$. But $M$ matches every position with its best valid partner, so $u$ prefers $v$ to $v'$. Then $(u,v)$ must be an unstable pair in $M'$, which is a contradiction.

Here, we have a striking example of how algorithms can encode bias if we're not careful. Notice that we kind of rolled into this outcome without really thinking about it. Of course, it wasn't clear in the midst of designing this algorithm how our choices would lead to this outcome. However, we were able to figure out that our solution has this property.

This points to the usefulness of algorithm analysis for evaluating properties of the algorithm beyond just correctness and efficiency. This is important as we rely more and more on algorithms to make decisions for us. We should rightly question how those decisions are made and be prepared to give an answer. Rigorously reasoning about algorithms gives us the ability to do so.

In the case of Gale–Shapley, the problem with the algorithm is clear: it privileges employers. This is generally a problem since employers already have tremendous power over employees. However, our analysis also gives us a way to fix this: simply swap the role of the proposers. This simple fix will then privilege the preferences of the students/interns, who tend to have less power in this scenario.

MPCS 55001 — Lecture 1

Stable Matching

Define the problem

The Gale–Shapley algorithm

Analyzing the algorithm

There's more!