MPCS 55001 — Lecture 6

Divide and conquer

Now, we'll move on to another common algorithm design technique: divide and conquer. Much like "greedy algorithms" this design paradigm is pretty much what it's been named. The idea is to recursively divide a large problem into smaller subproblems repeatedly and then put the pieces back together, so to speak.

Sorting

The first problem that we will consider is sorting, something we are probably a bit familiar with now.

Problem 6.1. The sorting problem is: given a list $a_1, \dots, a_n$ of $n$ integers, produce a list $L$ that contains $a_1, \dots, a_n$ in ascending order.

Of course, we can sort more things than just integers, as long as the elements we choose are from some set that has a total order (like, say, strings, assuming your alphabet is ordered).

There are some obvious applications of sorting, and we've already seen some not so obvious applications of sorting, where sorting is a crucial step that allows us to apply some of our other algorithmic design strategies.

On the last problem set, you were asked to come up with an obscenely bad sorting algorithm. Some of you started with some simple ideas that were simple or naive, but not necessarily horrible. Even that bit of thinking (i.e. hey smaller elements should be at the beginning of the array!) brought you from $O(n \cdot n!)$ up to $O(n^2)$. But while we're in the realm of "efficiency" now, we're not satisfied with that.

So how do we get to $O(n \log n)$? I've already briefly mentioned how we might do this with heaps, but this is a bit cumbersome. Do we really want to be constructing a heap every time we want to sort an array?

I got a very good question recently about what intuitively $O(\log n)$ looks like. The answer to that is that it is basically what happens when you start dividing a problem into subproblems (usually by half, but not necessarily) repeatedly. It is that idea that will guide us for the next few problems.

Mergesort

The idea behind Mergesort is pretty simple:

  1. Divide the array in half
  2. Sort the left half using Mergesort
  3. Sort the right half using Mergesort
  4. Merge the two halves

The only thing that's not clear is what merging does. Assume we have two sorted arrays that we want to merge into one big array containing all the elements in both. What we would need to do is examine each and arrange them in order. Luckily, they're sorted, so all we have to do is go through each array and pick the smaller of the two elements we're looking at to put into our new array next.

Mergesort was developed in 1945 by John von Neumann, who, like many of the most eminent mathematicians and scientists of our time, has scientific contributions in more fields than I can remember or am aware of.

Example 6.2. Here, we are given an array $[3,1,5,2,8,7,6,4,9]$ and we sort recursively, splitting the array in half each time. The results of each sort are then merged back together.

The algorithm in pseudocode is essentially the algorithm we described, with the addition of the base case of when your array is just a single element.

    \begin{algorithm}
    \caption{Mergesort}
    \begin{algorithmic}
    \PROCEDURE{mergesort}{$A$}
        \STATE $n = $ size($A$)
        \IF{$n$ is 1}
            \RETURN{$A$}
        \ENDIF
        \STATE $m = \left\lfloor \frac n 2 \right\rfloor$
        \STATE $B = $ mergesort($A[1,\dots,m]$)
        \STATE $C = $ mergesort($A[m+1,\dots,n]$)
        \STATE $A' = $ merge($B,C$)
    \RETURN{$A'$}
    \ENDPROCEDURE
    \end{algorithmic}
    \end{algorithm}

The correctness of the algorithm is almost obvious from how it's been described, but we can outline a proof. First, you would need to prove that merge(A,B) produces a sorted array, assuming that $A$ and $B$ were already sorted. Then, you would prove that Mergesort is correct by employing an inductive argument, where the base case is when $A$ is of size 1.

Analyzing Mergesort

This is our first recursive algorithm, and it's not necessarily immediately clear how to consider the running time. However, if we think about what the algorithm is really doing, the crucial operation we need to keep track of is comparisons. We'll use this as a proxy of the running time (technically, there are a few more constants floating about if we wanted to be more strict about counting operations).

Suppose $T(n)$ is the number of comparisons that Mergesort performs on an array of size $n$. Then, we have the following pieces of information:

What can we do with this? Let's run through a simplified case first.

Example 6.4. Suppose $n$ is a power of 2, so $n = 2^k$ for some $k \geq 0$.

We can then express $T(n)$ as a recurrence, \[T(n) = \begin{cases} 0 & \text{if $n = 1$,} \\ 2 \cdot T\left(\frac n 2 \right) + n & \text{if $n \gt 1$.} \end{cases} \]

This is a relatively simple recurrence to solve and we have a few ways of looking at it. We can try to gain some insight by expanding our tree of subproblems.

We can then use this to come up with a guess for our recurrence, or if we prefer, we can try iterating the recurrence or guessing the first few values. Here's what we get from iterating: \begin{align*} T(2^k) &= 2 \cdot T(2^{k-1}) + 2^k \\ 2 \cdot T(2^{k-1}) &= 4 \cdot T(2^{k-2}) + 2^k \\ & \vdots \\ 2^{k-1} T(2) &= 2^k \cdot T(1) + 2^k \\ \end{align*} This gives us $T(2^k) = k \cdot 2^k$. We can then take our hypothesis and prove the recurrence formally using induction. Once we've proved it, we notice that this works out very nicely, because notice that $k = \log_2 2^k$ and since $n = 2^k$, we get $T(n) = O(n \log n)$.

Of course, this recurrence works for $n$ which are exact powers of 2. What about all of those numbers that aren't so lucky? If we re-examine our algorithm for Mergesort, we notice that it's possible that we might not get to break our array clean in half, in which case one subarray has size $\lfloor n/2 \rfloor$ and the other has size $\lceil n/2 \rceil$. We would need to rewrite our recurrence as follows \[T(n) \leq \begin{cases} 0 & \text{if $n = 1$,} \\ T\left(\left\lfloor \frac n 2 \right\rfloor\right) + T\left(\left\lceil \frac n 2 \right\rceil \right) + n & \text{if $n \gt 1$.} \end{cases} \]

This recurrence is still sufficient to give us the following result.

Proposition 6.5. $T(n)$ is $O(n \log n)$.

How does one go about proving this? You play around with the recurrence and somehow guess your way into finding $T(n) \leq n \lceil \log n \rceil$ and then proving that this holds by induction on $n$.

This was a bit of an ad-hoc refresher of how to deal with recurrences. However, if guessing and checking makes you uneasy, we will be seeing a more powerful result that lets us avoid this very soon.

Closest Points on a Plane

Before we do, let's look at another neat problem that appears harder than it looks at first (well, for computers, it still might take a bit of work for us). First, an important definition.

Definition 6.6. The Euclidean distance between two points $p_1 = (x_1,y_1)$ and $p_2 = (x_2,y_2)$ is defined by \[d(p_1,p_2) = \sqrt{(x_1-x_2)^2 + (y_1-y_2)^2}.\]

Of course, one can define many different kinds of distances on many kinds of objects. So now we can ask the following question.

Problem 6.7. Given a list of $n$ points on a plane, find the closest pair of points.

This is a fairly simple problem that is another one of those things that people probably really want to know about and efficiently, since it's a very basic computational geometry problem and has obvious applications in all sorts of graphics and computer vision problems. It was first studied by Shamos and Hoey in 1975, where they showed the algorithm we're about to investigate.

We will assume that squaring and square roots are basic operations (although we don't really need to take square roots). There is a pretty obvious brute force algorithm: just compute and check the distance of every pair of points. Of course, if you have $n$ points, that's $O(n^2)$ things you have to check.

Designing the algorithm

So we want to try to use divide and conquer on this set, ideally splitting our problem up into subproblems of roughly equal size. It's not exactly clear how to do this, which can lead to something complicated like splitting our space into quadrants or something.

Luckily, we do not need to go that far. Something simpler we can do is to just split our problem into two. First, we sort our points by their $x$-coordinate and then divide our points accordingly. Picture a vertical line $L$ going down the middle of the space. Then we recursively find the closest pair in each half and choose the closer of the two. Once we get down to three points in our space, we can just compute which pair is closest. Easy! But there is a big complication with this strategy:

What happens if the closest pair is actually right in the middle, with one point to the left and one point to the right?

Surprisingly, all is not lost. Let $\delta$ be the smaller of the distances computed for the left closest pair and the right closest pair. If we run into the edge case that we described, we know that this pair can't be more than $\delta$ away from each other!

What we can do is consider only those points that are at most $\delta$ away from the line $L$. Sort these by their $y$-coordinates. The next question is: how many comparisons do we have to make when considering these points? It seems like we could be looking at arbitrarily many such comparisons, but we can do a bit more narrowing down.

Lemma 6.8. Let $S = \{p_1, \dots, p_k\}$ be the set of points within distance $\delta$ of line $L$, sorted in order of $y$-coordinate. Then, for each point $p_i$, if $p_j$ with $|i - j| \gt 7$, then $d(p_i,p_j) \gt \delta$.

Proof. Consider a rectangle $R$ of width $2\delta$ and height $\delta$ centered on $L$. We claim that this rectangle can contain at most 8 points. If it contains more, then there are enough points on one side of $L$ so that there is at least one pair of points with distance closer than $\delta$, which would contradict our assumption that $\delta$ was the distance of the closest pair of points on one side.

To see this, we divide $R$ into eight squares of length and width $\frac \delta 2$. We claim that each square can contain at most one point. Observe that the furthest apart any two points can be in such a square is $\frac{\delta}{\sqrt 2} \lt \delta$. $$\tag*{$\Box$}$$

Therefore, for any point within the $\delta$-region of $L$, it suffices to compare it with at most seven other points, since all other points are guaranteed to have distance greater than $\delta$. The important thing about this is we only need to compare it with a constant number of points (it is possible to improve the constant from seven by employing more complicated geometric arguments). So even if we end up with an unusually large number of points along this strip, we know that at worst, we need something on the order of $O(n)$ comparisons, which crucially doesn't take us out of the realm of $O(n \log n)$.

So a very general description of our algorithm is as follows:

  1. Sort our points by $x$-coordinate.
  2. Divide our points in half.
  3. Recursively find the closest pair on the left side.
  4. Recursively find the closest pair on the right side.
  5. Check the middle for possible closest pairs that cross the middle.
  6. Return the closest pair.

This is enough to prove that our algorithm is correct. Just like with Mergesort, this involves first proving the combining step is true (which we just did above) and then proving that the algorithm itself is correct via an inductive argument. In this case, the base case would be when we need to determine the closest pair of 3 points.

Efficiency and Implementation

An efficient implementation almost follows from what we've just described. The only complication occurs in the combining phase, when we try to examine points along the middle strip. As we've described the algorithm, we would require sorting points by their $y$-coordinate each time we want to combine solutions. This leads to an $O(n \log n)$ step at each recursive call, so we need to avoid this.

It turns out the trick is in preprocessing and keeping two copies of the points: one sorted by $x$-coordinate and one sorted by $y$-coordinate. Once we do this, we can just extract the points we need in order by referring to these two lists. This then gets us the following algorithm.

    \begin{algorithm}
    \caption{Closest Pair}
    \begin{algorithmic}
    \PROCEDURE{closest-pair}{$S = \{p_1 = (x_1, y_1), \dots, p_n = (x_n,y_n)\}$}
        \STATE $S_x = $ $S$ sorted by $x$-coordinate
        \STATE $S_y = $ $S$ sorted by $y$-coordinate
        \RETURN{Closest-Pair-Rec($S_x,S_y$)}
    \ENDPROCEDURE

    \PROCEDURE{closest-pair-rec}{$S_x, S_y$}
        \IF{$n \leq 3$}
            \STATE Compute closest pair
            \RETURN{Closest pair}
        \ENDIF

        \STATE Construct $L_x,L_y,R_x,R_y$
        \STATE $(\ell_1,\ell_2) = $ Closest-Pair-Rec($L_x,L_y$)
        \STATE $(r_1,r_2) = $ Closest-Pair-Rec($R_x,R_y$)

        \STATE $\delta = \min(d(\ell_1,\ell_2),d(r_1,r_2))$

        \STATE $L = \max_{(x,y) \in L_x} x$
        \STATE $M = \{(x,y) \in S_y \mid |x - L| \leq \delta\}$, sorted by $y$-coordinate

        \FORALL{$p \in M$}
            \STATE Compute $d(s,s')$ for next 7 points $s' \in M$
        \ENDFOR
        \STATE Let $s,s'$ be the pair of minimum distance computed among points of $M$
        
        \IF{$d(s,s') \lt \delta$}
            \RETURN{$(s,s')$}
        \ELIF{$d(\ell_1,\ell_2) \lt d(r_1,r_2)$}
            \RETURN{$(\ell_1,\ell_2)$}
        \ELSE
            \RETURN{$(r_1,r_2)$}
        \ENDIF

    \ENDPROCEDURE
    \end{algorithmic}
    \end{algorithm}

This algorithm is a bit more complicated than ones we've seen before, so this proof will be a bit more exhaustive.

Theorem 6.10. Closest-Pair has running time $O(n \log n)$.

Proof. First, the sorting steps of lines 2 and 3 can be done in $O(n \log n)$ time. We then need to analyze the running time for Closest-Pair-Rec. Let $T(n)$ be the number of operations performed by Closest-Pair-Rec. When $n \leq 3$, we have $T(n) = O(1)$.

Now, consider when $n \gt 3$.

Therefore, we have the recurrence \[T(n) = \begin{cases} O(1) & \text{if $n \leq 3$}, \\ T(\lfloor n/2 \rfloor) + T(\rceil n/2 \rceil) + O(n) & \text{if $n \gt 3$.} \end{cases}\] and we have that $T(n) = O(n \log n)$.

Then the running time of Closest-Pair is $O(n \log n) + T(n) = O(n \log n)$. $$\tag*{$\Box$}$$

A formal standalone proof of this would involve proving the recurrence for $T(n)$. Of course, this is the same recurrence as Mergesort, so that's not too hard.

Both Mergesort and Closest Pairs nicely follow the same template for divide and conquer algorithms:

  1. Identify a way to divide your problem into subproblems and base cases.
  2. Recursively solve each subproblem.
  3. Combine the solutions to each subproblem.

Proving correctness and efficiency then involves proving the efficiency and correctness of these steps. Proving the efficiency of a divide and conquer algorithm involves solving the recurrence based on the recursive steps and the combination step. Proving correctness involves proving the correctness of the combination step and inductively proving the correctness of the recursive steps.

This is not directly related to divide and conquer algorithms, but it turns out that sorting a bunch of points is a solid first step for many computational geometry algorithms. For instance, the convex hull problem asks for a minimal convex polygon that encloses a set of points. This sounds complicated, but it's no more difficult than sorting! Once you have the convex hull of a set of points, you get the farthest pair of points very easily: just check pairs of points on the convex hull!