MPCS 55001 — Lecture 6

Divide and conquer

Now, we'll move on to another common algorithm design technique: divide and conquer. Much like "greedy algorithms" this design paradigm is pretty much what it's been named. The idea is to recursively divide a large problem into smaller subproblems repeatedly and then put the pieces back together, so to speak.

Sorting

The first problem that we will consider is sorting, something we are probably a bit familiar with now.

The sorting problem is: given a list $a_1, \dots, a_n$ of $n$ integers, produce a list $L$ that contains $a_1, \dots, a_n$ in ascending order.

Of course, we can sort more things than just integers, as long as the elements we choose are from some set that has a total order (like, say, strings, assuming your alphabet is ordered).

There are some obvious applications of sorting, and we've already seen some not so obvious applications of sorting, where sorting is a crucial step that allows us to apply some of our other algorithmic design strategies.

On the last problem set, you were asked to come up with an obscenely bad sorting algorithm. Some of you started with some simple ideas that were simple or naive, but not necessarily horrible. Even that bit of thinking (i.e. hey smaller elements should be at the beginning of the array!) brought you from $O(n \cdot n!)$ up to $O(n^2)$. But while we're in the realm of "efficiency" now, we're not satisfied with that.

So how do we get to $O(n \log n)$? I've already briefly mentioned how we might do this with heaps, but this is a bit cumbersome. Do we really want to be constructing a heap every time we want to sort an array?

I got a very good question recently about what intuitively $O(\log n)$ looks like. The answer to that is that it is basically what happens when you start dividing a problem into subproblems (usually by half, but not necessarily) repeatedly. It is that idea that will guide us for the next few problems.

Mergesort

The idea behind Mergesort is pretty simple:

  1. Divide the array in half
  2. Sort the left half using Mergesort
  3. Sort the right half using Mergesort
  4. Merge the two halves

The only thing that's not clear is what merging does. Assume we have two sorted arrays that we want to merge into one big array containing all the elements in both. What we would need to do is examine each and arrange them in order. Luckily, they're sorted, so all we have to do is go through each array and pick the smaller of the two elements we're looking at to put into our new array next.

Mergesort was developed in 1945 by John von Neumann, who, like many of the most eminent mathematicians and scientists of our time, has scientific contributions in more fields than I can remember or am aware of.

Here, we are given an array $[3,1,5,2,8,7,6,4,9]$ and we sort recursively, splitting the array in half each time. The results of each sort are then merged back together.

The algorithm in pseudocode is essentially the algorithm we described, with the addition of the base case of when your array is just a single element.

    \begin{algorithmic}
    \PROCEDURE{mergesort}{$A$}
        \STATE $n = $ size($A$)
        \IF{$n$ is 1}
            \RETURN{$A$}
        \ENDIF
        \STATE $m = \left\lfloor \frac n 2 \right\rfloor$
        \STATE $B = $ \CALL{mergesort}{$A[1,\dots,m]$}
        \STATE $C = $ \CALL{mergesort}{$A[m+1,\dots,n]$}
        \STATE $A' = $ \CALL{merge}{$B,C$}
    \RETURN{$A'$}
    \ENDPROCEDURE
    \end{algorithmic}

Analyzing Mergesort

Proving the correctness of a recursive algorithm seems daunting, but the structure lends itself quite naturally to proof by induction. In fact, the most difficult part of a proof of the correctness of Mergesort is in proving that merging is correct, since this is typically not expressed recursively.

Mergesort produces a sorted array.

We will prove this by induction on $n$, the size of the array $A$. The base case is when $|A| = 1$, in which case mergesort returns an array containing the single element of $A$. This is clearly sorted in increasing order.

Then we consider $|A| \gt 1$. We assume that for all arrays of size less than $|A|$, mergesort will produce a sorted array. Thus, by our inductive hypothesis, mergesort correctly sorts the arrays $B = A[1..m]$ and $C = A[m+1..n]$, where $m = \lfloor n/2 \rfloor$, since $|B| = m \lt n$ and $|C| = n-m \lt n$.

Since both $B$ and $C$ are sorted, we can merge them to produce a sorted array containing elements of $A$.

Simple! Except now we need to deal with merge. The concept is actually quite simple.

        \begin{algorithmic}
        \FUNCTION{merge}{$A,B$}
            \STATE let $n_1 = |A|, n_2 = |B|$
            \STATE set $A[n_1 + 1], B[n_2 + 1] = \infty$
            \STATE initialize array $A'$ of size $n = n_1 + n_2$
            \STATE set $i,j = 1$
            \FOR{$k$ from 1 to $n$}
                \IF{$A[i] \leq B[j]$}
                    \STATE $A'[k] = A[i]$
                    \STATE $i = i+1$
                \ELSE
                    \STATE $A'[k] = B[j]$
                    \STATE $j = j+1$
                \ENDIF
            \ENDFOR
        \RETURN{$A'$}
        \ENDFUNCTION
        \end{algorithmic}
        

Recall that we assume that $A$ and $B$ are sorted arrays of about size $\frac n 2$.

Merge produces an array $A'$ that contains the elements of $A$ and $B$ in increasing order.

We will prove that $A'[1..k]$ is sorted and $A'[k] \leq A[\ell],B[m]$ for all $i \leq \ell \leq n_1$ and $j \leq m \leq n_2$ after the $k$th iteration by induction on $k$.

First consider $k = 1$. Then $A'[1]$ is assigned the smaller of $A[1]$ and $B[1]$, which are the smallest elements of $A$ and $B$, respectively. Then $A'$ contains one element and is clearly sorted and smaller than all remaining elements in $A$ and $B$.

Now consider $k \gt 1$. By our inductive hypothesis, $A'[1..k-1]$ is sorted and $A'[k-1] \leq A[\ell],B[m]$ for all $i \leq \ell \leq n_2$ and $j \leq m \leq n_2$. Assume that $A[i] \leq B[j]$---a similar argument holds for the other case. We set $A'[k] = A[i]$. By our inductive hypothesis, $A'[k-1] \leq A'[k]$, so $A'[1..k]$ is sorted. Since $A$ was already sorted, we know $A'[k] = A[i] \leq A[\ell]$ for $i+1 \leq \ell \leq n_1$ and from earlier, we had $A'[k] = A[i] \leq B[j]$, so $A'[k] \leq B[m]$ for all $j \leq m \leq n_2$ since $B$ is also sorted.

Since we've shown that this holds for all iterations $k$, this holds when $k = n$, and $A' = A'[1..n]$ is sorted after the $n$th and final iteration of the for loop. Thus, $A'$ is sorted.

Now, we need to consider the efficiency of the algorithm. This is our first recursive algorithm, and it's not necessarily immediately clear how to consider the running time. However, if we think about what the algorithm is really doing, the crucial operation we need to keep track of is comparisons. We'll use this as a proxy of the running time (technically, there are a few more constants floating about if we wanted to be more strict about counting operations).

Suppose $T(n)$ is the number of comparisons that Mergesort performs on an array of size $n$. Then, we have the following pieces of information:

What can we do with this? Let's run through a simplified case first.

Suppose $n$ is a power of 2, so $n = 2^k$ for some $k \geq 0$. We can then express $T(n)$ as a recurrence, \[T(n) = \begin{cases} 0 & \text{if $n = 1$,} \\ 2 \cdot T\left(\frac n 2 \right) + n & \text{if $n \gt 1$.} \end{cases} \]

This is a relatively simple recurrence to solve and we have a few ways of looking at it. We can try to gain some insight by expanding our tree of subproblems.

We can then use this to come up with a guess for our recurrence, or if we prefer, we can try iterating the recurrence or guessing the first few values. Here's what we get from iterating: \begin{align*} T(2^k) &= 2 \cdot T(2^{k-1}) + 2^k \\ 2 \cdot T(2^{k-1}) &= 4 \cdot T(2^{k-2}) + 2^k \\ & \vdots \\ 2^{k-1} T(2) &= 2^k \cdot T(1) + 2^k \\ \end{align*} This gives us $T(2^k) = k \cdot 2^k$. We can then take our hypothesis and prove the recurrence formally using induction. Once we've proved it, we notice that this works out very nicely, because notice that $k = \log_2 2^k$ and since $n = 2^k$, we get $T(n) = O(n \log n)$.

Of course, this recurrence works for $n$ which are exact powers of 2. What about all of those numbers that aren't so lucky? If we re-examine our algorithm for Mergesort, we notice that it's possible that we might not get to break our array clean in half, in which case one subarray has size $\lfloor n/2 \rfloor$ and the other has size $\lceil n/2 \rceil$. We would need to rewrite our recurrence as follows \[T(n) \leq \begin{cases} 0 & \text{if $n = 1$,} \\ T\left(\left\lfloor \frac n 2 \right\rfloor\right) + T\left(\left\lceil \frac n 2 \right\rceil \right) + n & \text{if $n \gt 1$.} \end{cases} \]

This recurrence is still sufficient to give us the following result.

$T(n)$ is $O(n \log n)$.

How does one go about proving this? You play around with the recurrence and somehow guess your way into finding $T(n) \leq n \lceil \log n \rceil$ and then proving that this holds by induction on $n$.

This was a bit of an ad-hoc refresher of how to deal with recurrences. However, if guessing and checking makes you uneasy, we will be seeing a more powerful result that lets us avoid this very soon.

The Master Theorem

So far, the recurrences we've run into are not terribly complicated and so could afford to be pretty ad-hoc with how we've dealt with them. Eventually, we will be running into situations where we need to divide our problem into more than two subproblems, which will result in more complicated recurrences. Fortunately, there is a bit of a reliable result that covers most of the types of recurrences we're likely to see.

In algorithms analysis, the form of recurrence we tend to see is the following.

A divide-and-conquer recurrence is a recurrence of the form \[T(n) = a T\left(\frac n b\right) + f(n)\] where $a$ is the number of subproblems, $b$ is the factor by which we decrease the size of the subproblem, and $f(n)$ is the cost of combining solutions to subproblems.

We can gain some intuition about how to work with this by looking at its associated recursion tree. For convenience, we assume that $n$ is a power of $b$. Then at each level, each problem breaks into $a$ subproblems. What this means is that on the $i$th level of the tree, we have a total of $a^i$ subproblems, each of size $\frac{n}{b^i}$. In total, we will have $1 + \log_b n$ levels. Now, suppose also that we have $f(n) = n^c$ for some fixed constant $c$ (please ignore the fact that the illustration says $k$). We have the following.

Adding all of these up, we have \[T(n) = \sum_{i=0}^{\log_b n} a^i \left( \frac{n}{b^i} \right)^c = n^c \sum_{i=0}^{\log_b n} \left(\frac{a}{b^c}\right)^i.\]

We would like to know what the asymptotic behaviour of this recurrence is. This will depend on what $a,b,c$ are, which will determine which term of $T(n)$ dominates.

The question is how fast the sum grows. Since this is a geometric sum, this will depend on what $a$ and $b$ are. Recall the following:

Taking $r = \frac{a}{b^c}$ and $k = \log_b n$ together with the fact that $\frac{a}{b^c} \lt 1$ if and only if $c \gt \log_b a$, we arrive at the following.

Let $a \geq 1, b \geq 2, c \geq 0$ and let $T:\mathbb N \to \mathbb R^+$ be defined by \[T(n) = aT\left(\frac n b\right) + \Theta(n^c), \] with $T(0) = 0$ and $T(1) = \Theta(1)$. Then, \[ T(n) = \begin{cases} \Theta(n^{\log_b a}) & \text{if $c \lt \log_b a$}, \\ \Theta(n^c \log n) & \text{if $c = \log_b a$}, \\ \Theta(n^c) & \text{if $c \gt \log_b a$}. \\ \end{cases} \]

This is called the Master Theorem in CLRS because it magically solves many of the divide and conquer recurrences we see and because most instructors don't really get into the proof. Observe that I skillfully avoided many details. But the origins of this particular recurrence and its solutions is due to Bentley, Haken, and Saxe in 1980, partly intended for use in algorithms classes. If you are really interested in the entirety of the proof, then you should take a look at CLRS 4.6.

Recall our recurrence from Mergesort. Let $T(n) = 2T\left(\frac n 2\right) + kn$, for some constant $k$. Here, we have $a = 2$, $b = 2$, and $c = 1$. Since $\log_2 2 = 1$, we have that $T(n)$ is $O(n \log n)$.

Now consider something like $T(n) = 3T\left(\frac n 2\right) + kn$. Here, we have $1 \lt \log_2 3 \lt 1.58$, so this gives us $T(n) = O(n^{\log_2 3}) = O(n^{1.58})$, a noticeable difference just by increasing the number of subproblems by 1.