CMSC 27200 — Lecture 16

The knapsack problem was first formulated by Dantzig in 1957 and he gives a basic dynamic programming algorithm for it. The idea is that you have a knapsack that can hold a certain amount of weight. You would like to fill it with items such that the value of the items you choose are maximized. Of course, we don't really care about filling literal knapsacks full of stuff—this just happens to be a useful metaphor.

Dantzig is well known for the Simplex algorithm for solving linear programming problems. He is also the source of the urban legend of the student who was late to class, copied down some equations, did them for homework, and inadvertently solved two unsolved problems: he was the (grad) student.

The 0-1 knapsack problem is given a list of $n$ items, where item $i$ has value $v_i \gt 0$ and weight $w_i \gt 0$, and a knapsack weight limit of $W$, choose the subset of items with the maximum total value.

Suppose we can carry weight $12$. Then $\{1,5\}$ has weight 12 and value 20, while $\{1,3,4\}$ has weight 11 and value 22.

The question is, similar to our greedy algorithm, how should we consider how to choose items? We can look for greedy algorithms for a bit of inspiration: do we choose based on weight? On value? On weight and value?

We can think through similar choices. We want to maximize the value of a solution to the problem, but the question is which choices we should be making. Suppose that we only consider the optimal value of a solution by considering the items $1, \dots, i$. This seems like it should work, but there is a crucial piece missing: how do we ensure that we don't exceed our weight limit? Similarly, we can try to optimize around the value of a solution considering only the possible weight, but how do we formulate how this changes with respect to each item that's chosen?

Let $OPT(i,w)$ denote the value of an optimal solution $\mathcal O_{i,w}$ for the knapsack problem on items $1, \dots, i$ subject to weight limit $w$.

In essence, what we do is to consider two different subproblems at the same time: the subproblem of which items are chosen out of $1, \dots, i$ and the subproblem of which items to choose with the given weight limit $w$. The subproblem of selecting a subset of items $1, \dots, i$ is clear, since it's similar to the scheduling problem we saw earlier.

But how do we consider subproblems on weight? Luckily, we made the helpful assumption that all weights are integers, so we have subproblems ranging on $0 \leq w \leq W$. This gives us the equation with two variables, \[OPT(i,w) = \max_{S \subseteq \{1,\dots,i\}} \sum_{j \in S} v_j \quad \text{subject to } \sum_{j \in S} w_j \leq w.\]

Under this formulation, we want to compute $OPT(n,W)$. So, now we need to figure out how to express $OPT(i,w)$ in terms of smaller subproblems. Let us consider what our choices are regarding $OPT(i,w)$ when considering whether $i \in \mathcal O_{i,w}$.

$OPT(i,w)$ satisfies the following recurrence. \[OPT(i,w) = \begin{cases} 0 & \text{if $i = 0$ or $w = 0$}, \\ \max\{OPT(i-1,w), v_i + OPT(i-1,w-w_i)\} & \text{for $i,w \gt 0$}. \end{cases} \]

$i$	1	2	3	4	5
$w_i$	1	3	4	7	11
$v_i$	1	4	9	12	19

We will prove this by induction on $i$ and $w$.

Our base case will be when $i = 0$ or $w = 0$. If $i = 0$, then there are no items to add, so $OPT(i,w) = 0$ for all $w$. If $w = 0$, then there is no space to add any items, since the weight of all items are strictly positive, so $OPT(i,w) = 0$ for all $i$.

Now, suppose $i, w \gt 0$. Assume for all $i' \lt i$ and $w' \lt w$ that $\mathcal O_{i',w'}$ is an optimal solution (i.e. with maximum value) with value $OPT(i',w')$. Let $\mathcal O(i,w)$ be an optimal solution on $i$ items and weight $w$ and value $OPT(i,w)$. Let $w_i$ be the weight of item $i$ and let $v_i$ be the value of item $i$.

First, we argue that \[OPT(i,w) \leq \max\{OPT(i-1,w), v_i + OPT(i-1,w-w_i)\}\] There are two cases, depending on whether or not $i$ is in $\mathcal O_{i,w}$.

Suppose that $\mathcal O_{i,w}$ doesn't contain $i$. This is either because it doesn't fit or because it does not belong in the solution that gives us the maximal value. This means that we have a subset containing only items from $1$ through $i-1$ with weight up to $w$. But this is a candidate solution for the problem on $i-1$ items with weight up to $w$. In other words, $OPT(i,w) \leq OPT(i-1,w)$.
Now, suppose that $\mathcal O_{i,w}$ does contain item $i$. If we consider the set $\mathcal O_{i,w} \setminus \{i\}$, this is a set of items from $1$ through $i-1$ subject to weight $w - w_i$. The value of this set is $OPT(i,w) - v_i$. But this is a candidate solution for $i-1$ items and weight up to $w - w_i$, which means that we have $OPT(i-1,w-w_i) \geq OPT(i,w) - v_i$. Rearranging gives us that $OPT(i,w) \leq OPT(i-1,w-w_i) + v_i$.

Since we take the maximum over the two cases, the inequality is proved.

Next, we argue that \[OPT(i,w) \geq \max\{OPT(i-1,w), v_i + OPT(i-1,w-w_i)\}\] There are two cases, but this time, we consider the value of the subproblems.

Consider the subset $\mathcal O_{i-1,w}$ which is an optimal solution for $i-1$ items and weight $w$ and has value $OPT(i-1,w)$. This subset is also a possible solution for $i$ items and weight $w$, so we must have that $OPT(i,w) \geq OPT(i-1,w)$.
Now, consider the subset $\mathcal O_{i-1,w-w_i}$ which is an optimal solution for $i-1$ items and weight $w-w_i$ and has value $OPT(i-1,w-w_i)$. Take the set $\mathcal O_{i-1,w-w_i} \cup \{i\}$. Now, this set is a possible solution for $i$ items and weight $w = (w-w_i) + w_i$ and it has value exactly $OPT(i-1,w-w_i) + v_i$. Therefore, $OPT(i,w) \geq OPT(i-1,w-w_i) + v_i$.

Again, since we take the maximum over these two cases, the inequality is proved.

Together, both inequalities show that \[OPT(i,w) = \max\{OPT(i-1,w), v_i + OPT(i-1,w-w_i)\}\]

Recall Example 10.2 from above. Running this algorithm constructs the following table $V$.

Our solution, $OPT(n,W)$, is highlighted in red. This table also gives us a way to figure out which items were chosen: if $V[i,w] \gt V[i-1,w]$, then item $i$ was chosen.

$i$/$w$	1	2	3	4	5	6	7	8	9	10	11	12
0	0	0	0	0	0	0	0	0	0	0	0	0
1	1	1	1	1	1	1	1	1	1	1	1	1
2	1	1	4	5	5	5	5	5	5	5	5	5
3	1	1	4	9	10	10	13	14	14	14	14	14
4	1	1	4	9	10	10	13	14	14	16	21	22
5	1	1	4	9	10	10	13	14	14	16	21	22

We will show that $V[i,w] = OPT(i,w)$ for $i \leq n$ and $w \leq W$ by induction on $(i,w)$. First, we consider as our base cases $i = 0$ and $w = 0$. We have that $V[0,w] = 0 = OPT(0,w)$ for all $w \leq W$ and $V[i,0] = 0 = OPT(i,0)$ for all $i \leq n$.

Now consider $i,w \gt 0$ and assume that for all $(i',w')$ that are computed before $(i,w)$ that $V[i',w'] = OPT(i',w')$. First, suppose $w_i \gt w$. Then by line 7, we have $V[i,w] = V[i-1,w]$. By our inductive hypothesis, $V[i-1,w] = OPT(i-1,w)$, so $V[i,w] = OPT(i-1,w)$ and therefore $V[i,w] = OPT(i,w)$ as desired.

Then if $w_i \leq w$, by line 9, we have \[V[i,w] = \max(V[i-1,w], v_i + V[i-1,w-w_i])\] and by our inductive hypothesis, we have \[V[i,w] = \max\{OPT(i-1,w), v_i + OPT(i-1,w-w_i)\},\] so $V[i,w] = OPT(i,w)$.

The work is done in the nested loops. Each of the computations within the loops can be done in $O(1)$ time. The inner loop is run $W$ times, while the outer loop is run $n$ times. This gives $O(nW)$ time in total.

Now, at this point, some of you may remember my note on the solution to this problem that there is no known polynomial time algorithm for knapsack, but here, we have an algorithm that runs in $O(nW)$ time! What's up with that?

The key here is to remember that the complexity of algorithms are functions of the size of the input, so it's worth stepping back to ask what the size of this problem instance is. One part is obvious: the number of items that we have. The other part is not so obvious: what is the size of $W$?

A common mistake, as in this case, is to take $W$ as the size of $W$. But $W$ is the input, a number. The size of $W$ is the size of its representation, which is the number of bits of $W$ (or digits if you prefer to use some other number system). Either way, for any representation that isn't unary, the size of $W$ is $O(\log W)$.

The issue here then becomes clear: $W$ is not polynomial in $\log W$, so our $O(nW)$ bound is not polynomial in the size of the input. Algorithms that are polynomial in the numeric value of the input but not the size of the input are said to be pseudopolynomial.

Knapsack problems belong to a family of packing problems, where we are given a set of items with various properties and constraints and we are asked to choose a selection of items subject to constraints while tying to maximize or minimize some property:

It becomes pretty clear that many of these problems are quite important, in that they can be repurposed very easily as various computational and real life applications. However, they also happen to share the same unfortunate downside: we don't have efficient algorithms for many of these problems. This is where we begin to see glimpses of intractability rear its ugly head.

$i$/$w$	1	2	3	4	5	6	7	8	9	10	11	12
0	0	0	0	0	0	0	0	0	0	0	0	0
1	1	1	1	1	1	1	1	1	1	1	1	1
2	1	1	4	5	5	5	5	5	5	5	5	5
3	1	1	4	9	10	10	13	14	14	14	14	14
4	1	1	4	9	10	10	13	14	14	16	21	22
5	1	1	4	9	10	10	13	14	14	16	21	22

$i$/$w$	1	2	3	4	5	6	7	8	9	10	11	12
0	0	0	0	0	0	0	0	0	0	0	0	0
1	1	1	1	1	1	1	1	1	1	1	1	1
2	1	1	4	5	5	5	5	5	5	5	5	5
3	1	1	4	9	10	10	13	14	14	14	14	14
4	1	1	4	9	10	10	13	14	14	16	21	22
5	1	1	4	9	10	10	13	14	14	16	21	22

$i$/$w$	1	2	3	4	5	6	7	8	9	10	11	12
0	0	0	0	0	0	0	0	0	0	0	0	0
1	1	1	1	1	1	1	1	1	1	1	1	1
2	1	1	4	5	5	5	5	5	5	5	5	5
3	1	1	4	9	10	10	13	14	14	14	14	14
4	1	1	4	9	10	10	13	14	14	16	21	22
5	1	1	4	9	10	10	13	14	14	16	21	22