CMSC 27200 — Lecture 25

Note. We didn't cover the vast majority of this in class (most of it was spent on Q&A), but since I have it handy, you may find it interesting/helpful.

Hamiltonian Cycle

Now, we'll prove something about a simple, but important graph problem. The following notion was due to the Irish mathematician William Rowan Hamilton in the early 1800s.

Let $G$ be a graph. A cycle is a Hamiltonian cycle if every vertex of $G$ is in the cycle. A path is a Hamiltonian path if every vertex of $G$ is in the path.

The following graph contains a Hamiltonian cycle that is highlighted.

Hamiltonian cycles and paths have some very interesting applications. One is a connection to Gray codes, which are a binary code on $n$ digits that are defined such that the successor of a number only has a one digit difference from the current number.

The obvious question that we can ask about Hamiltonian cycles is whether our graph contains one.

The Hamiltonian Cycle problem is: Given a graph $G$, does $G$ contain a Hamiltonian cycle?

There is are directed and undirected flavours of this problem. Let's start with a simple reduction.

Directed Hamiltonian Cycle $\leq_P$ Undirected Hamiltonian Cycle.

We'll describe the reduction informally. The big question is how to transform a directed graph into an undirected graph while maintaining the "directedness" of the edges. What we can do is split a vertex $v$ into three: $v_{in}, v, v_{out}$.

We join $v_{in}$ and $v$ by an edge and $v$ and $v_{out}$ by an edge. Then for all edges $(u,v)$, we join $u_{out}$ and $v_{in}$. It remains to show that there is a Hamiltonian cycle in one graph if and only if there is one in the other.

Here's another very interesting related problem.

The Traveling Salesman problem is: Given a list of $n$ cities, and costs $c(i,j) \geq 0$ for travelling from city $i$ to city $j$ and an integer $k \geq 0$, is there a sequence $i_1, i_2, \dots, i_n$ such that \[\sum_{j=1}^n c(i_j, i_{j+1}) + c(i_n,i_1) \leq k?\]

Traveling Salesman is probably the most well-known example of an intractable problem: the idea is straightforward, it's easy to show that it's hard to solve, and it's easy to see why someone might possibly care about the problem.

The standard formulation of this problem doesn't involve graphs, but it's not hard to automatically interpret this as a problem on a complete graph of $n$ vertices. This view of the problem gives us an obvious reduction.

Hamiltonian Cycle $\leq_P$ Traveling Salesman.

We won't go through an entire formal proof, since the construction of the reduction is pretty simple. Let's describe how to take an instance of Hamiltonian Cycle, which is a directed graph $G$, and construct an instance of Traveling Salesman. An instance of Traveling Salesman is $(H,c,v)$, where $H$ is a complete graph with edge costs $c$, and $k \geq 0$.

If $G$ has $n$ vertices, we simply construct a complete graph on $n$ vertices and set $k = n$. Then we assign edge costs as follows: if $e$ is an edge of $G$, then $c(e) = 1$. Otherwise, $c(e) = 2$. Then all we need to do is show that the minimal cost Traveling Salesman tour with cost $n$ corresponds to the minimal cost Hamiltonian cycle if it exists, and there is no such tour if it doesn't, since to complete such a tour, we must include an edge with cost 2.

That was not so bad. Now, let's do something a bit more surprising: we'll show that 3-SAT reduces to Hamiltonian Cycle.

3-SAT $\leq_P$ Hamiltonian Cycle.

Let $\varphi$ be an instance of 3-SAT with $n$ variables and $k$ clauses. We will construct an instance of Hamiltonian Cycle. What we will do is try to construct a graph so that each possible assignment of variables corresponds to a Hamiltonian cycle. We then modify the graph by adding paths through clauses in such a way that we'll have a Hamiltonian cycle for those assignments that are satisfying.

The idea is that we need a way to tell whether we're setting a variable $x_i$ to True or False. So we create a series of vertices so that a path goes through the vertices from left to right if we set $x_i$ to True and we set it to False if the path goes the other way.

This gives us a way to define Hamiltonian paths through our graph that correspond to a particular assignment of variables. We then need to use this to indicate whether a clause has been satisfied by a particular assignment of variables it contains. We do this by hooking up vertices representing clauses to each of the paths for the corresponding variables.

If a literal $x_i$ is in a clause $C_j$, then the edges are such that $C_j$ is entered via the left from row $i$. Similarly, if the literal $\neg x_i$ is in a clause $C_k$, the edges are such that $C_k$ is entered via the right on row $i$.

These clause vertices are connected to the variable paths according to the literal. In order to space out these paths, we need $3k+3$ vertices in each variable path.

We need to argue that there is a satisfying assignment for $\varphi$ if and only if this graph $G$ we constructed has a Hamiltonian cycle.

Suppose $\varphi$ has a satisfying assignment. We can define our Hamiltonian cycle $\mathcal C$ by the following. If variable $x_i$ has been set to True, then the path moves through row $i$ from left to right; if $x_i$ is set to False, the path moves through row $i$ from right to left. Then for each clause $C_j$, there is at least one row for which the cycle can visit $C_j$. This only needs to happen once per clause, so the path only goes through $C_j$ once.

Now, suppose that $G$ has a Hamiltonian cycle. We need to recover an assignment for $\varphi$. For each row $i$, if the cycle goes from left to right, then $x_i$ is set to True; if it goes from right to left, we set $x_i$ to False. This assignment is satisfying because each clause vertex is visited by the Hamiltonian cycle. This vertex can only be visited by entering and exiting via vertices on the same row. This row corresponds to the literal that satisfies the clause. Since every clause is satisfied, $\varphi$ is satisfied.

It remains to show that this construction can be performed in polynomial time. We observe that the graph $G$ has $n(3k+3) + 2$ vertices, so the size of the graph is polynomial in the length of $\varphi$.

Subset Sum

Here's a problem involving numbers.

The Subset Sum problem is: Given a set of $n$ natural numbers $S = \{s_1, \dots, s_n\}$ and an integer $t \geq 0$, is there a subset $S'$ of $S$ such that the sum of elements of $S'$ is equal to $t$?

Consider the set $\{2, 45, 67, 84, 315, 346, 765, 1001\}$. If $t = 1192$, then a subset of numbers that adds up to $t$ is $\{45, 67, 315, 765\}$.

Where have we seen something like this before?

Subset Sum $\leq_P$ 0-1 Knapsack.

The reduction for this is pretty simple: Subset Sum is just Knapsack where the weight of each item is the same as its value. Then it's pretty clear that what you want to do is fill up your knapsack with items so that the weight of the items is as close to the weight limit $W$ as possible.

Is there something more exciting we can do with this problem? Yes! Let's relate it to one of the problems we've already seen. However, there doesn't seem to be a natural problem that we can construct a reduction from. However, we will show that we can get a reduction from 3-SAT.

3-SAT $\leq_P$ Subset Sum.

We will show how to transform an instance $\varphi$ of 3-SAT with $n$ variables and $k$ clauses into an instance $(S,t)$ of Subset Sum.

We will define $2n+2k$ integers, each with $n+k$ digits. This corresponds to two numbers per variable and two numbers per clause, where each digit corresponds to a variable or clause. For convention, we will use $n$ leftmost digits as digits for the variables and the $k$ rightmost digits as the digits for the clauses.

For each variable $x_i$, we construct the numbers $a_i$ and $a_i'$. For both numbers, the $i$th digit from the left is a 1. If $x_i$ is a literal in clause $j$, then the $n+j$th digit of $a_i$ is 1. If $\neg x_i$ is a literal in clause $j$, then the $n+j$th digit of $a_i'$ is 1. All other digits are 0.

For each clause $C_j$, we define the numbers $b_j$ and $b_j'$. The number $b_j$ has a 1 in the $n+j$th position and the number $b_j'$ has the number 2 in the $n+j$th position. All other digits are 0.

Then to define the target $t$, we define the number with 1 for every digit corresponding to a variable and a 4 for every digit corresponding to a clause.

For example, if we have $\varphi = (x_1 \vee x_2 \vee \neg x_3) \wedge (\neg x_1 \vee x_3 \vee x_4) \wedge (x_1 \vee \neg x_2 \vee x_4)$, then we end up with the following numbers,

$x_1$ $x_2$ $x_3$ $x_4$ $C_1$ $C_2$ $C_3$
$a_1$ 1 0 0 0 1 0 1 1000101
$a_1'$ 1 0 0 0 0 1 0 1000010
$a_2$ 0 1 0 0 1 0 0 100100
$a_2'$ 0 1 0 0 0 0 1 100001
$a_3$ 0 0 1 0 0 1 0 10010
$a_3'$ 0 0 1 0 1 0 0 10100
$a_4$ 0 0 0 1 0 1 1 1011
$a_4'$ 0 0 0 1 0 0 0 1000
$b_1$ 0 0 0 0 1 0 0 100
$b_1'$ 0 0 0 0 2 0 0 200
$b_2$ 0 0 0 0 0 1 0 10
$b_2'$ 0 0 0 0 0 2 0 20
$b_3$ 0 0 0 0 0 0 1 1
$b_3'$ 0 0 0 0 0 0 2 2
$t$ 1 1 1 1 4 4 4 1111444

One possible satisfying subset is highlighted.

We now need to show that we have a satisfying assignment for $\varphi$ if and only if there is a subset of these numbers that adds up to $t$.

Suppose we have a satisfying assignment. If $x_i$ is assigned to True, we add $a_i$ to our subset and if $a_i$ is assigned to False, we take $a_i'$. So each digit corresponding to $x_i$ sums up to 1. Since $\varphi$ is satisfiable, every digit corresponding to $C_j$ has a sum of at least 1.

Since we need each digit corresponding to $C_j$ to add up to 4, we add $b_j$ or $b_j'$ depending on how many literals of $C_j$ were set to True. If three literals were set to True, then this means that the digit sums up to 3 and we choose $b_j$. If two literals were set to True, the digit sums to 2 and we choose $b_j'$. If one literal was set to True, we choose both $b_j$ and $b_j'$.

So our chosen numbers sum up to $t$.

Now, suppose we have a subset $S'$ of numbers that sum up to $t$. We observe that each digit corresponding to a variable is 1, so we can only have one of $a_i$ or $a_i'$. If we have $a_i$, then $x_i$ is True and if $a_i'$ is in our set, then $x_i$ is False.

We claim that this is a satisfying assignment. To see why, we note that each digit corresponding to a clause is 4. Since the sum of $b_i$ and $b_i'$ is 3, this means there must be at least one 1 among the $a_i$'s chosen, which corresponds to the literal which satisfies the clause. Since every clause is satisfied, the formula is satisfied.

Finally, we have to show that our reduction is within polynomial time. Note that we defined the reduction on a per-digit basis. This is because for numerical problems, the size of the instance is measured in the size of the binary representation of the numbers. Since we were careful to define the digits of the Subset Sum instance, the size of the instance is quadratic in the size of the length of $\varphi$.