CMSC 27200 — Lecture 18

More reductions

Set Cover

One of the things that we've seen and will continue to see is that there are some surprising connections between problems that seem very different. For instance, we will consider the following problem.

Problem 18.1. The set cover problem is: Given a universe $U$ of $n$ elements, a collection of subsets $\mathcal X = \{X_1, \dots, X_m\}$ of $U$, and an integer $k \geq 0$, does there exist a collection of at most $k$ subsets $X_i$, $1 \leq i \leq m$, whose union is equal to $U$?

Example 18.2. Consider the universe $U = \{1,2,3,4,5,6,7,8,9\}$ and the sets

Clearly, the easy answer is that choosing every set is a set cover for $U$, so we really want the smallest set cover for $U$. Such a set cover is $X_1, X_4, X_5$.

This is a problem about sets, so at first glance this doesn't seem like it has anything to do with the graph problems that we've been looking at. But from its name, you can guess that set cover is a covering problem like vertex cover and we remember that graphs (like most things in mathematics) are made of sets, so maybe there is a connection after all.

Proposition 18.3. Vertex Cover $\leq_P$ Set Cover.

Proof. We are given an instance of Vertex Cover $(G,k)$ with $G = (V,E)$ and $k \geq 0$. We will show how to solve Vertex Cover by constructing an instance of Set Cover $(U,\mathcal X,k)$.

Let our universe $U$ be the set of edges $E$. Then for each vertex $v \in V$, we define the subset \[X_v = \{e \in E \mid \exists u \in V, e = (u,v) \}.\] In other words, $X_v$ is the subset of edges that are incident to $v$.

Now, we want to show that $G$ has a vertex cover of size $k$ if and only if $U$ contains a set cover by sets in $\mathcal X$ of size $k$.

Suppose that $C \subseteq V$ is a vertex cover of size $k$ in $G$. Then we choose $\mathcal Y = \{X_v \mid v \in C\}$ as our set cover. Since $C$ is a vertex cover, every edge in $G$ is adjacent to at least one vertex in $C$. Therefore, every edge is a member of some set in $\mathcal Y$. And by definition, $|\mathcal Y| = k$.

Now, suppose that $\mathcal Y \subseteq S$ is a set cover of size $k$ for $U$. Then we take $C = \{v \mid X_v \in \mathcal Y\}$ and claim that $C$ is a vertex cover. To see this, we note that since $\mathcal Y$ is a set cover, every element of $U$ is in some set in $\mathcal Y$. Therefore, every edge is a member of some set in $\mathcal Y$.

Then, we can solve Vertex Cover by constructing for an instance $(G,k)$ an instance $(U,\mathcal X,k)$ and running an algorithm for Set Cover on $(U,\mathcal X,k)$. $$\tag*{$\Box$}$$

In other words, we've shown that if we can solve this problem on sets, then we can solve a problem on graphs. But what if we wanted to solve Independent Set? It turns out that reduction is transitive.

Proposition 18.4. If $A \leq_P B$ and $B \leq_P C$, then $A \leq_P C$.

Proof. $$\tag*{$\Box$}$$

Satisfiability

So you might say, okay, it makes sense to reduce Set Cover to Vertex Cover because they're both about covers and you can kind of generalize things to graphs. But what if we considered something that wasn't about covering at all?

Here, we will need to recall some definitions from propositional logic.

Definition 18.5. A boolean variable is a variable that can be assigned a value from $\{0,1\}$ (or $\{T,F\}$). A literal is a boolean variable or its negation. A clause is a disjunction of literals. We say a boolean formula $\varphi$ is in conjunctive normal form (CNF) if $\varphi$ is a conjunction of clauses.

Example 18.6. Consider the formula \[\varphi = (\neg x_1 \vee x_2 \vee x_3) \wedge (x_1 \vee \neg x_2 \vee x_3) \wedge (\neg x_1 \vee x_2 \vee x_4).\] This formula is in conjunctive normal form and has four variables: $x_1,x_2,x_3,x_4$, but has six literals: $x_1, \neg x_1, x_2, \neg x_2, x_3, x_4$. It has three clauses: $(\neg x_1 \vee x_2 \vee x_3), (x_1 \vee \neg x_2 \vee x_3), (\neg x_1 \vee x_2 \vee x_4)$.

Recall that propositional formulas are logical statements that assert the truth of some statement. Usually this is mathematical in nature. We are interested in determining whether such a statement is ever true. In other words, is there a way for us to assign the variables in such a way that the formula becomes true? This is called the satisfiability problem.

Problem 18.7. The satisfiability (SAT) problem is: Given a propositional formula $\varphi$, does it have a satisfying truth assignment? The 3-SAT problem is: Given a propositional formula $\varphi$ in conjunctive normal form, where each clause contains exactly three literals, does it have a satisfying truth assignment?

Satisfiability is a basic problem in mathematical logic and has obvious applications in things like circuit design. However, as mentioned before, there are a number of mathematical statements that one can express in propositional logic, so being able to solve this would mean being able to determine the satisfiability of any statement that can be expressed as a propositional statement.

We will show something that is maybe a bit surprising.

Theorem 18.8. 3-SAT $\leq_P$ Independent Set.

Proof. Given an instance $\varphi$ of 3-SAT, where $\varphi$ has $k$ clauses, we will construct an instance $(G,k)$ of Independent set that has an independent set of size $k$ if and only if $\varphi$ is satisfiable.

Our construction is as follows. For each clause $C = x_i \vee x_j \vee x_k$ in $\varphi$, we construct a triangle. Then, we add an edge between each literal and its negations in the other clauses.

Consider the following graph, constructed by using the formula from Example 18.6. Note that each triangle corresponds to a clause and that the literals $x_1$ and $x_2$ have edges to their negations, $\neg x_1$ and $\neg x_2$.

We will show that $\varphi$ is satisfiable if and only if $G$ contains an independent set of size $k$, where $k$ is the number of clauses in $\varphi$.

First, suppose $\varphi$ is satisfiable and consider a satisfying assignment. This means that in every clause, at least one of the literals is true. Then we choose one of these literals from each clause and these form an independent set $S$.

To see that they do, it is clear that only one vertex can be chosen from each triangle. Then all that remains is that there are no edges joining two vertices in different triangles. But the only way this can happen is between two literals $x_i$ and $\neg x_i$, which can't be the case in a satisfying assignment. therefore, $S$ is an independent set.

Next, suppose that $G$ contains an independent set $S$ of size $k$. Since $S$ is independent, it must contain exactly one vertex from each triangle. We set the literals corresponding to these vertices to True, as long as they are consistent. Then this means every clause evaluates to true, so $\varphi$ is satisfied. $$\tag*{$\Box$}$$

With that we have the following relationships among the problems we've discussed so far: \[\mathrm{3SAT} \leq_P \mathrm{Independent\ Set} \leq_P \mathrm{Vertex\ Cover} \leq_P \mathrm{Set\ Cover}.\] This is a good start, but we don't seem to be any closer to figuring out whether these problems do or don't have efficient algorithms.