CMSC 27200 — Lecture 12

Shortest paths, revisited

Recall from Lecture 4 that we were interested in solving the shortest path problem, where we are given a digraph $G$ with edge weights, a source vertex $s$ and a destination vertex $t$. We saw how to solve this using a greedy approach.

Now, we are ready to revisit a problem that we were introduced to earlier that we couldn't solve: how to find the shortest path in a graph that contains negative edge weights. One of the things we saw in Problem Set 2 was that even tinkering with the edge weight function wasn't enough to guarantee that we could get Dijkstra's algorithm to be correct.

One of the things some of you observed was that if there's a negative cycle (i.e. a cycle for which the sum of its edge weights is negative) on a path from $s$ to $t$, then there's no shortest path from $s$ to $t$. The proof is not complicated: if there's a negative cycle, then we can take the negative cycle as much as we like to decrease the cost of the path by an arbitrary amount.

Now, we will consider a non-greedy approach (read: dynamic programming) to solving the problem that will allow us to use negative edge weighs. Our algorithm will even be able to determine whether the graph contains a negative cycle, so we can figure out whether there is a shortest path.

Rather than computing the shortest path from a source $s$ to every other node, we consider sort of the backwards version of the problem, where we compute the shortest path from every vertex to our destination $t$.

Problem 12.1. The single-destination shortest-paths problem is: Given a directed graph $G = (V,E)$ with an edge weight function $e: E \to \mathbb R$ and a destination vertex $t$, find a shortest path from $v$ to $t$ for every $v \in V$.

Because we plan to use dynamic programming, we need to determine what subproblem is appropriate to compute in the service of finding a shortest path. Here, we can take some inspiration from our greedy approach: recall that Dijkstra's algorithm kept an estimated weight of the path from the source $s$ to each vertex $v$.

There are a few similar subproblems that we can try:

To help clarify this decision, here is a useful result that is similar to something you should have seen in discrete math.

Proposition 12.2. Suppose $G$ has no negative cycles. Then there is a shortest path from $u$ to $v$ that does not repeat any vertices.

One consequence of this result is that any shortest path will have at most $n-1$ edges in it, since any path that has more edges will revisit some vertex.

We will define $OPT(i,v)$ to be the weight of a shortest path from $v$ to $t$ that uses at most $i$ edges. So if we want to compute the shortest path from $s$ to $t$, we would want to compute $OPT(n-1,s)$.

Let us consider a shortest path from a vertex $v$ to our destination $t$, which would have weight $OPT(i,v)$. How can we decompose this path? Since we want to express this in terms of some shorter path, it makes sense to consider the first edge in our path, say $(v,x)$, followed by the shortest path from that vertex $x$ to $t$. So we have the following possibilities.

  1. Our shortest path from $v$ to $t$ uses fewer than $i$ edges. In this case, our work is done: we just use the shortest path we already have, so we must have $OPT(i,v) = OPT(i-1,v)$.
  2. Our shortest path from $v$ to $t$ uses exactly $i$ edges. In this case, we want to express the weight of our path as a path with fewer edges, but we also know the cost of at least one edge: the one leaving $v$. So we must have $OPT(i,v) \geq w(v,x) + OPT(i-1,x)$ for some vertex $x$. Since we can always take the optimal path from $x$ to $t$, we have $OPT(i,v) \leq w(v,x) + OPT(i-1,x)$. This means for each edge leaving $v$, we have a choice of vertices $x$ such that the weight of the shortest path is $w(v,x) + OPT(i-1,x)$.

    Then the obvious choice is to choose the one with minimal weight, giving us \[OPT(i,v) = \min_{(v,x) \in E} \{w(v,x) + OPT(i-1,x)\}.\]

We then put this together to get the following.

Proposition 12.3. $OPT(i,v)$ satisfies the recurrence

As usual, the recurrence leads us to a straightforward algorithm that fills out a table.

    \begin{algorithm}
    \caption{Compute shortest paths}
    \begin{algorithmic}
    \PROCEDURE{shortest-paths}{$G,t$}
        \FORALL{$v \in V$}
            \STATE $P[0,v] \gets \infty$
        \ENDFOR
        \STATE $P[0,t] \gets 0$
        \FOR{$i$ from $1$ to $n-1$}
            \FORALL{$v \in V$}
                \STATE $P[i,v] \gets P[i-1,v]$
                \FORALL{$(v,x) \in E$}
                    \STATE $P[i,v] \gets \min\{P[i,v], P[i-1,x]+w(v,x)\}$
                \ENDFOR
            \ENDFOR
        \ENDFOR
    \ENDPROCEDURE
    \end{algorithmic}
    \end{algorithm}

This algorithm is pretty straightforward. The tricky part is counting the number of iterations this algorithm makes. Just like with Dijkstra's algorithm, it's easy to conclude that there must be something like $O(mn^2)$ just by looking at the definition of the loops, but if we observe how the edges are chosen, we see that only edges the leave a particular vertex are examined at each iteration, so every edge is examined at most once. So the inner two loops together comprise a total of $O(m)$ iterations. In total, this gives us $O(mn)$ time.

On the other hand, the algorithm clearly requires $O(n^2)$ space. It seems like there are a lot of nested loops. This seems like a lot, especially compared to Dikjstra's algorithm. It turns out that we can do better by applying some of the same ideas.

Bellman-Ford

One observation we can make is that in the end, we really don't care how many edges a shortest path actually uses. In that sense, all we really need to do is keep track of the minimum weight of the path from each vertex. So instead of keeping track of a table of size $O(n^2)$, we can keep track of an array $d$ of estimated weights of shortest paths.

The problem with this is that this may not be enough information to reconstruct the actual path itself. We take a similar approach to Dijkstra's algorithm and keep track of the successors for each vertex as we update the estimated weights $d$. In doing so, we reduce our space requirements from a table of size $O(n^2)$ to two arrays of size $n$, which is $O(n)$.

This leads to the following algorithm attributed to Bellman, who we saw earlier as the inventor of dynamic programming, and Ford, who we will see again very shortly when we get into network flows. But even though the algorithm is called Bellman-Ford, they didn't publish together: Ford published the algorithm in 1956, while Bellman published it in 1958. Even this isn't the only instance: Moore also published the algorithm in 1957 and sometimes also gets attribution, while Shimbel published it in 1955 and does not.

    \begin{algorithm}
    \caption{Compute shortest paths (Bellman-Ford)}
    \begin{algorithmic}
    \PROCEDURE{bellman-ford}{$G,t$}
        \FORALL{$v \in V$}
            \STATE $d[v] \gets \infty$
            \STATE $s[v] \gets \varnothing$
        \ENDFOR
        \STATE $d[t] \gets 0$
        \FOR{$i$ from $1$ to $n-1$}
            \FORALL{$v \in V$}
                    \FORALL{$(v,x) \in E$}
                        \IF{$d[v] \gt d[x] + w(v,x)$}
                            \STATE $d[v] \gets d[x] + w(v,x)$
                            \STATE $s[v] \gets x$
                        \ENDIF
                    \ENDFOR
            \ENDFOR
        \ENDFOR
    \ENDPROCEDURE
    \end{algorithmic}
    \end{algorithm}

The running time of this algorithm is still $O(mn)$ by the same argument as before and the space requirements have clearly changed to $O(n)$. What is not as obvious is that $d$ ends up holding the weights of the shortest paths for each vertex and that $s$ correctly holds the successor for each vertex. First, we will show that Bellman-Ford correctly computes the weight of the shortest path.

Proposition 12.7. After the $i$th iteration of the outer loop of Bellman-Ford, the following properties hold for every vertex $v \in V$:

  1. If $d[v] \neq \infty$, then $d[v]$ is the weight of some path from $v$ to $t$,
  2. If there is a path from $v$ to $t$ with at most $i$ edges, then $d[v]$ is the length of the shortest path from $v$ to $t$ with at most $i$ edges.

Proof. We will prove this by induction on $i$. First, we consider $i = 0$. We have $d[t] = 0$ and $d[v] = \infty$ for all $v \neq t$, since there are no paths from $v$ to $t$ with 0 edges.

Now consider $i \gt 0$ and consider an update to $d[v] = d[x] + w(v,x)$. By our inductive hypothesis, $d[x]$ is the weight of a path from $x$ to $t$, so $d[x] + w(v,x)$ is the weight of a path from $v$ to $t$. So (1) holds.

Now, consider the shortest path $P$ from $v$ to $t$ with at most $i$ edges. Let $x$ be the first vertex after $v$ in $P$. Then the subpath $P'$ from $x$ to $t$ is the shortest path from $x$ to $t$ with at most $i-1$ edges.

By our inductive hypothesis, $d[x]$ must be no more than the weight of $P'$ after $i-1$ iterations of the outer loop of Bellman-Ford. So $d[x] \leq w(P')$.

Then on the next iteration, we consider the edge $(v,x)$. We have \[d[v] \leq d[x] + w(v,x) \leq w(P') + w(v,x) = w(P).\] So after $i$ iterations, $d[v]$ is at most the weight of the shortest path from $v$ to $t$ with at most $i$ edges. So (2) holds. $$\tag*{$\Box$}$$

All that is left is to show that the path that we get from following successors from $s$ gets us the shortest $v,t$ path. Unlike with Dijkstra's algorithm, we haven't shown that it's not possible to create a cycle through our computation. This is something that we need to consider. First, let us consider the successor subgraph.

Definition 12.8. Let $G_s = (V,E_s)$ be the subgraph defined by \[E_s = \{(v,s[v]) \mid v \in V\}.\]

Lemma 12.9. If the successor graph $G_s$ contains a cycle, it must be a negative cycle.

Proof. First, we observe that if $s[v] = x$, then we have that $d[v] \geq d[x] + w(v,x)$. Now, consider a cycle $C$ on $v_1, v_2, \dots, v_k, v_1$ in $G_s$ and assume that $(v_k,v_1)$ is the last edge that belongs to the cycle that is added to $G_s$.

At the point that $v_1$is assigned to $s[v_k]$ and added to $G_s$, we have the following inequalities: \begin{align*} d[v_1] & \geq d[v_2] + w(v_1,v_2) \\ d[v_2] & \geq d[v_3] + w(v_2,v_3) \\ & \vdots \\ d[v_{k-1}] & \geq d[v_k] + w(v_{k-1},v_k) \\ d[v_k] & \gt d[v_1] + w(v_k,v_1) \\ \end{align*} Adding these inequalities together and grouping the $d[i]$'s on one side and edge weights on the other gives us \begin{align*} d[v_1] + \cdots + d[v_k] - d[v_1] - \cdots d[v_k] &\gt w(v_1,v_2) + \cdots + w(v_k,v_1) \\ 0 &\gt w(v_1,v_2) + \cdots + w(v_k,v_1) \\ 0 &\gt w(C) \\ \end{align*} $$\tag*{$\Box$}$$

So if our graph does not have any negative cycles, then we will not end up with a cycle in our successor graph. It remains to show that the paths that are obtained via the successor graph are actually the shortest paths from $v$ to $t$.

Theorem 12.10. Assuming no negative cycles, Bellman-Ford finds shortest $v,t$ paths for every vertex $v$ in $O(mn)$ time and $O(n)$ space.

Proof. Since there are no negative cycles, the successor graph $G_s$ cannot contain any cycles. Therefore, the successor graph must consist of paths to $t$. Consider a path $P$ on vertices $v = v_1, v_2, \dots, v_k = t$.

Once the algorithm terminates, if $s[v] = x$, we must have $d[v] = d[x] + w(v,x)$ for all vertices $v \in V$. This gives us the equations \begin{align*} d[v_1] & = d[v_2] + w(v_1,v_2) \\ d[v_2] & = d[v_3] + w(v_2,v_3) \\ & \vdots \\ d[v_{k-1}] & = d[v_k] + w(v_{k-1},v_k) \\ \end{align*} Since $v_1 = v$ and $d_k = t$, adding these equations together gives us \begin{align*} d[v] &= d[t] + w(v_1,v_2) + \cdots + w(v_{k-1},v_k) \\ &= 0 + w(v_1,v_2) + \cdots + w(v_{k-1},v_k) \\ &= w(P) \\ \end{align*} By Proposition 12.7, $d[v]$ is the weight of the shortest path from $v$ to $t$, so $P$ is the shortest path from $v$ to $t$. $$\tag*{$\Box$}$$

So Bellman-Ford is able to compute the shortest path, assuming there are no negative cycles. And that's fine, because we know that if a graph has a negative cycle, then it doesn't have a shortest path.

Detecting negative cycles

Of course, it seems like it shouldn't be too hard to go from what we've discussed to figuring out whether we really do have a negative cycle or not. After all, if we're able to do this, we can make lots of money, as in our previous discussion on the matter.

Problem 12.11. The negative cycle problem is: Given a directed graph $G = (V,E)$ with an edge weight function $e: E \to \mathbb R$ find a negative cycle, if one exists.

Recall our Bellman-Ford recurrence, where $OPT(i,v)$ was the weight of a shortest path from $v$ to $t$ with at most $i$ edges. We can make use of this recurrence to determine whether our graph contains a negative cycle.

If our graph does not contain any cycles, then every vertex has a shortest path to $t$ and it must contain at most $n-1$ edges. Therefore, in the Bellman-Ford algorithm, it is not necessary to run the algorithm for more than $n-1$ iterations. However, what if we did run it some more?

Lemma 12.12. Let $G = (V,E)$ be a graph. If $OPT(n,v) = OPT(n-1,v)$ for all vertices $v \in V$, then $G$ does not contain any negative cycles.

Proof. This follows from Proposition 12.7. $$\tag*{$\Box$}$$

Proposition 12.13. Let $G = (V,E)$ be a graph. If there exists a vertex $v \in V$ such that $OPT(n,v) \lt OPT(n-1,v)$, then any shortest path from $v$ to $t$ with at most $n$ edges contains a negative cycle.

Proof. Since $OPT(n,v) \lt OPT(n-1,v)$, the shortest path from $v$ to $t$ must have $n$ edges. But by the Pigeonhole principle, this path must contain a repeated vertex, and therefore, it contains a cycle.

Now consider one of these cycles $C$. By Proposition 12.2, we can delete $C$ from our path. This produces a path from $v$ to $t$ with strictly fewer than $n$ edges. But we already have that $OPT(n,v)$ is the minimal cost of a path from $v$ to $t$ and this path must have $n$ edges. This means that the weight of our path without $C$ is greater than $OPT(n,v)$, which means the weight of $C$ must be negative.

Thus, $C$ is a negative cycle. $$\tag*{$\Box$}$$

This gives us a way to detect whether our graph contains a negative cycle that occurs on one of the paths from a vertex to our destination. But what if we want to detect these negative cycles on our entire graph, and not just on certain paths. One easy way to do this is to augment our graph, by adding an extra vertex and edges joining every vertex to it with weight 0.

The intution behind this is simple. The zero weight edges don't contribute to any negative cycles and the shortest path from any vertex to the sink vertex should be via the zero weight edge—if there are no negative cycles.

Theorem 12.14. Let $G = (V,E)$ be a directed graph. Then we can determine whether $G$ contains a negative cycle in $O(mn)$ time and $O(n^2)$ space.

Proof. Let $G' = (V \cup \{t\}, E \cup \{(v,t) \mid v \in v\})$ with edge weight function $w':E \to \mathbb R$ defined by \[w'(e) = \begin{cases} w(e) & \text{if $e \in E$,} \\ 0 & \text{if $e = (v,t)$ for some $v \in V$.} \end{cases}\] We will show that $G'$ contains a negative cycle if and only if $G$ contains a negative cycle.

Suppose $G$ contains a negative cycle. Since every vertex in $G'$ has an edge to $t$, the negative cycle belongs on a path to $t$.

Now suppose that $G'$ has a negative cycle on a path to $t$. Since $t$ has no outgoing edges, $t$ does not belong to the cycle. Therefore, the cycle is in $G$.

Now, by applying Lemma 12.12 and Proposition 12.13, we can run Algorithm 12.4 on $G'$ to determine whether or not $G$ contains a negative cycle. $$\tag*{$\Box$}$$

Our previous approach follows from using our original table-filling algorithm. This is a nice and neat proof that applies everything in a straightforward manner. But is there a way to use Bellman-Ford instead and get our space usage down to $O(n)$? The answer is yes.

Theorem 12.15. Let $G = (V,E)$ be a directed graph. Then we can determine whether $G$ contains a negative cycle in $O(mn)$ time and $O(n)$ space.

We won't prove this formally, but the algorithm roughly goes like this:

  1. Run Bellman-Ford on our augmented graph $G'$ for $n+1$ iterations (remember that since $G'$ has $n+1$ vertices, we only needed to run it for $n$ iterations to find shortest paths).
  2. Check if any $d[v]$ was updated on the $n+1$st iteration; if not, then the paths have converged and there's no cycle in the graph.
  3. If there is an update on the $n+1$st iteration, then there is a negative cycle, and we can recover the path by following the successors, starting with the vertex $v$ with $d[v]$ updated in the $n+1$st iteration.