MPCS 50103 — Lecture 14

Walks, paths, cycles

A natural impulse when presented with a bunch of circles joined by lines is to see if we can get from one circle to another by following the lines. More formally, when we do this, what we're doing is checking if an object is connected to another object. It's not surprising to learn that this is a fundamental property of graphs that we'd like to work with.

A walk of length $k$ is a sequence of $k+1$ vertices $v_0, v_1, \cdots, v_k$ such that $v_{i-1}$ and $v_i$ are adjacent for $1 \leq i \leq k$. This walk has length $k$ because we traverse $k$ edges. A path is a walk without repeated vertices. A cycle or circuit of length $k$ is a path with $v_0 = v_k$.

Consider the following graphs.

In the first, the walk $a_0 a_1 a_4 a_2 a_1 a_3$ is highlighted in red. In the second, the path $b_0 b_4 b_2 b_3 b_1$ is highlighted in orange. In the third, the cycle $c_0 c_4 c_2 c_1 c_0$ is highlighted in yellow.

Note that the above definitions are not universal and you should be very careful when consulting other materials (e.g. sometimes they use path when we mean walk).

Before we proceed, here is a simple theorem about walks and paths and how the fastest way between point $a$ and $b$ is a straight line.

Let $G$ be a graph and $u,v \in V$ be vertices. If there is a walk from $u$ to $v$, then there is a path from $u$ to $v$.

Consider the walk $u = v_0 v_1 \cdots v_k = v$ from $u$ to $v$ of minimum length $k+1$. Suppose that a vertex is repeated. In other words, there exist, $i, j$ with $i \lt j$ such that $v_i = v_j$. Then we can create a shorter walk $$u = v_0 v_1 \cdots v_{i-1} v_i v_{j+1} v_{j+2} \cdots v_k = v,$$ since $(v_i,v_{j+1}) = (v_j,v_{j+1})$. Then this contradicts the minimality of the length of our original walk. So there are no repeated vertices in our walk and therefore, it is a path by definition.

This leads to a basic question: how do I know whether there's a walk from $u$ to $v$? Although it is probably overkill, especially if you're only interested in one particular pair of vertices, one way we can answer this is by making clever use of the adjacency matrix.

Let $G = (V,E)$ be a graph with $V = \{1,\dots,n\}$ and adjacency matrix $A$. The number of walks of length $\ell$ from $i$ to $j$ is the $(i,j)$th entry of $A^\ell$.

We will prove this by induction on $\ell$.

Base case. We have $\ell = 1$. Then $A^1 = A$ is just the adjacency matrix and the $(i,j)$th entry corresponds to the edge $(i,j)$, which is a walk of length 1 from $i$ to $j$.

Inductive case. Now assume for $k \in \mathbb N$ that the $(i,j)$th entry of $A^k$ is the number of walks of length $k$ from $i$ to $j$. Consider $\ell = k+1$.

Since $A^{k+1} = A^kA$, the $(i,j)$th entry $c_{i,j}$ of $A^{k+1}$ is $$c_{i,j} = b_{i,1} a_{1,j} + b_{i,2} a_{2,j} + \cdots + b_{i,n} a_{n,j},$$ where $b_{i,m}$ is the $(i,m)$th entry of $A^k$ and $a_{m,j}$ is the $(m,j)$th entry of $A$. By the inductive hypothesis, $b_{i,m}$ is the number of walks of length $\ell$ from $i$ to $m$.

Then we would like to know the number of walks from $m$ to $j$. This will be 1 if $(m,j)$ is an edge and 0 if not. So the number of walks is dependent on whether or not $j$ is reachable from each vertex $m$. If $j$ is reachable from $m$, then we can add the number of walks that reach $m$ to the number of those that reach $j$. Otherwise, $j$ is not reachable from $m$ and we remove those from our total.

Note that although it doesn't help in the proof above, the matrix $A^k$ counting $k$-length walks is also well-defined for $k = 0$, giving us $A^0 = I$, the identity matrix. This supposedly gives us all of the walks of length 0 from a vertex $i$ to $j$. Since the identity matrix has zeroes everywhere except for the $(i,i)$th entry, we can interpret this as the number of walks starting at vertex $i$ and travelling along 0 edges, which brings us to the vertex $i$ itself.

Consider the following graph.

If we take the vertices of this graph in order $a_0, a_1, a_2, a_3, a_4$, then we have the adjacency matrix $A$, defined by $$A = \begin{pmatrix} 0 & 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 & 0 \end{pmatrix}$$ Then we have $$A^2 = \begin{pmatrix} 2 & 1 & 1 & 1 & 2 \\ 1 & 4 & 2 & 0 & 1 \\ 1 & 2 & 3 & 1 & 1 \\ 1 & 0 & 1 & 1 & 1 \\ 2 & 1 & 1 & 1 & 2 \end{pmatrix}, A^3 = \begin{pmatrix} 2 & 6 & 5 & 1 & 2 \\ 6 & 4 & 6 & 4 & 6 \\ 5 & 6 & 4 & 2 & 5 \\ 1 & 4 & 2 & 0 & 1 \\ 2 & 6 & 5 & 1 & 2 \end{pmatrix}, A^4 = \begin{pmatrix} 11 & 10 & 10 & 6 & 11\\ 10 & 22 & 16 & 4 & 10\\ 10 & 16 & 16 & 6 & 10\\ 6 & 4 & 6 & 4 & 6\\ 11 & 10 & 10 & 6 & 11 \end{pmatrix}. $$

Now, this gives us some nice information, but it's not really the best way to compute paths between two vertices, or even all paths between every pair of vertices. If you think about the cost of mutiplying matrices together, just going wild is probably not the best approach. What are some better ways of doing this? This is something that you can look forward to thinking about in an algorithms class.

Eulerian and Hamiltonian paths

We can play around with the notions of paths and cycles a bit more. One of the first questions of this kind was asked by Euler concerning the bridges of Königsberg. The city was connected by a bunch of bridges over the rivers that passed through it. The question was: is there a way to walk through the city so that you only go across every bridge exactly once?

If we reframe this question, we can see that it's a graph question, where the bridges act as edges. Of course, this is a multigraph which is not what we're interested in, but we can ask the same question of the simple graph version of this graph.

Let $G$ be a graph. An Eulerian tour is a closed walk that contains every edge of $G$. An Eulerian path is a path that contains every edge of $G$.

The following, due to Euler, gives us necessary and sufficient conditions for an Eulerian tour to exist in our graph.

A graph $G = (V,E)$ contains an Eulerian tour if and only if every vertex has even degree.

First, suppose $G$ contains an Eulerian cycle. Consider a walk going through a vertex. The walk must traverse two edges incident to this vertex: one going in, and one going out. So for any vertex in the interior of our walk, we need to have traversed an even number of edges, and therefore the vertex will have even degree. This leaves the starting vertex. However, our walk has to end at the starting vertex, so that gives us our even degree.

Now, for the other direction, we assume that every vertex of $G$ has even degree and choose a walk of maximal length. This walk has to be a cycle. If not, the first and last vertices of the walk have an odd number of edges that were traversed, so there must be an unused edge incident that we can add to our walk. Suppose for contradiction that our walk doesn't contain every edge in $G$. Then we can choose a vertex where we have an unused edge incident to it. Now, we construct a new walk that doesn't contain any edges of our original walk and this must also be a cycle for the same reason as above. We can then join the two walks together to form a larger walk, which contradicts our assumption that our walk was maximal.

Euler's result holds for multigraphs, since the original Königsberg bridges problem was formulated for a multigraph.

However, 1736 was a long time ago and Königsberg has changed since. For one thing, it's now Kaliningrad, which is the part of the Russian Federation that's not connected to the rest of it. The bridges have changed too: there are only five bridges connecting the different parts of the city. Luckily for us, this is no longer a multigraph.

However, examining the graph, we see that the modern arrangement of the bridges of Kaliningrad still doesn't have the Euler condition, and so we can conclude that there still is no Eulerian tour through the city's bridges.

We can ask the same kind of questions but for vertices instead of edges. This question is due to William Rowan Hamilton. Hamilton was an Irish mathematician who invented the icosian game, which involves trying to find what we'll describe as a Hamiltonian cycle along the edges of a dodecahedron.

Let $G$ be a graph. A cycle is a Hamiltonian cycle if every vertex of $G$ is in the cycle. A path is a Hamiltonian path if every vertex of $G$ is in the path.

The usual binary encoding of a number interprets each digit as a power of 2. Suppose we're interested in counting something and so we would proceed by counting using three bits in the following way 000, 001, 010, 011, 100, 101, 110, 111. Generally, this is not a problem because we think of these abstractly. However, it's worth thinking about the physical implementation of such a system.

Here, the big question is when we go from something like 011 to 100: how do we guarantee that the system perfectly switches every bit at the same time? The answer is that we really can't. For physical systems, you can imagine that this is not feasible, but even for electronic systems, this is not always possible. This is also a big problem if the switching of bits is expensive, as in quantum systems.

To deal with this, we need an encoding of numbers based on the idea that only one bit is changed at a time. These are called Gray codes, after Frank Gray who defined them in the late 1940s.

Gray codes have a lot of interesting mathematical properties, but we'll focus on graph theory. We can ask the question of whether we can construct a Gray code for all $n$. It seems obvious that we can get one for $n = 2$ and with some work, we can get $n = 3$. Is there a better way to do this than just by trial and error?

Recall that $Q_n$ is the graph of $n$-bit strings which are joined by edges if they differ in exactly one position. This sounds suspiciously like a Gray code. Then our question becomes: is there a path through the $Q_n$ that hits every vertex exactly once? If there is, this gives us our Gray code.

For every integer $n \geq 2$, the $n$-cube $Q_n$ has a Hamiltonian cycle.

We will prove this by induction on $n$.

Base case. We have $n = 2$. Then our graph looks like this.

There is clearly a Hamiltonian cycle in this graph.

Inductive case. Let $k \geq 2$ and assume that $Q_n$ contains a Hamiltonian cycle. Now, consider the graph $Q_{n+1}$.

We partition our vertex set into two sets, with one set consisting of strings that begin with 0 and the other with strings that begin with 1. We have \begin{matrix} V_0 = \{0x \mid x \in \{0,1\}^k\}, & V_1 = \{1x \mid x \in \{0,1\}^k\}. \end{matrix} We observe that the subgraph induced by both of these sets is isomorphic to the $k$-cube, $Q_k$, since their vertex sets are really just binary strings of length $k$. Then by our inductive hypothesis, both of these subgraphs have a Hamiltonian cycle. Let $C = v_1 v_2 \cdots v_{2^k} v_1$ be a Hamiltonian cycle for $Q_k$. In other words, $v_1, \dots, v_{2^k}$ are strings of length $k$. Then we have the Hamiltonian cycles $C_0 = 0v_1, 0v_2, \dots, 0v_{2^k}, 0v_1$ in the subgraph induced by $V_0$ and $C_1 = 1v_1, 1v_2, \dots, 1v_{2^k}, 1v_1$ in the subgraph induced by $V_1$.

Then a Hamiltonian cycle for $Q_{k+1}$ is $$0v_1, 0v_2, \dots, 0v_{2^k}, 1v_{2^k}, 1v_{2^k-1}, \dots, 1v_2, 1v_1, 0v_1.$$ In other words, we remove the final edges $C_0$ and $C_1$ so that they are paths. Then we add the edge $(0v_{2^k}, 1v_{2^k})$ to the cycle. This edge exists, because these two strings differ only in the first position. We then add the edge $(1v_1, 0v_1)$ and again these two strings differ only in the first position. Then our cycle is traversed by following $C_0$, then entering $C_1$ and traversing $C_1$ in reverse.

This proof actually shows us a way to construct an $n$-bit Gray code. First, we take an $n-1$-bit Gray code. We append a 0 to the front of the $n-1$-bit Gray code and append a 1 to the front of the $n-1$-bit Gray code. Then we traverse the $n-1$-bit Gray code with the 0 at the beginning in its original order followed by the $n-1$-bit Gray code with the 1 at the beginning in reverse. It's because of this that the Gray code is also called a reflected code.

Just like with Eulerian cycles and paths, we can ask what conditions are necessary for a graph to contain a Hamiltonian path or cycle. Unlike Eulerian cycles and paths, we know of no necessary conditions for Hamiltonian cycles or paths, only sufficient ones. The following results are known, but we will not prove them.

Let $G = (V,E)$. If $\deg(v) \geq \frac{|V|}2$ for all $v \in V$, then there exists a Hamiltonian cycle.

Let $G = (V,E)$ with $|V| \geq 3$. If for all pairs of nonadjacent vertices $u,v \in V$, $\deg(u) + \deg(v) \geq |V|$, then $G$ has a Hamiltonian cycle.

Basically, these two results say that as long as you have enough edges, there will be a Hamiltonian path. But, as mentioned, if a graph does have a Hamiltonian cycle, it doesn't mean that it will satisfy the conditions in these theorems.

This gives us hints that the difficulty of finding Eulerian and Hamiltonian cycles/paths might also be different. And this is true: we already saw that there is a (very) efficient algorithm for finding Eulerian tours. However, there isn't any known efficient algorithm for Hamiltonian cycles/paths; if there were, we could solve the Traveling Salesman Problem efficiently too. This might be a bit surprising, but it serves as a preview of an important point of theoretical computer science: very similar-looking problems can be quite different in terms of their difficulty.

Connectedness

What does it mean if there's a walk between two vertices? Practically speaking, it means that we can reach one from the other. This idea leads to the following definition.

A graph $G$ is connected if there is a walk between every pair of vertices.

The graphs that we've seen so far have mostly been connected. However, we're not guaranteed to work with connected graphs, especially in real-world applications. In fact, we may want to test the connectedness of a graph. For that, we'll need to talk about the parts of a graph that are connected.

A connected component of a graph $G = (V,E)$ is a connected subgraph $G'$ of $G$ that is not a proper subgraph of another connected subgraph of $G$.

Consider the following graph.

This graph has two connected components, the subgraphs induced by $a_0,\dots, a_5$ and $a_6,\dots,a_9$. No other can be considered a connected component since they would either be not connected or a proper subgraph of one of the two connected components we identified.

Another question related to connectivity that we can ask is how fragile the graph is. For instance, if we imagine some sort of network (computer, transportation, social, etc.), this is the same as asking where the points of failure are in our network. Which vertices or edges do we need to take out to disconnect our graph?

Let $G = (V,E)$ be a graph and consider a vertex $v \in V$. If removing $v$ and its incident edges increases the number of connected components, then $v$ is a cut vertex or articulation point. Similarly, consider an edge $e \in E$. If removing $e$ increases the number of connected components, then $e$ is a cut edge or bridge.

Consider the following graph $G$. There are two obvious bridges here: $(a_3,c_2)$ and $(b_2,c1)$. And the vertices incident to the bridges are the cut vertices of the graph.

We would like to be able to identify which vertices are bridges and their properties. The following lemma is one of those results that sounds incredibly obvious based on the definitions involved and our intuition.

Let $G$ be a connected graph. If $(u,v) \in E$ is a bridge, then $G - (u,v)$ (the subgraph of $G$ with only the edge $(u,v)$ deleted) has exactly two components and $u$ and $v$ are in different components.

Since $(u,v)$ is a bridge, we know that $G - (u,v)$ has at least two components. Let $X$ be the set of vertices that in the same component as $u$. Choose some vertex $x \not \in X$. Then $x$ is not in the same component as $u$. Since $G$ was connected, $G$ contained a path between $u$ and $x$ and there are no paths in $G - (u,v)$. Thus, any path from $u$ to $x$ must have contained the edge $(u,v)$. These paths must have the form $uv v_2 v_4 \cdots v_{n-1} x$.

Then $v v_2 \cdots v_{n-1} x$ is a path in $G - (u,v)$ and therefore $v$ and $x$ are in the same component. But recall that the choice of $x \not \in X$ was arbitrary. This means that there exists a path from $u$ to every vertex $x \not \in X$. Therefore, $G - (u,v)$ has two components, and $u$ and $v$ are in different components.

We can use this to get the following important characterization of bridges which is maybe not quite as obvious.

Let $G$ be a graph. An edge $e$ of $G$ is a bridge if and only if it is not contained in any cycle of $G$.

For the forward direciton, we assume that $e = (u,v)$ is a bridge. Suppose that $e$ is in a cycle, say $u v v_2 \cdots v_{n-1} u$. Now consider $G - e$. Note that $v v_2 \cdots v_{n-1} u$ is a path between $u$ and $v$, so $u$ and $v$ are in the same component. But this contradicts Lemma 19.9. Therefore, $e$ is not in a cycle of $G$.

For the reverse direction, we assume that $e = (u,v)$ is not in a cycle. Suppose that $e$ is not a bridge. Then $u$ and $v$ are in the same component in the graph $G - e$. Therefore, there is a path between $u$ and $v$, say $u v_2 \cdots v_{n-1} v$. However, this means that $u v_2 \cdots v_{n-1} vu$ is a cycle in $G$, which is a contradiction. Therefore, $e$ is a bridge.