I enjoy thinking about encoding problems, where an object in one domain needs to be converted to another domain in such a way that the original object can easily be retrieved.

In most applications the encoding domain is strings, and the problem of coming up with the encoding is rarely difficult, as long as you’re careful. Usually a markup language like XML or JSON makes it particularly easy.

I’ve been wondering about less straightforward encodings, which started when I thought of the problem of encoding a directed graph in an undirected graph. Clearly the opposite problem is easy: to encode an undirected graph in a directed graph, simply choose a direction for each edge (this is apparently referred to as orienting the graph). Then the original undirected graph can be obtained by discarding the directions of the edges.

But directed graphs clearly contain more information than undirected graphs with the same number of vertices and edges, so any encoding method will have to generally increase the order of the graph. The best tactic I can think of is to encode the directed edges as vertices in the undirected graph. For example, consider the simplest non-trivial directed graph:

Obviously, we can’t encode this as an undirected graph with just two vertices — we wouldn’t know if there was just one directed edge being represented or two. And if the graph was larger and the two vertices were distinguishable, we wouldn’t know which vertex was the head and which was the tail. So we definitely need another vertex to represent the edge. One vertex doesn’t improve the situation, however, as we still only have one option:

So it looks like we need at least two extra vertices to fully represent each edge with its direction. As best I can tell, the simplest way to do it is like this:

We connect the two edge-vertices to each other, so we know they’re related. Then we connect one of them to the tail, and both of them to the head, so we can tell which is which. Here’s how it looks for a more complex graph:

There is an issue, however, with going in the opposite direction: how do we distinguish the edge vertices from the vertex vertices? The colors and shapes in the previous image make it pretty easy, but they’re not actually part of the graph. The undecorated graph looks like this:

It’s not so obvious how to get back to the original directed graph, or if a general method exists at all. The simplest way I’ve thought of to address this issue is to tack an additional vertex on to each vertex-vertex. Rather than explain that more clearly, I’ll illustrate it. Our first graph will look like this:

Essentially each vertex from the directed graph is being represented with two vertices, only one of which is connected to anything else. Since these second vertices will be the only vertices in the graph with degree one, they’re easy to identify. Here’s the larger graph:

Just for kicks, here’s another pair of graphs where the directed graph was randomly generated:

One question I haven’t settled is whether the extra vertices are strictly necessary. More formally, if we encode without them, are there any two digraphs that encode to the same graph? If I figure anything out, I’ll post it.