1
$\begingroup$

I'm working on a problem where I need to convert an undirected and unweighted graph with cycles into a tree while preserving the edge information (all the edges from the graph are preserved in the resulting tree).

For the resulting tree, the height and nodes duplication should be minimized.

Nodes duplication can happen because in order to preserve the edge information from the graph, we will need to break the cycles by removing the edge(s) and then make a copy of the node(s) then connect it in the tree, or in other scenarios that I haven't thought of.

I don't think minimum spanning tree algorithms will work because the algorithms will lose the edge information.



So far my heuristic is:

  1. For choosing the root node of the tree, pick the vertex with the highest betweenness centrality in the graph. Because that vertex likely serves as a bridge which many shortest paths flow.

  2. When breaking the cycles, remove the edge whose nodes have the least total sum of degrees.

This works for some examples I crafted.example



My question is, how can I verify the heuristic is indeed optimal and provide a formal proof? Or is there a better strategy?

Any insights, references, or alternative approaches would be greatly appreciated!

$\endgroup$

1 Answer 1

1
$\begingroup$

If the starting graph has $n$ vertices and $m$ edges, and we want to turn it into a tree with $m$ edges, that tree will have $m+1$ vertices; there is no choice in the matter. So there is no question of which choices will have more or less vertex duplication; they will all have the same total amount.

Now, as I understand the problem, starting from a graph $G$ you want to create a graph $G'$ such that:

  • each vertex of $G'$ is a copy of a vertex of $G$, but some vertices of $G$ can have multiple copies on $G'$;
  • for each edge $vw$ of $G$, we want an edge in $G'$ between some copy of $v$ and some copy of $w$;
  • $G'$ is a tree with the minimum possible height (after we choose some vertex of $G'$ to be the root).

Regarding the last requirement, there is a one-way relationship between distances in $G$ and distances in $G'$: if $v,w$ are vertices in $G$ and $v', w'$ are copies of $v$ and $w$ in $G'$, then the distance between $v'$ and $w'$ in $G'$ is at least the distance between $v$ and $w$ in $G$. This holds because, whatever the shortest path is between $v'$ and $w'$ in $G'$, we can follow the same edges in $G$ to get from $v$ to $w$.

As a result, if $G'$ is a rooted tree with root $v'$ (a copy of vertex $v$ in $G$) and height $h$, then every vertex of $G$ is within distance $h$ from $v$. So $h$ is at least the radius of $G$. In at least some cases, it is not possible to have $h$ equal the radius of $G$; for example, if $G$ is a complete graph on $3$ vertices, then $G$ has radius $1$; but it's impossible to create a tree $G'$ with height $1$ out of $G$.

Here is an algorithm that will make $G'$ have height at most one more than the radius of $G$. (So this is definitely within one of optimal, and sometimes the best we can do.)

  1. Let $v$ be any central vertex of $G$: the distance from $v$ to any other vertex is at most the radius of $G$. (Let $r$ be that radius.)
  2. As the first step to constructing $G'$, find a shortest-path tree starting at $v$. This is a spanning tree of $G$ such that the distances from $v$ to other vertices are the same in $G$ and in the spanning tree; it has height $r$.
  3. Now, for every edge $xy$ in $G$ that is not in this spanning tree, clone $x$ an additional time, and connect that copy of $x$ to the vertex $y$ in the spanning tree. (The choice between doing this to $x$ or $y$ is arbitrary.)

The resulting $G'$ has height at most $r+1$, because every vertex is either part of the spanning tree or within one step of the spanning tree, and every vertex on the spanning tree is within $r$ steps from $v$.

$\endgroup$
4
  • $\begingroup$ Thank you for the insight. However I am not necessarily trying to turn a graph with $n$ vertices and $m$ edges into a tree with $m$ edges. In my example if I pick vertex 2 as the root node for the tree, I will need to duplicate vertices 7, 6, 4 and 5 as I build the tree. That's not optimal and that's why I am looking for a strategy to pick the best vertex to minimize the duplication. $\endgroup$ Commented Aug 22, 2023 at 5:58
  • 1
    $\begingroup$ Why would you need to duplicate all four of those vertices? You could just duplicate vertex $4$, only: once as the child of $1$ and once as the child of $3$. $\endgroup$ Commented Aug 22, 2023 at 10:50
  • $\begingroup$ Sorry I probably should have mentioned this in the post. I will want to duplicate all four vertices because every vertex $v_{i}$ in my graph is actually a collection(sets) of small vertices $v'_{ij}$. And, if some $v'_{1u}$ and $v'_{2v}$ are connected, then $v_1$ and $v_2$ are connected. So in my example, if I only duplicate $4$ as the child of $3$, some small vertices in $3$ will lose access to the corresponding ones in $5$, which can only be reached through $4$ from $3$ originally. Sorry again for the confusion. $\endgroup$ Commented Aug 22, 2023 at 21:27
  • $\begingroup$ In that case, I think any solution to this problem will need access to the "small vertices" to function. Otherwise, there is no guarantee that any solution works - not even the one in your example, where it's possible that some small vertices in $3$ are not reachable from $4$ directly, but can only be reached via going around through $1$ and $2$. $\endgroup$ Commented Aug 22, 2023 at 22:29

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .