71
\$\begingroup\$

After writing this answer, I was inspired to try to design a more elegant solution to the problem of printing binary trees. And what better tool with which to seek elegance than Clojure?

Overall design

The solution I ended up with involved creating, merging, and printing what I'm going to call sparse strings. A sparse string is simply a map from row/column pairs to substrings. These substrings must not contain newlines or overlap with each other.

So for instance, this multiline string

 baz
     foo
bar
              qux

could be represented by this sparse string:

{[3 -2] "foo"
 [4 -7] "bar"
 [2 -6] "baz"
 [5 7] "qux"}

A few things to note:

  1. Empty space is simply filled by regular spaces and newlines
  2. The first coordinate is the row, the second coordinate is the column
  3. The first row and column need not necessarily be zero

The problem of printing a binary tree then reduces to creating sparse strings from its left and right subtrees (as well as one for its value, which will go at the top), then shifting and merging those sparse strings together.

The only extra requirement when using sparse strings to represent trees is that the root (or the center of the root, if the root is wider than 1 character) must be at [0 0]. Consider the tree below:

    4
   / \
  2   5
 / \
1   3

The in-memory representation of this would be

{:value 4
 :left {:value 2
        :left {:value 1}
        :right {:value 3}}
 :right {:value 5}}

Textually, the tree would be represented as

{[0 0] "4"
 [2 -2] "2"
 [4 -4] "1"
 [3 -3] "/"
 [4 0] "1"
 [3 -1] "\\"
 [1 -1] "/"
 [2 2] "5"
 [1 1] "\\"}

I'll refer to this as a tree string. This way, when I combine multiple tree strings to create a bigger tree string, I can safely use [0 0] as an anchor point from which to shift subtrees.

Code

(defn end-col
  "Returns one plus the maximum column occupied by the sparse string entry x."
  [x]
  (let [[[_ col] s] x]
    (+ col (count s))))

(defn min-corner
  "Returns a vector of the minimum non-empty row and column in sparse-string."
  [sparse-string]
  (mapv #(apply min (map % (keys sparse-string)))
        [first second]))

(defn max-corner
  "Returns a vector of one plus the maximum non-empty row and column in
  sparse-string."
  [sparse-string]
  (mapv #(apply max (map % sparse-string))
        [(comp inc first key) end-col]))

(defn fill
  "Returns a string of newlines and spaces to fill the gap between entries x and
  y in a sparse string whose minimum corner is corner."
  [corner x y]
  (let [[_ min-col] corner
        [[prev-row _] _] x
        [[next-row next-col] _] y]
    (apply str (concat (repeat (- next-row prev-row) \newline)
                       (repeat (- next-col
                                  (if (== prev-row next-row)
                                    (end-col x)
                                    min-col))
                               \space)))))

(defn sparse-str
  "Converts sparse-string to a string."
  [sparse-string]
  (let [corner (min-corner sparse-string)]
    (->> sparse-string
         (sort-by key)
         (cons [corner ""])
         (partition 2 1)
         (map (fn [[x y]] (str (fill corner x y) (val y))))
         (apply str))))

(defn shift
  "Creates and returns a sparse string by adding offset to the position of each
  entry in sparse-string."
  [offset sparse-string]
  (into {} (map (fn [[pos s]]
                  [(mapv + pos offset) s])
                sparse-string)))

(defn vert-gap
  "Returns the minimum vertical gap that can be used in combining the left and
  right tree strings."
  [left right]
  (if (and left right)
    (max 1 (quot (- (second (max-corner left))
                    (second (min-corner right)))
                 2))
    1))

(def directions {:left - :right +})

(defn diagonal
  "Returns a diagonal sparse string with the top end located at corner."
  [direction corner length character]
  (let [[first-row first-col] corner]
    (into {} (map (fn [n]
                    [[(+ first-row n)
                      ((directions direction) first-col n)]
                     (str character)])
                  (range length)))))

(defn leg
  "Returns a sparse string from shifting tree-string according to direction,
  vert-gap, and value-height, merged with a diagonal strut."
  [direction tree-string vert-gap value-height]
  (merge (shift [(+ value-height vert-gap)
                 ((directions direction) (inc vert-gap))]
                tree-string)
         (diagonal direction
                   [value-height ((directions direction) 1)]
                   vert-gap
                   ({:left \/ :right \\} direction))))

(defn assemble
  "Assembles a complete tree string from the tree strings of a value, left
  subtree, and right subtree."
  [value left right]
  (if (or left right)
    (let [[value-height _] (max-corner value)
          vert-gap (vert-gap left right)]
      (merge value
             (when left
               (leg :left left vert-gap value-height))
             (when right
               (leg :right right vert-gap value-height))))
    value))

(defn tree-string
  "Creates and returns a tree string from node."
  [node]
  (let [{:keys [value left right]} node
        s (str value)]
    (apply assemble
           {[0 (- (quot (count s) 2))] s}
           (map #(when % (tree-string %))
                [left right]))))

Implementation notes

When printing a sparse string, one first needs to find the minimum occupied row and column in the sparse string. This naturally leads to the end-col, min-corner, and max-corner functions for calculating bounding boxes for sparse strings.

Now, how does one print a sparse string? This problem basically boils down to printing the whitespace to fill the gaps between sparse string entries. Once that's accomplished, the actual sparse-str implementation is quite straightforward.

Also, since I'm going to be combining sparse strings in addition to just creating and printing them, I'll need a function to shift the reference point of an existing sparse string.

All we need now is a way to convert from that logical representation of the tree to the textual one, which is the task of the tree-string function.

As you can see, the better part of the work is done in that assemble function. The first thing we need to do is decide how long we're going to make the struts that connect the left and right subtrees. To simplify things, we'll just always make them the same length, which means that the length of the struts, calculated by vert-gap, will equal to the final distance between the bottom of the value tree string and the tops of the left and right tree strings.

And of course, we'll need the diagonal function to create the struts themselves.

Now that we have all the rest of the pieces, it's just a matter of assembleing them together. The leg function is just a helper that combines diagonal and shift together into something that can then be merged with the root.

Example

In case you'd like to test this code yourself, here's a function for generating a random binary tree:

(defn rand-tree
  [weight depth]
  (into {:value (int (Math/pow 2 (rand (dec Integer/SIZE))))}
        (map #(when (and (pos? depth) (< (rand) weight))
                [% (rand-tree weight (dec depth))])
             [:left :right])))

So this...

(println (sparse-str (tree-string (rand-tree 3/4 3))))

... will print something like this:

          1369616891
              / \
             /   \
            /     \
           /       \
          /         \
      238883         2
        / \         /
       /   \       9
      /     \     / \
     /       \   1 2222
 2949729      6
   /         /
1836     5299294
\$\endgroup\$
7
  • \$\begingroup\$ Why are 2's kids on a different level than those of 238883? \$\endgroup\$
    – Josh
    Commented Mar 29, 2017 at 14:07
  • \$\begingroup\$ @Josh Because my code recursively builds the left and right subtrees into tree strings separately and then combines them into a tree string for the whole tree, so in that example the right subtree doesn't know the length of the struts in the left subtree because it doesn't know that the left subtree exists at all. \$\endgroup\$
    – Sam Estep
    Commented Mar 29, 2017 at 16:37
  • \$\begingroup\$ But why are the struts different lengths there?? \$\endgroup\$
    – Josh
    Commented Mar 30, 2017 at 6:54
  • \$\begingroup\$ @Josh Because you can't fit 2949729 and 6 on the same line with struts of length 1 using the centering function #(- (quot (count %) 2)) that you can find in tree-string. \$\endgroup\$
    – Sam Estep
    Commented Mar 30, 2017 at 11:27
  • 1
    \$\begingroup\$ @Josh In any case, if you think that my code can be improved, by all means please post an answer! That's why I posted it in this question in the first place. \$\endgroup\$
    – Sam Estep
    Commented Mar 30, 2017 at 22:27

1 Answer 1

17
\$\begingroup\$

First of all, in my opinion, both the algorithm is nice and interesting, and the code is really well-written, broken down into easily understandable functions! Well done!

The only addition that I can make are corner-cases (and their possible fixes) for some of the helper functions. However, please note, that from the main entry point, I did not find any way to trigger these corner cases, so, from a user perspective, the program works well even without the changes below.

Corner case # 1

(min-corner {})
(sparse-str {})

Both throw an exception, due to min in min-corner being called with zero arguments.

I suggest the following change, in order to handle this case:

(defn min-corner
   "Returns a vector of the minimum non-empty row and column in sparse-string."
   [sparse-string]
   (mapv #(apply min (let [args (map % (keys sparse-string))] (if (empty? args) [0 0] args)))
;  (mapv #(apply min (map % (keys sparse-string)))
         [first second]))

I.e., call min with a default value (in this case [0 0]), if the arguments are empty.

Corner case #2

(max-corner {})

Similar issue, like for min-corner, similar solution:

(defn max-corner 
   "Returns a vector of one plus the maximum non-empty row and column in
   sparse-string."
   [sparse-string]
   (mapv #(apply max (let [args (map % sparse-string)] (if (empty? args) [0 0] args)))
;  (mapv #(apply max (map % sparse-string))
         [(comp inc first key) end-col]))

Corner case #3

(sparse-str {[0 0] "aa" [0 1] "b"})

This will return aab. However, b is misplaced here, because it should be at position [0 1], and it is, instead, at position [0 2]. To be honest, I don't really know what would be the best solution here. I can imagine the following:

  1. Leave it as it is: all output is shown, some of it might be on the wrong place.
  2. Overwrite the the end of the first string with the second one (i.e. "ab").
  3. Overwrite the beginning of the second string with the first one (in this case: "aa").
  4. Throw an exception.

Again, starting from the main entry point, it is not possible to trigger this situation, because the user does not provide any positions. However, in case the helper functions were to be reused in a library, this is a question that should be addressed.

Corner case #4

(vert-gap {[0 0] "a"} {[4 10] "a"})

In case the right element has a bigger second coordinate than the left one, the result is always 1. I'm not sure if this is by design, but if not, then I'd suggest to make sure, that vertical gap is calculated correctly also in such cases, by taking the absolute value of the difference, instead of the difference itself:

(defn vert-gap
   "Returns the minimum vertical gap that can be used in combining the left and
   right tree strings."
   [left right]
   (if (and left right)
     (max 1 (quot (Math/abs (- (second (max-corner left))
                     (second (min-corner right)))
                  ) 2))
   1))

Corner case # 5

(diagonal :left [0 0] -2 \a)

With the current implementation, this will return an empty result: {}. Now, with the semantics that the name of the parameter length implies, this should be correct (maybe an exception could be thrown, but that is only detail). However, wouldn't it be nice, if the signum of length could control the direction, in which the first coordinate grows? Something like this:

(defn diagonal
   "Returns a diagonal sparse string with the top end located at corner."
   [direction corner length character]
   (let [[first-row first-col] corner
         _length (Math/abs length)
         op (if (< length 0) - +)]
     (into {} (map (fn [n]
                     [[(op first-row n)
                       ((directions direction) first-col n)]
                      (str character)])
                   (range _length)))))

In this way, the result of the above example will be: {[0 0] "a", [-1 -1] "a"}. Of course, it might be worth considering to rename the length parameter, e.g. to horizontal-displacement or something similar.

P.S.:

I setup a github repo for the above mentioned changes, and their corresponding test cases (along other tests):

\$\endgroup\$
5
  • \$\begingroup\$ Thanks for the awesome write-up! For Corner case #1 and Corner case #2, I think that it would be better for sparse-str to handle the case where its argument is empty (and just return "") and for min-corner and max-corner to specify in their docstring that their argument must be nonempty, similar to how it is part of the contract of min and max that they be passed a nonzero number of arguments. \$\endgroup\$
    – Sam Estep
    Commented Aug 11, 2017 at 19:14
  • \$\begingroup\$ For Corner case #3, I definitely agree that the current behavior is undesirable; the "dimensions" of the returned string should not depend on whether an overlap is present. Out of the possible solutions you present, I think the second one (overwrite the the end of the first string with the second one) is the best. \$\endgroup\$
    – Sam Estep
    Commented Aug 11, 2017 at 19:14
  • \$\begingroup\$ For Corner case #4, that's an interesting behavior, and on the one hand your solution might be desirable for robustness purposes, but on the other hand, in my definition of "tree string", I specified that the root must be present at [0 0], so {[4 10] "a"} does not qualify as a tree string, and vert-gap specifies that both of its arguments are tree strings. \$\endgroup\$
    – Sam Estep
    Commented Aug 11, 2017 at 19:14
  • \$\begingroup\$ For Corner case #5, I like your idea of adding semantics to the signum of length! Do you think that could possibly remove the need for the direction and character parameters as well? \$\endgroup\$
    – Sam Estep
    Commented Aug 11, 2017 at 19:14
  • \$\begingroup\$ @SamEstep: re #5: I don't think that we can pack those infos into length, because the signum of length is now the vertical direction, while direction is the horizontal one. Unless length is converted to some kind of structure, but that will still need all three pieces of information IMHO. \$\endgroup\$
    – Attilio
    Commented Aug 12, 2017 at 11:54

Not the answer you're looking for? Browse other questions tagged or ask your own question.