Enumerate binary trees

Question

Binary trees

A binary tree is a tree with nodes of three types:

terminal nodes, which have no children
unary nodes, which have one child each
binary nodes, which have two children each

We can represent them with the following grammar, given in BNF (Backus–Naur form):

<e> ::= 
      <terminal>   
    | <unary>
    | <binary>

<terminal> ::= 
    "0"

<unary> ::= 
    "(1" <e> ")"

<binary> ::= 
    "(2" <e> " " <e> ")"

In this grammar the nodes are given in preorder and each node is represented by a digit which is the number of children it has.

Motzkin numbers

Motzkin numbers (OEIS) (Wikipedia) have many interpretations, but one interpretation is that the nth Motzkin number is the number of distinct binary trees with n nodes. A table of Motzkin numbers starts

N          Motzkin number M(N)
1          1
2          1
3          2 
4          4 
5          9 
6         21 
7         51 
8        127 
    ...

e.g. M(5) is 9, and the nine distinct binary trees with 5 nodes are

1      (1 (1 (1 (1 0))))  
2      (1 (1 (2 0 0)))  
3      (1 (2 0 (1 0)))  
4      (1 (2 (1 0) 0))  
5      (2 0 (1 (1 0)))  
6      (2 0 (2 0 0))  
7      (2 (1 0) (1 0))  
8      (2 (1 (1 0)) 0)  
9      (2 (2 0 0) 0)

Task

Take a single positive integer n as input and output all of the distinct binary trees with n nodes.

Examples for n from 1 to 5 with parenthesis included for readability

0

(1 0)

(1 (1 0))
(2 0 0)

(1 (1 (1 0)))
(1 (2 0 0))
(2 0 (1 0))
(2 (1 0) 0)

(1 (1 (1 (1 0))))
(1 (1 (2 0 0)))
(1 (2 0 (1 0)))
(1 (2 (1 0) 0))
(2 0 (1 (1 0)))
(2 0 (2 0 0))
(2 (1 0) (1 0))
(2 (1 (1 0)) 0)
(2 (2 0 0) 0)

Input

The input will be one positive integer.

Output

The output should be an intelligible representation of the distinct binary trees with that many nodes. It is not compulsory to use the exact string given by the BNF grammar above: it is sufficient that the syntax used give an unambiguous representation of the trees. E.g. you could use [] instead of (), an extra level of brackets [[]] instead of [], outer parenthesis are present or missing, extra commas or no commas, extra spaces, parenthesis or no parenthesis, etc.

All of these are equivalent:

(1 (2 (1 0) 0))  
[1 [2 [1 0] 0]]  
1 2 1 0 0  
12100  
(1 [2 (1 0) 0])  
.:.--  
*%*55  
(- (+ (- 1) 1))
-+-11

Also a variation purposed by @xnor in a comment. Since there is a way to translate this to a format that can be understood it is acceptable.

[[[]][]]  is (2 (1 0) 0)

To make this easier to understand convert some of the [] to () like so

[([])()]

Now if you start with

[]

then insert a binary which needs two expressions you get

 [()()] which is 2

and then for the first () insert a unary which needs one expression you get

 [([])()] which is 21

but since [] or () with no inner bracketing can represent 0 which needs no more expressions you can interpret it as

Note that answers should work theoretically with infinite memory, but will obviously run out of memory for an implementation-dependent finite input.

Variations of output

BNF             xnor       Christian   Ben
b(t, b(t, t))   [{}{{}{}}] (0(00))     (1, -1, 1, -1)                         
b(t, u(u(t)))   [{}{(())}] (0((0)))    (1, -1, 0, 0)           
b(u(t), u(t))   [{()}{()}] ((0)(0))    (1, 0, -1, 0)                     
b(b(t, t), t)   [{{}{}}{}] ((00)0)     (1, 1, -1, -1)              
b(u(u(t)), t)   [{(())}{}] (((0))0)    (1, 0, 0, -1)                          
u(b(t, u(t)))   [({}{()})] ((0(0)))    (0, 1, -1, 0)                          
u(b(u(t), t))   [({()}{})] (((0)0))    (0, 1, 0, -1)                        
u(u(b(t, t)))   [(({}{}))] (((00)))    (0, 0, 1, -1)                          
u(u(u(u(t))))   [(((())))] ((((0))))   (0, 0, 0, 0)

A possible place to check for duplicate trees

One place to check for a duplicate is with M(5).
This one tree was generated twice for M(5) from M(4) trees

(2 (1 0) (1 0))

the first by adding a unary branch to

(2 (1 0) 0)

and second by adding a unary branch to

(2 0 (1 0))

Understanding BNF

BNF is composed of simple rules:

<symbol> ::= expression

where on the left is a symbol name surrounded by <>.
On the right is the expression for constructing the symbol. Some rules use other rules in the construction, e.g.

<e> ::= <terminal>

e can be a terminal

and some rules have characters that are used in constructing the symbol, e.g.

<terminal> ::= "0"

terminal is just the character zero.

Some rules have multiple ways of constructing them, e.g.

<e> ::= 
      <terminal>   
    | <unary>
    | <binary>

An e can be a <terminal> or a <unary> or a <binary>.

And some rules are a sequence of parts, e.g.

<unary> ::= "(1" <e> ")"

A unary is the characters (1 followed by what can be constructed for e followed by ).

You always start with the starting rule, which for this <e>.

Some simple examples:

The simplest sequence is just 0. So we start with the starting rule <e> and see that there are three choices:

  <terminal>   
| <unary>
| <binary>

so take the first one <terminal>. Now a terminal has no choices and is 0. So replace <terminal> with 0 in the <e> rule and you are done.

Then next one is (1 0). Start with <e> and use rule <unary> which has

"(1" <e> ")"

Now this needs an <e> so we go back to <e> and make a choice of one of the three, this time choosing, <terminal> which gives 0. Replacing 0 into (1 <e> ) gives (1 0), and this is replaced into <unary> so <e> is (1 0).

So, a binary tree? "a binary tree is a tree data structure in which each node has at most two children" — fəˈnɛtɪk, Commented Mar 14, 2017 at 14:19
Your description is that of a binary tree. Binary trees do not need to have 2 children. It just means they have at most 2 children. I guess unary-binary is just a more specific term which doesn't really mean anything different. — fəˈnɛtɪk, Commented Mar 14, 2017 at 14:22
Consider clarifying what "BNF" is for those of us who are not computer scientists — Luis Mendo, Commented Mar 14, 2017 at 14:46
@GuyCoder My point is, if someone sees "BNF" and doesn't know what that means they may be put off and stop reading. Perhaps using the name instead of the acronym and adding a link to the Wikipedia would suffice — Luis Mendo, Commented Mar 14, 2017 at 15:35
@mbomb007 Name changed. I should get a peer-pressure award for that. :) — Guy Coder, Commented Mar 14, 2017 at 16:43

Christian Sievers · Accepted Answer · 2017-03-14 17:09:56Z

13

Haskell, 68 bytes

t 0=[""]
t 1=["0"]
t n=['(':x++y++")"|k<-[1..n-1],x<-t k,y<-t$n-k-1]

Terminal nodes are represented by 0, unary and binary nodes by (e) resp. (ee), so the two three-node trees are given as (00) and ((0)).

Examples:

*Main> t 5
["(0(00))","(0((0)))","((0)(0))","((00)0)","(((0))0)","((0(0)))","(((0)0))","(((00)))","((((0))))"]
*Main> length $ t 8
127
*Main> length $ t 15
113634

answered Mar 14, 2017 at 17:09

Christian Sievers

7,1301 gold badge18 silver badges25 bronze badges

Add a comment |

Peter Taylor · Accepted Answer · 2017-03-14 16:55:10Z

5

CJam (37 bytes)

0aa{_2m*2\f+1Y$f+++:e__&}qi:A*{,A=},p

Online demo. Note that this is not very efficient, and you probably don't want to try calculating input 5 online.

Dissection to follow.

answered Mar 14, 2017 at 16:55

Peter Taylor

43.1k4 gold badges70 silver badges168 bronze badges

Add a comment |

Community · Accepted Answer · 2017-04-13 13:15:04Z

5

Pyth (24 21 19 bytes)

This is based on my Python 3 solution.

f!|sTf<sY0._T^}1_1t

It's my first time using Pyth so this is likely still golfable.

Example, output when input is 4:

[[1, 0, -1], [1, -1, 0], [0, 1, -1], [0, 0, 0]]

1 represents a binary node, 0 represents a unary node, and -1 represents a terminal node. There is an implied terminal node at the end of every tree.

Explanation:

f!|sTf<sY0._T^}1_1t
f                    filter
             ^    t  length n-1 lists of elements
              }1_1   from [1, 0, -1]
 !|                  for when both
   sT                sum of list is 0, and
     f    ._T        for each prefix of list,
      <sY0           sum of prefix is non-negative.

edited Apr 13, 2017 at 13:15

CommunityBot

1

answered Mar 14, 2017 at 21:11

Ben Frankel

3011 silver badge6 bronze badges

\$\begingroup\$ Of interest: Pyth source code \$\endgroup\$
– Guy Coder
Commented Mar 15, 2017 at 10:20

Add a comment |

Mitch Schwartz · Accepted Answer · 2017-05-05 13:10:21Z

brainfuck, 107 bytes

,>++>-[-[<-[<-[>>[[>+<-]<]>+>[[<+>>>>>+<<<<-]>]>>++>,++++>]>[<+>-[+>>]>[<->[.<<<
<<]+[->+]+>>>]]]]<[[,<]<]<]

Formatted:

,>++>-
[
  -
  [
    <-
    [
      <-
      [
        >>
        [[>+<-]<]
        >+>
        [[<+> >>>>+<<<<-]>]
        >>++>,++++>
      ]
      >
      [
        <+>-
        [
          +>>
        ]
        >
        [
          <->[.<<<<<]
          +[->+]
          +>>>
        ]
      ]
    ]
  ]
  <
  [
    [,<]
    <
  ]
  <
]

Try it online

Input is taken as a byte, and the tree 12100 is represented as \x01\x02\x03\x02: to convert back, translate tr/\x01\x02\x03/012/, reverse the string, and add a final 0. Trees are separated by \xfe. (The output can be made easier to read by e.g. changing the first - into -36 and the . into +47.-47, where -36 means a string of 36 - characters, etc.)

This approach makes use of the property (which Ben Frankel also used): considering the possible nodes as -1, 0, 1 and disregarding the final -1 node, a list represents a valid tree if and only if (1) all prefixes of the list have non-negative sum, and (2) the sum of the entire list equals 0. The first condition is maintained while generating intermediate nodes, so only the second condition needs to be checked at the end.

The tape is divided into 5-node cells,

i d x 0 0

where i is the index (descending from left to right), d is the partial sum, and x is the element.

Sketch of control flow:

take n and push initial node
while stack is non-empty:
    if rightmost node can be decremented:
        decrement rightmost node
        if there are less than n nodes:
            push new node
        else if valid tree:
            print
    else:
        backtrack (pop)

Note that sometimes a value is stored or initialized as one or two greater than the actual (conceptual) value and adjusted as needed.

Ben Frankel · Accepted Answer · 2017-03-14 19:15:50Z

3

Python 3 (138 134 128 121 119 bytes)

from itertools import*
lambda n:[any(sum(t[:k])<0for k in range(n))|sum(t)or print(t)for t in product(*[[-1,0,1]]*~-n)]

Example output, for n=5:

(0, 0, 0, 0)
(0, 0, 1, -1)
(0, 1, -1, 0)
(0, 1, 0, -1)
(1, -1, 0, 0)
(1, -1, 1, -1)
(1, 0, -1, 0)
(1, 0, 0, -1)
(1, 1, -1, -1)

1 represents a binary node, 0 represents a unary node, and -1 represents a terminal node. There is an implied terminal node at the end of every tree.

The program starts taking too long at around n=17.

edited Mar 14, 2017 at 19:15

answered Mar 14, 2017 at 18:12

Ben Frankel

3011 silver badge6 bronze badges

Add a comment |

Neil · Accepted Answer · 2017-12-27 11:15:28Z

3

JavaScript (Firefox 30-57), 79 bytes

f=(m,l=0)=>m?[for(n of[1,0,-1])if(l>n&l<=m+n)for(a of f(m-1,l-n))[...a,n]]:[[]]

Where -1 represents a terminal, 0 a unary node and 1 a binary node. Starts getting slow at m=14 on my PC. Recursively works back from the end of the tree.

The number of unaccounted-for nodes l is limited by the fact that there may only be 1 node left at the end.
The type of the next node n is limited by the need to have enough unaccounted-for nodes to be its children.

edited Dec 27, 2017 at 11:15

answered Mar 15, 2017 at 10:00

Neil

173k12 gold badges72 silver badges276 bronze badges

Add a comment |

Guy Coder · Accepted Answer · 2017-03-15 12:05:36Z

Prolog, 149 144 138 137 131 107 bytes

e(L,L)-->[0].

e([_|A],L)--> 
    [1],
    e(A,L).

e([_,_|A],L)--> 
    [2],
    e(A,B), 
    e(B,L).

e(M,E):-                   
    length([_|L],M),        
    e(L,[],E,[]).           

?- e(5,S).
S = [1, 1, 1, 1, 0] ;
S = [1, 1, 2, 0, 0] ;
S = [1, 2, 0, 1, 0] ;
S = [1, 2, 1, 0, 0] ;
S = [2, 0, 1, 1, 0] ;
S = [2, 0, 2, 0, 0] ;
S = [2, 1, 0, 1, 0] ;
S = [2, 1, 1, 0, 0] ;
S = [2, 2, 0, 0, 0].

And to count the solutions

e_count(N,Count) :-
    length([_|Ls], N),
    findall(., phrase(e(Ls,[]),E), Sols),
    length(Sols, Count).

?- e_count(N,Count).
N = Count, Count = 1 ;
N = 2, Count = 1 ;
N = 3, Count = 2 ;
N = Count, Count = 4 ;
N = 5, Count = 9 ;
N = 6, Count = 21 ;
N = 7, Count = 51 ;
N = 8, Count = 127 ;
N = 9, Count = 323 ;
N = 10, Count = 835 ;
N = 11, Count = 2188 ;
N = 12, Count = 5798 ;
N = 13, Count = 15511 ;
N = 14, Count = 41835 ;
N = 15, Count = 113634

Mitch Schwartz · Accepted Answer · 2017-04-09 02:35:46Z

1

Python, 71 bytes

f=lambda n:{(a+b,)for k in range(n)for a in f(k)for b in f(n+~k)}or[()]

This represents trees as nested tuples such as ((((),), ()),), which can be transformed into ((())()) by removing commas, spaces, and the outermost ().

An earlier 76-byte version:

f=lambda n:{'('+a+b+')'for k in range(n)for a in f(k)for b in f(n+~k)}or['']

edited Apr 9, 2017 at 2:35

answered Apr 8, 2017 at 21:09

Mitch Schwartz

5,8591 gold badge26 silver badges26 bronze badges

Add a comment |

Esolanging Fruit · Accepted Answer · 2017-05-05 02:09:06Z

1

CJam, 38 bytes

Uses a different approach that Peter Taylor's CJam answer.

3rim*{:(1\+[{1$+}*])\:(_:z#|!},

Output will be something like 1110120020102100. Each tree is a group of n digits (where n is the input number).

The basic idea is that we generate each possible string of digits 0, 1, and 2, and then filter only the ones that are well-formed trees.

answered May 5, 2017 at 2:09

Esolanging Fruit

15.5k4 gold badges48 silver badges92 bronze badges

Add a comment |

Stack Exchange Network

Enumerate binary trees

Binary trees

Motzkin numbers

Task

Input

Output

Variations of output

A possible place to check for duplicate trees

Understanding BNF

9 Answers 9

Haskell, 68 bytes

CJam (37 bytes)

Pyth (24 21 19 bytes)

brainfuck, 107 bytes

Python 3 (138 134 128 121 119 bytes)

JavaScript (Firefox 30-57), 79 bytes

Prolog, 149 144 138 137 131 107 bytes

Python, 71 bytes

CJam, 38 bytes

Not the answer you're looking for? Browse other questions tagged
code-golf
graph-theory
binary-tree
or ask your own question.

Hot Network Questions

Enumerate binary trees

Binary trees

Motzkin numbers

Task

Input

Output

Variations of output

A possible place to check for duplicate trees

Understanding BNF

9 Answers 9

Haskell, 68 bytes

CJam (37 bytes)

Pyth (24 21 19 bytes)

brainfuck, 107 bytes

Python 3 (138 134 128 121 119 bytes)

JavaScript (Firefox 30-57), 79 bytes

Prolog, 149 144 138 137 131 107 bytes

Python, 71 bytes

CJam, 38 bytes

Not the answer you're looking for? Browse other questions tagged code-golfgraph-theorybinary-tree or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
code-golf
graph-theory
binary-tree
or ask your own question.