3
$\begingroup$

I was inspired by this awesome puzzle. Here is an image of a word tree borrowed from there:

enter image description here

In a word tree every path from the root to the leaves must form a distinct word. The size of the tree is the number of distinct words it forms in this way. The size of this tree is 16 as it forms 16 distinct words: BLOOM, BLOOD, BLOWN, BLOWS, BLAND, BLANK, BLASÉ, BLAST, BROOK, BROOM, BROWN, BROWS, BRAID, BRAIN, BRASH, BRASS. Note that words that do not reach the leaves are not counted, so for example we do not count the words BLOW, BROW and BRA. You must use words from this dictionary. Every branch must split into exactly two (or zero) other branches above it. The two branches may use the same letter. The leaves can be at different distances from the root. So we can extend the BRAIN branch into BRAINY and BRAINS, thus increasing the size of this tree to 17. What is the size of the largest word tree possible? You may use a computer to find the answer.

$\endgroup$
8
  • 3
    $\begingroup$ This feels extremely derivative, with little thought towards the actual solving process. Which words are allowed? Can the two branches contain the same letter? Is there some puzzle element to this programming exercise? $\endgroup$
    – Bass
    Commented Jul 27, 2022 at 6:38
  • $\begingroup$ Thank you both. I have added a dictionary of words. The two branches can contain the same letter, provided that the final words are all distinct. $\endgroup$ Commented Jul 27, 2022 at 7:10
  • $\begingroup$ The puzzle element is how to solve this programming exercise that has a vast search space. I have some ideas, but I am looking forward to seeing your thoughts. Incremental solutions are acceptable. So currently we know that size 17 is possible. Can we get size 18? $\endgroup$ Commented Jul 27, 2022 at 7:17
  • $\begingroup$ I am also not ruling out manually constructed solutions, which would involve a puzzle element. $\endgroup$ Commented Jul 27, 2022 at 8:58
  • 1
    $\begingroup$ A trivial extension: blasts and blasty. Also, why is your dictionary slightly out of order? $\endgroup$
    – A username
    Commented Jul 27, 2022 at 10:29

1 Answer 1

7
+50
$\begingroup$

I wrote a computer progam for this problem. It found the following tree:

This tree with root S contains 67 words.

squabbed
squabber
squabbled
squabbles
squabbly
squabs
squalled
squallers
squallery
squally
squalm
squeaked
squeakers
squeakery
squeaks
squeald
squeals
squeel
squeezed
squeezes
squeezy
squadded
squadder
squaddy
squads
squawks
squawky
squaws
squirms
squirmy
squirts
squirty
squished
squishes
squishy
squiss
stagers
stagery
stagey
stagged
staggers
staggery
staggy
starken
starker
starkle
starkly
startled
startles
startly
starty
striped
stripes
stripy
strived
striven
strivy
strobed
strobes
strobic
strobilae
strobilar
strobils
stropped
stropper
stroppy
strops

Note that this is not quite in alphabetical order because SQ has two children of letter U, the first with grandchildren A&E and the second with A&I.

The program works relatively straightforwardly. First it puts all the words in a tree structure. Then it prunes all the branches that have a node with only one branch. Then it prunes away all the smallest surplus branches until there are no more than two branches at each node. There is the slight complication that a node with 4 branches can be split into two branches with the same letter, so that has to be done first before pruning away.

Here's my C# program:

  using System.Collections.Generic;
  using System.Linq;
  using System.Text;

  namespace TempProg
  {
     static class PSEWordTree
     {
        public class LetterNode
        {
           public char Letter;
           public List<LetterNode> Children = new List<LetterNode>();

           internal void Add(string word)
           {
              if( word==null || word.Length == 0) return;
              char firstLetter = word.First();
              LetterNode child = Children.FirstOrDefault(c => c.Letter == firstLetter);
              if (child == null)
              {
                 child = new LetterNode();
                 child.Letter = firstLetter;
                 Children.Add(child);
              }
              child.Add(word.Substring(1));
           }

           // Prune any nodes with only one child
           internal void Prune1()
           {
              for (int i = Children.Count - 1; i >= 0; i--)
              {
                 var c = Children[i];
                 c.Prune1();
                 if (c.Children.Count == 1 && Children.Count > 1) Children.RemoveAt(i);
              }
           }

           // Prune surplus branches
           internal void Prune2()
           {
              for( int i=Children.Count - 1; i>=0; i--)
              {
                 var c = Children[i];
                 if (c.Children.Count == 0) continue;

                 c.Prune2();
                 c.Children = c.Children.OrderByDescending(c2 => c2.NumLeaves()).ToList();
                 if (c.Children.Count > 4) c.Children.RemoveRange(4, c.Children.Count - 4);
                 if (c.Children.Count == 4)
                 {
                    var c2 = new LetterNode();
                    c2.Letter = c.Letter;
                    c2.Children = c.Children.GetRange(2, 2).ToList();
                    c.Children.RemoveRange(2, 2);
                    Children.Insert(i + 1, c2);
                 }
                 else if (c.Children.Count == 3)
                 {
                    c.Children.RemoveAt(2);
                 }
              }
           }

           internal int NumLeaves()
           {
              if (Children.Count == 0) return 1;
              return Children.Select(c => c.NumLeaves()).Sum();
           }

           internal List<string> GetWords()
           {
              if(Children.Count == 0)
              {
                 return new List<string> { Letter.ToString() };
              }
              var w = Children.OrderBy(c => c.Letter).SelectMany(c => c.GetWords()).Select(s => Letter + s);
              return w.ToList();
           }

           public override string ToString()
           {
              StringBuilder s = new StringBuilder();
              foreach(var w in GetWords())
              {
                 s.AppendLine(w);
              }
              return s.ToString();
           }

        }

        public static void Main()
        {
           var root = new LetterNode();

           foreach (string line in System.IO.File.ReadLines(@"E:\temp\words_alpha.txt"))
           {
              root.Add(line);
           }
           root.Prune1();
           root.Prune2();

           foreach (var c in root.Children)
           {
              System.Console.WriteLine("For letter {0} the tree has {1} words.", c.Letter, c.NumLeaves());
           }
           System.Console.WriteLine();

           foreach (var c in root.Children)
           {
              System.Console.WriteLine("For letter {0} the tree has {1} words.", c.Letter, c.NumLeaves());
              System.Console.WriteLine(c);
           }
        }

     }
  }
$\endgroup$
13
  • 1
    $\begingroup$ This is excellent work! Quite a large tree too. I assume this is optimal? Does the pruning order matter? $\endgroup$ Commented Jul 27, 2022 at 12:53
  • 1
    $\begingroup$ @DmitryKamenetsky This should be optimal, barring any errors or bugs. When a node has say 3 branches, you want to prune away the one with fewest words/leaves. The number of words can only be accurately determined if all the sub-branches have been pruned to form a valid word tree (i.e. a binary tree). So the pruning must be performed from the leaves towards the root. The order of pruning amongst branches at the same level does not matter. I did the pruning in two stages (single branches first, then surplus branches) but only because I was having trouble combining those in a single pass. $\endgroup$ Commented Jul 27, 2022 at 13:08
  • $\begingroup$ hang on I am not understanding something. Every word should have a partner word that has the same substring except the last letter. But the last two words in your list (stroppy and strops) don't have partners. Perhaps you are allowing some branches to have a single child? $\endgroup$ Commented Jul 27, 2022 at 13:57
  • $\begingroup$ @DmitryKamenetsky To take your example: "The leaves can be at different distances from the root. So we can extend the BRAIN branch into BRAINY and BRAINS". This causes BRAID to be unpartnered since BRAIN is replaced by longer words, i.e. BRAID is now partnered with BRAIN*. Similarly STROPS is partnered with STROPP*. $\endgroup$ Commented Jul 27, 2022 at 14:00
  • 1
    $\begingroup$ Does the program build a tree or a trie? XP $\endgroup$
    – melfnt
    Commented Jul 31, 2022 at 16:21

Not the answer you're looking for? Browse other questions tagged or ask your own question.