19
\$\begingroup\$

I was interested in seeing the answers to this (now defunct) question, but it has never been corrected/improved.

Given a set of 6-sided Boggle dice (configuration stolen from this question), determine in two minutes of processing time which board configuration will allow for the highest possible score. (i.e. which dice in which location with which side up allows for the largest scoring word pool?)


OBJECTIVE

  • Your code should run for no more than 2 minutes (120 seconds). At that time, it should automatically stop running and print the results.

  • Final challenge score will be the average Boggle score of 5 runs of the program.

    • In the event of a tie, the winner will be whichever algorithm found more words.
    • In the event there is still a tie, the winner will be whichever algorithm found more long (8+) words.

RULES/CONSTRAINTS

  • This is a code challenge; code length is irrelevant.

  • Please refer to this link for a word list (use ISPELL "english.0" list - the SCOWL list is missing some pretty common words).

    • This listing may be referred to/imported/read in your code any way you wish.
    • Only words matched by the regex ^([a-pr-z]|qu){3,16}$ will be counted. (Only lowercase letters, 3-16 characters, qu must be used as a unit.)
  • Words are formed by linking adjacent letters (horizontal, vertical, and diagonal) to spell words out in the correct order, without using a single die more than once in a single words.

    • Words must be 3 letters or longer; shorter words will earn no points.
    • Duplicated letters are acceptable, just not dice.
    • Words that span the edges/cross over from one side of the board to the other are not allowed.
  • The final Boggle (not challenge) score is the total of the point values for all the words that are found.

    • The point value assigned for each word is based on the length of the word. (see below)
    • Normal Boggle rules would deduct/discount words found by another player. Assume here no other players are involved and all found words count towards the total score.
    • However, words found more than once in the same grid should only be counted once.
  • Your function/program must FIND the optimal arrangement; simply hard-coding a predetermined list won't do.

  • Your output should be a 4x4 grid of your ideal game board, a list of all the found words for that board, and the Boggle score to match those words.


DIE CONFIGURATION

A  A  E  E  G  N
E  L  R  T  T  Y
A  O  O  T  T  W
A  B  B  J  O  O
E  H  R  T  V  W
C  I  M  O  T  U
D  I  S  T  T  Y
E  I  O  S  S  T
D  E  L  R  V  Y
A  C  H  O  P  S
H  I  M  N  Qu U
E  E  I  N  S  U
E  E  G  H  N  W
A  F  F  K  P  S
H  L  N  N  R  Z
D  E  I  L  R  X

STANDARD BOGGLE SCORING TABLE

Word length => Points
<= 2 - 0 pts
   3 - 1  
   4 - 1  
   5 - 2  
   6 - 3  
   7 - 5
>= 8 - 11 pts
*Words using the "Qu" die will count the full 2 letters for their word, not just the 1 die.

EXAMPLE OUTPUT

A  L  O  J  
V  U  T  S  
L  C  H  E  
G  K  R  X

CUT
THE
LUCK
HEX
....

140 points

If any further clarification is needed, please ask!

\$\endgroup\$
20
  • 2
    \$\begingroup\$ I would prefer to have a dictionary provided to standardize the objective. Note also that this is not a new idea, as a simple google search will reveal :) The highest score I've seen is 4527 (1414 total words), found here: ai.stanford.edu/~chuongdo/boggle/index.html \$\endgroup\$
    – mellamokb
    Commented Apr 26, 2012 at 20:22
  • 4
    \$\begingroup\$ Is the program required to terminate this century? \$\endgroup\$ Commented Apr 26, 2012 at 21:52
  • 1
    \$\begingroup\$ @GlitchMr In English, Q is typically only ever used with U. Boggle accounts for this by putting the two letters on the same die as one unit. \$\endgroup\$
    – Gaffi
    Commented Apr 28, 2012 at 13:31
  • 1
    \$\begingroup\$ The word list specification is unclear. Are you counting only those words listed in english.0 in lower case? (Standard word game rules exclude abbreviations/initialisms and proper nouns). \$\endgroup\$ Commented Jun 8, 2012 at 16:30
  • 1
    \$\begingroup\$ I was thinking regex ^([a-pr-z]|qu){3,16}$ (which would incorrectly exclude 3-letter words with qu, but there aren't any). \$\endgroup\$ Commented Jun 8, 2012 at 17:00

4 Answers 4

9
\$\begingroup\$

C, averaging 500+ 1500 1750 points

This is a relatively minor improvement over version 2 (see below for notes on previous versions). There are two parts. First: Instead of selecting boards randomly from the pool, the program now iterates over every board in the pool, using each one in turn before returning to the top of the pool and repeating. (Since the pool is being modified while this iteration occurs, there will still be boards that get chosen twice in a row, or worse, but this is not a serious concern.) The second change is that the program now tracks when the pool changes, and if the program goes for too long without improving the pool's contents, it determines that the search has "stalled", empties the pool, and starts over with a new search. It continues to do this until the two minutes are up.

I had initially thought that I would be employing some kind of heuristic search to get beyond the 1500-point range. @mellamokb's comment about a 4527-point board led me to assume that there was plenty of room for improvement. However, we are using a relatively small wordlist. The 4527-point board was scoring using YAWL, which is the most inclusive wordlist out there -- it's even bigger than the official US Scrabble wordlist. With this in mind, I re-examined the boards my program had found and noticed that there appeared to be a limited set of boards above 1700 or so. So for example, I had multiple runs that had discovered a board scoring 1726, but it was always the exact same board that was found (ignoring rotations and reflections).

As another test, I ran my program using YAWL as the dictionary, and it found the 4527-point board after about a dozen runs. Given this, I am hypothesizing that my program is already at the upper limit of the search space, and therefore the rewrite I was planning would introduce extra complexity for very little gain.

Here is my list of the five highest-scoring boards my program has found using the english.0 wordlist:

1735 :  D C L P  E I A E  R N T R  S E G S
1738 :  B E L S  R A D G  T I N E  S E R S
1747 :  D C L P  E I A E  N T R D  G S E R
1766 :  M P L S  S A I E  N T R N  D E S G
1772:   G R E P  T N A L  E S I T  D R E S

My belief is that the 1772 "grep board" (as I've taken to calling it), with 531 words, is the highest-scoring board possible with this wordlist. Over 50% of my program's two-minute runs end with this board. I've also left my program running overnight without it finding anything better. So if there is a higher-scoring board, it would likely have to have some aspect that defeats the program's search technique. A board in which every possible small change to the layout causes a huge drop in the total score, for example, might never be discovered by my program. My hunch is that such a board is very unlikely to exist.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <time.h>

#define WORDLISTFILE "./english.0"

#define XSIZE 4
#define YSIZE 4
#define BOARDSIZE (XSIZE * YSIZE)
#define DIEFACES 6
#define WORDBUFSIZE 256
#define MAXPOOLSIZE 32
#define STALLPOINT 64
#define RUNTIME 120

/* Generate a random int from 0 to N-1.
 */
#define random(N)  ((int)(((double)(N) * rand()) / (RAND_MAX + 1.0)))

static char const dice[BOARDSIZE][DIEFACES] = {
    "aaeegn", "elrtty", "aoottw", "abbjoo",
    "ehrtvw", "cimotu", "distty", "eiosst",
    "delrvy", "achops", "himnqu", "eeinsu",
    "eeghnw", "affkps", "hlnnrz", "deilrx"
};

/* The dictionary is represented in memory as a tree. The tree is
 * represented by its arcs; the nodes are implicit. All of the arcs
 * emanating from a single node are stored as a linked list in
 * alphabetical order.
 */
typedef struct {
    int letter:8;   /* the letter this arc is labelled with */
    int arc:24;     /* the node this arc points to (i.e. its first arc) */
    int next:24;    /* the next sibling arc emanating from this node */
    int final:1;    /* true if this arc is the end of a valid word */
} treearc;

/* Each of the slots that make up the playing board is represented
 * by the die it contains.
 */
typedef struct {
    unsigned char die;      /* which die is in this slot */
    unsigned char face;     /* which face of the die is showing */
} slot;

/* The following information defines a game.
 */
typedef struct {
    slot board[BOARDSIZE];  /* the contents of the board */
    int score;              /* how many points the board is worth */
} game;

/* The wordlist is stored as a binary search tree.
 */
typedef struct {
    int item: 24;   /* the identifier of a word in the list */
    int left: 16;   /* the branch with smaller identifiers */
    int right: 16;  /* the branch with larger identifiers */
} listnode;

/* The dictionary.
 */
static treearc *dictionary;
static int heapalloc;
static int heapsize;

/* Every slot's immediate neighbors.
 */
static int neighbors[BOARDSIZE][9];

/* The wordlist, used while scoring a board.
 */
static listnode *wordlist;
static int listalloc;
static int listsize;
static int xcursor;

/* The game that is currently being examined.
 */
static game G;

/* The highest-scoring game seen so far.
 */
static game bestgame;

/* Variables to time the program and display stats.
 */
static time_t start;
static int boardcount;
static int allscores;

/* The pool contains the N highest-scoring games seen so far.
 */
static game pool[MAXPOOLSIZE];
static int poolsize;
static int cutoffscore;
static int stallcounter;

/* Some buffers shared by recursive functions.
 */
static char wordbuf[WORDBUFSIZE];
static char gridbuf[BOARDSIZE];

/*
 * The dictionary is stored as a tree. It is created during
 * initialization and remains unmodified afterwards. When moving
 * through the tree, the program tracks the arc that points to the
 * current node. (The first arc in the heap is a dummy that points to
 * the root node, which otherwise would have no arc.)
 */

static void initdictionary(void)
{
    heapalloc = 256;
    dictionary = malloc(256 * sizeof *dictionary);
    heapsize = 1;
    dictionary->arc = 0;
    dictionary->letter = 0;
    dictionary->next = 0;
    dictionary->final = 0;
}

static int addarc(int arc, char ch)
{
    int prev, a;

    prev = arc;
    a = dictionary[arc].arc;
    for (;;) {
        if (dictionary[a].letter == ch)
            return a;
        if (!dictionary[a].letter || dictionary[a].letter > ch)
            break;
        prev = a;
        a = dictionary[a].next;
    }
    if (heapsize >= heapalloc) {
        heapalloc *= 2;
        dictionary = realloc(dictionary, heapalloc * sizeof *dictionary);
    }
    a = heapsize++;
    dictionary[a].letter = ch;
    dictionary[a].final = 0;
    dictionary[a].arc = 0;
    if (prev == arc) {
        dictionary[a].next = dictionary[prev].arc;
        dictionary[prev].arc = a;
    } else {
        dictionary[a].next = dictionary[prev].next;
        dictionary[prev].next = a;
    }
    return a;
}

static int validateword(char *word)
{
    int i;

    for (i = 0 ; word[i] != '\0' && word[i] != '\n' ; ++i)
        if (word[i] < 'a' || word[i] > 'z')
            return 0;
    if (word[i] == '\n')
        word[i] = '\0';
    if (i < 3)
        return 0;
    for ( ; *word ; ++word, --i) {
        if (*word == 'q') {
            if (word[1] != 'u')
                return 0;
            memmove(word + 1, word + 2, --i);
        }
    }
    return 1;
}

static void createdictionary(char const *filename)
{
    FILE *fp;
    int arc, i;

    initdictionary();
    fp = fopen(filename, "r");
    while (fgets(wordbuf, sizeof wordbuf, fp)) {
        if (!validateword(wordbuf))
            continue;
        arc = 0;
        for (i = 0 ; wordbuf[i] ; ++i)
            arc = addarc(arc, wordbuf[i]);
        dictionary[arc].final = 1;
    }
    fclose(fp);
}

/*
 * The wordlist is stored as a binary search tree. It is only added
 * to, searched, and erased. Instead of storing the actual word, it
 * only retains the word's final arc in the dictionary. Thus, the
 * dictionary needs to be walked in order to print out the wordlist.
 */

static void initwordlist(void)
{
    listalloc = 16;
    wordlist = malloc(listalloc * sizeof *wordlist);
    listsize = 0;
}

static int iswordinlist(int word)
{
    int node, n;

    n = 0;
    for (;;) {
        node = n;
        if (wordlist[node].item == word)
            return 1;
        if (wordlist[node].item > word)
            n = wordlist[node].left;
        else
            n = wordlist[node].right;
        if (!n)
            return 0;
    }
}

static int insertword(int word)
{
    int node, n;

    if (!listsize) {
        wordlist->item = word;
        wordlist->left = 0;
        wordlist->right = 0;
        ++listsize;
        return 1;
    }

    n = 0;
    for (;;) {
        node = n;
        if (wordlist[node].item == word)
            return 0;
        if (wordlist[node].item > word)
            n = wordlist[node].left;
        else
            n = wordlist[node].right;
        if (!n)
            break;
    }

    if (listsize >= listalloc) {
        listalloc *= 2;
        wordlist = realloc(wordlist, listalloc * sizeof *wordlist);
    }
    n = listsize++;
    wordlist[n].item = word;
    wordlist[n].left = 0;
    wordlist[n].right = 0;
    if (wordlist[node].item > word)
        wordlist[node].left = n;
    else
        wordlist[node].right = n;
    return 1;
}

static void clearwordlist(void)
{
    listsize = 0;
    G.score = 0;
}


static void scoreword(char const *word)
{
    int const scoring[] = { 0, 0, 0, 1, 1, 2, 3, 5 };
    int n, u;

    for (n = u = 0 ; word[n] ; ++n)
        if (word[n] == 'q')
            ++u;
    n += u;
    G.score += n > 7 ? 11 : scoring[n];
}

static void addwordtolist(char const *word, int id)
{
    if (insertword(id))
        scoreword(word);
}

static void _printwords(int arc, int len)
{
    int a;

    while (arc) {
        a = len + 1;
        wordbuf[len] = dictionary[arc].letter;
        if (wordbuf[len] == 'q')
            wordbuf[a++] = 'u';
        if (dictionary[arc].final) {
            if (iswordinlist(arc)) {
                wordbuf[a] = '\0';
                if (xcursor == 4) {
                    printf("%s\n", wordbuf);
                    xcursor = 0;
                } else {
                    printf("%-16s", wordbuf);
                    ++xcursor;
                }
            }
        }
        _printwords(dictionary[arc].arc, a);
        arc = dictionary[arc].next;
    }
}

static void printwordlist(void)
{
    xcursor = 0;
    _printwords(1, 0);
    if (xcursor)
        putchar('\n');
}

/*
 * The board is stored as an array of oriented dice. To score a game,
 * the program looks at each slot on the board in turn, and tries to
 * find a path along the dictionary tree that matches the letters on
 * adjacent dice.
 */

static void initneighbors(void)
{
    int i, j, n;

    for (i = 0 ; i < BOARDSIZE ; ++i) {
        n = 0;
        for (j = 0 ; j < BOARDSIZE ; ++j)
            if (i != j && abs(i / XSIZE - j / XSIZE) <= 1
                       && abs(i % XSIZE - j % XSIZE) <= 1)
                neighbors[i][n++] = j;
        neighbors[i][n] = -1;
    }
}

static void printboard(void)
{
    int i;

    for (i = 0 ; i < BOARDSIZE ; ++i) {
        printf(" %c", toupper(dice[G.board[i].die][G.board[i].face]));
        if (i % XSIZE == XSIZE - 1)
            putchar('\n');
    }
}

static void _findwords(int pos, int arc, int len)
{
    int ch, i, p;

    for (;;) {
        ch = dictionary[arc].letter;
        if (ch == gridbuf[pos])
            break;
        if (ch > gridbuf[pos] || !dictionary[arc].next)
            return;
        arc = dictionary[arc].next;
    }
    wordbuf[len++] = ch;
    if (dictionary[arc].final) {
        wordbuf[len] = '\0';
        addwordtolist(wordbuf, arc);
    }
    gridbuf[pos] = '.';
    for (i = 0 ; (p = neighbors[pos][i]) >= 0 ; ++i)
        if (gridbuf[p] != '.')
            _findwords(p, dictionary[arc].arc, len);
    gridbuf[pos] = ch;
}

static void findwordsingrid(void)
{
    int i;

    clearwordlist();
    for (i = 0 ; i < BOARDSIZE ; ++i)
        gridbuf[i] = dice[G.board[i].die][G.board[i].face];
    for (i = 0 ; i < BOARDSIZE ; ++i)
        _findwords(i, 1, 0);
}

static void shuffleboard(void)
{
    int die[BOARDSIZE];
    int i, n;

    for (i = 0 ; i < BOARDSIZE ; ++i)
        die[i] = i;
    for (i = BOARDSIZE ; i-- ; ) {
        n = random(i);
        G.board[i].die = die[n];
        G.board[i].face = random(DIEFACES);
        die[n] = die[i];
    }
}

/*
 * The pool contains the N highest-scoring games found so far. (This
 * would typically be done using a priority queue, but it represents
 * far too little of the runtime. Brute force is just as good and
 * simpler.) Note that the pool will only ever contain one board with
 * a particular score: This is a cheap way to discourage the pool from
 * filling up with almost-identical high-scoring boards.
 */

static void addgametopool(void)
{
    int i;

    if (G.score < cutoffscore)
        return;
    for (i = 0 ; i < poolsize ; ++i) {
        if (G.score == pool[i].score) {
            pool[i] = G;
            return;
        }
        if (G.score > pool[i].score)
            break;
    }
    if (poolsize < MAXPOOLSIZE)
        ++poolsize;
    if (i < poolsize) {
        memmove(pool + i + 1, pool + i, (poolsize - i - 1) * sizeof *pool);
        pool[i] = G;
    }
    cutoffscore = pool[poolsize - 1].score;
    stallcounter = 0;
}

static void selectpoolmember(int n)
{
    G = pool[n];
}

static void emptypool(void)
{
    poolsize = 0;
    cutoffscore = 0;
    stallcounter = 0;
}

/*
 * The program examines as many boards as it can in the given time,
 * and retains the one with the highest score. If the program is out
 * of time, then it reports the best-seen game and immediately exits.
 */

static void report(void)
{
    findwordsingrid();
    printboard();
    printwordlist();
    printf("score = %d\n", G.score);
    fprintf(stderr, "// score: %d points (%d words)\n", G.score, listsize);
    fprintf(stderr, "// %d boards examined\n", boardcount);
    fprintf(stderr, "// avg score: %.1f\n", (double)allscores / boardcount);
    fprintf(stderr, "// runtime: %ld s\n", time(0) - start);
}

static void scoreboard(void)
{
    findwordsingrid();
    ++boardcount;
    allscores += G.score;
    addgametopool();
    if (bestgame.score < G.score) {
        bestgame = G;
        fprintf(stderr, "// %ld s: board %d scoring %d\n",
                time(0) - start, boardcount, G.score);
    }

    if (time(0) - start >= RUNTIME) {
        G = bestgame;
        report();
        exit(0);
    }
}

static void restartpool(void)
{
    emptypool();
    while (poolsize < MAXPOOLSIZE) {
        shuffleboard();
        scoreboard();
    }
}

/*
 * Making small modifications to a board.
 */

static void turndie(void)
{
    int i, j;

    i = random(BOARDSIZE);
    j = random(DIEFACES - 1) + 1;
    G.board[i].face = (G.board[i].face + j) % DIEFACES;
}

static void swapdice(void)
{
    slot t;
    int p, q;

    p = random(BOARDSIZE);
    q = random(BOARDSIZE - 1);
    if (q >= p)
        ++q;
    t = G.board[p];
    G.board[p] = G.board[q];
    G.board[q] = t;
}

/*
 *
 */

int main(void)
{
    int i;

    start = time(0);
    srand((unsigned int)start);

    createdictionary(WORDLISTFILE);
    initwordlist();
    initneighbors();

    restartpool();
    for (;;) {
        for (i = 0 ; i < poolsize ; ++i) {
            selectpoolmember(i);
            turndie();
            scoreboard();
            selectpoolmember(i);
            swapdice();
            scoreboard();
        }
        ++stallcounter;
        if (stallcounter >= STALLPOINT) {
            fprintf(stderr, "// stalled; restarting search\n");
            restartpool();
        }
    }

    return 0;
}

Notes for version 2 (June 9)

Here's one way to use the initial version of my code as a jumping-off point. The changes to this version consist of less than 100 lines, but tripled the average game score.

In this version, the program maintains a "pool" of candidates, consisting of the N highest-scoring boards that the program has generated so far. Every time a new board is generated, it is added to the pool and the lowest-scoring board in the pool is removed (which may very well be the board that was just added, if its score is lower than what's already there). The pool is initially filled with randomly generated boards, after which it keeps a constant size throughout the program's run.

The program's main loop consists of of selecting a random board from the pool and altering it, determining this new board's score and then putting it into the pool (if it scores well enough). In this fashion the program is continually refining high-scoring boards. The main activity is to make stepwise, incremental improvements, but the size of the pool also allows the program to find multi-step improvements that temporarily make a board's score worse before it can make it better.

Typically this program finds a good local maximum rather quickly, after which presumably any better maximum is too distant to be found. And so once again there's little point to running the program longer than 10 seconds. This might be improved by e.g. having the program detect this situation and start a fresh search with a new pool of candidates. However, this would net only a marginal increase. A proper heuristic search technique would likely be a better avenue of exploration.

(Side note: I saw that this version was generating about 5k boards/sec. Since the first version typically did 20k boards/sec, I was initially concerned. Upon profiling, however, I discovered that the extra time was spent managing the wordlist. In other words, it was entirely due to the program finding so many more words per board. In light of this, I considered changing the code to manage the wordlist, but given that this program is only using 10 of its allotted 120 seconds, such an optimization would be very premature.)

Notes for version 1 (June 2)

This is a very, very simple solution. All it does it generate random boards, and then after 10 seconds it outputs the one with the highest score. (I defaulted to 10 seconds because the extra 110 seconds permitted by the problem specification typically doesn't improve the final solution found enough to be worth waiting for.) So it's extremely dumb. However, it has all the infrastructure to make a good starting point for a more intelligent search, and if anyone wishes to make use of it before the deadline, I encourage them to do so.

The program starts by reading the dictionary into a tree structure. (The form isn't quite as optimized as it could be, but it's more than good enough for these purposes.) After some other basic initialization, it then begins generating boards and scoring them. The program examines about 20k boards per second on my machine, and after about 200k boards the random approach starts to run dry.

Since only one board is actually being evaluated at any given time, the scoring data is stored in global variables. This allows me to minimize the amount of constant data that has to be passed as arguments to the recursive functions. (I'm sure this will give some people hives, and to them I apologize.) The wordlist is stored as a binary search tree. Every word found has to be looked up in the wordlist, so that duplicate words aren't counted twice. The wordlist is only needed during the evaulation process, however, so it's discarded after the score is found. Thus, at the end of the program, the chosen board has to be scored all over again, so that the wordlist can be printed out.

Fun fact: The average score for a randomly-generated Boggle board, as scored by english.0, is 61.7 points.

\$\endgroup\$
7
  • \$\begingroup\$ Obviously, I need to improve my own efficiency. :-) \$\endgroup\$
    – Gaffi
    Commented Jun 2, 2012 at 6:12
  • \$\begingroup\$ My genetic approach gets about 700-800 points generating about 200k boards, so you're clearly doing something far better than me in the way you produce the next generation. \$\endgroup\$ Commented Jun 11, 2012 at 18:21
  • \$\begingroup\$ My own tree structure, having been implemented only for the master wordlist thus far, while it works and allows me to validate boards, it bogs down my system memory (actively lags to the point that it takes a considerable amount of time to force the process to terminate early). This is surely my own fault, but I'm working at it! Edit: Already fixed! ;-) \$\endgroup\$
    – Gaffi
    Commented Jun 11, 2012 at 21:28
  • \$\begingroup\$ @PeterTaylor I thought about trying a genetic algorithm, but I couldn't think of a plausible mechanism for combining two boards. How are you doing it? Are you selecting the parent randomly for each slot on the board? \$\endgroup\$
    – breadbox
    Commented Jun 12, 2012 at 18:26
  • \$\begingroup\$ I split the board state into the permutation of dice and the faces showing on the dice. For permutation crossover I use "order crossover 1" from cs.colostate.edu/~genitor/1995/permutations.pdf and for face crossover I do the obvious. But I have an idea for a totally different approach which I need to find the time to implement. \$\endgroup\$ Commented Jun 12, 2012 at 19:02
3
\$\begingroup\$

VBA (average currently ranging 80-110 points, unfinished)

Here's my working process, but it's far from the best possible; my absolute best score found on any board after many test runs is around 120. There still needs to be some better general cleanup and I'm sure there are more efficiencies to be gained in a number of places.

  • 2012.05.09:
    • Original posting
  • 2012.05.10 - 2012.05.18:
    • Improved the scoring algorithm
    • Improved the pathfinding logic
  • 2012.06.07 - 2012.06.12:
    • Reduced word limit to 6 from 8. Allows for more boards with smaller words. Appears to have made a slight improvement in average score. (10-15 or so boards checked per run vs. 1 to 2)
    • Following breadbox's suggestion, I've created a tree structure to house the word-list. This does speed up back-end checking of the words on a board significantly.
    • I played with changing the maximum word size (speed vs. score) and I haven't yet decided if 5 or 6 is a better option for me. 6 results in 100-120 total boards checked, while 5 results in 500-1000 (both of which are still far below the other examples provided so far).
    • Problem: After many successive runs, the process begins to slow, so there is still some memory to be managed.

This probably looks horrid to some of you, but as I said, WIP. I am very open to constructive criticism! Sorry for the very lengthy body...


Dice Class Module:

Option Explicit

Private Sides() As String

Sub NewDie(NewLetters As String)
    Sides = Split(NewLetters, ",")
End Sub

Property Get Side(i As Integer)
    Side = Sides(i)
End Property

Tree Class Module:

Option Explicit

Private zzroot As TreeNode


Sub AddtoTree(ByVal TreeWord As Variant)
Dim i As Integer
Dim TempNode As TreeNode

    Set TempNode = TraverseTree(TreeWord, zzroot)
    SetNode TreeWord, TempNode

End Sub

Private Function SetNode(ByVal Value As Variant, parent As TreeNode) As TreeNode
Dim ValChar As String
    If Len(Value) > 0 Then
        ValChar = Left(Value, 1)
        Select Case Asc(ValChar) - 96
            Case 1:
                Set parent.Node01 = AddNode(ValChar, parent.Node01)
                Set SetNode = parent.Node01
            Case 2:
                Set parent.Node02 = AddNode(ValChar, parent.Node02)
                Set SetNode = parent.Node02
            ' ... - Reduced to limit size of answer.
            Case 26:
                Set parent.Node26 = AddNode(ValChar, parent.Node26)
                Set SetNode = parent.Node26
            Case Else:
                Set SetNode = Nothing
        End Select

        Set SetNode = SetNode(Right(Value, Len(Value) - 1), SetNode)
    Else
        Set parent.Node27 = AddNode(True, parent.Node27)
        Set SetNode = parent.Node27
    End If
End Function

Function AddNode(ByVal Value As Variant, NewNode As TreeNode) As TreeNode
    If NewNode Is Nothing Then
        Set AddNode = New TreeNode
        AddNode.Value = Value
    Else
        Set AddNode = NewNode
    End If
End Function
Function TraverseTree(TreeWord As Variant, parent As TreeNode) As TreeNode
Dim Node As TreeNode
Dim ValChar As String
    If Len(TreeWord) > 0 Then
        ValChar = Left(TreeWord, 1)

        Select Case Asc(ValChar) - 96
            Case 1:
                Set Node = parent.Node01
            Case 2:
                Set Node = parent.Node02
            ' ... - Reduced to limit size of answer.
            Case 26:
                Set Node = parent.Node26
            Case Else:
                Set Node = Nothing
        End Select

        If Not Node Is Nothing Then
            Set TraverseTree = TraverseTree(Right(TreeWord, Len(TreeWord) - 1), Node)
            If Not TraverseTree Is Nothing Then
                Set TraverseTree = parent
            End If
        Else
            Set TraverseTree = parent
        End If
    Else
        If parent.Node27.Value Then
            Set TraverseTree = parent
        Else
            Set TraverseTree = Nothing
        End If
    End If
End Function

Function WordScore(TreeWord As Variant, Step As Integer, Optional parent As TreeNode = Nothing) As Integer
Dim Node As TreeNode
Dim ValChar As String
    If parent Is Nothing Then Set parent = zzroot
    If Len(TreeWord) > 0 Then
        ValChar = Left(TreeWord, 1)

        Select Case Asc(ValChar) - 96
            Case 1:
                Set Node = parent.Node01
            Case 2:
                Set Node = parent.Node02
            ' ... - Reduced to limit size of answer.
            Case 26:
                Set Node = parent.Node26
            Case Else:
                Set Node = Nothing
        End Select

        If Not Node Is Nothing Then
            WordScore = WordScore(Right(TreeWord, Len(TreeWord) - 1), Step + 1, Node)
        End If
    Else
        If parent.Node27 Is Nothing Then
            WordScore = 0
        Else
            WordScore = Step
        End If
    End If
End Function

Function ValidWord(TreeWord As Variant, Optional parent As TreeNode = Nothing) As Integer
Dim Node As TreeNode
Dim ValChar As String
    If parent Is Nothing Then Set parent = zzroot
    If Len(TreeWord) > 0 Then
        ValChar = Left(TreeWord, 1)

        Select Case Asc(ValChar) - 96
            Case 1:
                Set Node = parent.Node01
            Case 2:
                Set Node = parent.Node02
            ' ... - Reduced to limit size of answer.
            Case 26:
                Set Node = parent.Node26
            Case Else:
                Set Node = Nothing
        End Select

        If Not Node Is Nothing Then
            ValidWord = ValidWord(Right(TreeWord, Len(TreeWord) - 1), Node)
        Else
            ValidWord = False
        End If
    Else
        If parent.Node27 Is Nothing Then
            ValidWord = False
        Else
            ValidWord = True
        End If
    End If
End Function

Private Sub Class_Initialize()
    Set zzroot = New TreeNode
End Sub

Private Sub Class_Terminate()
    Set zzroot = Nothing
End Sub

TreeNode Class Module:

Option Explicit

Public Value As Variant
Public Node01 As TreeNode
Public Node02 As TreeNode
' ... - Reduced to limit size of answer.
Public Node26 As TreeNode
Public Node27 As TreeNode

Main Module:

Option Explicit

Const conAllSides As String = ";a,a,e,e,g,n;e,l,r,t,t,y;a,o,o,t,t,w;a,b,b,j,o,o;e,h,r,t,v,w;c,i,m,o,t,u;d,i,s,t,t,y;e,i,o,s,s,t;d,e,l,r,v,y;a,c,h,o,p,s;h,i,m,n,qu,u;e,e,i,n,s,u;e,e,g,h,n,w;a,f,f,k,p,s;h,l,n,n,r,z;d,e,i,l,r,x;"
Dim strBoard As String, strBoardTemp As String, strWords As String, strWordsTemp As String
Dim CheckWordSub As String
Dim iScore As Integer, iScoreTemp As Integer
Dim Board(1 To 4, 1 To 4) As Integer
Dim AllDice(1 To 16) As Dice
Dim AllWordsTree As Tree
Dim AllWords As Scripting.Dictionary
Dim CurWords As Scripting.Dictionary
Dim FullWords As Scripting.Dictionary
Dim JunkWords As Scripting.Dictionary
Dim WordPrefixes As Scripting.Dictionary
Dim StartTime As Date, StopTime As Date
Const MAX_LENGTH As Integer = 5
Dim Points(3 To 8) As Integer

Sub Boggle()
Dim DiceSetup() As String
Dim i As Integer, j As Integer, k As Integer

    StartTime = Now()

    strBoard = vbNullString
    strWords = vbNullString
    iScore = 0

    ReadWordsFileTree

    DiceSetup = Split(conAllSides, ";")

    For i = 1 To 16
        Set AllDice(i) = New Dice
        AllDice(i).NewDie "," & DiceSetup(i)
    Next i

    Do While WithinTimeLimit

        Shuffle

        strBoardTemp = vbNullString
        strWordsTemp = vbNullString
        iScoreTemp = 0

        FindWords

        If iScoreTemp > iScore Or iScore = 0 Then
            iScore = iScoreTemp
            k = 1
            For i = 1 To 4
                For j = 1 To 4
                    strBoardTemp = strBoardTemp & AllDice(k).Side(Board(j, i)) & "  "
                    k = k + 1
                Next j
                strBoardTemp = strBoardTemp & vbNewLine
            Next i
            strBoard = strBoardTemp
            strWords = strWordsTemp

        End If

    Loop

    Debug.Print strBoard
    Debug.Print strWords
    Debug.Print iScore & " points"

    Set AllWordsTree = Nothing
    Set AllWords = Nothing
    Set CurWords = Nothing
    Set FullWords = Nothing
    Set JunkWords = Nothing
    Set WordPrefixes = Nothing

End Sub

Sub ShuffleBoard()
Dim i As Integer

    For i = 1 To 16
        If Not WithinTimeLimit Then Exit Sub
        Board(Int((i - 1) / 4) + 1, 4 - (i Mod 4)) = Int(6 * Rnd() + 1)
    Next i

End Sub

Sub Shuffle()
Dim n As Long
Dim Temp As Variant
Dim j As Long

    Randomize
    ShuffleBoard
    For n = 1 To 16
        If Not WithinTimeLimit Then Exit Sub
        j = CLng(((16 - n) * Rnd) + n)
        If n <> j Then
            Set Temp = AllDice(n)
            Set AllDice(n) = AllDice(j)
            Set AllDice(j) = Temp
        End If
    Next n

    Set FullWords = New Scripting.Dictionary
    Set CurWords = New Scripting.Dictionary
    Set JunkWords = New Scripting.Dictionary

End Sub

Sub ReadWordsFileTree()
Dim FSO As New FileSystemObject
Dim FS
Dim strTemp As Variant
Dim iLength As Integer
Dim StartTime As Date

    StartTime = Now()
    Set AllWordsTree = New Tree
    Set FS = FSO.OpenTextFile("P:\Personal\english.txt")

    Points(3) = 1
    Points(4) = 1
    Points(5) = 2
    Points(6) = 3
    Points(7) = 5
    Points(8) = 11

    Do Until FS.AtEndOfStream
        strTemp = FS.ReadLine
        If strTemp = LCase(strTemp) Then
            iLength = Len(strTemp)
            iLength = IIf(iLength > 8, 8, iLength)
            If InStr(strTemp, "'") < 1 And iLength > 2 Then
                AllWordsTree.AddtoTree strTemp
            End If
        End If
    Loop
    FS.Close

End Sub

Function GetScoreTree() As Integer
Dim TempScore As Integer

    If Not WithinTimeLimit Then Exit Function

    GetScoreTree = 0

    TempScore = AllWordsTree.WordScore(CheckWordSub, 0)
    Select Case TempScore
        Case Is < 3:
            GetScoreTree = 0
        Case Is > 8:
            GetScoreTree = 11
        Case Else:
            GetScoreTree = Points(TempScore)
    End Select

End Function

Sub SubWords(CheckWord As String)
Dim CheckWordScore As Integer
Dim k As Integer, l As Integer

    For l = 0 To Len(CheckWord) - 3
        For k = 1 To Len(CheckWord) - l
            If Not WithinTimeLimit Then Exit Sub

            CheckWordSub = Mid(CheckWord, k, Len(CheckWord) - ((k + l) - 1))

            If Len(CheckWordSub) >= 3 And Not CurWords.Exists(CheckWordSub) Then
                CheckWordScore = GetScoreTree

                If CheckWordScore > 0 Then
                    CurWords.Add CheckWordSub, CheckWordSub
                    iScoreTemp = iScoreTemp + CheckWordScore
                    strWordsTemp = strWordsTemp & CheckWordSub & vbNewLine
                End If

                If Left(CheckWordSub, 1) = "q" Then
                    k = k + 1
                End If
            End If

        Next k
    Next l

End Sub

Sub FindWords()
Dim CheckWord As String
Dim strBoardLine(1 To 16) As String
Dim Used(1 To 16) As Boolean
Dim i As Integer, j As Integer, k As Integer, l As Integer, m As Integer, n As Integer
Dim StartSquare As Integer
Dim FullCheck As Variant

    n = 1
    For l = 1 To 4
        For m = 1 To 4
            If Not WithinTimeLimit Then Exit Sub
            strBoardLine(n) = AllDice(n).Side(Board(m, l))
            n = n + 1
        Next m
    Next l

    For i = 1 To 16
        For k = 1 To 16

            If Not WithinTimeLimit Then Exit Sub
            If k Mod 2 = 0 Then
                For j = 1 To 16
                    Used(j) = False
                Next j

                Used(i) = True
                MakeWords strBoardLine, Used, i, k / 2, strBoardLine(i)
            End If

        Next k
    Next i

    For Each FullCheck In FullWords.Items
        SubWords CStr(FullCheck)
    Next FullCheck

End Sub

Function MakeWords(BoardLine() As String, Used() As Boolean, _
    Start As Integer, _
    Direction As Integer, CurString As String) As String
Dim i As Integer, j As Integer, k As Integer, l As Integer

    j = 0

    Select Case Direction
        Case 1:
            k = Start - 5
        Case 2:
            k = Start - 4
        Case 3:
            k = Start - 3
        Case 4:
            k = Start - 1
        Case 5:
            k = Start + 1
        Case 6:
            k = Start + 3
        Case 7:
            k = Start + 4
        Case 8:
            k = Start + 5
    End Select

    If k >= 1 And k <= 16 Then
        If Not WithinTimeLimit Then Exit Function

        If Not Used(k) Then
            If ValidSquare(Start, k) Then
                If Not (JunkWords.Exists(CurString & BoardLine(k))) And Not FullWords.Exists(CurString & BoardLine(k)) Then
                    Used(k) = True
                    For l = 1 To MAX_LENGTH
                        If Not WithinTimeLimit Then Exit Function
                        MakeWords = CurString & BoardLine(k)
                        If Not (JunkWords.Exists(MakeWords)) Then
                            JunkWords.Add MakeWords, MakeWords
                        End If
                        If Len(MakeWords) = MAX_LENGTH And Not FullWords.Exists(MakeWords) Then
                            FullWords.Add MakeWords, MakeWords
                        ElseIf Len(MakeWords) < MAX_LENGTH Then
                            MakeWords BoardLine, Used, k, l, MakeWords
                        End If
                    Next l
                    Used(k) = False
                End If
            End If
        End If
    End If

    If Len(MakeWords) = MAX_LENGTH And Not FullWords.Exists(MakeWords) Then
        FullWords.Add MakeWords, MakeWords
        Debug.Print "FULL - " & MakeWords
    End If

End Function

Function ValidSquare(StartSquare As Integer, EndSquare As Integer) As Boolean
Dim sx As Integer, sy As Integer, ex As Integer, ey As Integer

    If Not WithinTimeLimit Then Exit Function

    sx = (StartSquare - 1) Mod 4 + 1
    ex = (EndSquare - 1) Mod 4 + 1

    sy = Int((StartSquare - 1) / 4 + 1)
    ey = Int((EndSquare - 1) / 4 + 1)

    ValidSquare = (sx - 1 <= ex And sx + 1 >= ex) And (sy - 1 <= ey And sy + 1 >= ey) And StartSquare <> EndSquare

End Function

Function WithinTimeLimit() As Boolean
    StopTime = Now()
    WithinTimeLimit = (Round(CDbl(((StopTime - StartTime) - Int(StopTime - StartTime)) * 86400), 0) < 120)
End Function
\$\endgroup\$
11
  • 2
    \$\begingroup\$ I haven't looked through the code, but 50 points is ridiculuously low. I've played randomly generated boards with scores over 1000 (using SOWPODS - the word list supplied may be less extensive). You might want to check for a sign error! \$\endgroup\$ Commented May 15, 2012 at 21:43
  • \$\begingroup\$ @PeterTaylor Thanks for the suggestion. I know that score's much too low, and I know that part of the problem lies in the fact that I can see obvious words being missed... \$\endgroup\$
    – Gaffi
    Commented May 15, 2012 at 22:24
  • \$\begingroup\$ @PeterTaylor Also, for the record, I'm continually posting my progress, rather than wait for my final product, since nobody else has taken a stab at it yet. I would like to keep the question somewhat alive until that happens. \$\endgroup\$
    – Gaffi
    Commented May 16, 2012 at 0:30
  • \$\begingroup\$ I should also note that this is not being run on the fastest machine out there, so that doesn't help, either. \$\endgroup\$
    – Gaffi
    Commented May 18, 2012 at 17:56
  • 1
    \$\begingroup\$ @Gaffi 10 seconds to compute the score? That's 9.999 seconds too long. You have to rethink your code. If you refuse to turn your wordlist into a tree, then at least do this: Make a list (hashtable, whatever) of all the two-letter prefixes. Then when you start to follow a path on the board, if the first two letters aren't in the list, skip that entire subtree of possible paths. Again, building the full tree is best, but the two-letter prefix list will help and is very cheap to make. \$\endgroup\$
    – breadbox
    Commented Jun 8, 2012 at 6:35
2
\$\begingroup\$

Quick look at size of the search space.

   16! => 20922789888000 Dice Permutations
(6^16) =>  2821109907456 Face Permutations
 59025489844657012604928000 Boggle Grids 

Reducing to exclude the repetition on each die.

              16! => 20922789888000 Dice Permutations
(4^4)*(5^6)*(6^5) => 31104000000 Unique Face Permutations
   650782456676352000000000 Boggle Grids 

@breadbox store the dictionary as an Hash Table O(1) checking.

EDIT

Best Board (I've witnessed so far)

L  E  A  N
S  E  T  M
T  S  B  D
I  E  G  O

Score: 830
Words: 229
SLEETIEST  MANTELETS
MANTEELS  MANTELET  MATELESS
MANTEEL  MANTELS  TESTEES  BETISES  OBTESTS  OBESEST
SLEETS  SLEEST  TESTIS  TESTES  TSETSE  MANTES  MANTEL  TESTAE  TESTEE
STEELS  STELES  BETELS  BESETS  BESITS  BETISE  BODGES  BESEES  EISELS
GESTES  GEISTS  OBTEST
LEANT  LEATS  LEETS  LEESE  LESES  LESTS  LESBO  ANTES  NATES  SLEET  SETAE
SEATS  STIES  STEEL  STETS  STEAN  STEAM  STELE  SELES  TAELS  TEELS  TESTS
TESTE  TELES  TETES  MATES  TESTA  TEATS  SEELS  SITES  BEETS  BETEL  BETES
BESET  BESTS  BESIT  BEATS  BODGE  BESEE  DOGES  EISEL  GESTS  GESTE  GESSE
GEITS  GEIST  OBESE
LEAN  LEAT  LEAM  LEET  LEES  LETS  LEST  LESS  EATS  EELS  ELSE  ETNA  ESES
ESTS  ESSE  ANTE  ANTS  ATES  AMBO  NATS  SLEE  SEEL  SETA  SETS  SESE  SEAN
SEAT  SEAM  SELE  STIE  STET  SEES  TAEL  TAES  TEEL  TEES  TEST  TEAM  TELE
TELS  TETS  TETE  MATE  MATS  MAES  TIES  TEAT  TEGS  SELS  SEGO  SITS  SITE
BEET  BEES  BETA  BETE  BETS  BEST  BEAN  BEAT  BEAM  BELS  BOGS  BEGO  BEGS
DOGE  DOGS  DOBS  GOBS  GEST  GEIT  GETS  OBES
LEA  LEE  LET  LES  EAN  EAT  EEL  ELS  ETA  EST  ESS  ANT  ATE  NAT  NAE  NAM
SEE  SET  SEA  SEL  TAN  TAE  TAM  TEE  TES  TEA  TEL  TET  MNA  MAN  MAT  MAE
TIE  TIS  TEG  SEG  SEI  SIT  BEE  BET  BEL  BOD  BOG  BEG  DOG  DOB  ITS  EGO
GOD  GOB  GET  OBS  OBE
EA  EE  EL  ET  ES  AN  AT  AE  AM  NA  ST  TA  TE  MA
TI  SI  BE  BO  DO  IT  IS  GO  OD  OB
\$\endgroup\$
5
  • \$\begingroup\$ Get me a machine with that much RAM and we'll talk. \$\endgroup\$
    – breadbox
    Commented Jun 2, 2012 at 18:28
  • \$\begingroup\$ You need to divide the dice permutations by 8 to account for the symmetries of the square. Also, how do you get (4^4)(5^6)(6^5)? I make it (4^3)(5^7)(6^6), for a total searc space of just over 2^79. \$\endgroup\$ Commented Jun 8, 2012 at 21:42
  • \$\begingroup\$ @Peter Taylor : You're right. I must have deleted one to many, when do the unique faces. I think we can agree there is 83 unique faces, (excluding repeats across die). Pick any 16 without repeats. '83 x 82 x 81 x 80 x 79 x 78 x 77 x 76 x 75 x 74 x 73 x 72 x 71 x 70 x 69 x 68' Approx: 1.082 x (10^30) ==> ~2^100 What ever it is, its a big number. \$\endgroup\$ Commented Jun 8, 2012 at 22:53
  • 2
    \$\begingroup\$ @AdamSpeight I originally assumed that your comment about storing the dictionary as a hashtable was just a joke, and so I basically ignored it. My apologies. A proper response would be: Actually, a hashtable is a lousy data structure for this problem. It can only answer the question "is X a valid word?", so you have to build all possible strings in order to find the words. A DAWG lets you ask "is X a prefix of any valid word? and if so, what letters can follow it?" This allows you to prune the search space to a tiny tiny fraction of its total size. \$\endgroup\$
    – breadbox
    Commented Jun 9, 2012 at 5:54
  • \$\begingroup\$ Hashtable is terrible as it prevents you culling word fragments that will never become complete words (dicttree.ceiling(fragment).startsWith(fragment)). Although any given boggle board has many millions of potential words, you can throw out a huge portion of them after 2-3 letters have been strung together. Tree traversal is slower than hashtable lookup, but the tree allows you to sidestep 99+ percent of the work through backtracking. \$\endgroup\$
    – Jim W
    Commented Sep 29, 2017 at 15:35
1
\$\begingroup\$

My entry is over here on Dream.In.Code ~30ms per a board search (on a 2 core machine, should be quicker with more cores)

\$\endgroup\$
3
  • \$\begingroup\$ Still looking into it, but your first link on that page is missing the : in http://. ;-) \$\endgroup\$
    – Gaffi
    Commented Jun 18, 2012 at 20:08
  • \$\begingroup\$ Very nice. I'm going to try to steal that for myself as a learning experience. .NET to VBA isn't too hard. \$\endgroup\$
    – Gaffi
    Commented Jun 18, 2012 at 20:48
  • \$\begingroup\$ Would you mind updating the answer to include you average score when running the ISPELL list (not SOWPODS)? That is part of the challenge, and I'm interested to see how your results compare to breadbox's. \$\endgroup\$
    – Gaffi
    Commented Jun 22, 2012 at 21:19

Not the answer you're looking for? Browse other questions tagged or ask your own question.