6

First time poster. I'm currently programming a hamming distance program using a dictionary.

So if we start with the word "cat" and we want to transform it into "dog", the sequence is cat-->cot-->dot-->dog, and "full" to "tree" is full-->fuel-->feel-->teel-->tyee-->tree. All intermediates must exist in a dictionary. I've implemented that part and it works quite well. My question is about the fringe cases. Some words have no transformations, such as dragon and cosmo. But that's fine, I implemented a quick check.

The real problem is there are lots of paths in which transformations are impossible and I can't seem to find a way to check whether they are impossible. For example, heist to flame is heist-->feist-->feast-->feaseX --> flame. Getting to the end point is impossible and an infinite loop occurs.

My possible options are that after around 20 transformations, I stop the calculation and throw an error, which is fine. Or, there might be way to detect a dead path before the program is ran, this is the best option, I'm not optimistic this is possible.

3

5 Answers 5

2

You can solve this with a standard breadth-first search, with the addition of also storing the path taken to a node whenever you enqueue it in the to search queue. BFS algorithms avoid loops by keeping a visited or discovered set. Because they use a queue instead of a stack, they are not normally implemented using recursion, unless the implementation uses immutable data structures.

If you reach a point in the algorithm where you have nothing remaining in the to search queue, you know there is no possible path. Note at this point, the visited set contains all the nodes reachable from the given starting point. You can take advantage of that to precalculate and store all the sets of connected words, by just skipping the part where you check if you reached the goal word. However, it's simpler and may often work out faster to just run the BFS.

2
  • Yup! thats exactly what I did
    – Siqi Liu
    Commented Jul 8, 2015 at 14:34
  • Although the way I did it was if all paths coverged on the same node, then the algorithm would end.
    – Siqi Liu
    Commented Jul 8, 2015 at 15:23
5

It seems that what you're really trying to do is figure out if two nodes in a graph are in the same connected component. Specifically the graph is an undirected one in which the words of your dictionary are nodes, and two nodes are connected to each other by a vertex when they differ by exactly one letter, and are the same length. The root of the problem is that this graph is not be totally connected. It may be a disconnected graph of mutually unreachable connected components. In fact, it seems from your description that any words with a different number of letters would have to be in separate components.

The italicized keywords above are too technical to explain completely in an answer here, but you will find them on Wikipeida, or in any computer science text that covers graph algorithms. The algorithms you should consider for solving your problem are reachability algorithms for disconnected, undirected graphs.

1

To solve the question you pose, you must understand complete search, and for that you have to understand recursive algorithms. This simply means that you must reduce e.g. the problem of reaching "tree" from "full" to the problem of reaching "free" from "full", and handle the intermediate data correctly.

1
  • 1
    Could you go into a bit more detail? I'm interested
    – Siqi Liu
    Commented Jul 7, 2015 at 18:08
0

I assume you are using recursion for this. Every transformation is a unit of work to be done, hopefully it is handled in one method, or group of methods.

When you jump to next word, keep track of all the words that you have already been at during this specific path and check the available options against the list of those words to exclude the ones that you already visited.

Example:

heist-(1)->feist-(2)->feast-(3)->feaseX -(4)-> flame

* Numbers signify the execution cycle of the same method. Simplified version for demonstration purposes.

public string[] previouslyUsedWords = new String[]();

public string transformNext(string currentWord) {
    var possibleOptions = _optionsProvider.GetOptions(currentWord);

    foreach(var word in possibleOptions) {
        if(!previouslyUsedWords.Contains(word)) {
            previouslyUsedWords.Add(word);
            return word;
        }
    }

    throw new Exception("No options left, this is a dead end!");
}

Combine this with a tree-traversal algorithm(some paths are dead ends and that's why you need to go back and try other paths) and you should accomplish this task in no time.

2
  • Actually if you try to do depth search, you don't get the shortest distance. Going breadth search might work, but how would I implement a tree-traversal algorithm? Binary trees don't work.
    – Siqi Liu
    Commented Jul 7, 2015 at 23:14
  • Execution path will rarely be shortest distance, however the output you provide should be. That's why I am saying to keep track of nodes you have visited to avoid loops. Also, keep track of nodes that successfully let you navigate closer to your target word. If you find dead end, go back and remove the success nodes from the second list up to the moment where you still have options to explore further. If no options are left, it's a dead end. Binary tree obviously doesn't work, as you have non-binary tree here.
    – Alexus
    Commented Jul 7, 2015 at 23:21
0

I think you should probably use a union-find algorithm to find all connected groups of n-letter words that can transform into each other. Preprocess you dictionary with it, and before you start searching see if the two end words are in the same connected group. The preprocessing will only take time proportional to the number of valid word-to-word transitions, so you could get away with doing it every time you load your dictionary. You would need to store the group id of every word, so that would take a little more space.

Not the answer you're looking for? Browse other questions tagged or ask your own question.