1
\$\begingroup\$

I am trying to determine the correctness of a helper method within my MaxHeap class.

The goal of this helper method is, given the index of any node in the Max Heap, sink it to the correct level in the heap so that all of its children are greater than the node, and its parent is greater (essentially sink it to the correct position to regain the key property of a Max Heap). Max-heap property we don't want to break: the value of each node is less than or equal to the value of its parent

I implemented this first without looking at solution code provided online/in a MOOC course. The solution code is very (too) compact and am having trouble determining if my solution is also correct.

My heap representation is an array, pq[], with empty index 0. Heap will be from position 1 to pq.length -1. for a node at position nodeIndex, left child will be at index 2k and right child at index 2k + 1.

enter image description here



private void sink(int nodeIndex){


        /*
         * children of nodeIndex are in positions 2k, and 2k+1. Here we iterate while the node we are trying to sink, nodeIndex, has a left child.
         *
         * Note: if node has a right child it WILL have a left child because we are dealing with a Complete tree, meaning leaf nodes are left leaning
         */
        while(2 * nodeIndex <= this.top){
            //initially, we have not checked if node we are sinking has a right child, so we say left child is the node with the greatest Key for now
            int indexOfMaxChild = 2 * nodeIndex;

            //simple arithmetic to see which indexis would contain left and right child (if right child exists) 
            int indexOfLeftChild = 2 * nodeIndex;
            int indexOfRightChild = indexOfLeftChild+ 1;

            //if nodeIndex has a right child
            if(indexOfRightChild <= this.top) {
                //helper method that returns the position/index of nodeIndex's children with the greatest key. If children 1's key is < than children 2's key, return the index of children 2, otherwise, return index of children 1 
                indexOfMaxChild = indexOfMaxKey(indexOfLeftChild, indexOfRightChild);
            }

            // if nodeIndex is less than child with greatest key, swap (sink by one level)
            if(isLessThan(nodeIndex, indexOfMaxChild))
                exchange(nodeIndex, indexOfMaxChild);
            else //if node we are trying to sink is at the correct level, no sinking needed
                break;
            // we update the index of the node we are sinking and repeat if it has a left child
            nodeIndex = indexOfMaxChild;
        }

    }

Solution code:

private void sink(int k) 
{
 while (2*k <= top)
 {
     int j = 2*k;
     if (j < top && less(j, j+1)) j++;
     if (!less(k, j)) break;
     exch(k, j);
     k = j;
     } 
}
\$\endgroup\$
0

1 Answer 1

1
\$\begingroup\$

Verifying correctness

To verify the correctness of a program, you must verify all the paths of execution, and all the relevant variations of data. It's good to write unit tests for them. Consider these cases (from simpler to complex):

  • the index is out of bounds
  • the index points to a leaf
  • the index points to a node having only a left or right child that is a leaf, and should or should not sink (6 combinations: node < left; node == left; node > left; and similarly for right)
  • the index points to a node having two children that are leafs, with left higher or lower than right, and should or should not sink (many combinations to cover all the ordering of values of node, left and right, including equality and inequality)

It's also good to check the differences between your code and the solution:

  • If you rename variables and functions, they come very close, and look clearly equivalent for the most part, with few exceptions
  • Assuming that isLessThan in your code and less in the solution are the same, then the if (isLessThan(...)) ... in your code is equivalent to the if (!less(...)) ... in the solution, with the branches of the conditional effectively flipped, which doesn't change the behavior.

The trickiest difference is this part:

if (indexOfRightChild <= this.top) {
    indexOfMaxChild = indexOfMaxKey(indexOfLeftChild, indexOfRightChild);
}

Compared with this in the solution:

if (j < top && less(j, j+1)) j++;

Where j and j+1 correspond to indexOfLeftChild and indexOfRightChild in your code. These snippets are equivalent because:

  • j < top is equivalent to j+1 <= top
  • indexOfMaxChild = indexOfMaxKey(indexOfLeftChild, indexOfRightChild) is equivalent to if (isLessThan(indexOfMaxChild, indexOfMaxChild+1)) indexOfMaxChild++ because:
    • indexOfMaxChild has the value of indexOfLeftChild
    • indexOfRightChild has the value of indexOfLeftChild+1
    • indexOfMaxKey will return either indexOfLeftChild or indexOfRightChild

Avoiding duplicated logic

The helper methods indexOfMaxKey and isLessThan must have some similar logic:

  • they both must check the indexes are within the correct bounds
  • they both compare the values of nodes when present

To understand your code, I have to understand both helper methods. The other solution achieves the same effect using the single helper method less, and this makes it a bit easier to understand, simply because there is smaller vocabulary and logic to keep in my mind.

It's a good practice to use helper methods in general, at the same time I recommend to avoid helper methods that are too similar to each other.

Confusing names and terms

I find it confusing that the name top means the last index in the array, in many ways:

  • "top" and "last" are generally at opposite ends of things...
  • Given a tree, when I hear "top", I think "root"

I would prefer this named last.


You have this in a code comment:

helper method that returns the position/index of nodeIndex's children with the greatest key. If children 1's key is < than children 2's key, return the index of children 2, otherwise, return index of children 1

Instead of "key", I recommend to use the term "value" here. Arrays have values, tree nodes have values. The term "key" is more commonly used in map-like structures, where you lookup values by keys.

Instead of writing "position/index", pick a term and use it consistently (and here I'd pick "index"). Any one consistently used term is better than having alternative terms for the same thing. It reduces mental burden, and helps prevent future confusions.

Avoid empty index 0

The description says that index 0 is empty. In that case sink(0) looks illegal. I think it would be better to avoid this additional complexity by using index 0 as the top of the heap.

\$\endgroup\$
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.