11
$\begingroup$

From CLRS book for MAX-HEAPIFY procedure :

The children's subtrees each have size at most 2n/3 - the worst case occurs when the last row of the tree is exactly half full

I fail to see this intuition for the worst case scenario . Can some one explain possibly with a diagram . Thanks .

P.S: I know Big O notation and also found this answer here but still I have doubts.

$\endgroup$
1
  • 1
    $\begingroup$ This question is fine here, but it would also be very fine on Computer Science. Just saying. $\endgroup$
    – k.stm
    Commented Mar 6, 2016 at 1:43

5 Answers 5

30
$\begingroup$

Start with a heap $H$ with $n$ levels with all levels full. That's $2^{i - 1}$ nodes for each level $i$ for a total of $$|H| = 2^n - 1$$ nodes in the heap. Let $L$ denote the left sub-heap of the root and $R$ denote the right sub-heap. $L$ has a total of $$|L| = 2^{n - 1} - 1$$ nodes, as does $R$. Since a binary heap is a complete binary tree, then new nodes must be added such that after heapification, nodes fill up the last level from left to right. So, let's add nodes to $L$ so that a new level is filled and let's denote this modified sub-heap as $L'$ and the modified whole heap as $H'$. This addition will require $2^{n - 1}$ nodes, bringing the total number of nodes in $L'$ to $$ |L'| = (2^{n - 1} - 1) + 2^{n - 1} = 2\cdot 2^{n - 1} - 1$$ and the total number of nodes in the entire heap to $$ |H'| = (2^{n} - 1) + 2^{n - 1} = 2^n + 2^{n - 1} - 1 $$

The amount of space $L'$ takes up out of the whole heap $H'$ is given by $$ \frac{|L'|}{|H'|} = \frac{2\cdot 2^{n-1} - 1}{2^n + 2^{n - 1} - 1} = \frac{2\cdot 2^{n-1} - 1}{2\cdot 2^{n - 1} + 2^{n - 1} - 1} = \frac{2\cdot 2^{n-1} - 1}{3\cdot 2^{n - 1} - 1} $$

Taking the limit as $n \to \infty$, we get: $$ \lim_{n\to\infty} { \frac{|L'|}{|H'|} } = \lim_{n\to\infty} { \frac{2\cdot 2^{n-1} - 1}{3\cdot 2^{n - 1} - 1} } = \frac{2}{3} $$

Long story short, $L'$ and $R$ make up effectively the entire heap. $|L'|$ has twice as many elements as $R$ so it makes up $\frac{2}{3}$ of the heap while $R$ makes up the other $\frac{1}{3}$.

This $\frac{2}{3}$ of the heap corresponds to having the last level of the heap half full from left to right. This is the most the heap can get imbalanced; adding another node will either begin to rebalance the heap (by filling out the other, right, half of the last level) or break the heap's shape property of being a complete binary tree.

$\endgroup$
4
  • 2
    $\begingroup$ I understood that when we want the lat level of nodes to be half then left subtree will have 2n/3 nodes. But I did not understand that why make the last level half. If the last level is full , even then, the program will have to traverse full height of tree. $\endgroup$
    – V K
    Commented Jul 23, 2020 at 7:02
  • 1
    $\begingroup$ @VK Were you able to understand this finally ? I have the same question. $\endgroup$ Commented Aug 1, 2020 at 12:17
  • $\begingroup$ @DeveshLohumi No, but I was able to understand from this walkccc.github.io/CLRS/Chap06/6.4/#64-5-star link. $\endgroup$
    – V K
    Commented Aug 4, 2020 at 11:44
  • $\begingroup$ @VK I also realise that a better way this could have been said is that the last row of one of the sub-trees is full instead of saying that the last row is half full since when calculating the running time we are recursively analysing one of the sub-trees. $\endgroup$ Commented Aug 5, 2020 at 12:12
6
$\begingroup$

The worst case scenario will occur when the recursive function MAX-HEAPIFY will be called until a leaf is reached.So to make it reach to the leaf we can choose the value of nodes such that every time the parent node is less then its children eg. chose parent node value as $0$ and every other node as $1$.So the running time will be $\Theta(h)=\Theta(\lg n)$ (since MAX-HEAPIFY will be called $h$ number of times and each call takes $\Theta(1)$ time).

Now as the running time is analysed by number of nodes in the tree ($n$).We have to maximize the number of nodes in the subtree so that the height will also get increased.So, $$T(n) \leq T(2n/3) + \Theta(1)$$ So in the worst case if we want to maximize the nodes in any of the subtree then we will make the final row of that subtree as full as possible and other subtree's final row as empty as possible.And by simple mathematical calculation we can show that - maximum number of node in a subtree is bounded by $2n/3$

$\endgroup$
1
  • 1
    $\begingroup$ can you put the mathematical induction in your answer to make it complete ? I got the point , though. So +1. $\endgroup$
    – Geek
    Commented Aug 11, 2012 at 18:45
3
$\begingroup$

We first note that a complete binary tree of height $h$ has $2^{h+1}-1$ nodes. Note that the height of a tree is the maximum number of edges from the root to a leaf.

We see that the worst case of Max-Heapify occurs when we start at the root of a heap and recurse all the way to a leaf. We also see that Max-Heapify makes a choice of recursing to its left subtree or its right subtree. When Max-Heapify recurses it cuts the work down by some fraction of the work it had before. We will show that this fraction is at most $2/3$.

From inspection we see the worst ratio occurs when we are at the root node of a heap of height $h$ and the left subtree is a complete binary tree of height $h-1$ and the right subtree is a complete binary tree of height $h-2$. Notice this means the heap's last level is half full and the heap is of size $1+(2^h-1)+(2^{h-1}-1)$. If Max-Heapify chooses to recurse to its left subtree then the work has been cut by a factor of

\begin{align} \frac{\text{number of nodes in left subtree}}{\text{number of nodes in entire heap}} &= \frac{2^h-1}{1+(2^h-1)+(2^{h-1}-1)} \\ &= \frac{2^h-1}{2^{h-1}(2+1)-1} \\ &= \frac{2^h-1}{3\cdot2^{h-1}-1}. \end{align}

We now want to take the limit as $h\to\infty$ to get an upper bound on the ratio

$$ \lim_{h\to\infty}\frac{2^h-1}{3\cdot2^{h-1}-1} = \lim_{h\to\infty}\frac{2\cdot2^{h-1}-1}{3\cdot2^{h-1}-1} = \frac{2}{3}. $$

And so we see what the authors meant when they said "The children's subtrees each have size at most $2n/3$ - the worst case occurs when the last row of the tree is exactly half full".

$\endgroup$
1
  • $\begingroup$ I don't get why we're changing $n$ when we're doing the worst-case analysis. Shouldn't we be analysing the worst case arrangement of nodes given the input is of size $n$? For a fixed $n$, the arrangement of nodes at the lowest level is fixed so it is not in our hands to forcefully make it half full. Any clarification would be helpul. $\endgroup$
    – Jamāl
    Commented Jan 12, 2021 at 3:21
3
$\begingroup$

Clarifying the statement below in a simple and easy manner:-

"The children's sub-trees each have size at most 2n/3 - the worst case occurs when the last row of the tree is exactly half full".

Let us suppose we have a max-heap as given below :-

                                        1
                                 /             \
                          2                           3
                      /       \                    /     \
                   4             5               6         7
                 /   \         /   \ 
               8      9      10     11 

Here the bottom level is exactly half filled. If you count the number of nodes to the left of the top node (i.e. 1) then it is equal to 7 and number of nodes to its right are 3.

Let x be the number of nodes in left sub-tree at any time and y be the number of nodes in right sub-tree at any time.

Then we can say that when bottom level is half filled. Then nodes in the left sub-tree are as twice as the nodes in the right sub-tree. So we can say that in worst case we have ceil(2n/3) nodes in left sub-tree where 'n' is the total number of nodes in left and right-subtrees. As in the above example, left sub-tree has ceil(2*10/3) = 7 nodes in worst case.

Hence we have the relation :-

T(N) = T(2N/3) + O(1)

$\endgroup$
1
  • $\begingroup$ if we take n = total number of nodes = 11, then we won't need to use the ceil function (as not used by book), and it will become: (2 * 11)/3 = 7 nodes $\endgroup$ Commented Nov 19, 2020 at 5:53
2
$\begingroup$

Normally for a binary tree, $ n \approx n/2 + n/2$

But in a binary-heap the left subtree has one more level that is filled completely.

What does this mean for the left subtree size? Well, since, its a binary tree this is roughly the same as multiplying by 2 (recall number of nodes in a binary tree of size $n = 2^h -1 $).
So the left subtree has 2x as many nodes as the right.

Now, Let $a = $ size of left tree
$b$ = size of right tree

Then we can express the size of the heap as $$\begin{align}n &= (a + b) \\ &= (2b + b) \text{ (Since left has twice as many as right}) \\ &= 3b \\ &\implies b = \frac{n}{3}, a= \frac{2}{3}n \\ \end{align} $$

$\endgroup$
1
  • $\begingroup$ number of nodes in a binary tree of size n is (2 ^ (h+1) ) - 1 $\endgroup$ Commented Nov 19, 2020 at 5:57

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .