Heaps

Binary Heaps

Hello people…! In this post I will talk about Binary Heaps, or, more casually called as Heaps. Binary Heaps are used to implement Priority Queues which are associated with Scheduling. Binary Heaps is a starting step to learn many advanced heap related Data Structures that are used in many algorithms. So let’s understand the need of the situation by taking up a scenario.

Let’s say that the CPU has to execute ‘n’ programs that take their own pre-defined time to get completed. Then one strategy to schedule them is the Shortest Remaining Processing Time (SRPT) first method. As the name suggests, we look at all the available programs, then pick up that program which takes the minimum time to get executed. After completing that program, then the next minimum is supposed to be picked up. So if we observe what operations are we doing on the data we have, they would be –

  • Finding the minimum of given elements.
  • Deleting the minimum element (after a program gets executed).
  • Inserting a new element (if a new program enters the scheduling).

So, we must choose such a Data Structure which takes the minimum to these operations so that we can come up with an efficient solution to the scenario. Binary Heap is one such Data Structure that does these operations in pretty fast time. Now, what exactly is the Binary Heap…? What does it look like…? The Binary Heap is actually a Binary Tree has two very important properties –

  • Structural Property – The value of every node in the Binary Tree should be less than or equal to the values of its children nodes.
  • Heap Property – The Binary Tree should be fully-filled up to the 2nd last level and the last level must be “left-filled”.

To understand these two properties let’s take up an example –

Heap Property in Binary Heaps
Heap Property in Binary Heaps

So, as we can see, the last level is what is called “left-filled”, i.e., all the nodes are to the left and all the empty spaces to the right. The heap property is also depicted. The parent is smaller than both of its children. Now, having introduced this structure to you. Can you think where the smallest element of the heap would be….? At the top of course…! This is due to the very definition of the Heap Property. If the parent has the lower number, then the ancestor of all the nodes should surely have the least number. But this does not say that the leaves at the last level have the greatest number. Think about it for a while looking at the picture and you will understand it.

Now, the Binary Heap is although depicted as a tree, it is implemented as an array…! Just like the Fenwick Tree. The Binary Tree structure of the allows this. Look at the sketch below –

Storing a Binary Heap
Storing a Binary Heap

As, you can see, the nodes of the heap are put into an array in the left-to-right order at each level. Why is this so…? This is the way we can access all the nodes in a heap. How…? If we take any element arr[i] –

  • Parent of Element at i → arr[ i2 ]
  • Left Child of Element at → arr[(2 ✕ i)]
  • Right Child of Element at → arr[(2 ✕ i) + 1]

Looking at the picture, we can verify the above statements. For example, take the node 9 (arr[6]) which is at level 3. Its parent is arr[ 62 ] which is arr[3], i.e., node 6 at level 2, as it is in the structure. The left child of arr[6] is arr[(2 ✕ 6)], which is, arr[12], node 20 and like this even the right child can be verified. The way to find the parent is actually floor( i2 ), but as both ‘i’ and 2 are integers the result comes as it is desired. Now, we must discuss the five important operations in a binary heap –

  • Getting minimum element
  • Inserting a new element
  • Heapify
  • Deleting a new Element
  • Build Heap

Minimum Element – As you can clearly see, the minimum element would be the topmost element of the heap which is the first element of the array. So this operation takes O(1) time.

Inserting an Element – Recall that the last level of the heap must be left-filled. So, where ever you add a node, the resulting structure should be such that the last level will have an extra element. We do this, by first adding the new element to the end, then to ensure that the Structural Property is satisfied we keep swapping the element with its parent until the parent node to be swapped is less than the inserted element. So, we kind of insert the element at the bottom and we “move up” the heap. This process is illustrated in the sketch below.

Inserting an element into a Binary Heap
Inserting an element into a Binary Heap

This is how we insert a node into a heap. In the worst case, we would have to travel all the way up which costs us (log n) swaps, which is the height of the tree. So, the insertion operation has a complexity of O(log n).

Heapify – The heapify operation is as the name suggests, converting some “non-heap” thing into a heap. Well, what this non-heap thing is that it is a parent node, where the left and the right children are heaps, but the structure as a whole isn’t a heap. How is that possible…? This parent node we are talking about, its value could be greater than one or both of its children. In such a case, the smaller valued child is swapped with the parent and this is carried on down the tree until the heap property holds true for the whole structure. To understand this, observe the sketch below.

Heapify in Binary Min Heap

The process of heapify is clearly shown in the sketch, how we keep swapping the child nodes till the heap property is satisfied for the whole structure. This too takes O(log n) time in the worst case, when we would have to travel the whole till the leaves. Where is this process used anyway…? It is used to build a heap, which is our next discussion.

Deleting an Element – Deleting an element is a little tricky. Structurally, when an element is deleted, the last level will lose the right-most node. So what we do is, we swap the last element and the node to be deleted, then remove the last element. Based on the newly swapped element, we either bubble it up (as we did it in Insertion) or we bubble it down (which is simply to use heapify on the swapped element). Why would we get these two cases? Let us take some examples to understand this better. Study the sketch below –

Deletion in Binary Min Heap

So this is the case where we would have to call heapify on the swapped node. As we can see, this case arises when the swapped element is greater than both its children (greater than its parent). Now, the sketch below depicts the other case where we would have to bubble it upwards like we did it in insertion –

binary-min-heap-delete-bubble-up

So this case arises when the swapped element is lesser than its parent. So, while deletion the conditions are –

if (swappedElement > parent) {
	heapify(swappedElement)
} else {
	bubble up swappedElement
}

Build Heap – Build heap is constructing the heap from a given set of randomly arranged nodes in a binary tree. The build heap process uses the heapify to construct the heap “down-to-up”. Now, if you remember the principle of the heapify process, we were able to do that only because the left-subtree and the right-subtree were heaps. Think of the second last level in a heap, the left and the right nodes are leaves, i.e., they have no children, so, they are heaps by themselves. Why is a leaf by itself a heap…? This is because the left child and the right child don’t exist, so we can claim that the parent node (the leaf), satisfies the heap property. And for a leaf, the structural property also holds good because the second-last level, (the level of the leaf node), is fully filled, and the last level (the level of the leaf’s children), is left-filled (actually there are no nodes to be left-filled). Well, most probably, you won’t understand this in the first reading, give it another couple of readings and observe the picture below.

Build Heap Step-by-Step
Build Heap Step-by-Step

I hope the process of the Build Heap is clear. At every level we use the heapify process to gradually build the heap “down-up”. So what is the asymptotic time complexity of Build Heap…? We use heapify for all the nodes except the nodes in the last level,

Nodes till the 2nd Last level = 12 ✕ (Total number of Nodes ⇒ N)
Complexity of the Heapify procedure = O(log N)
Complexity of Build Heap = O(N log N)

But we can argue that the Build Heap process will only take O(N). Now, look at the above sketch once again. If you feel the text is too small, open the image by right-clicking, its bigger than you think…! Now,

The (n + 1)4 nodes at level 3 took 1 swap ⇒ (n + 1)4 ✕ 1
The (n + 1)8 nodes at level 2 took 2 swaps ⇒ (n + 1)8 ✕ 2
The (n + 1)16 nodes at level 1 took 3 swaps ⇒ (n + 1)16 ✕ 3

As, you can see, we have a sort-of progression here. If we were to sum them up, taking (n + 1) common,

⇒ (N + 1) ✕ (14 + 28 + 316 + …)
⇒ (N + 1) ✕ Σ i2i
⇒ (N + 1) ✕ 2       { Σ i2i = 2 }
⇒ O(N)

Therefore, the Build Heap takes O(N) time. The principle behind this is that for most of the time Heapify works on smaller values of ‘N’. These are the most important operations of a Binary Heap. We have discussed the implementation of Binary Heaps using arrays and accessing the parent and child. Now, try to code the above-stated operations at least 3 times…! You have to try. There is no better way of knowing a Data Structure than trying to code it. After you have given your everything, if you succeed, you are brilliant…! If not, look at my code below and see for yourselves how close you were…!

    

One thing I would like to point out is that my heap is an array of structures. Just an integer would suffice, but I made it an array of structures so that it becomes a little generic. So, tomorrow, if you want to you this code as a Priority Queue, you needn’t change anything (or at most the names) in the code. Or, if you wanted to store more information about each node of the Binary Heap, you can simply add attributes to the struct.
Next, for my deletion operation, I swapped the last element and the element to be deleted and called the heapify procedure. This is because, the heapify procedure, checks if the Heap Property is violated and makes the necessary corrections. As the first element of the heap is the smallest, it can be accessed directly by heap[1]. Remember, the arrays in my code are 1-indexed, and all the loops and conditions go with it.
The code is pretty well tested. You can download the input file here.

There is one last topic to discuss before I conclude.

Max Heap – Whatever heaps we have been seeing till now are all Min Heaps. This is because the Heap Property is such that the minimum element is put on top. What if the Heap Property would just be the reverse of it…? Something like –

Heap Property – Every parent node must have a value greater than or equal to both of its child node.

In this case, by the Heap property, the maximum element would be put on the top and so will be for all the sub-trees or sub-heaps in the structure. This is a Max Heap. Actually, the very first heap that we used in our Build Heap sketch is a Max Heap. Have a close look at it and you will understand the difference. Max Heap differ from Min Heaps only by the Heap Property, due to which the other differences arise.

Whoo…! This has become a really long post…! Thanks a lot to stay with me till the end. I hope this helps you to get started with heaps. It may not be easy to code the Binary Heap, but that’s how you grow your skills in programming. You won’t learn anything if you sit for hours and write the Hello World program over-and-over again. You need to challenge yourself…! Feel free to comment your doubts..! Keep practicing…! Happy Coding…! 🙂

26 thoughts on “Binary Heaps

  1. While explaning build heap using images
    I think there is an invalid swap between (3) and (4) in the 3rd image at level 2.
    Anyways awesome tutorial as always 🙂

  2. Hi Vamsy,

    Can you tell some condition in which delete function holds true for while case present in else part.
    while (minHeap[index].value < minHeap[index / 2].value && index != 0)

    As it is minHeap i feel any swapped value from last leaf will always have a value greater than its parent.

    Regards,
    Shashi

  3. In BuildHeap procedure, I am not able to get the comment properly i.e ” Simply call heapify() for all nodes except the last one”, as according to me it should be “….for all levels except the last one”.
    Please do check it.
    Thanks for the post btw 🙂 😀

      1. Thank you so much for correcting it! 🙂
        One more thing, if I were to delete the node at index = 1, then in the DeleteVertex procedure, it will compare the value of node at index = 1 and node at index = 1/2 i.e 0, which is not present in the minHeap[] array, moreover the garbage value at index 0 can be smaller or greater than the swapped value at index 1, and if it turned out to be smaller, then the function will swap the nodes of these indices making garbage enter the heap.
        Please check this thing also.
        Thanks in advance 😀

          1. Thanks again 😀
            Please explain my doubt asked in Dijkstra’ Algorithm also ASAP, it would be highly helpful 🙂

  4. The delete section is badly messed up. It is simply false to say that you always sift up in the first step. The value from the end of the array might be greater than both the children of the deleted node. In that case, sifting down is required. It isn’t particularly similar to insertion.

  5. bro ur code has some minure bug in deleting part…take input 11 and d values as 5,7,9,100,99,11,13,200,101,119,120 nd delete 2 nd index (i.e 7) ….so aftr deletion d array shuld be ..5,99,9,100,119,11,13,200,101,120 but ur progrm is showing as 5,99,9,100,120,11,13,200,101,119…… according to me its because ur reducing the size twice in deletion part…

    1. Sorry for the late reply… But code does seem to give the expected output when I rtested it for the test case you gave me… But nontheless, the error you pointed out is correct… It has been rectified 🙂

    1. Lol…! 😛 … Thanks a lot..! 😀 … But, I can never be better than any professor… I wrote this post leisurely… On the other hand, I don’t think professors have the time to explain anything too descriptively… 😉

Leave a Reply