Binary Heaps

Implementing Priority Queues Effectively

Dec 03, 2024

Hey! Today, let’s discuss binary heaps and when they are helpful.

A binary heap is widely used for tasks that require quick access to the minimum or maximum element in a data structure. This data structure is a specialized version of a binary tree that follows two key properties:

Complete binary tree: All levels of the tree are fully filled except possibly the last, which is filled from left to right.
Heap property: It satisfies either:
- Min-Heap: Each parent node is smaller than its children.
- Max-Heap: Each parent node is larger than its children.

NOTE: To clarify, it’s not when we want fast access to the min and the max, it’s the min or the max. When we require the min, we use a min-heap; when we need the max, we use a max-heap.

Given its properties, a heap is the ideal data structure to implement priority queues1, where each element has an associated priority.

In most cases, a heap is backed by an array. If a node is at index i, we can calculate its relatives this way:

Left child is at index 2 * i + 1
Right child is at index 2 * i + 2
Parent is at index (i - 1) / 2

Here’s an example of a max-heap:

And its underlying array representation:

For example, 39 is at index 1:

Its left child is at index 2 * 1 + 1 = 3
Its right child is at index 2 * 1 + 2 = 4
Its parent is at index (1 - 1) / 2 = 0

A heap must support at least these operations:

isEmpty: Check whether the heap has no elements.
insert: Add an element, followed by a heapify up operation (we will mention the concept of heapify right after)
getMin (or getMax): Get the min (or max) element without removing it.
extractMin (or extractMax): Remove and return the min (or max) element, followed by a heapify down.

In terms of complexity, with n the number of elements in the heap:

insert: O(log n).
getMin (or getMax): O(1).
extractMin (or extractMax): O(log n).
Build a heap: O(n). While inserting elements individually seems to suggest an O(n * log n) cost, building a heap from an unsorted array can be done more efficiently using a heapify down strategy. This is an important property, that’s part of the beauty with heaps.

Whenever an element is inserted or removed, the heap must rearrange the elements in the array to satisfy the min or max heap property using the heapify process:

Heapify up (after insertion) compares the newly inserted element with its parent and swaps them if necessary.
Heapify down (after extraction or during heap construction) ensures that any element violating the heap property is swapped with its smallest (or largest, in a max-heap) child.

While it’s not essential to understand how heapify works, it's valuable to know it exists, as it explains why inserting or extracting elements is not O(1) but O(log n)—it requires running a heapify operation.

It’s quite frequent for developers to be confused about when to use heaps versus binary search trees (BST). Here are the benefits of a heap compared to a BST:

Structure initialization: As we discussed, building a binary heap is O(n); while it’s O(n * log n) for a BST.
Memory layout: A heap is implemented using an array, which has a better locality than a BST and is a more CPU cache-friendly data structure. Hence, a heap is often faster than a BST in practice.
Access time: Finding the min (the max) is O(1) with a heap and O(log n) with a BST.

NOTE: In a BST, smaller elements are consistently on the left, and larger ones are on the right. In a min-heap (max-heap), smaller (larger) elements are either on the left or on the right.

In conclusion, when the most frequent access pattern involves retrieving the minimum or maximum element in a dataset, we should consider using binary heaps. Their efficiency and properties make them ideal for priority queues and various algorithm implementations such as Dijkstra’s algorithm and heapsort.

Tomorrow, we will discuss one of the most important data structures: graphs.

A priority queue is an abstract data type. It’s the same when we use the concept of queue or stack, which are also abstract data types and are implemented using a concrete data type like an array.

Binary Heaps

Implementing Priority Queues Effectively

Comments