# Comp202: Principles of Object-Oriented Programming II Fall 2007 -- Lecture #37: Heaps and Heap Sort

### Heap Sort

• Heap Sort, like Selection Sort, is a hard-split, easy-join method.
• Think of Heap Sort as an improved (faster) version of Selection Sort.
• Specifically, split(), which finds the largest (or smallest) element in the subarray, is made to run in O(log n) steps instead of O(n) steps, where n is the subarray length.
• Since split() is performed n times, where n is the (overall) array length, Heap Sort takes O(n log n) steps.

### How is split() sped up?

• The elements in the unsorted portion of the array are organized into a heap.  A heap is a data structure that is optimized for repeatedly finding and removing the largest (or smallest) element.

### What is a Heap?

• A "complete" tree is a minimum height tree with all the nodes on the lowest level in their left-most positions. That is, the tree is completely filled from the top with any extra elements strictly on one side (left) of the lowest level. Note that in a complete tree, there is at most a variation of 1 in path lengths from the root to the leaves.

• A heap is (conceptually) a binary tree that
1. is complete and
2. also exhibits the heap property:
• the root, if non-null, is the largest (or smallest) key in the tree, and
• its left and right subtrees are themselves heaps.

### Heap Sort: split()

```public int split(int[] A, int lo, int hi) {
// Swap A[hi] and A[lo].
int temp = A[hi];
A[hi] = A[lo];
A[lo] = temp;

// Restore the heap property by ``sifting down''
//  the element at A[lo].
Heapifier.Singleton.siftDown(A, lo, lo, hi - 1);
return hi;
}

```

### siftDown(): The Implementation

```public void siftDown(int[] A, int lo, int cur, int hi) {
int dat = A[cur];             // hold on to data.
int child = 2 * cur + 1 - lo; // index of left child of A[cur].
boolean done = hi < child;

while (!done) {
if (child < hi && A[child + 1] < A[child]) {
child++;
} // child is the index of the smaller of the two children.
if (A[child] < dat) {
A[cur] = A[child];
cur = child;
child = 2 * cur + 1 - lo;
done = hi < child;
}                     // A[cur] is less than its children.
else {                // A[cur] <= A[child].
done = true;        // heap condition is satisfied.
}                     // A[cur] is less than its children.
}                       // location found for temp.
A[cur] = dat;
}

```

### Initializing the Heap: HeapSorter()

In order to sort an array, using the ordering capabilities of a heap, you must first transform the randomly placed data in the array into a heap.  This is called "heapifying" the array.  This can be accomplished by sifting down the elements.  Luckily, even though this operation takes place in O(n log(n)) time, it only occurs once, so in the end, it has no impact on the overall complexity of the algorithm.

Note that we only really have to sift down half the array, i.e. half the "tree".  This is because a single-element array (tree) is already a heap, so we can bypass all the leaves and immediately start working on the layer right above the leaves.

```public class HeapSorter extends ASorter {
public HeapSorter(int[] A, int lo, int hi)  {
for (int cur = (hi + lo + 1) / 2; cur >= lo; cur--) {
Heapifier.Singleton.siftDown(A, lo, cur, hi);
}
}
// etc. . .
}
```

### Inserting into an existing Heap: siftUp()

To insert a data element into an existing heap, we are forced to initially insert the element at the bottom of the tree, which is at the end of the array.  Since this may break the heap property, we need "sift up" the data through the tree to find a spot for it where the overall heap property will be restored.  When sifting up, we are essentially taking a data element, (starting with the one being inserted) comparing it to its parent and then taking the largest (or smallest) of the pair, leaving the other in the parent's position.   The process with the left-over data element and the next higher parent until the top of the heap is reached.

```    /**
* "Sifts" A[cur] up the array A to maintain the heap property.
* @param A A[lo:cur-1] is a heap.
* @param lo the low index of A.
* @param cur lo <= cur <= the high index of A.
*/
public void siftUp(int[] A, int lo, int cur) {
int dat = A[cur];
int parent = (cur - lo - 1) / 2 + lo;  // index of parent.
while (0 < (cur - lo) && dat < A[parent]) {
A[cur] = A[parent];
cur = parent;
parent = (cur - lo - 1) / 2 + lo;
}
A[cur] = dat;
}

```

### Sample Code

 Best-case Cost Worst-case Cost Selection O(n2) O(n2) Insertion O(n) O(n2) Heap O(n log n) O(n log n) Merge O(n log n) O(n log n) Quick O(n log n) O(n2)
• Selection sort performs the least swaps, O(n), in the worst case.
• Insertion sort is best if the array is nearly or already sorted.
• Heap sort performs a constant factor more comparisons than Merge sort
• Merge sort requires extra storage proportional in size to the input.
• Quick sort typically (expected case) outperforms Heap and Merge sort because of its simplicity.

Last Revised Thursday, 03-Jun-2010 09:52:35 CDT

©2007 Stephen Wong and Dung Nguyen