Title: Heaps and the Heapsort
1Heaps and the Heapsort
- Heaps and priority queues
- Heap structure and position numbering
- Heap structure property
- Heap ordering property
- Removal of top priority node
- Inserting a new node into the heap
- The heap sort
- Source code for heap sort program
2Heaps and priority queues
- A heap is a data structure used to implement an
efficient priority queue. The idea is to make it
efficient to extract the element with the highest
priority the next item in the queue to be
processed. - We could use a sorted linked list, with O(1)
operations to remove the highest priority node
and O(N) to insert a node. Using a tree structure
will involve both operations being O(log2N) which
is faster.
3Heap semantics
- The usage of the term heap to describe a tree
sorted from bottom to top is unrelated to usage
of the same term for the pool of memory available
for dynamic allocation, i.e. using malloc(). - It does relate to the winner of a competive
process, e.g. a football league, as being "at the
top of the heap".
4Heap structure and position numbering 1
- A heap can be visualised as a binary tree in
which every layer is filled from the left. For
every layer to be full, the tree would have to
have a size exactly equal to 2n1, e.g. a value
for size in the series 1, 3, 7, 15, 31, 63, 127,
255 etc. - So to be practical enough to allow for any
particular size, a heap has every layer filled
except for the bottom layer which is filled from
the left.
5Heap structure and position numbering 2
6Heap structure and position numbering 3
In the above diagram nodes are labelled based on
position, and not their contents. Also note that
the left child of each node is numbered node2
and the right child is numbered node21. The
parent of every node is obtained using integer
division (throwing away the remainder) so that
for a node i's parent has position i/2 .
Because this numbering system makes it very
easy to move between nodes and their children or
parents, a heap is commonly implemented as an
array with element 0 unused.
7Heap structure property
- For a heap based on the above structure to be
maintained, every layer must be complete except
the bottom layer, which must be filled from the
left and items must be removed from the right. - In order to insert items elsewhere and remove
items from the top, localised rearrangements
along single branches will made to restore this
structure.
8Heap ordering property 1
- A data structure with the shape described above
becomes useful if data within it is organised, so
that the key of every node is smaller or equal to
the keys of its 2 (or sometimes one) children. A
child with a key smaller than its parent's would
violate this condition. - When a heap is organised like this, it can be
useful as a priority queue, because the lowest
key will always be at the top of the heap and
most easy to remove. This is called a min heap.
This ordering property is reversed (a max heap)
if it is desired for the highest key should
always to be removed first.
9Heap ordering property 2
10Removal of top priority node 1
- The rest of these notes assume a min heap will be
used. - Removal of the top node creates a hole at the top
which is "bubbled" downwards by moving values
below it upwards, until the hole is in a position
where it can be replaced with the rightmost node
from the bottom layer. This process restores the
heap ordering property.
11Removal of top priority node 2figures 6.6 6.11
from "Data Structures and Algorithm Analysis in
C", 2e, M.A. Weiss.
12Removal of top priority node 3
13Removal of top priority node 4
14Inserting a node into the heap 1
- To insert a node into the heap, a hole is first
created at the next right position available
within the bottom layer. If the bottom layer is
full, a new layer is started from the left. If an
array is used to implement the heap, a size check
must first be performed to avoid array overflow. - Values above a hole within the structure are
bubbled down into the hole, so the hole "bubbles
up" to the position where the hole can receive
the value to be inserted while maintaining the
heap ordering property.
15Inserting a node into the heap 2
16Inserting a node into the heap 3
17The heap sort
- Using a heap to sort data involves performing N
insertions followed by N delete min operations as
described above. Memory usage will depend upon
whether the data already exists in memory or
whether the data is on disk. Allocating the array
to be used to store the heap will be more
efficient if N, the number of records, can be
known in advance. Dynamic allocation of the array
will then be possible, and this is likely to be
preferable to preallocating the array.
18Source code 1
19Source code 2
20Source code 3
21Source code 4
22Source code 5
23Source code 6
24Source code 7