Title: Heaps%20and%20basic%20data%20structures
1Heaps and basic data structures
- David Kauchak
- cs161
- Summer 2009
2Administrative
- Homework 2 due date extended to Fri. 7/10 at 5pm
- Midterm
- 7/20 in class. Closed book, etc.
- Review sessions
- SCPD students
- Discussion board thanks ?
3Quicksort partitions the good vs. the bad
4Quicksort average case take 2
cn
bad split
good 50/50 split
We absorb the bad partition. In general, we
can absorb any constant number of bad partitions
5Quicksort partitions the good vs. the bad
- For Quicksort to absorb the cost of bad
partitions, as n grows, the proportion of bad to
good partitions cannot grow - Why?
- If as we increase the size of n, we
proportionately increase the number of good and
bad partitions, then there is still a constant
number of bad partitions to be absorbed by a
given good partition - If, however, as we increase n the proportion of
bad partitions increases, then we can no longer
absorb the cost since of the bad partitions
since it depends on n
6Decision-tree model
- Full binary tree representing the comparisons
between elements by a sorting algorithm - Internal nodes contain indices to be compared
- Leaves contain a complete permutation of the
input - Tracing a path from root to leave gives the
correct reordering/permutation of the input for
an input
13
gt
1,3,2
3, 12, 7
3, 7, 12
1,3,2
2,1,3
3, 7, 12
7, 3, 12
7Comparison-based sorting
- Sorted order is determined based only on a
comparison between input elements - Ai lt Aj
- Ai gt Aj
- Ai Aj
- Ai Aj
- Ai Aj
- This is why most built-in sorting approaches only
require you to define the comparison operator
(i.e. compareTo in Java) - Can we do better than O(n log n)?
8A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
9A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
12, 7, 3
10A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
12, 7, 3
Is 12 7 or is 12 gt 7?
11A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
12, 7, 3
Is 12 3 or is 12 gt 3?
12A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
12, 7, 3
Is 12 3 or is 12 gt 3?
13A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
12, 7, 3
Is 12 3 or is 12 gt 3?
14A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
12, 7, 3
Is 7 3 or is 7 gt 3?
15A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
12, 7, 3
Is 7 3 or is 7 gt 3?
16A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
3, 2, 1
12, 7, 3
17A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
3, 2, 1
3, 7, 12
12, 7, 3
18A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
7, 12, 3
19A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
7, 12, 3
20A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
7, 12, 3
21A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
7, 12, 3
22A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
7, 12, 3
23A decision tree model
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
3, 7, 12
7, 12, 3
24How many leaves are in a decision tree?
- Leaves must have all possible permutations of the
input - Input of size n, n! leaves
- What if decision tree model didnt?
- Some input would exist that didnt have a correct
reordering
25A lower bound
- What is the worst-case number of comparisons for
a tree?
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
26A lower bound
- The longest path in the tree, i.e. the height
12
gt
13
23
gt
gt
2,1,3
13
23
1,2,3
gt
gt
1,3,2
3,1,2
2,3,1
3,2,1
27A lower bound
- What is the maximum number of leaves a binary
tree of height h can have? - A complete binary tree has 2h leaves
log is monotonically increasing
from hw1 ?
28Can we do better than O(n logn) for sorting?
- What if I told you the maximum value k that any
number could take - and k O(n)
- In some situation (like above) we can sort in
T(n) - counting sort
- radix sort
- bucket sort
- Leverage additional knowledge about the data
besides comparisons
29Why dont we hear about these more?
- Constants can be large and running times
therefore may be larger for modest input sizes - Cache friendliness
- Memory (Quicksort sorts in place)
- Hardware considerations
30Data Structures
- What is a data structure?
- Way of storing data that facilitates particular
operations - Dynamic set operations For a set S
- Search(S,k) Does k exist in S?
- Insert(S,k) Add k to S
- Delete(S,x) Given a pointer/reference, x, to an
elkement, delete it from S - Min(S) Return the smallest element of S
- Max(S) Return the largest element of S
31Array
- Sequential locations in memory in linear order
- Elements are accessed via index
- Cost of operations
- Search(S,k)
- Insert(S,k)
- InsertIndex(S,k)
- Delete(S,x)
- Min(S)
- Max(S)
O(n)
T(n)
T(1)
T(n)
T(n)
T(n)
32Array
- Uses?
- constant time access of particular indices
33Linked list
- Elements are arranged linearly.
- An element in list points to the next element in
the list - Cost of operations
- Search(S,k)
- Insert(S,k)
- InsertIndex(S,k)
- Delete(S,x)
- Min(S)
- Max(S)
O(n)
T(1)
T(1)
O(n)
T(n)
T(n)
34Linked list
- Uses?
- constant time insertion at the cost of linear
time access
35Double linked list
- Elements are arranged linearly.
- An element in list points to the next element and
previous element in the list - What does the back link get us?
- T(1) deletion
36Stack
- LIFO
- Picture the stack of plates at a buffet
- Can implement with an array or a linked list
37Stack
top
- LIFO
- Picture the stack of plates at a buffet
- Can implement with an array or a linked list
push(1)
push(2)
push(3)
pop()
3
pop()
2
pop()
1
38Stack
- Empty check if stack is empty
- Array check if top is at index 0
- Linked list check if top pointer is null
- Runtime
T(1)
39Stack
- Pop removes the top element from the list
- check if empty, if so, underflow
- Array return element at top and decrement
top - Linked list return and remove at front of linked
list - Runtime
- Push add an element to the list
- Array increment top and insert element. Must
check for overflow! - Linked list insert element at front of linked
list - Runtime
T(1)
T(1)
40Stack
- Array or linked list?
- Array more memory efficient
- Linked list dont have to worry about overflow
- Uses?
- runtime stack
- graph search algorithms (depth first search)
- syntactic parsing (i.e. compilers)
41Queue
head
tail
- FIFO
- Picture a line at the grocery store
- Can implement with array or double linked list
Enqueue(1) Enqueue(2) Enqueue(3)
1 2 3
Dequeue() Dequeue() Dequeue()
42Queue
- Operations
- Empty T(1)
- Enqueue add element to end of queue - T(1)
- Dequeue remove element from the front of the
queue - T(1) - Uses?
- scheduling
- graph traversal (breadth first search)
43Binary heap
- A binary tree where the value of a parent is
greater than or equal to the value of its
children - Additional restriction all levels of the tree
are complete except the last - Max heap vs. min heap
44Binary heap - operations
- Maximum(S) - return the largest element in the
set - ExtractMax(S) Return and remove the largest
element in the set - Insert(S, val) insert val into the set
- IncreaseElement(S, x, val) increase the value
of element x to val - BuildHeap(A) build a heap from an array of
elements
45Binary heap - pointers
parent child
all nodes in a heap are themselves heaps
complete tree
level does not indicate size
46Binary heap - array
47Binary heap - array
16 14 10 8 7 9 3 2 4 1
1 2 3 4 5 6 7 8 9 10
48Binary heap - array
16 14 10 8 7 9 3 2 4 1
1 2 3 4 5 6 7 8 9 10
Left child of A3?
49Binary heap - array
16 14 10 8 7 9 3 2 4 1
1 2 3 4 5 6 7 8 9 10
Left child of A3?
23 6
50Binary heap - array
16 14 10 8 7 9 3 2 4 1
1 2 3 4 5 6 7 8 9 10
Parent of A8?
51Binary heap - array
16 14 10 8 7 9 3 2 4 1
1 2 3 4 5 6 7 8 9 10
Parent of A8?
52Binary heap - array
53Identify the valid heaps
15, 12, 3, 11, 10, 2, 1, 7, 8
20, 18, 10, 17, 16, 15, 9, 14, 13
54Heapify
- Assume left and right children are heaps, turn
current set into a valid heap
55Heapify
- Assume left and right children are heaps, turn
current set into a valid heap
56Heapify
- Assume left and right children are heaps, turn
current set into a valid heap
find out which is largest current, left of right
57Heapify
- Assume left and right children are heaps, turn
current set into a valid heap
58Heapify
- Assume left and right children are heaps, turn
current set into a valid heap
if a child is larger, swap and recurse
59Heapify
16 3 10 8 7 9 5 2 4 1
1 2 3 4 5 6 7 8 9 10
3
60Heapify
16 3 10 8 7 9 5 2 4 1
1 2 3 4 5 6 7 8 9 10
3
61Heapify
16 8 10 3 7 9 5 2 4 1
1 2 3 4 5 6 7 8 9 10
8
62Heapify
16 8 10 3 7 9 5 2 4 1
1 2 3 4 5 6 7 8 9 10
8
63Heapify
16 8 10 4 7 9 5 2 3 1
1 2 3 4 5 6 7 8 9 10
8
64Heapify
16 8 10 4 7 9 5 2 3 1
1 2 3 4 5 6 7 8 9 10
8
65Heapify
16 8 10 4 7 9 5 2 3 1
1 2 3 4 5 6 7 8 9 10
8
66Correctness of Heapify
- Remember both the children are valid heaps
- Three cases
- Case 1 Ai (current node) is the largest
- parent is greater than both children
- both children are heaps
- current node is a valid heap
67Correctness of heapify
- Case 2 left child is the largest
- When Heapify returns
- Left child is a valid heap
- Right child is unchanged and therefore a valid
heap - Current node is larger than both children since
we selected the largest node of current, left and
right - current node is a valid heap
- Case 3 right child is largest
- similar to above
68Running time of Heapify
- What is the cost of each call to Heapify?
- T(1)
- How many calls are made to Heapify?
- O(height of the tree)
- What is the height of the tree?
- Complete binary tree, except for the last level
O(log n)
69Binary heap - operations
- Maximum(S) - return the largest element in the
set - ExtractMax(S) Return and remove the largest
element in the set - Insert(S, val) insert val into the set
- IncreaseElement(S, x, val) increase the value
of element x to val - BuildHeap(A) build a heap from an array of
elements
70Maximum
- Return the largest element from the set
- Return A1
71ExtractMax
- Return and remove the largest element in the set
72ExtractMax
- Return and remove the largest element in the set
?
73ExtractMax
- Return and remove the largest element in the set
?
74ExtractMax
- Return and remove the largest element in the set
?
75ExtractMax
- Return and remove the largest element in the set
?
76ExtractMax
- Return and remove the largest element in the set
77ExtractMax
- Return and remove the largest element in the set
Heapify
78ExtractMax
- Return and remove the largest element in the set
79ExtractMax running time
- Constant amount of work plus one call to Heapify
O(log n)
80IncreaseElement
- Increase the value of element x to val
15
81IncreaseElement
- Increase the value of element x to val
15
82IncreaseElement
- Increase the value of element x to val
15
8
83IncreaseElement
- Increase the value of element x to val
15
84IncreaseElement
- Increase the value of element x to val
14
85IncreaseElement
- Increase the value of element x to val
86Correctness of IncreaseElement
- Why is it ok to swap values with parent?
87Correctness of IncreaseElement
- Stop when heap property is satisfied
88Running time of IncreaseElement
- Follows a path from a node to the root
- Worst case O(height of the tree)
- O(log n)
89Insert
6
90Insert
6
91Insert
propagate value up
6
92Insert
93Running time of Insert
- Constant amount of work plus one call to
IncreaseElement O(log n)
94Building a heap
- Can we build a heap using the functions we have
so far? - Maximum(S)
- ExtractMax(S)
- Insert(S, val)
- IncreaseElement(S, x, val)
95Building a heap
96Running time of BuildHeap1
- n calls to Insert O(n log n)
- Can we get a better bound?
97Building a heap take 2
- Start with n/2 simple heaps
- call Heapify on element n/2-1, n/2-2, n/2-3
- all children have smaller indices
- building from the bottom up, makes sure that all
the children are heaps
984 1 3 2 16 9 10 14 8 7
1 2 3 4 5 6 7 8 9 10
4
1
3
16
9
10
14
8
7
99heapify
4 1 3 2 16 9 10 14 8 7
1 2 3 4 5 6 7 8 9 10
4
1
3
16
9
10
14
8
7
100heapify
4 1 3 2 16 9 10 14 8 7
1 2 3 4 5 6 7 8 9 10
4
1
3
16
9
10
14
8
7
101heapify
4 1 3 14 16 9 10 2 8 7
1 2 3 4 5 6 7 8 9 10
4
1
3
14
16
9
10
2
8
7
102heapify
4 1 3 14 16 9 10 2 8 7
1 2 3 4 5 6 7 8 9 10
4
1
3
14
16
9
10
2
8
7
103heapify
4 1 10 14 16 9 3 2 8 7
1 2 3 4 5 6 7 8 9 10
4
1
10
14
16
9
3
2
8
7
104heapify
4 1 10 14 16 9 3 2 8 7
1 2 3 4 5 6 7 8 9 10
4
1
10
14
16
9
3
2
8
7
105heapify
4 16 10 14 7 9 3 2 8 1
1 2 3 4 5 6 7 8 9 10
4
16
10
14
7
9
3
2
8
1
106heapify
4 16 10 14 7 9 3 2 8 1
1 2 3 4 5 6 7 8 9 10
4
16
10
14
7
9
3
2
8
1
107heapify
16 14 10 8 7 9 3 2 4 1
1 2 3 4 5 6 7 8 9 10
16
14
10
8
7
9
3
2
4
1
108Correctness of BuildHeap2
109Correctness of BuildHeap2
- Invariant elements Ai1n are all heaps
- Base case i floor(n/2). All elements i1,
i2, , n are simple heaps - Inductive case We know i1, i2, .., n are all
heaps, therefore the call to Heapify(A,i)
generates a heap at node i - Termination?
110Running time of BuildHeap2
- n/2 calls to Heapify O(n log n)
- Can we get a tighter bound?
111Running time of BuildHeap2
all nodes at the same level will have the same
cost
How many nodes are at level d?
2d
112Running time of BuildHeap2
?
113Nodes at height h
h
lt ceil(n/2h1) nodes
lt ceil(n/8) nodes
h2
lt ceil(n/4) nodes
h1
lt ceil(n/2) nodes
h0
114Running time of BuildHeap2
115BuildHeap1 vs. BuildHeap2
- Runtime
- Both O(n)
- BuildHeap2 may have smaller constants (only n/2
calls) - Memory
- Both O(n)
- BuildHeap1 requires an additional array, i.e. 2n
memory - Complexity/Ease of implementation
116Heap uses
- Heapsort
- Build a heap
- Call ExtractMax for all the elements
- O(n log n) running time
- Priority queues
- scheduling tasks jobs, processes, network
traffic - A search algorithm
117Other heaps
118Other heaps
119Other heaps