Title: More quicksort
1More quicksort
- Bad splits at the root are worse than bad splits
farther down the tree - A bad split at the root costs n and produces two
subarrays of n-1 and 0. Then the partitioning of
the n-1 subarray costs n-1 and produces (n-1)/2
1 and (n-1)/2 sizes. - The running time of quicksort even when
alternating between good and bad splits is
O(nlgn) but with a slightly larger constant.
2Quick Sort HW 3
- Homework 3 forces a test between reality (your
homework) and our mathematical models. - You should have graphs showing the correlation
between O(ngln) and O(n2)
3Randomized quicksort
- The next idea is an improvement for
quicksortbecause the partition procedure is
where the breakdown occurs when the data is
sorted. - We have assumed that all permutations of the
input data are equally likelybut that is not
likely to hold in the real world. - We could randomize the input to obtain good
average-case performance over all inputs by
explicitly permuting the input. However, we will
use random sampling, meaning that the pivot will
be chosen randomly from the range subarray
Ap..r
4Randomized Quicksort pseudocode
RANDOMIZED-PARTITION(A,p,r) i ?RANDOM(p,r)
exchange Ar Ai return
Partition(A,p,r)
- RANDOMIZED QUICKSORT(A,p,r)
- if pltr
- then q ?RANDOMIZED-PARTITION(A,p,r)
- RANDOMIZED-QUICKSORT(A,p,q-1)
- RANDOMIZED-QUICKSORT(A,q1,r)
5Exercises
- Why do we analyze the average-case performance of
a randomized algorithym and not its worst-case
performance? - During the running of the procedure
RANDOMIZED-QUICKSORT, how many calls are made to
the random-number generator RANDOM in the worst
case? Answer in terms of T-notation.
6A randomized version of Quicksort
- The hiring problem (chapter 5)
- New Office Assistant needed
- Previous attempts at hiring were unsuccessful
- Employment agency hired
- One candidate/day being sent over
- Will interview the candidate and decide to hire
or not - It costs a small fee to interview the applicant
7The Hiring Problem(cont)
- To hire an applicant is more costly because
- Must fire the current office assistant
- Must pay a large fee to the employment agency
- We are committed to having the best possible
person in the job. - Therefore, if we interview an applicant that is
better qualified than the current office
assistant, we will fire the current one and hire
the new one. - We are willing to pay this cost but want to know
what the price of the strategy is.
8The Hiring Problem (cont)
- Assume candidates are number 1-n
- When we interview candidate i, we assume that we
are able to determine if candidate i is the best
candidate seen so far.
9The Hiring Problem-pseudocode
- HIRE-ASSISTANT(n)
- Best ? 0 Candidate 0 is a least-qualified
dummy candidate - For i ?1 to n
- do interview candidate i
- if candidate i is better than
candidate best - then best ? i
- hire candidate i
10Worst-Case Analysis
- In the worst case, we actually hire every
candidate that we interview - This occurs when the candidates come in
increasing order of quality - Interviewing has a low cost ci
- Hiring has a higher cost ch
- Let m be the number of people hired
- Then the total cost is O(ncimch)
11Probabilistic Analysis
- Must make assumptions or have knowledge about the
input distribution - We assume that each of the n! permutations occur
with equal probability. - Stated without proof Lemma 5.2 If the
candidates are presented in a random order,
algorithm HIRE-ASSISTANT has a total hiring cost
of O(ch ln n)
12Worst-Case analysis of quicksort
- We now prove the assertion that the worst case
running time is T(n2) - For an input of size n and q ranges from 0 to n-1
because PARTITION produces two subproblems with
total size n-1. - T(n) max (T(q) T(n-q-1)) T(n)
- Guess that T(n) ?? cn2 for some c then we have by
substitution into the recurrence
0?q?n-1
13Worst-Case analysis of quicksort (cont)
- T(n) ? max (cq2 c(n-q-1)2) T(n)
- c max ( q2 (n-q-1)2) T(n)
- We can use a little calculus hereremember that
for a local minimum the second derivative must be
gt 0 when the first derivative is 0.
0?q?n-1
0?q?n-1
14Worst-Case analysis of quicksort (cont)
- Show the mathon an auxillary sheet. Since we
have maximum at both endpoints - We then have
- (q2 (n-q-1)2) ? (n-1)2 n2 2n 1
- T(n) ? cn2 c(2n-1) T(n)
- ? cn2
- The worst case running time is T(n2)
0?q?n-1
15Expected running time
- Using indicator random variables in the proof, we
state that - The expected running time of RANDOMIZED-PARTITION
is T(n lg n)
16HeapSort(Chapter 6)
- The (binary) heap data structure is a nearly
complete binary tree. - Each node of the tree is an element in the array.
- The tree is completely filled out in all levels
except possibly the lowest - Think of the number of elements of the heap
stored in the array A.
17Heapsort
1
16
3
2
14
10
5
7
4
6
7
3
3
8
9
10
8
9
2
2
1
4
2
4
5
8
10
3
6
7
1
9
18Heapsort
- Parent(i)
- return?i/2?
- Left(i)
- return 2i
- Right(i)
- return 2i1
19Heapsort
- There are two kinds of binary heaps
- A max-heap has the property that for every node i
other than the root - AParent(i) ? Ai
- A min-heap has the property that for every node i
other than the root - AParent(i) ? Ai
- The height of a node in a heap is defined as the
number of edges on the longest simple downward
path from that node to a leaf, and the height of
a heap is the height of its root.
20Heapsort
- MAX-HEAPIFY runs in O(lgn) time is used to
maintain the max-heap property - BUILD-MAX_HEAP runs in O(n) time and produces a
max-heap from an unordered array. - HEAPSORT runs in O(n lg n) sorts an array in
place - MAX-HEAP-INSERT, HEAP-EXTRACT-MAX,
HEAP-INCREASE-KEY, and HEAP-MAXIMUM run in O(lg
n) time allow use as a priority queue.
21Heapsort
- MAX-HEAPIFY(A,i) Left(i) and Right(i) heaps
- l ? Left(i)
- r ? Right(i)
- if l ? heap-sizeA and Al gt Ai
- then largest ? l
- else largest ? i
- If r ? heap-sizeA and Ar gt Alargest
- then largest ? r
- if largest ? i
- then exchange Ai Alargest
- MAX-HEAPIFY(A,largest)
22Heapsort (heap-sizeA 10)
1
16
3
2
7
14
10
5
4
6
i
7
4
9
3
1
10
9
(b)
2
1
8
1
16
7
2
3
3
3
14
10
8
5
4
7
6
7
8
9
3
10
9
(a)
2
1
4
i
(c)
23Heapsort
- Running time of MAX-HEAPIFY is just
- T(n) ? T(2n/3) T(1)
- Where T(1) desribes the time to fix up the
relationship among the elements Ai, ALeft(i),
ARight(i), and T(2n/3) arises - from the fact that the procedure must call
MAX-HEAPIFY for the subtree rooted at one of the
children of node i and the fact that the
childrens subtrees each have size at most 2n/3.
The worst case occurs when the last row is
exactly ½ full. - By the master theorem, T(n) O(lg n)
24Building a Heap
- We use MAX-HEAPIFY in a bottom-up way to convert
an array A1..n, where n lengthA, into a
max-heap. - Note that A(?n/2? 1)..n are all leaves
- BUILD-MAX-HEAP goes through the non-leave nodes
of the tree and runs MAX-HEAPIFY on each one.
25BUILD-MAX-HEAP pseudo code
- BUILD-MAX-HEAP(A)
- heap-sizeA ? lengthA
- for i ? lengthA/2 downto 1
- do MAX-HEAPIFY(A,i)
26BUILD-MAX-HEAP loop invariants
- Initialization prior to the first iteration of
the loop, i ?n/2? . Each node ?n/2? 1, ?n/2?
2, ,n is a leaf and at the root of a trivial
max-heap. - Maintenance the children of node i are numbered
higher than i. They are both roots of max-heaps.
- Termination At termination, i 0. By the loop
invariant, each node 1,2,,n is the root of a
max-heap.
277
4
A
1
2
9
3
16
10
14
8
1
1
4
4
2
3
3
2
(b)
1
3
(a)
1
3
7
5
4
6
7
5
i
4
6
16
2
i
9
10
16
2
10
9
10
9
8
9
10
1
1
8
14
7
8
7
14
8
4
4
3
3
2
2
i
i
(c)
(d)
1
1
10
3
7
5
5
7
4
4
6
6
16
16
14
14
9
10
9
3
10
10
8
8
9
9
1
1
7
2
i
7
2
8
8
16
4
3
2
3
2
(e)
14
(f)
10
10
16
5
7
4
6
7
5
4
6
7
7
8
9
3
14
9
3
10
9
8
9
8
10
2
4
1
1
2
8
28Heapsort BUILD-MAX-HEAP
- A loose bound on BUILD-MAX-HEAP is argued as
follows Each call to MAX-HEAPIFY costs O(lg n)
time and there are just O(n) of these calls.
Thus, the running time is O(n lg n) - A tighter bound is derived in the text but
essentially has the result that we can build a
max-heap from an unordered array in linear time
O(n).
29The Heapsort algorithm
- Use BUILD-MAX-HEAP to build a max-heap on the
input array A1n, where n is just lengthA. - Since the maximum element in the array is stored
at the root A1 it is easy to put it in its
correct final position by just exchanging with
An. - Now discard node n from the tree.
30The Heapsort algorithm(cont)
- Now we use MAX-HEAPIFY to build a heap 1 smaller
than before and keep doing this process until we
are down to 2. - This takes O(n lg n) time since the call to
BUILD-MAX-HEAP takes O(n) and each of the n-1
calls to MAX-HEAPIFY take O(lg n)
31Heapsort
14
2
3
16
(b)
8
10
3
2
5
4
6
14
10
(a)
7
3
9
5
4
4
6
10
7
9
3
8
9
i
2
16
1
10
9
1
2
4
9
3
2
(d)
3
8
5
4
6
7
1
2
4
10
9
10
16
14
10
i
32Heapsort
1
1
7
3
8
2
2
(f)
3
4
(e)
7
3
7
5
4
6
7
5
4
6
2
9
9
8
1
2
9
1
4
9
10
8
9
i
i
10
8
9
10
14
16
10
10
10
14
16
1
1
3
4
3
2
3
2
2
(h)
1
2
(g)
3
7
5
4
i
6
5
7
4
6
i
7
9
8
9
4
7
8
1
9
10
8
9
10
8
9
10
14
16
10
14
16
33Heapsort
1
1
3
i
2
2
(j)
3
7
5
4
6
7
9
8
4
10
8
9
10
14
16
1
10
8
7
9
2
16
4
14
3
(k)
34Heapsort Applicationpriority queue
- A priority queue is a data structure for
maintaining a set S of elements, each with an
associated value called a key. A max-priority
queue supports the following - INSERT(S,x) inserts the element x in set S. We
could write as S ? S ? x. - MAXIMUM(S) returns the element of S with the
largest key. - EXTRACT-MAX(S) removes and returns the element of
S with the largest key. - INCREASE-KEY(S,x,k) increases the value of
element xs value to k, k ? current key value.
35Heapsort Applicationpriority queue
- A common application is scheduling jobs on a
shared computer. The queue keeps track of jobs
to performed and their relative priorities. - A min-priority queue is the mirror image of the
max-priority queue. It supports the operations
INSERT, MINIMUM,EXTRACT-MIN, and DECREASE-KEY.
36Heapsort Applicationpriority queue
- The textbook discusses the idea of a handle. The
exact meaning depends upon the application but it
could be thought of a function pointer to the
piece of code that needs to be executed at a
certain priority. - The procedure HEAP-MAXIMUM implements the MAXIMUM
operation in T(1) time. - HEAP-MAXIMUM(A)
- return A1
37Heapsort Applicationpriority queue
- The procedure HEAP-EXTRACT-MAX implements the
EXTRACT-MAX operation - HEAP-EXTRACT-MAX(A)
- if heap-sizeA lt 1
- then error heap underflow
- max ? A1
- A1 ? Aheap-sizeA
- Heap-sizeA ? heap-sizeA 1
- MAX-HEAPIFY(A,1)
- Return max
38Heapsort Applicationpriority queue
- HEAP-EXTRACT-MAX is O(lg n) since it performs a
constant amount of work on top of the O(lg n)
time for MAX-HEAPIFY. - HEAP-INCREASE-KEY implements the INCREASE-KEY
operation.
39Heapsort Applicationpriority queue
- HEAP-INCREASE-KEY(A,i,key)
- If key lt Ai
- then error new key is lt current key
- Ai ? key
- while i gt 1 and and AParent(i) lt Ai
- do exchange Ai ? AParent(i)
- i ? Parent(i)
40HEAP-INCREASE-KEY
16
16
3
2
3
2
14
10
(b)
14
10
(a)
5
4
6
5
4
6
7
3
8
9
7
3
8
9
10
9
10
9
1
2
15
1
2
4
i
i
Pri15
16
3
2
16
3
2
14
i
10
(c)
15
10
(d)
5
4
6
i
5
4
6
7
3
9
15
7
14
3
9
10
9
10
9
2
1
8
1
2
8
41Sorting in Linear Time(chptr 8)
- All sorts introduced so far, namely quicksort,
mergesort, and heapsort achieve the O(n lg n)
bound quicksort on average and mergesort and
heapsort in the worst case. - All of these sorts are comparison sortsnamely
they work by comparison of keys.
42Lower Bounds for Sorting
- We look first at a decision tree model for
insertion sort. - We assume all input elements are distinct
- With the above assumption, comparisions like ai
aj are useless, so all comparisons will be
assumed to have the form ai ? aj since these are
equivalent to all of the possible comparisons
that can be made.
43Decision Trees
- A decision tree is a full binary tree that
represents comparisons between elements - Control, data movement, and all other aspects of
the algorithm are ignored - Each internal node is annotated by ij where i
and j are the ith and jth elements in the input
sequence. - Leaf nodes are denoted by ?(1), ?(2),
?(3),?(n)!. Since there are n! possible
permutations of an input there must be n! leaves.
44Decision Tree-Insertion Sort(3 elements)
12
gt
?
23
13
?
gt
gt
?
13
lt1,2,3gt
lt2,1,3gt
23
?
gt
?
gt
lt3.2.1gt
lt2,3,1gt
lt3,1,2gt
lt1,3,2gt
45Lower Bounds (cont)
- The length of the longest path from the root of a
decision tree to any of its reachable leaves
represents the worst-case number of comparisons
that the corresponding sorting algorithm
performs. - Therefore, the worst-case number of comparisons
for a given comparison sort algorithm is the
height of the decision tree - A lower bound on the heights of all decision
trees in which each permutation appears as a
reachable leaf is a lower bound on the running
time of any comparison sort algorithm.
46Lower Bounds(cont)
- Theorem 8.1
- Any comparison algorithm requires ?(n lg n)
comparisons in the worst case. - Proof Consider a decision tree of height h with
l reachable leaves exists corresponding to a
comparison sort on n elements. We have n!
possible permutations of the input each of which
appears as some leaf. Therefore n! ? l.
47Lower Bounds(cont)
- Since a binary tree of height h has no more than
2h leaves, we have - n! ? l ? 2h
- Thus n! ? 2h and if we take logarithms of both
sides of the inequality - lg (n!) ? h lg 2 ?h ? lg(n!)
- By 3.18 lg(n!) T(n lg n), ?
- h ?(n lg n)
48Counting Sort
- This sort assumes that each of the n input
elements is an integer in the range 0 to k for
some integer k. When k O(n), the sort runs in
T(n) time. - The basic idea is to determine for each input
element x, the number of elements which are less
than x. Then we can place the element directly
into its position in the output array. - There is a slight problem if elements are not
distinct, because they cant go into the same
location.
49Counting Sort(cont)
- In the following pseudo code, assume that the
input is in array A1n, and thus the lengthA
n. Two other arrays are required, namely
B1n holds the sorted output and C0k
provides temporary working storage.
50Counting Sort pseudo code
- COUNTING-SORT(A,B,k)
- for i ? 0 to k
- do Ci ? 0
- for j ? 1 to lengthA
- do CAj ? CAj 1
- ? Ci now contains the number of elements equal
to i - for i ? 1 to k
- do Ci ? Ci Ci-1
- ? Ci now contains the number of elements ? i
- for j ? lengthA downto 1
- do BCAj ? Aj
- CAj ? CAj 1
51Counting Sort Operation
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
B
3
0 1 2 3 4 5
2
2
5
3
3
3
0
0
A
7
7
2
2
4
8
C
0 1 2 3 4 5
0 1 2 3 4 5
6
4
7
8
2
2
C
C
2
3
0
2
0
1
(b)
(c)
(a)
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
B
0
3
3
3
B
0
1 2 3 4 5 6 7 8
0 1 2 3 4 5
0 1 2 3 4 5
B
2
2
3
0
5
3
3
0
2
1
8
C
5
4
7
7
1
4
6
8
C
2
(f)
(e)
(d)
52Counting Sort Times
- Overall, the time is T(k n) because the first
for loop takes T(k) while the next for loop takes
T(n). The third for loop takes T(k) and the last
for loop takes T(n). - Therefore, the overall time is T(nk) .
- It beats ?(n lg n) because it does NO comparisons
at all. Rather, the actual values of the
elements are used to index into an array. - It is also STABLE numbers with the same value
appear in the output array in the same order as
the input array. (This is only important with
satellite data carried around with the elements
being sorted).
53Radix Sort
- In the old days of computers (ca 1970) there used
to be card sorting machines which sorted punched
cards. - The cards were organized into 80 columns, and 12
rows/column such that a hole could be punched
into one of 12 places in a particular column. - The sorters could place a card into one of 12
output bins, depending upon which hole is punched.
54Radix Sort(cont)
- An operator could then gather the cards bin by
pin and place them such that the cards with the
first hole punched are on top of the cards with
the second hole punched, etc. - For decimal digits, only 10 of the holes are
used, representing digits 0-9 - A d-digit number would occupy d columns
55Radix Sort(cont)
- The sorting machines could only sort on one
column at a time, so some algorithm was necessary
to sort n cards with d-digit numbers. - What about sorting on the most significant digit
first? - This requires each of the 10 stacks generated to
be sorted separately, which requires keeping
track of each of these separate piles.
56Radix Sort(Cont)
- The scheme is to sort on the least significant
digit first (counter intuitively) - The cards are then re-combined into a single
deck, with the cards in the 0 bin preceding the
cards in the 1 bin, etc. - We then re-sort the deck in its entirety but
based upon the next least significant digit - After making our way to the most significant
digit, the entire deck is sorted.
57Radix Sort(cont)
- Only d passes are required through the deck of
cards. - The sort is stablebut the operator has to be
careful about not changing the order of the
cards.
58Radix Sort Figure 8.3
720 355 436 457 657 329 839
329 355 436 457 657 720 839
329 457 657 839 436 720 355
720 329 436 839 355 457 657
ms digit
mid digit
ls digit
59Radix Sort (typical computer)
- Our typical computer is a sequential
random-access machine - We can use radix sort to sort information keyed
by multiple fields - Dates have 3 integer fieldsmonth,day,year
- We could do a stable radix sort by sorting 3
times on day, month, and then year.
60Radix Sort PsuedoCode
- RADIX-SORT(A,d)
- For i ? 1 to d
- do use a stable sort to sort array A on
digit I - Lemma 8.3 Given n d-digit numbers in which each
digit can take on up to k possible values,
RADIX-SORT correctly sorts these numbers in
T(d(nk)) time. - When d is constant and k O(n), radix sort runs
in linear time.
61Radix-Sort
- Lemma 8.4 Given n b-bit numbers and any positive
integer r ? b, RADIX-SORT correctly sorts these
numbers in T((b/r)(n2r) time. - Proof For a value r ? b, view each key as
having d?b/r? digits of r bits each. Each digit
is an integer in the range 0 to 2r-1, so
counting sort can be used with k2r-1. (For
example, view a 32 bit word as having 4 8 bit
digits, so that b32,r8,k 2r-1255, and
db/r4.) Each pass of counting sort takes
T(nk) T(n2r) and there are d passes, for a
total of T(d(n2r)) T((b/r)(n2r))
62Radix Sort Comments
- Lots of comments in book
- The text makes the point that even if for given
values of n and b and we choose r to minimize the
expression to make the sort be close to T(n) it
appears to be better than Quicksorts T(n lgn).
However, there is the big caveat that the
constant factors hidden in the T notation differ.
Quicksort often uses hardware caches more
effectively than radix sort, while the version of
radix sort which uses counting sort as the
intermediate stable sort does not sort in place,
while others (like quicksort) may.
63Review Mid-Term Results
- Nine Students took the exam
- Average Score 80.11
- Sorted Scores 66,67,71,78,85,85,86,90,93
64Data Structures and sets
- Sets are fundamental to computer science
- Mathematical sets are unchanging
- Sets manipulated by algorithms can grow, shrink,
or change over time. - Such sets are called dynamic sets.
- Algorithms may need various operations on
setssuch as - insert
- delete
- test for membership
65Dynamic Sets
- In a typical implementation, each element is
represented by an object whose fields can be both
examined and manipulated if we have a pointer to
the object. - Some dynamic sets assume that one of the objects
fields is an identifying key field. - If all the keys are distinct, think of the
dynamic set as being a set of key values - The object may also contain satellite data.
66Operations on dynamic sets
- Search(S,k) A query that, given a set S and a
key value k, returns a pointer x to an element in
S such that keyx k, or NIL if no such element
belongs to S. - Insert(S,x) A modifying operation that augments
the set S with the element pointed to by x. We
usually assume that any fields in element x
needed by the set implementation have already
been initialized.
67Operations on Dynamic Sets(cont)
- Delete(S,x) A modifying operation that, given a
pointer x to an element in the set S, removes x
form S. (Note that x is a pointer to an element,
not a key value) - Minimum(S) A query on a totally ordered set S
that returns a pointer to the element of S with
the smallest key - Maximum(S) A query on a totally ordered set S
that returns a pointer to the element of S with
the largest key
68Operations on Dynamic Sets(cont)
- Successor(S,x) A query that, given an element x
whose key is from a totally ordered set S,
returns a pointer to the next larger element in
S, or NIL if x is the maximum element. - Predecessor(S,x) A query that, given an element
x whose key is from a totally ordered set S,
returns a pointer to the next smaller element in
S, or NIL if x I sthe minimum element.
69Overview of Part III
- Chapter 10 stacks, queues, linked lists, and
rooted trees - Chapter 11 hash tables
- Chapter 12 binary search trees
- Chapter 13 red-black trees
- Chapter 14 augmenting data structures
70Chapter 10 stacks queues
- Stacks and queues are dynamic sets in which the
element removed from the set by the delete
operation is pre-specified. - In a stack, the element deleted from the set is
the one most recently inserted, it has a last-in
first-out or LIFO policy. - In a queue, the element deleted is always the one
that has been in the set for the longest time
it implements a first-in first-out or FIFO policy.
71Stacks
- The insert operation on a stack is often called a
push and the delete operation is often called a
pop. - These names are allusions to physical stacks,
such as the spring-loaded stacks of plates found
in cafeterias.
72Stack pseudo code
- STACK-EMPTY(S)
- if topS 0
- then return TRUE
- else return FALSE
- PUSH(S,x)
- topS ? topS 1
- StopS ? x
73Stack pseudo code(cont)
- POP(s)
- if STACK-EMPTY(S)
- then error underflow
- else topS ? topS - 1
- return StopS 1
- Note The pseudo-code does nothing about stack
overflowjust underflow.
74StacksFigure 10.1
S
TopS 4
S
17
3
TopS 6
S
17
TopS 5
75Queues
- The insert operation on a queue is called an
ENQUEUE (a.k.a. an append or put) - The delete operation on a queue is called a
DEQUEUE (a.k.a. a serve or get) - The FIFO property of a queue causes it to act
like a line of students waiting in the registrars
office.
76Queue pseudo code
- ENQUEUE(Q,x)
- QtailQ ? x
- if tailQ lengthQ
- then tailQ ? 1
- else tailQ ? tailQ 1
- DEQUEUE(Q)
- x ? QheadQ
- if headQ lengthQ
- then headQ ? 1
- else headQ ? headQ 1
- return x
77Queues Figure 10.2
(a)
Q
15
6
9
4
8
HeadQ7
TailQ12
(b)
Q
6
9
3
15
8
4
17
5
After ENQUEUE(Q,17) ENQUEUE(Q,3) ENQUEUE(Q,5)
HeadQ7
TailQ3
(c)
Q
15
8
6
9
4
3
17
5
After DEQUEUE(Q) returns 15
HeadQ8
TailQ3
78Queues a practical example
- This next group of slides is taken from a
practical example of a queue implementation which
was used on multiple printers at Hewlett-Packard
Company. - The purpose of this set of slides is to
illustrate some of the subtleties which are
present when trying to implement a defect-free
queue. - This code had tens of thousands of hours of
testing and all of the printer products were
deemed shippable. Nevertheless, there were
reports of rare problems
79Queues a practical example
- The context for this queue code is that it was
intended to be a fast, local queue implemented
between some task code and an interrupt service
routine (isr). - The task code executed at a 5 hz rate, i.e., each
200 ms. It essentially executed at a high
priority compared to other tasks, then slept for
200 ms and then was run again. Each time it
executed, it placed either nothing in the queue
or 3 commands in the queue. - The isr code executed asynchronously to the task
code. It was called at each zero crossing of the
power line, or 120 hz here in the US or 100 in
countries that use 50 hz power. Even though the
isr ran at 120 hz, it really just returned 5 out
of 6 interrupts and only pulled commands out of
the queue each 6th interrupt, making it run
effectively at 20 hz.
80Queues a practical example
- Given that the isr emptying the queue executed at
20 hz and the task filling the queue executed at
5 hz, the queue was empty most of the time. - The three commands placed in the queue were
really two commands and a dummy framing command
which indicated to the isr the end of a frame of
commands, i.e., the end of the execution of the
task. - The isr would read the queue until either (1)
empty or (2) encountering a framing command.
81Queues some more detail
- Here are some declarations and the code itself to
illustrate the details - define MAXQUEUE 50
- typedef struct queue_tag
-
- int front
/ front of queue / - int rear
/ rear of queue / - int entryMAXQUEUE
- QUEUETYPE
-
- QUEUETYPE myQueue
82Queues some more detail
- static void InitializeQueue(QUEUETYPE q)
- q-gtfront0 q-gtrear-1
- static boolean_t QueueEmpty(QUEUETYPE q)
- return(((q-gtrear1)MAXQUEUE)q-gtfront)
- static boolean_t QueueFull(QUEUETYPE q)
- return(((q-gtrear2)MAXQUEUE)q-gtfront)
83Queues some more detail
- static void PutQueue(int item, QUEUETYPE q)
-
- boolean_t result
- result QueueFull(q)
- if(result FALSE)
-
- q-gtrear (q-gtrear 1) MAXQUEUE
- q-gtentryq-gtrear item
-
-
84Queues some more detail
- void GetQueue(int itemPtr, QUEUETYPE q)
-
- boolean_t result
- result QueueEmpty(q) / details of what
to do if / - if(!result) / empty left
off purposefully / -
- itemPtr q-gtentryq-gtfront
- q-gtfront (q-gtfront 1) MAXQUEUE
-
-
-
85Queues some more detail
- The context is that InitializeQueue() gets called
once at startup. Thereafter, PutQueue() and by
extension QueueFull() are called only by the
task. - GetQueue() and by extension, QueueEmpty() are
called only by the isr - These work over millions of calls but every
once - in a while (could be tens of thousands of
printed pages) something happens in which the
code thinks the queue is suddenly full
86Queues a practical example
- Are there any ideas of what might be wrong
- Lets look at the next slide maestro
87Queues a practical example
Assume interrupt in PutQueue() between rear ptr
move and store QueueFull() returns FALSE, we inc
rear ptr as shown GetQueue() is now executed
repeatedly until Queue empty or frame reached.
4
6
2
5
x
3
x
1
rear before store but after inc
front
rear
88Queues a practical example
Eventually in the isr we get down to the
situation shown below which is a problem if the
rear pointer has been moved but the item has not
been stored because GetQueue() will remove an
incorrect itemand then when the isr returns, the
item will be stored but now the queue appears to
be full because the front pointer gets
incremented one more time
4
6
2
5
x
3
x
1
front
rear before store but after inc
rear
89Queues a practical example
- This defect is currently in shipping products but
happens so rarely that it is not deemed necessary
to roll to a new release. - This is a pragmatic, business based decision
which balances the work (cost and risk as well)
against fixing the known defect. - What will probably happen is that the fix will be
made in the code base but not rolled in until and
unless the code needs to be rolled for some other
reason.
90Linked Lists
- A linked list is a data structure in which the
objects are arranged in a linear order. - Unlike an array, the order is not determined by
the array indices, rather by the pointer in each
object. - A doubly linked list l L is an object with a key
field and two pointer fields next and prev. - Given an element x in the list, nextx points to
its successor in the list while prevx points to
the predecessor.
91Linked Lists (cont)
- If prevx is nil, there is no predecessor so x
is the head of the list. - If nextx is nil, there is no successor so x is
the tail of the list. - An attribute headL points to the first element
of the list. If this is nil, the list is empty
or a null list. - There are singly linked (only next pointers) and
doubly linked lists.
92Linked Lists (cont)
- A sorted linked list has keys increasing or
decreasing in order.
93Linked List search pseudo code
- LIST-SEARCH(L,k)
- x ? headL
- while x ? nil and keyx ? k
- do x ? nextx
- return x
- LIST-SEARCH searches a list of n objects in ?(n)
in the worst case.
94Linked List Insert pseudo code
- LIST-INSERT(L,x)
- nextx ? headL
- if headL ? nil
- then prevheadL ? x
- headL ? x
- prevx ? nil
- The running time is O(1) since this splices onto
the head of the line.
95Linked List Delete pseudo code
- LIST-DELETE(L,x)
- if prevx ? nil
- then nextprevx ? nextx
- else headL ? nextx
- if nextx ? nil
- then prevnextx ? prevx
- LIST-DELETE runs in O(1) time because the element
to be deleted has already been found and no
searching is included.
96Linked List Sentinels
- The code for list-delete could be simpler if we
ignore the boundary conditions at the head and
tail of the table. - LIST-DELETE
- nextprevx ? nextx
- prevnextx ? prevx (note symmetry!)
- A sentinel is a dummy object that allows us to
simplify boundary conditions.
97Linked List Sentinels (cont)
- The sentinel nilL appears between the head and
the tail. The attributes headL and tailL are
no longer needed. - An empty list consists of just the sentinel.
- nextnilL refers to the head
- PrevnilL refers to the tail
- LIST-SEARCH and LIST-DELETE also become simpler.
98List-Search pseudo code
- LIST-SEARCH(L,k)
- x ? nextnilL
- while x ? nilL and keyx ? k
- do x ? nextx
- return x
99List-Insert pseudo code
- LIST-INSERT(L,x)
- nextx ? nextnilL
- prevnextnilL ? x
- nextnilL ? x
- prevx ? nilL
- Sentinels rarely reduce the asymptotic time
bounds of data structure operations, but they can
and do minimize constant time factors. - However, if they can tighten code in a loop, they
may reduce the coefficient of n or n2 in the
running time.
100Multiple-array representation of objects
- Some languages, such as Fortran, do not have
pointers. There are still ways of implementing
linked data structures without explicit pointer
types. - The idea is to use parallel arrays next, key, and
prev as shown in the next slide. - Generally, we tend to use languages that do have
pointer capability!