More quicksort

About This Presentation

Title:

More quicksort

Description:

... will use random sampling, meaning that ... how many calls are made to the random-number generator RANDOM in the worst case? ... Assume candidates are number 1-n ... – PowerPoint PPT presentation

Number of Views:93

Avg rating:3.0/5.0

Slides: 101

Provided by: vancou

Category:

more less

Transcript and Presenter's Notes

Title: More quicksort

1
More quicksort

Bad splits at the root are worse than bad splits
farther down the tree
A bad split at the root costs n and produces two
subarrays of n-1 and 0. Then the partitioning of
the n-1 subarray costs n-1 and produces (n-1)/2
1 and (n-1)/2 sizes.
The running time of quicksort even when
alternating between good and bad splits is
O(nlgn) but with a slightly larger constant.

2
Quick Sort HW 3

Homework 3 forces a test between reality (your
homework) and our mathematical models.
You should have graphs showing the correlation
between O(ngln) and O(n2)

3
Randomized quicksort

The next idea is an improvement for
quicksortbecause the partition procedure is
where the breakdown occurs when the data is
sorted.
We have assumed that all permutations of the
input data are equally likelybut that is not
likely to hold in the real world.
We could randomize the input to obtain good
average-case performance over all inputs by
explicitly permuting the input. However, we will
use random sampling, meaning that the pivot will
be chosen randomly from the range subarray
Ap..r

4
Randomized Quicksort pseudocode
RANDOMIZED-PARTITION(A,p,r) i ?RANDOM(p,r)
exchange Ar Ai return
Partition(A,p,r)

RANDOMIZED QUICKSORT(A,p,r)
if pltr
then q ?RANDOMIZED-PARTITION(A,p,r)
RANDOMIZED-QUICKSORT(A,p,q-1)
RANDOMIZED-QUICKSORT(A,q1,r)

5
Exercises

Why do we analyze the average-case performance of
a randomized algorithym and not its worst-case
performance?
During the running of the procedure
RANDOMIZED-QUICKSORT, how many calls are made to
the random-number generator RANDOM in the worst
case? Answer in terms of T-notation.

6
A randomized version of Quicksort

The hiring problem (chapter 5)
New Office Assistant needed
Previous attempts at hiring were unsuccessful
Employment agency hired
One candidate/day being sent over
Will interview the candidate and decide to hire
or not
It costs a small fee to interview the applicant

7
The Hiring Problem(cont)

To hire an applicant is more costly because
Must fire the current office assistant
Must pay a large fee to the employment agency
We are committed to having the best possible
person in the job.
Therefore, if we interview an applicant that is
better qualified than the current office
assistant, we will fire the current one and hire
the new one.
We are willing to pay this cost but want to know
what the price of the strategy is.

8
The Hiring Problem (cont)

Assume candidates are number 1-n
When we interview candidate i, we assume that we
are able to determine if candidate i is the best
candidate seen so far.

9
The Hiring Problem-pseudocode

HIRE-ASSISTANT(n)
Best ? 0 Candidate 0 is a least-qualified
dummy candidate
For i ?1 to n
do interview candidate i
if candidate i is better than
candidate best
then best ? i
hire candidate i

10
Worst-Case Analysis

In the worst case, we actually hire every
candidate that we interview
This occurs when the candidates come in
increasing order of quality
Interviewing has a low cost ci
Hiring has a higher cost ch
Let m be the number of people hired
Then the total cost is O(ncimch)

11
Probabilistic Analysis

Must make assumptions or have knowledge about the
input distribution
We assume that each of the n! permutations occur
with equal probability.
Stated without proof Lemma 5.2 If the
candidates are presented in a random order,
algorithm HIRE-ASSISTANT has a total hiring cost
of O(ch ln n)

12
Worst-Case analysis of quicksort

We now prove the assertion that the worst case
running time is T(n2)
For an input of size n and q ranges from 0 to n-1
because PARTITION produces two subproblems with
total size n-1.
T(n) max (T(q) T(n-q-1)) T(n)
Guess that T(n) ?? cn2 for some c then we have by
substitution into the recurrence

0?q?n-1
13
Worst-Case analysis of quicksort (cont)

T(n) ? max (cq2 c(n-q-1)2) T(n)
c max ( q2 (n-q-1)2) T(n)
We can use a little calculus hereremember that
for a local minimum the second derivative must be
gt 0 when the first derivative is 0.

0?q?n-1
0?q?n-1
14
Worst-Case analysis of quicksort (cont)

Show the mathon an auxillary sheet. Since we
have maximum at both endpoints
We then have
(q2 (n-q-1)2) ? (n-1)2 n2 2n 1
T(n) ? cn2 c(2n-1) T(n)
? cn2
The worst case running time is T(n2)

0?q?n-1
15
Expected running time

Using indicator random variables in the proof, we
state that
The expected running time of RANDOMIZED-PARTITION
is T(n lg n)

16
HeapSort(Chapter 6)

The (binary) heap data structure is a nearly
complete binary tree.
Each node of the tree is an element in the array.
The tree is completely filled out in all levels
except possibly the lowest
Think of the number of elements of the heap
stored in the array A.

17
Heapsort
1
16
3
2
14
10
5
7
4
6
7
3
3
8
9
10
8
9
2
2
1
4
2
4
5
8
10
3
6
7
1
9
18
Heapsort

Parent(i)
return?i/2?
Left(i)
return 2i
Right(i)
return 2i1

19
Heapsort

There are two kinds of binary heaps
A max-heap has the property that for every node i
other than the root
AParent(i) ? Ai
A min-heap has the property that for every node i
other than the root
AParent(i) ? Ai
The height of a node in a heap is defined as the
number of edges on the longest simple downward
path from that node to a leaf, and the height of
a heap is the height of its root.

20
Heapsort

MAX-HEAPIFY runs in O(lgn) time is used to
maintain the max-heap property
BUILD-MAX_HEAP runs in O(n) time and produces a
max-heap from an unordered array.
HEAPSORT runs in O(n lg n) sorts an array in
place
MAX-HEAP-INSERT, HEAP-EXTRACT-MAX,
HEAP-INCREASE-KEY, and HEAP-MAXIMUM run in O(lg
n) time allow use as a priority queue.

21
Heapsort

MAX-HEAPIFY(A,i) Left(i) and Right(i) heaps
l ? Left(i)
r ? Right(i)
if l ? heap-sizeA and Al gt Ai
then largest ? l
else largest ? i
If r ? heap-sizeA and Ar gt Alargest
then largest ? r
if largest ? i
then exchange Ai Alargest
MAX-HEAPIFY(A,largest)

22
Heapsort (heap-sizeA 10)
1
16
3
2
7
14
10
5
4
6
i
7
4
9
3
1
10
9
(b)
2
1
8
1
16
7
2
3
3
3
14
10
8
5
4
7
6
7
8
9
3
10
9
(a)
2
1
4
i
(c)
23
Heapsort

Running time of MAX-HEAPIFY is just
T(n) ? T(2n/3) T(1)
Where T(1) desribes the time to fix up the
relationship among the elements Ai, ALeft(i),
ARight(i), and T(2n/3) arises
from the fact that the procedure must call
MAX-HEAPIFY for the subtree rooted at one of the
children of node i and the fact that the
childrens subtrees each have size at most 2n/3.
The worst case occurs when the last row is
exactly ½ full.
By the master theorem, T(n) O(lg n)

24
Building a Heap

We use MAX-HEAPIFY in a bottom-up way to convert
an array A1..n, where n lengthA, into a
max-heap.
Note that A(?n/2? 1)..n are all leaves
BUILD-MAX-HEAP goes through the non-leave nodes
of the tree and runs MAX-HEAPIFY on each one.

25
BUILD-MAX-HEAP pseudo code

BUILD-MAX-HEAP(A)
heap-sizeA ? lengthA
for i ? lengthA/2 downto 1
do MAX-HEAPIFY(A,i)

26
BUILD-MAX-HEAP loop invariants

Initialization prior to the first iteration of
the loop, i ?n/2? . Each node ?n/2? 1, ?n/2?
2, ,n is a leaf and at the root of a trivial
max-heap.
Maintenance the children of node i are numbered
higher than i. They are both roots of max-heaps.
Termination At termination, i 0. By the loop
invariant, each node 1,2,,n is the root of a
max-heap.

27
7
4
A
1
2
9
3
16
10
14
8
1
1
4
4
2
3
3
2
(b)
1
3
(a)
1
3
7
5
4
6
7
5
i
4
6
16
2
i
9
10
16
2
10
9
10
9
8
9
10
1
1
8
14
7
8
7
14
8
4
4
3
3
2
2
i
i
(c)
(d)
1
1
10
3
7
5
5
7
4
4
6
6
16
16
14
14
9
10
9
3
10
10
8
8
9
9
1
1
7
2
i
7
2
8
8
16
4
3
2
3
2
(e)
14
(f)
10
10
16
5
7
4
6
7
5
4
6
7
7
8
9
3
14
9
3
10
9
8
9
8
10
2
4
1
1
2
8
28
Heapsort BUILD-MAX-HEAP

A loose bound on BUILD-MAX-HEAP is argued as
follows Each call to MAX-HEAPIFY costs O(lg n)
time and there are just O(n) of these calls.
Thus, the running time is O(n lg n)
A tighter bound is derived in the text but
essentially has the result that we can build a
max-heap from an unordered array in linear time
O(n).

29
The Heapsort algorithm

Use BUILD-MAX-HEAP to build a max-heap on the
input array A1n, where n is just lengthA.
Since the maximum element in the array is stored
at the root A1 it is easy to put it in its
correct final position by just exchanging with
An.
Now discard node n from the tree.

30
The Heapsort algorithm(cont)

Now we use MAX-HEAPIFY to build a heap 1 smaller
than before and keep doing this process until we
are down to 2.
This takes O(n lg n) time since the call to
BUILD-MAX-HEAP takes O(n) and each of the n-1
calls to MAX-HEAPIFY take O(lg n)

31
Heapsort
14
2
3
16
(b)
8
10
3
2
5
4
6
14
10
(a)
7
3
9
5
4
4
6
10
7
9
3
8
9
i
2
16
1
10
9
1
2
4
9
3
2
(d)
3
8
5
4
6
7
1
2
4
10
9
10
16
14
10
i
32
Heapsort
1
1
7
3
8
2
2
(f)
3
4
(e)
7
3
7
5
4
6
7
5
4
6
2
9
9
8
1
2
9
1
4
9
10
8
9
i
i
10
8
9
10
14
16
10
10
10
14
16
1
1
3
4
3
2
3
2
2
(h)
1
2
(g)
3
7
5
4
i
6
5
7
4
6
i
7
9
8
9
4
7
8
1
9
10
8
9
10
8
9
10
14
16
10
14
16
33
Heapsort
1
1
3
i
2
2
(j)
3
7
5
4
6
7
9
8
4
10
8
9
10
14
16
1
10
8
7
9
2
16
4
14
3
(k)
34
Heapsort Applicationpriority queue

A priority queue is a data structure for
maintaining a set S of elements, each with an
associated value called a key. A max-priority
queue supports the following
INSERT(S,x) inserts the element x in set S. We
could write as S ? S ? x.
MAXIMUM(S) returns the element of S with the
largest key.
EXTRACT-MAX(S) removes and returns the element of
S with the largest key.
INCREASE-KEY(S,x,k) increases the value of
element xs value to k, k ? current key value.

35
Heapsort Applicationpriority queue

A common application is scheduling jobs on a
shared computer. The queue keeps track of jobs
to performed and their relative priorities.
A min-priority queue is the mirror image of the
max-priority queue. It supports the operations
INSERT, MINIMUM,EXTRACT-MIN, and DECREASE-KEY.

36
Heapsort Applicationpriority queue

The textbook discusses the idea of a handle. The
exact meaning depends upon the application but it
could be thought of a function pointer to the
piece of code that needs to be executed at a
certain priority.
The procedure HEAP-MAXIMUM implements the MAXIMUM
operation in T(1) time.
HEAP-MAXIMUM(A)
return A1

37
Heapsort Applicationpriority queue

The procedure HEAP-EXTRACT-MAX implements the
EXTRACT-MAX operation
HEAP-EXTRACT-MAX(A)
if heap-sizeA lt 1
then error heap underflow
max ? A1
A1 ? Aheap-sizeA
Heap-sizeA ? heap-sizeA 1
MAX-HEAPIFY(A,1)
Return max

38
Heapsort Applicationpriority queue

HEAP-EXTRACT-MAX is O(lg n) since it performs a
constant amount of work on top of the O(lg n)
time for MAX-HEAPIFY.
HEAP-INCREASE-KEY implements the INCREASE-KEY
operation.

39
Heapsort Applicationpriority queue

HEAP-INCREASE-KEY(A,i,key)
If key lt Ai
then error new key is lt current key
Ai ? key
while i gt 1 and and AParent(i) lt Ai
do exchange Ai ? AParent(i)
i ? Parent(i)

40
HEAP-INCREASE-KEY
16
16
3
2
3
2
14
10
(b)
14
10
(a)
5
4
6
5
4
6
7
3
8
9
7
3
8
9
10
9
10
9
1
2
15
1
2
4
i
i
Pri15
16
3
2
16
3
2
14
i
10
(c)
15
10
(d)
5
4
6
i
5
4
6
7
3
9
15
7
14
3
9
10
9
10
9
2
1
8
1
2
8
41
Sorting in Linear Time(chptr 8)

All sorts introduced so far, namely quicksort,
mergesort, and heapsort achieve the O(n lg n)
bound quicksort on average and mergesort and
heapsort in the worst case.
All of these sorts are comparison sortsnamely
they work by comparison of keys.

42
Lower Bounds for Sorting

We look first at a decision tree model for
insertion sort.
We assume all input elements are distinct
With the above assumption, comparisions like ai
aj are useless, so all comparisons will be
assumed to have the form ai ? aj since these are
equivalent to all of the possible comparisons
that can be made.

43
Decision Trees

A decision tree is a full binary tree that
represents comparisons between elements
Control, data movement, and all other aspects of
the algorithm are ignored
Each internal node is annotated by ij where i
and j are the ith and jth elements in the input
sequence.
Leaf nodes are denoted by ?(1), ?(2),
?(3),?(n)!. Since there are n! possible
permutations of an input there must be n! leaves.

44
Decision Tree-Insertion Sort(3 elements)
12
gt
?
23
13
?
gt
gt
?
13
lt1,2,3gt
lt2,1,3gt
23
?
gt
?
gt
lt3.2.1gt
lt2,3,1gt
lt3,1,2gt
lt1,3,2gt
45
Lower Bounds (cont)

The length of the longest path from the root of a
decision tree to any of its reachable leaves
represents the worst-case number of comparisons
that the corresponding sorting algorithm
performs.
Therefore, the worst-case number of comparisons
for a given comparison sort algorithm is the
height of the decision tree
A lower bound on the heights of all decision
trees in which each permutation appears as a
reachable leaf is a lower bound on the running
time of any comparison sort algorithm.

46
Lower Bounds(cont)

Theorem 8.1
Any comparison algorithm requires ?(n lg n)
comparisons in the worst case.
Proof Consider a decision tree of height h with
l reachable leaves exists corresponding to a
comparison sort on n elements. We have n!
possible permutations of the input each of which
appears as some leaf. Therefore n! ? l.

47
Lower Bounds(cont)

Since a binary tree of height h has no more than
2h leaves, we have
n! ? l ? 2h
Thus n! ? 2h and if we take logarithms of both
sides of the inequality
lg (n!) ? h lg 2 ?h ? lg(n!)
By 3.18 lg(n!) T(n lg n), ?
h ?(n lg n)

48
Counting Sort

This sort assumes that each of the n input
elements is an integer in the range 0 to k for
some integer k. When k O(n), the sort runs in
T(n) time.
The basic idea is to determine for each input
element x, the number of elements which are less
than x. Then we can place the element directly
into its position in the output array.
There is a slight problem if elements are not
distinct, because they cant go into the same
location.

49
Counting Sort(cont)

In the following pseudo code, assume that the
input is in array A1n, and thus the lengthA
n. Two other arrays are required, namely
B1n holds the sorted output and C0k
provides temporary working storage.

50
Counting Sort pseudo code

COUNTING-SORT(A,B,k)
for i ? 0 to k
do Ci ? 0
for j ? 1 to lengthA
do CAj ? CAj 1
? Ci now contains the number of elements equal
to i
for i ? 1 to k
do Ci ? Ci Ci-1
? Ci now contains the number of elements ? i
for j ? lengthA downto 1
do BCAj ? Aj
CAj ? CAj 1

51
Counting Sort Operation
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
B
3
0 1 2 3 4 5
2
2
5
3
3
3
0
0
A
7
7
2
2
4
8
C
0 1 2 3 4 5
0 1 2 3 4 5
6
4
7
8
2
2
C
C
2
3
0
2
0
1
(b)
(c)
(a)
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
B
0
3
3
3
B
0
1 2 3 4 5 6 7 8
0 1 2 3 4 5
0 1 2 3 4 5
B
2
2
3
0
5
3
3
0
2
1
8
C
5
4
7
7
1
4
6
8
C
2
(f)
(e)
(d)
52
Counting Sort Times

Overall, the time is T(k n) because the first
for loop takes T(k) while the next for loop takes
T(n). The third for loop takes T(k) and the last
for loop takes T(n).
Therefore, the overall time is T(nk) .
It beats ?(n lg n) because it does NO comparisons
at all. Rather, the actual values of the
elements are used to index into an array.
It is also STABLE numbers with the same value
appear in the output array in the same order as
the input array. (This is only important with
satellite data carried around with the elements
being sorted).

53
Radix Sort

In the old days of computers (ca 1970) there used
to be card sorting machines which sorted punched
cards.
The cards were organized into 80 columns, and 12
rows/column such that a hole could be punched
into one of 12 places in a particular column.
The sorters could place a card into one of 12
output bins, depending upon which hole is punched.

54
Radix Sort(cont)

An operator could then gather the cards bin by
pin and place them such that the cards with the
first hole punched are on top of the cards with
the second hole punched, etc.
For decimal digits, only 10 of the holes are
used, representing digits 0-9
A d-digit number would occupy d columns

55
Radix Sort(cont)

The sorting machines could only sort on one
column at a time, so some algorithm was necessary
to sort n cards with d-digit numbers.
What about sorting on the most significant digit
first?
This requires each of the 10 stacks generated to
be sorted separately, which requires keeping
track of each of these separate piles.

56
Radix Sort(Cont)

The scheme is to sort on the least significant
digit first (counter intuitively)
The cards are then re-combined into a single
deck, with the cards in the 0 bin preceding the
cards in the 1 bin, etc.
We then re-sort the deck in its entirety but
based upon the next least significant digit
After making our way to the most significant
digit, the entire deck is sorted.

57
Radix Sort(cont)

Only d passes are required through the deck of
cards.
The sort is stablebut the operator has to be
careful about not changing the order of the
cards.

58
Radix Sort Figure 8.3
720 355 436 457 657 329 839
329 355 436 457 657 720 839
329 457 657 839 436 720 355
720 329 436 839 355 457 657
ms digit
mid digit
ls digit
59
Radix Sort (typical computer)

Our typical computer is a sequential
random-access machine
We can use radix sort to sort information keyed
by multiple fields
Dates have 3 integer fieldsmonth,day,year
We could do a stable radix sort by sorting 3
times on day, month, and then year.

60
Radix Sort PsuedoCode

RADIX-SORT(A,d)
For i ? 1 to d
do use a stable sort to sort array A on
digit I
Lemma 8.3 Given n d-digit numbers in which each
digit can take on up to k possible values,
RADIX-SORT correctly sorts these numbers in
T(d(nk)) time.
When d is constant and k O(n), radix sort runs
in linear time.

61
Radix-Sort

Lemma 8.4 Given n b-bit numbers and any positive
integer r ? b, RADIX-SORT correctly sorts these
numbers in T((b/r)(n2r) time.
Proof For a value r ? b, view each key as
having d?b/r? digits of r bits each. Each digit
is an integer in the range 0 to 2r-1, so
counting sort can be used with k2r-1. (For
example, view a 32 bit word as having 4 8 bit
digits, so that b32,r8,k 2r-1255, and
db/r4.) Each pass of counting sort takes
T(nk) T(n2r) and there are d passes, for a
total of T(d(n2r)) T((b/r)(n2r))

62
Radix Sort Comments

Lots of comments in book
The text makes the point that even if for given
values of n and b and we choose r to minimize the
expression to make the sort be close to T(n) it
appears to be better than Quicksorts T(n lgn).
However, there is the big caveat that the
constant factors hidden in the T notation differ.
Quicksort often uses hardware caches more
effectively than radix sort, while the version of
radix sort which uses counting sort as the
intermediate stable sort does not sort in place,
while others (like quicksort) may.

63
Review Mid-Term Results

Nine Students took the exam
Average Score 80.11
Sorted Scores 66,67,71,78,85,85,86,90,93

64
Data Structures and sets

Sets are fundamental to computer science
Mathematical sets are unchanging
Sets manipulated by algorithms can grow, shrink,
or change over time.
Such sets are called dynamic sets.
Algorithms may need various operations on
setssuch as
insert
delete
test for membership

65
Dynamic Sets

In a typical implementation, each element is
represented by an object whose fields can be both
examined and manipulated if we have a pointer to
the object.
Some dynamic sets assume that one of the objects
fields is an identifying key field.
If all the keys are distinct, think of the
dynamic set as being a set of key values
The object may also contain satellite data.

66
Operations on dynamic sets

Search(S,k) A query that, given a set S and a
key value k, returns a pointer x to an element in
S such that keyx k, or NIL if no such element
belongs to S.
Insert(S,x) A modifying operation that augments
the set S with the element pointed to by x. We
usually assume that any fields in element x
needed by the set implementation have already
been initialized.

67
Operations on Dynamic Sets(cont)

Delete(S,x) A modifying operation that, given a
pointer x to an element in the set S, removes x
form S. (Note that x is a pointer to an element,
not a key value)
Minimum(S) A query on a totally ordered set S
that returns a pointer to the element of S with
the smallest key
Maximum(S) A query on a totally ordered set S
that returns a pointer to the element of S with
the largest key

68
Operations on Dynamic Sets(cont)

Successor(S,x) A query that, given an element x
whose key is from a totally ordered set S,
returns a pointer to the next larger element in
S, or NIL if x is the maximum element.
Predecessor(S,x) A query that, given an element
x whose key is from a totally ordered set S,
returns a pointer to the next smaller element in
S, or NIL if x I sthe minimum element.

69
Overview of Part III

Chapter 10 stacks, queues, linked lists, and
rooted trees
Chapter 11 hash tables
Chapter 12 binary search trees
Chapter 13 red-black trees
Chapter 14 augmenting data structures

70
Chapter 10 stacks queues

Stacks and queues are dynamic sets in which the
element removed from the set by the delete
operation is pre-specified.
In a stack, the element deleted from the set is
the one most recently inserted, it has a last-in
first-out or LIFO policy.
In a queue, the element deleted is always the one
that has been in the set for the longest time
it implements a first-in first-out or FIFO policy.

71
Stacks

The insert operation on a stack is often called a
push and the delete operation is often called a
pop.
These names are allusions to physical stacks,
such as the spring-loaded stacks of plates found
in cafeterias.

72
Stack pseudo code

STACK-EMPTY(S)
if topS 0
then return TRUE
else return FALSE
PUSH(S,x)
topS ? topS 1
StopS ? x

73
Stack pseudo code(cont)

POP(s)
if STACK-EMPTY(S)
then error underflow
else topS ? topS - 1
return StopS 1
Note The pseudo-code does nothing about stack
overflowjust underflow.

74
StacksFigure 10.1
S
TopS 4
S
17
3
TopS 6
S
17
TopS 5
75
Queues

The insert operation on a queue is called an
ENQUEUE (a.k.a. an append or put)
The delete operation on a queue is called a
DEQUEUE (a.k.a. a serve or get)
The FIFO property of a queue causes it to act
like a line of students waiting in the registrars
office.

76
Queue pseudo code

ENQUEUE(Q,x)
QtailQ ? x
if tailQ lengthQ
then tailQ ? 1
else tailQ ? tailQ 1
DEQUEUE(Q)
x ? QheadQ
if headQ lengthQ
then headQ ? 1
else headQ ? headQ 1
return x

77
Queues Figure 10.2
(a)
Q
15
6
9
4
8
HeadQ7
TailQ12
(b)
Q
6
9
3
15
8
4
17
5
After ENQUEUE(Q,17) ENQUEUE(Q,3) ENQUEUE(Q,5)
HeadQ7
TailQ3
(c)
Q
15
8
6
9
4
3
17
5
After DEQUEUE(Q) returns 15
HeadQ8
TailQ3
78
Queues a practical example

This next group of slides is taken from a
practical example of a queue implementation which
was used on multiple printers at Hewlett-Packard
Company.
The purpose of this set of slides is to
illustrate some of the subtleties which are
present when trying to implement a defect-free
queue.
This code had tens of thousands of hours of
testing and all of the printer products were
deemed shippable. Nevertheless, there were
reports of rare problems

79
Queues a practical example

The context for this queue code is that it was
intended to be a fast, local queue implemented
between some task code and an interrupt service
routine (isr).
The task code executed at a 5 hz rate, i.e., each
200 ms. It essentially executed at a high
priority compared to other tasks, then slept for
200 ms and then was run again. Each time it
executed, it placed either nothing in the queue
or 3 commands in the queue.
The isr code executed asynchronously to the task
code. It was called at each zero crossing of the
power line, or 120 hz here in the US or 100 in
countries that use 50 hz power. Even though the
isr ran at 120 hz, it really just returned 5 out
of 6 interrupts and only pulled commands out of
the queue each 6th interrupt, making it run
effectively at 20 hz.

80
Queues a practical example

Given that the isr emptying the queue executed at
20 hz and the task filling the queue executed at
5 hz, the queue was empty most of the time.
The three commands placed in the queue were
really two commands and a dummy framing command
which indicated to the isr the end of a frame of
commands, i.e., the end of the execution of the
task.
The isr would read the queue until either (1)
empty or (2) encountering a framing command.

81
Queues some more detail

Here are some declarations and the code itself to
illustrate the details
define MAXQUEUE 50
typedef struct queue_tag
int front
/ front of queue /
int rear
/ rear of queue /
int entryMAXQUEUE
QUEUETYPE
QUEUETYPE myQueue

82
Queues some more detail

static void InitializeQueue(QUEUETYPE q)
q-gtfront0 q-gtrear-1
static boolean_t QueueEmpty(QUEUETYPE q)
return(((q-gtrear1)MAXQUEUE)q-gtfront)
static boolean_t QueueFull(QUEUETYPE q)
return(((q-gtrear2)MAXQUEUE)q-gtfront)

83
Queues some more detail

static void PutQueue(int item, QUEUETYPE q)
boolean_t result
result QueueFull(q)
if(result FALSE)
q-gtrear (q-gtrear 1) MAXQUEUE
q-gtentryq-gtrear item

84
Queues some more detail

void GetQueue(int itemPtr, QUEUETYPE q)
boolean_t result
result QueueEmpty(q) / details of what
to do if /
if(!result) / empty left
off purposefully /
itemPtr q-gtentryq-gtfront
q-gtfront (q-gtfront 1) MAXQUEUE

85
Queues some more detail

The context is that InitializeQueue() gets called
once at startup. Thereafter, PutQueue() and by
extension QueueFull() are called only by the
task.
GetQueue() and by extension, QueueEmpty() are
called only by the isr
These work over millions of calls but every
once
in a while (could be tens of thousands of
printed pages) something happens in which the
code thinks the queue is suddenly full

86
Queues a practical example

Are there any ideas of what might be wrong
Lets look at the next slide maestro

87
Queues a practical example
Assume interrupt in PutQueue() between rear ptr
move and store QueueFull() returns FALSE, we inc
rear ptr as shown GetQueue() is now executed
repeatedly until Queue empty or frame reached.
4
6
2
5
x
3
x
1
rear before store but after inc
front
rear
88
Queues a practical example
Eventually in the isr we get down to the
situation shown below which is a problem if the
rear pointer has been moved but the item has not
been stored because GetQueue() will remove an
incorrect itemand then when the isr returns, the
item will be stored but now the queue appears to
be full because the front pointer gets
incremented one more time
4
6
2
5
x
3
x
1
front
rear before store but after inc
rear
89
Queues a practical example

This defect is currently in shipping products but
happens so rarely that it is not deemed necessary
to roll to a new release.
This is a pragmatic, business based decision
which balances the work (cost and risk as well)
against fixing the known defect.
What will probably happen is that the fix will be
made in the code base but not rolled in until and
unless the code needs to be rolled for some other
reason.

90
Linked Lists

A linked list is a data structure in which the
objects are arranged in a linear order.
Unlike an array, the order is not determined by
the array indices, rather by the pointer in each
object.
A doubly linked list l L is an object with a key
field and two pointer fields next and prev.
Given an element x in the list, nextx points to
its successor in the list while prevx points to
the predecessor.

91
Linked Lists (cont)

If prevx is nil, there is no predecessor so x
is the head of the list.
If nextx is nil, there is no successor so x is
the tail of the list.
An attribute headL points to the first element
of the list. If this is nil, the list is empty
or a null list.
There are singly linked (only next pointers) and
doubly linked lists.

92
Linked Lists (cont)

A sorted linked list has keys increasing or
decreasing in order.

93
Linked List search pseudo code

LIST-SEARCH(L,k)
x ? headL
while x ? nil and keyx ? k
do x ? nextx
return x
LIST-SEARCH searches a list of n objects in ?(n)
in the worst case.

94
Linked List Insert pseudo code

LIST-INSERT(L,x)
nextx ? headL
if headL ? nil
then prevheadL ? x
headL ? x
prevx ? nil
The running time is O(1) since this splices onto
the head of the line.

95
Linked List Delete pseudo code

LIST-DELETE(L,x)
if prevx ? nil
then nextprevx ? nextx
else headL ? nextx
if nextx ? nil
then prevnextx ? prevx
LIST-DELETE runs in O(1) time because the element
to be deleted has already been found and no
searching is included.

96
Linked List Sentinels

The code for list-delete could be simpler if we
ignore the boundary conditions at the head and
tail of the table.
LIST-DELETE
nextprevx ? nextx
prevnextx ? prevx (note symmetry!)
A sentinel is a dummy object that allows us to
simplify boundary conditions.

97
Linked List Sentinels (cont)

The sentinel nilL appears between the head and
the tail. The attributes headL and tailL are
no longer needed.
An empty list consists of just the sentinel.
nextnilL refers to the head
PrevnilL refers to the tail
LIST-SEARCH and LIST-DELETE also become simpler.

98
List-Search pseudo code

LIST-SEARCH(L,k)
x ? nextnilL
while x ? nilL and keyx ? k
do x ? nextx
return x

99
List-Insert pseudo code

LIST-INSERT(L,x)
nextx ? nextnilL
prevnextnilL ? x
nextnilL ? x
prevx ? nilL
Sentinels rarely reduce the asymptotic time
bounds of data structure operations, but they can
and do minimize constant time factors.
However, if they can tighten code in a loop, they
may reduce the coefficient of n or n2 in the
running time.

100
Multiple-array representation of objects

Some languages, such as Fortran, do not have
pointers. There are still ways of implementing
linked data structures without explicit pointer
types.
The idea is to use parallel arrays next, key, and
prev as shown in the next slide.
Generally, we tend to use languages that do have
pointer capability!

Write a Comment

User Comments (0)