Title: Sorting
1Sorting
- Suppose you wanted to write a computer game like
Doom 4 The Caverns of Calvin - How do you render those nice (lurid) pictures of
Calvin College torture chambers, with hidden
surfaces removed? - Given a collection of polygons (points, tests,
values), how do you sort them? - My favorite sort
- What are your favorite sorts?
- Read 6.1-6.5, omit rest of chapter 6.
2Simple (and slow) algorithms
- Bubble sort
- Selection Sort
- Insertion Sort
- Which is best?
- important factors comparisons, data movement
3Sorting out Sorting
- A collection or file of items with keys
- Sorting may be on items or pointers
- Sorting may be internal or external
- Sorting may or may not be stable
- Simple algorithms
- easy to implement
- slow (on big sets of data)
- show the basic approaches, concepts
- May be used to improve fancier algorithms
4Sorting Utilities
- Wed like our sorting algorithms to work with all
data types - template
- void exch(Item A, Item B)
- Item tA AB Bt
- template
- void compexch(Item A, Item B)
- if (B
5Bubble Sort
- The first sort moststudents learn
- And the worst
template void bubble(Item a,
int l, int r) for (int il ifor (int jr ji j--) compexch(aj-1,
aj)
comparisons?
something like n2/2
date movements?
something like n2/2
6Selection Sort
- Find smallest element
- Exchange with first
- Recursively sort rest
template void selection(Item a,
int l, int r) for (int i1 ii) int mini for (int ji1
jexch(ai, amin)
comparisons?
n2/2
swaps?
n
7Insertion Sort
- Like sorting cards
- Put next one in place
template void insertion(Item a,
int l, int r) int i for (ir il
i--) compexch(ai-1,ai) for (il2
iwhile (v aj v
comparisons?
n2/4
n2/4
data moves?
8Which one to use?
- Selection few data movements
- Insertion few comparisons
- Bubble blows
- But all of these are Q(n2), which, as you know,
is TERRIBLE for large n - Can we do better than Q(n2)?
9Merge Sort
- The quintessential divide-and-conquer algorithm
- Divide the list in half
- Sort each half recursively
- Merge the results.
- Base case
- left as an exercise to the reader
10Merge Sort Analysis
- Recall runtime recurrence
- T(1)0 T(n) 2T(n/2) cn
- Q(n log n) runtime in the worst case
- Much better than the simple sorts on big data
files and easy to implement! - Can implement in-place and bottom-up to avoid
some data movement and recursion overhead - Still, empirically, its slower than Quicksort,
which well study next.
11Quicksort
- Pick a pivot pivot list sort halves
recursively. - The most widely used algorithm
- A heavily studied algorithm with many variations
and improvements (it seems to invite tinkering) - A carefully tuned quicksort is usually fastest
(e.g. unixs qsort standard library function) - but not stable, and in some situations slooow
12Quicksort
template void qsort(Item a, int l,
int r) if (r
l, r) qsort(a, l, i-1) qsort(a, i1, r)
partition pick an item as pivot, p (last
item?) rearrange list into items smaller,
equal, and greater than p
13Partitioning
- template
- int partition(Item a,
- int l, int r)
-
- int il-1, jr
- Item var
- for ()
-
- while (ai
- while (v
- if (jl) break
- if (i j) break
- exch(ai, aj)
-
- exch(ai, ar)
- return i
14Quicksort Analysis
- What is the runtime for Quicksort?
- Recurrence relation?
- Worst case Q(n2)
- Best, Average case Q(n log n)
- When does the worst case arise?
- ?when the list is (nearly) sorted! oops
- Recursive algorithms also have lots of overhead.
How to reduce the recursion overhead?
15Quick Hacks Cutoff
- How to improve the recursion overhead?
- Dont sort lists of size
- At the end, run a pass of insertion sort.
- In practice, this speeds up the algorithm
16Quick Hacks Picking a Pivot
- How to prevent that nasty worst-case behavior?
- Be smarter about picking a pivot
- E.g. pick three random elements and take their
median - Again, this yields an improvement in empirical
performance the worst case is much more rare - (what would have to happen to get the worst case?)
17Median, Order Statistics
- Quicksort improvement idea use the median as
pivot - Give me an algorithm to
- find the smallest element of a list
- find the 4th smallest element
- find the kth smallest element
- Algorithm idea sort, then pick the middle
element. - Q(n log n) worst, average case.
- This wont help for quicksort!
- Can we do better?
18Quicksort-based selection
- Pick a pivot partition list. Let i be location
of pivot. - If ik search left part if i
template void select(Item a, int l,
int r, int k) if (r i partition(a, l, r) if (i k) return
select(a, l, i-1, k) if (i select(a, i1, r, k)
O(n2)
Worst-case runtime?
O(n)
Expected runtime?
19Lower Bound on Sorting
- Do you think that there will always be
improvements in sorting algorithms? - better than Q(n)?
- better than Q(n log n)?
- how to prove that no comparison sort is better
than Q(n log n) in the worst case? - consider all algorithms!?
- Few non-trivial lower bounds are known. Hard!
- But, we can say that the runtime for any
comparison sort is W(n log n).
20Comparison sort lower bound
- How many comparisons are needed to sort?
- decision tree each leaf a permutation each node
a comparison a - A sort of a particular list a path from root to
leaf. - How many leaves?
- n!
- Shortest possible decision tree?
- W(log n!)
- Stirlings formula (p. 43) lg n! is about n lg n
n lg e lg(sqrt(2 pi n)) - W(n log n)!
- There is no comparison sort better than W(n log
n) - (but are there other approaches to sorting?)