Title: Professor: Munehiro Fukuda
1CSS342 Sorting Algorithms
- Professor Munehiro Fukuda
2Why We Desperately Need Efficient Sorting
Algorithms?
- Data must be sorted before we run the following
programs - Search algorithms such as binary search and
interpolation search - Many computational geometry/graphics algorithm
such as the convex hull - We always or frequently need to sort the
following data - Dictionary
- White/yellow pages
- Student grades
3Topics
- Day 1 Lecture
- Selection Sort worst/average O(n2)
- Bubble Sort worst/average O(n2)
- Insertion Sort worst/average O(n2)
- Shell Sort worst O(n2) average O(n3/2)
- Merge Sort worst/average O(n log n)
- Quick Sort worst O(n2) average O(n log n)
- Radix Sort worst/average O(n)
- Day 2 Lab Work
- Partial Quick Sort
- Homework Assignment
- Non-recursive Semi-In-Place Merge Sort
4Selection Sort
O(n2) sorting
size-1
0
Scan item 0 to size-1, locate the largest item,
and swap it with the rightmost item.
Initial array
29
10
14
13
37
Scan item 0 to size-2, locate the 2nd largest
item, and swap it with the 2nd rightmost item.
After 1st swap
29
10
14
13
37
Scan item 0 to size-3, locate the 3rd largest
item, and swap it with the 3rd rightmost item.
29
10
14
13
37
After 2nd swap
Scan item 0 to size-4, locate the 4th largest
item, and swap it with the 4th rightmost item.
29
10
14
13
37
After 3rd swap
Scan item 0 to size-5, locate the 5th largest
item, and swap it with the 5th rightmost item.
29
10
14
13
37
After 4th swap
5Selection Sort
O(n2) sorting
template ltclass Objectgt void selectionSort(
vectorltObjectgt a ) for ( int last a.size(
) - 1 last gt 1 --last ) int indexSoFar
0 // Index of largest item found so far.
// Assume 0th item is
the largest first for ( int i 1 i lt last
i ) if ( ai gt aindexSoFar
) indexSoFar i // indexSoFar points
to the largest item at this point swap(
aindexSoFar, alast )
indexSoFar
last a.size( ) - 1
0
swap
indexSoFar
last
0
swap
indexSoFar
last
0
swap
6Efficiency of Selection Sort
O(n2) sorting
Comparisons Swapping
Initial array
29
10
14
13
37
N-1 (4) 1
After 1st swap
N-2 (3) 1
29
10
14
13
37
N-3 (2) 1
29
10
14
13
37
After 2nd swap
29
10
14
13
37
After 3rd swap
N-4 (1) 1
29
10
14
13
37
After 4th swap
O(n(n-1)/2) O(n-1)
O(n2)
7Bubble Sort
O(n2) sorting
Pass 1
Pass 2
Pass 3
29
10
14
13
37
29
10
14
13
37
13
10
14
29
37
29
10
14
13
37
29
10
14
13
37
13
10
14
29
37
29
10
14
13
37
29
10
14
13
37
29
10
13
14
37
29
10
14
13
37
29
10
14
13
37
Pass 4
29
10
14
13
37
14
10
13
29
37
29
10
13
14
37
8Bubble Sort
O(n2) sorting
include ltiostreamgt include ltvectorgt include
ltstringgt using namespace std template ltclass
Objectgt void bubbleSort( vectorltObjectgt a )
bool swapOccurred true // true when swaps
occur for ( int pass 1 ( pass lt a.size( ) )
swapOccurred pass ) swapOccurred
false // swaps have not occurred at the
beginning for ( int i 0 i lt a.size( ) -
pass i ) // a bubble(i) goes from 0 to
size - pass if ( ai gt ai 1 ) swap(
ai, ai 1 ) swapOccurred true // a swap
has occured
9Efficiency of Bubble Sort
O(n2) sorting
Pass 1
Pass 2
29
10
14
13
37
29
10
14
13
37
29
10
14
13
37
29
10
14
13
37
29
10
14
13
37
29
10
14
13
37
29
10
14
13
37
29
10
14
13
37
29
10
14
13
37
Comparison Swapping
N-1 N-1
N-2 N-2
1 1
O(n2) O(n2)
O(n2)
10Insertion Sort
O(n2) sorting
Sorted
Unsorted
29
10
14
13
37
Copy 10
Shift 29
29
29
14
13
37
29
10
14
13
37
Insert 10, copy 14
unsortedTop
29
10
29
13
37
Shift 29
14
10
29
13
37
Insert 14 copy 37
14
10
29
13
37
Shift nothing
14
10
29
13
37
Copy 13
14
10
14
37
29
Shift 37, 29 and 14.
13
10
14
37
29
Insert 13
11Insertion Sort
O(n2) sorting
template ltclass Objectgt void SortedListltObjectgti
nsertionSort( ) for ( int unsorted 1
unsorted lt array.size( ) unsorted ) //
Assume the 0th item is sorted. Unsorted items
start from the 1st item Object unsortedTop
arrayunsorted // Copy the top of unsorted
group int i for ( i unsorted - 1 ( i
gt 0 ) (arrayi gt unsortedTop ) --i )
// Upon a successful comparison, shift arrayi
to the right arrayi 1 arrayi
// insert the unsorted top into the sorted
group arrayi 1 unsortedTop
endif
unsortedTop
13
insert
copy
compare
unsorted
13
11
25
20
37
29
14
10
8
3
2
loc loc1
loc loc1
shift
loc loc1
37
29
14
12Efficiency of Insertion Sort
O(n2) sorting
Comparison Insertion Shift
Sorted
Unsorted
29
10
14
13
37
29
10
14
13
37
1 1 1
29
10
14
13
37
14
10
29
13
37
2 1 2
14
10
29
13
37
14
10
29
13
37
3 1 3
14
10
29
13
37
13
10
14
37
29
N-1(4) 1 N-1(4)
O(n2) O(n2) O(n) O(n2)
13ShellSort
O(n3/2) sorting
gap 3/2.2 1
0
16
15
11
81
20
38
87
85
15
75
41
58
28
95
17
35
12
96
11
94
12
12
Initially divided by 2
gap 17/2 8
11
15
81
95
17
35
12
96
11
94
17
35
12
11
38
75
58
20
17
17
38
87
85
15
75
41
58
28
95
96
94
87
85
15
41
28
38
20
sort
20
81
28
28
Practically chosen
sort
gap 8/2.2 3
20
35
41
38
12
11
15
11
58
20
35
41
17
38
28
35
12
75
sort
75
58
35
41
20
17
38
28
58
75
87
75
58
96
94
41
87
81
81
96
94
87
85
15
94
85
95
85
81
95
81
87
96
94
- The idea is to perform an insertion sort among
items in gap - This reduces the large amount of data movement.
95
95
85
96
14ShellSort
O(n3/2) sorting
template ltclass Comparablegt void shellsort(
vectorltcomparablegt a ) for ( int gap
a.size( ) / 2 gap gt 0 gap ( gap 2 )? 1
int( gap / 2.2 ) ) for ( int i gap i lt
a.size( ) i ) Comparable tmp
ai int j i for ( j gt gap
tmp lt aj gap j - gap ) aj
aj gap aj tmp
(1)
(2)
(3)
(4)
(5)
(2)
Assume i a.size( ) 1
gap 16/2 8
(1)
0
16
81
20
38
87
85
15
75
41
58
28
95
17
35
12
96
11
94
(4)
(4)
Shift a16-8 if it is larger than tmp
Shift a16-8 2 if it is larger than tmp
(3)
20
(5)
tmp
15Efficiency of ShellSort
O(n3/2) sorting
- Performance
- Worst case O(N2)
- Average case
- O(N3/2) when dividing 2
- O(N5/4) or O(N7/6) when dividing 2.2
- Proof
- A long-standing open problem
16Sorting Algorithms
O(nlog n) sorting
- Selection Sort
- Bubble Sort
- Insertion Sort
- Shell Sort
- Merge Sort
- Quick Sort
O(n2) (Shells average case depends on
increment.)
Use a recursive solution Take advantage of
trees log(n) characteristics
O(n log n)
17Mergesort(with an auxiliary temporary array)
O(nlog n) sorting
Assuming that we have already had two sorted
array, How can we merge them into one sorted
array?
1
4
8
13
14
20
25
2
3
5
7
11
23
18Mergesort(with an auxiliary temporary array)
O(nlog n) sorting
sorted
sorted
Template ltclass Comparablegt void
merge(vectorltComparablegt a, int first, int mid,
int last) vectorltComparablegt
tempArray(a.size( )) int first1 first int
lsat1 mid int first2 mid 1 int last2
last int index first1 for ( (first1
lt last1) (first2 lt last2) index)
if (afirst1 lt afirst2) tempArrayindex
afirst1 first1 else
tempArrayindex afirst2 first2
for ( first1 lt last1 first1,
index) tempArrayindex afirst1
for ( first2 lt last2 first1, index)
tempArrayindex afirst2 for ( index
first index lt last index ) aindex
tempArrayindex
first
mid
last
theArray
last1
last2
first1
first2
gt
lt
tempArray
index
sorted
sorted
first
mid
theArray
last2
last1
first1
first2
tempArray
index
19Mergesort(from down to top conquer)
O(nlog n) sorting
38
16
17
12
39
27
24
5
20Mergesort(from top to down divide)
O(nlog n) sorting
mid(fist last)/2
first
last
38
16
17
12
39
27
24
5
theArray
mid(fist last)/2
mid(fist last)/2
first
last
first
last
first lt last
38
16
39
27
17
12
24
5
first
last
38
16
39
27
17
12
24
5
first
last
38
16
17
12
39
27
24
5
21Mergesort(final view)
O(nlog n) sorting
templateltComparablegt void mergesort(vectorltCompara
blegt a, int first, int last) if ( first lt
last ) int mid ( first last ) / 2
mergesort( a, first, mid ) mergesort(
a, mid1, last ) merge( a, first, mid,
last )
22Mergesort(Efficiency Analysis)
O(nlog n) sorting
At level X, nodes in each pair 2x At level X,
major operations n/ 2x (3 2x 1)
O(3n) levels log n, where n array elements
( if n is a power of 2 ) levels log n 1 if n
is not a power of 2 operations O(3n) (log n
1) O(3 n log n) O(n log n)
23Quicksort(A partition about a pivot)
O(nlog n) sorting
81
31
75
57
43
0
13
26
92
65
Select a pivot
Partition
75
0
81
65
92
43
57
13
31
26
Smaller items
Larger items
43
0
65
81
92
26
75
13
57
31
13
43
31
57
26
0
24Quicksort(Code overview)
O(nlog n) sorting
templateltclass Comparablegt void
quicksort(vectorltComparablegt a, int first, int
last) int pivotIndex // after partition,
pivotIndex points to a pivot if ( first lt
last ) partition( a, fist, last,
pivotIndex ) quicksort( a, first,
pivotIndex - 1 ) quicksort( a, pivotIndex
1, last )
25Quicksort(Partitioning Algorithm)
O(nlog n) sorting
Initial State
Repeat moving each element in the unknown region
to S1 or S2 Until unknown reaches 0.
26Quicksort(Moving an new unknown into S1)
O(nlog n) sorting
swap
S1
S2
unknown
p
lt p
gt p
?
new ltp
first
lastS1
firstUnknow
last
S1
S2
unknown
p
lt p
gt p
?
new ltp
first
lastS1
last
firstUnknow
27Quicksort(Moving an new unknown into S2)
O(nlog n) sorting
S1
S2
unknown
p
lt p
gt p
?
new gtp
first
lastS1
firstUnknow
last
S1
S2
unknown
p
lt p
?
new gtp
gt p
first
lastS1
last
firstUnknow
28Quicksort(Partitioning Code)
O(nlog n) sorting
templateltclass Comparablegt void
partition(vectorltComparablegt a, int first,
int last, int pivotIndex) //place it in
afirst choosePivot( a, first, last )
Comparable pivot theArrayfirst int lastS1
first int firstUnknown first 1 for
( firstUnknown lt last firstUnknown )
if ( afirstUnknown lt pivot )
lastS1 swap( afirstUnknown,
alastS1 ) // else item from
unknown belongs in S2 swap( afirst,
alastS1 ) pivotIndex lastS1
29Quicksort(Example)
O(nlog n) sorting
Original array
27
28
16
26
39
12
firstUnknown1(points to 28) 28 belongs in S2
27
28
16
26
39
12
S2
S1 is empty. 12 belongs in S1, so swap 28 and 12
27
28
16
26
39
12
27
12
16
26
39
28
39 belongs in S2
26 belongs in S1, swap 28 and 26
27
12
16
26
39
28
16 belongs in S1, swap 39 and 16
27
12
16
28
39
26
S1 and S2 are determined
27
12
39
28
16
26
Place pivot between S1 and S2
16
12
39
28
27
26
30Quicksort(Efficiency Analysis)
O(nlog n) sorting
- Worst case If the pivot is the smallest item in
the array segment, S1 will remain empty. - S2 decreases in size by only 1 at each recursive
call. - Level 1 requires n-1 comparisons.
- Level 2 requires n-2 comparisons.
- Thus, (n-1) (n-2) . 2 1 n(n-1)/2
O(n2) - Then, how can we select the best pivot?
- Average case S1 and S2 contain the same number
of items. - log n or log n 1 levels of recursions occur.
- Each level requires n-k comparisons
- Thus, at most (n-1) (log n 1) O(n log n )
31Mergesort versus Quicksort
O(nlog n) sorting
- Then, why do we need Quicksort?
- Reasons
- Mergesort requires item-copying operations from
the array a to the temp - array and vice versa.
- A worst-case situation is not typical.
- Then, why do we need Mergesort?
- Reason
- If you sort a linked list, no item-copying
operations are necessary.
32Radix Sort(Algorithm Overview)
O(n) sorting
33Radix Sort(Efficiency Analysis)
O(log n) sorting
- Each grouping work requires n shuffles.
- grouping and combining steps is digits.
- The previous case is 4.
- Thus, for k digit number, the performance is
- K n O( n ) where k is irrelevant to n
- Disadvantage
- Need to compare digits in the same order rather
than items. - Need to accommodate 10 groups for numbers
- Need to accommodate 27 groups for strings
(alphabet blank)
34A Comparison of Sorting Algorithms
Average case
Worst case
n2
n2
Selection sort
n2
n2
Bubble sort
n2
n2
Insertion sort
Shell sort
n2
n3/2 ,n5/4depends on increment
n log n
Mergesort
n log n
n log n
n2
Quicksort
n
n
Radix sort
n log n
n2
Treesort
Studied in css343
n log n
n log n
Heapsort
Studied in css343
Question do we really need to always use
mergesort or quicksort?
35Lab Work
- Partial Quicksort
- Find the top k items
- Find the bottom k items
- Find the median
- Key Idea
- Focus on only either partitionfirst, pivot -1
or partitionpivot, last that fits the
requirements top k, bottom k, or middle.
36Programming Assignment
- In-Place Sorting
- Sort data items only in the original array.
Example Quick Sort - Impractical for Merge Sort
- Non-Recursive, Semi-In-Place Merge Sort
- Using a loop rather than recursion.
- Using only one additional temporary array.
- Moving data from the original to temporary or
vice versa at each stage