Chapter 12 Sorting - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Chapter 12 Sorting

Description:

Heapsort is a superior O( n log(n) ) method. Assume the array to sort is. Overview ... Subdivide the file into pieces small enough to fit in memory. Sort the pieces ... – PowerPoint PPT presentation

Number of Views:46

Avg rating:3.0/5.0

Slides: 29

Provided by: markt2

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 12 Sorting

1
Chapter 12Sorting

CS 260 Data Structures
Indiana University Purdue University Fort Wayne

2
Note

We temporarily skip ahead to Section 12.3
Pages 633 643

3
Heapsort

Heapsort is a superior O( n log(n) ) method
Assume the array to sort is
Overview
First convert an unsorted array to a heap
Then, iteratively, remove the root element,
rebuilding the heap each time
The root element is always the largest remaining
element
The elements, as removed, are in descending order
Notation Define to consist of the
elements

int a new int n
a i .. j
a i , a i1 , a i2 , ..., a j
4
Heapsort method details
public static void heapsort( int a, int n )

1. Note that a 0..0 is already a heap (only
one element)
2. Turn a 0..(n-1) into a heap by successively
adding . . .

a 1 to a 0..0 a 2 to a 0..1 a 3 to
a 0..2 a n-1 to a 0..(n-2)
These steps could be written as a private helper
method called makeheap
public static void makeHeap( int a, int n )
5
Heapsort method details

3. Iteratively remove the largest remaining
element and rebuild the heap
Note The largest remaining element is only
removed from the heap logically but remains
physically in the array (in the new position)
Note Reheapification downward could be written
as a private helper method called reheapifyDown

for( int i n-1 i gt 0 i-- ) Exchange a
0 with a i Perform reheapification
downward on a 0..(i-1)
public static void reheapifyDown( int a, int n
)
6
Heapsort example
0 1 2 3 4
a
8
8
5
5
8
swap
7
5
7
5
2
2
5
swap
2
2
1
makeHeap
1
5
1
7
swap
swap
2
5
2
1
7
5
2
5
7
swap
2
8
1
0 1 2 3 4
1
2
1
1
a
swap
2
2
5
1
final array
7
Analysis of heapsort

Since the heap is complete binary tree, it is
automatically balanced
The depth of tree is O( log(n) )
Worst case analysis of heapsort is O( n log(n) )
Steps 1 and 2 have O( n log(n) ) performance
Step 3 has O( n log(n) ) performance
The steps form a sequence
The worse case performance is also the best case
performance and the average case performance

8
Quadratic sorting algorithms
Now, back to the beginning of Chapter 12
Section 12.1, page 600

Quadratic sorting algorithms
Have inefficient worst case O( n2 ) performance
Are easy to implement
O( n2 ) performance doesnt matter for small
arrays

9
Selection sort
public static void selectionsort( int data,
int first, int n ) Find the largest
element. Swap it with the last. Find the
next largest. Swap it with the next to last.
Etc.

The sort range is data first .. (first n 1)
A typical call is selectionsort( a, 0, n )
Here a is an array of n cells
Analysis
Best case worst case average case O( n2 )

10
Insertion sort
public static void insertionsort( int data,
int first, int n ) Consider data 0..0
already sorted Insert data1 into the
proper position of data 0..1 Insert
data2 into the proper position of data 0..2
Etc.

Each insert operation places an additional
element into a portion of the array that has
already been sorted as follows

sorted
0 1 2 3 4 5
6 7 8 9 10 11
a
10
11
Insertion sort

Analysis
Worst case average case O( n2 )
Best case O( n )
The algorithms takes advantage of the situation
when the array is already sorted
This is a good method when . . .
a few updates need to be added from time to time
so that the array remains sorted

12
Recursive O( n log(n) ) methods

We will consider
Mergesort
Quicksort

13
Mergesort
public static void mergesort( int data, int
first, int n ) Divide the array in half.
Recursively apply mergesort to each half.
Merge the sorted halves into a sorted temporary
array Copy the temporary array back to the
original array
0 1 2 3 4 5
6 7 8 9
mergesort
mergesort
14
Mergesort

Recursive stopping case
When a subarray to be sorted consists of only one
element
During the merge process, when one of the halves
becomes empty, simply copy the remainder of the
remaining half to the end of the temporary array

15
private static void merge( int data, int
first, int n1, int n2) int temp new
int n1n2 // Allocate the temporary
array int copied 0 //
Number of elements copied from data to temp
int copied1 0 // Number copied
from the first half of data int copied2
0 // Number copied from the second
half of data int i
// Array index to copy from temp back into
data while ( ( copied1 lt n1 ) ( copied2
lt n2 ) ) if ( data first copied1
lt data first n1 copied2 )
temp copied data first ( copied1 )
// bad style ! else
temp copied data first n1 (
copied2 ) // bad style !
while ( copied1 lt n1 ) temp copied
data first ( copied1 )
// bad style ! while ( copied2 lt n2 )
temp copied data first n1 (
copied2 ) // bad style ! for
( i 0 i lt n1n2 i ) data first
i temp i
16
Mergesort
private static void mergesort( int data, int
first, int n ) int n1 // Size
of the first half of the array int n2
// Size of the second half of the array
if ( n gt 1 ) // Compute sizes of the
two halves n1 n / 2 n2 n -
n1 mergesort( data, first, n1 )
// Sort data first through data
firstn1-1 mergesort( data, first
n1, n2 ) // Sort data firstn1 to the end
// Merge the two sorted halves.
merge(data, first, n1, n2)
17
Mergesort analysis

The usual technique of determining the big-O of a
recursive methods does not work here
There are only half as many elements in the merge
phase within each successive recursive call
Instead look at the merge activity across an
entire level at a time
The big-O of merge across each level is O(n)
There are O( log(n) ) levels
Ignore the actual number of recursive calls

18
Mergesort analysis
log(n) levels
n elements

Worst case average case best case O( n
log(n) )

19
Mergesort analysis

A disadvantage of mergesort when used with arrays
is that a second temporary array is needed
This effectively cuts the size of the largest
array that can be sorted in half
Advantages
Works with linked lists without need for a
temporary array !
Can be used to sort data in a huge disk file
A file much too large to fit in memory
Subdivide the file into pieces small enough to
fit in memory
Sort the pieces
Merge the pieces together

20
Quicksort
public static void quicksort( int data, int
first, int n ) Partition the array in
two parts such that (all elements in left part)
lt (all elements in the right part)
Recursively apply quicksort to each part

Quicksort works in a manner opposite to mergesort
The partition operation iterates through all the
elements before the recursive calls rather than
after
The partition operation does rough sorting ahead
of time

21
Quicksort partition method
public static int partition( int data, int
first, int n )

The partition method rearranges elements such
that all the elements in the left part will be
smaller than any of the elements in the right
part
A value called the pivot value will end up
between the elements of the two parts
This pivot value is
gt each element of the left part
lt each element of the right part
Partition method returns the pivot index giving
the position of the pivot value

22
Quicksort partition method

Start by choosing a pivot value to help with
the partitioning process
Ideally, the pivot value would be the median of
the elements in the sort range
Then the two parts would be nearly equal in size
However, finding the median is a O(n) operation
This is too inefficient
Instead, simply choose data first for the
pivot
Later, we will improve on this guess
Then use two indices to sweep through the data
u for up
d for down

23
Quicksort partition method

Sweep through the data as follows
Move u up and d down until the values they refer
to are out of order with respect to the pivot
value
Then swap the u and d values and continue
Stop when u and d pass each other
Finally, swap data d with the pivot value
data first
Return index d

24
Quicksort partition example
pivot
u
d
pivot
u
d
pivot
u
d
pivot
u
d
pivot
u
d
all lt pivot (7)
all gt pivot (7)
25
Quicksort

Once elements in both parts are sorted, the
entire array is sorted

public static void quicksort( int data, int
first, int n) int pivotIndex // Array
index for the pivot element int n1
// Number of elements before the pivot element
int n2 // Number of elements after
the pivot element if ( n gt 1 )
// Partition the array, and set the pivot index.
pivotIndex partition( data, first, n
) // Compute the sizes of the two
pieces. n1 pivotIndex - first
n2 n - n1 - 1 // Recursive calls
will now sort the two pieces. quicksort(
data, first, n1 ) quicksort( data,
pivotIndex 1, n2 )
26
Quicksort analysis

Best case average case O( n log(n) )
When pivot occurs near the center of each time
Number of levels O( log(n) )
Number of probes within each level O(n)
Worst case O(n2)
When pivot occurs near an end most of the time
For example, when the array is already sorted
Number of levels is only limited by n

27
Quicksort

There is a better way to choose the pivot value
1. Choose the median of the three values . . .
data first
data first n 1
data first n/2
2. Swap the chosen value with data first
3. Continue as before
This method is called the median of three
Statistically, this gives a much better pivot
value
Performance is much more likely to be O(n log(n)
)
Even when the data is already sorted

28
Other improvements

Both mergesort and quicksort encounter more and
more overhead due to recursion when the subarrays
get small
Both can be improved as follows
When a subarray represents less than some number
M of elements, use the insertion sort method on
the subarray instead of making a recursive call
A typical value for M might be around 100

Write a Comment

User Comments (0)