Chapter 12 Sorting - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Chapter 12 Sorting

Description:

Heapsort is a superior O( n log(n) ) method. Assume the array to sort is. Overview ... Subdivide the file into pieces small enough to fit in memory. Sort the pieces ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 29
Provided by: markt2
Category:

less

Transcript and Presenter's Notes

Title: Chapter 12 Sorting


1
Chapter 12Sorting
  • CS 260 Data Structures
  • Indiana University Purdue University Fort Wayne

2
Note
  • We temporarily skip ahead to Section 12.3
  • Pages 633 643

3
Heapsort
  • Heapsort is a superior O( n log(n) ) method
  • Assume the array to sort is
  • Overview
  • First convert an unsorted array to a heap
  • Then, iteratively, remove the root element,
    rebuilding the heap each time
  • The root element is always the largest remaining
    element
  • The elements, as removed, are in descending order
  • Notation Define to consist of the
    elements

int a new int n
a i .. j
a i , a i1 , a i2 , ..., a j
4
Heapsort method details
public static void heapsort( int a, int n )
  • 1. Note that a 0..0 is already a heap (only
    one element)
  • 2. Turn a 0..(n-1) into a heap by successively
    adding . . .

a 1 to a 0..0 a 2 to a 0..1 a 3 to
a 0..2 a n-1 to a 0..(n-2)
These steps could be written as a private helper
method called makeheap
public static void makeHeap( int a, int n )
5
Heapsort method details
  • 3. Iteratively remove the largest remaining
    element and rebuild the heap
  • Note The largest remaining element is only
    removed from the heap logically but remains
    physically in the array (in the new position)
  • Note Reheapification downward could be written
    as a private helper method called reheapifyDown

for( int i n-1 i gt 0 i-- ) Exchange a
0 with a i Perform reheapification
downward on a 0..(i-1)
public static void reheapifyDown( int a, int n
)
6
Heapsort example
0 1 2 3 4
a
8
8
5
5
8
swap
7
5
7
5
2
2
5
swap
2
2
1
makeHeap
1
5
1
7
swap
swap
2
5
2
1
7
5
2
5
7
swap
2
8
1
0 1 2 3 4
1
2
1
1
a
swap
2
2
5
1
final array
7
Analysis of heapsort
  • Since the heap is complete binary tree, it is
    automatically balanced
  • The depth of tree is O( log(n) )
  • Worst case analysis of heapsort is O( n log(n) )
  • Steps 1 and 2 have O( n log(n) ) performance
  • Step 3 has O( n log(n) ) performance
  • The steps form a sequence
  • The worse case performance is also the best case
    performance and the average case performance

8
Quadratic sorting algorithms
Now, back to the beginning of Chapter 12
Section 12.1, page 600
  • Quadratic sorting algorithms
  • Have inefficient worst case O( n2 ) performance
  • Are easy to implement
  • O( n2 ) performance doesnt matter for small
    arrays

9
Selection sort
public static void selectionsort( int data,
int first, int n ) Find the largest
element. Swap it with the last. Find the
next largest. Swap it with the next to last.
Etc.
  • The sort range is data first .. (first n 1)
  • A typical call is selectionsort( a, 0, n )
  • Here a is an array of n cells
  • Analysis
  • Best case worst case average case O( n2 )

10
Insertion sort
public static void insertionsort( int data,
int first, int n ) Consider data 0..0
already sorted Insert data1 into the
proper position of data 0..1 Insert
data2 into the proper position of data 0..2
Etc.
  • Each insert operation places an additional
    element into a portion of the array that has
    already been sorted as follows

sorted
0 1 2 3 4 5
6 7 8 9 10 11
a
10
11
Insertion sort
  • Analysis
  • Worst case average case O( n2 )
  • Best case O( n )
  • The algorithms takes advantage of the situation
    when the array is already sorted
  • This is a good method when . . .
  • a few updates need to be added from time to time
    so that the array remains sorted

12
Recursive O( n log(n) ) methods
  • We will consider
  • Mergesort
  • Quicksort

13
Mergesort
public static void mergesort( int data, int
first, int n ) Divide the array in half.
Recursively apply mergesort to each half.
Merge the sorted halves into a sorted temporary
array Copy the temporary array back to the
original array
0 1 2 3 4 5
6 7 8 9
mergesort
mergesort
14
Mergesort
  • Recursive stopping case
  • When a subarray to be sorted consists of only one
    element
  • During the merge process, when one of the halves
    becomes empty, simply copy the remainder of the
    remaining half to the end of the temporary array

15
private static void merge( int data, int
first, int n1, int n2) int temp new
int n1n2 // Allocate the temporary
array int copied 0 //
Number of elements copied from data to temp
int copied1 0 // Number copied
from the first half of data int copied2
0 // Number copied from the second
half of data int i
// Array index to copy from temp back into
data while ( ( copied1 lt n1 ) ( copied2
lt n2 ) ) if ( data first copied1
lt data first n1 copied2 )
temp copied data first ( copied1 )
// bad style ! else
temp copied data first n1 (
copied2 ) // bad style !
while ( copied1 lt n1 ) temp copied
data first ( copied1 )
// bad style ! while ( copied2 lt n2 )
temp copied data first n1 (
copied2 ) // bad style ! for
( i 0 i lt n1n2 i ) data first
i temp i
16
Mergesort
private static void mergesort( int data, int
first, int n ) int n1 // Size
of the first half of the array int n2
// Size of the second half of the array
if ( n gt 1 ) // Compute sizes of the
two halves n1 n / 2 n2 n -
n1 mergesort( data, first, n1 )
// Sort data first through data
firstn1-1 mergesort( data, first
n1, n2 ) // Sort data firstn1 to the end
// Merge the two sorted halves.
merge(data, first, n1, n2)
17
Mergesort analysis
  • The usual technique of determining the big-O of a
    recursive methods does not work here
  • There are only half as many elements in the merge
    phase within each successive recursive call
  • Instead look at the merge activity across an
    entire level at a time
  • The big-O of merge across each level is O(n)
  • There are O( log(n) ) levels
  • Ignore the actual number of recursive calls

18
Mergesort analysis
log(n) levels
n elements
  • Worst case average case best case O( n
    log(n) )

19
Mergesort analysis
  • A disadvantage of mergesort when used with arrays
    is that a second temporary array is needed
  • This effectively cuts the size of the largest
    array that can be sorted in half
  • Advantages
  • Works with linked lists without need for a
    temporary array !
  • Can be used to sort data in a huge disk file
  • A file much too large to fit in memory
  • Subdivide the file into pieces small enough to
    fit in memory
  • Sort the pieces
  • Merge the pieces together

20
Quicksort
public static void quicksort( int data, int
first, int n ) Partition the array in
two parts such that (all elements in left part)
lt (all elements in the right part)
Recursively apply quicksort to each part
  • Quicksort works in a manner opposite to mergesort
  • The partition operation iterates through all the
    elements before the recursive calls rather than
    after
  • The partition operation does rough sorting ahead
    of time

21
Quicksort partition method
public static int partition( int data, int
first, int n )
  • The partition method rearranges elements such
    that all the elements in the left part will be
    smaller than any of the elements in the right
    part
  • A value called the pivot value will end up
    between the elements of the two parts
  • This pivot value is
  • gt each element of the left part
  • lt each element of the right part
  • Partition method returns the pivot index giving
    the position of the pivot value

22
Quicksort partition method
  • Start by choosing a pivot value to help with
    the partitioning process
  • Ideally, the pivot value would be the median of
    the elements in the sort range
  • Then the two parts would be nearly equal in size
  • However, finding the median is a O(n) operation
  • This is too inefficient
  • Instead, simply choose data first for the
    pivot
  • Later, we will improve on this guess
  • Then use two indices to sweep through the data
  • u for up
  • d for down

23
Quicksort partition method
  • Sweep through the data as follows
  • Move u up and d down until the values they refer
    to are out of order with respect to the pivot
    value
  • Then swap the u and d values and continue
  • Stop when u and d pass each other
  • Finally, swap data d with the pivot value
    data first
  • Return index d

24
Quicksort partition example
pivot
u
d
pivot
u
d
pivot
u
d
pivot
u
d
pivot
u
d
all lt pivot (7)
all gt pivot (7)
25
Quicksort
  • Once elements in both parts are sorted, the
    entire array is sorted

public static void quicksort( int data, int
first, int n) int pivotIndex // Array
index for the pivot element int n1
// Number of elements before the pivot element
int n2 // Number of elements after
the pivot element if ( n gt 1 )
// Partition the array, and set the pivot index.
pivotIndex partition( data, first, n
) // Compute the sizes of the two
pieces. n1 pivotIndex - first
n2 n - n1 - 1 // Recursive calls
will now sort the two pieces. quicksort(
data, first, n1 ) quicksort( data,
pivotIndex 1, n2 )
26
Quicksort analysis
  • Best case average case O( n log(n) )
  • When pivot occurs near the center of each time
  • Number of levels O( log(n) )
  • Number of probes within each level O(n)
  • Worst case O(n2)
  • When pivot occurs near an end most of the time
  • For example, when the array is already sorted
  • Number of levels is only limited by n

27
Quicksort
  • There is a better way to choose the pivot value
  • 1. Choose the median of the three values . . .
  • data first
  • data first n 1
  • data first n/2
  • 2. Swap the chosen value with data first
  • 3. Continue as before
  • This method is called the median of three
  • Statistically, this gives a much better pivot
    value
  • Performance is much more likely to be O(n log(n)
    )
  • Even when the data is already sorted

28
Other improvements
  • Both mergesort and quicksort encounter more and
    more overhead due to recursion when the subarrays
    get small
  • Both can be improved as follows
  • When a subarray represents less than some number
    M of elements, use the insertion sort method on
    the subarray instead of making a recursive call
  • A typical value for M might be around 100
Write a Comment
User Comments (0)
About PowerShow.com