Sorting - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Sorting

Description:

Radix sort. O(mN) where m is the number of columns to be ... The Radix Sort. This sort is based on the concept of sorting IBM's 80 column Hollerith cards. ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 53
Provided by: jimvand
Category:
Tags: radix | sorting

less

Transcript and Presenter's Notes

Title: Sorting


1
Chapter 7
  • Sorting

2
Objective
  • To understand the importance of sorting
  • Review known sorting techniques
  • Develop improved sorting techniques (from a
    computer complexity viewpoint)
  • Implement several sorting classes

3
Review of Sorting Techniques
  • Selection sort
  • Bubble sort
  • Insertion sort
  • All of the above are O(N2)
  • Radix sort
  • O(mN) where m is the number of columns to be
    sorted

4
Selection Sort Steps
  • If the array is of length n, we need n-1 steps
  • If we are at ai, we must find the smallest
    number between ai and an
  • We need to exchange ai with this smallest number

5
  • The step by step process to sort the components
    of the array a into ascending order is

Compare a0 with a1 through a4 and swap
a0 with the smallest value (a4)
Next, compare a1 with the contents of a2
through a4. No change.
Next, compare a2 with the contents of a3
through a4 and swap a2 with the smallest of
these values.
Next, compare a3 with the contents of a4 and
swap a3 with the smallest of these values
6
Bubble Sort Steps
  • During the first pass through the array
  • Compare each consecutive member
  • Interchange when necessary consecutive members
  • After the first pass, the nth element is in
    sequence
  • Repeat through n-1 element
  • Repeat
  • In general, the kth pass has the n-kth element in
    sequence
  • Notice that, during any pass, if no changes are
    made, we can stop (the array is in sequence).

7
  • The step by step process using the bubble sort

After the first pass
After the second pass
After the third pass
After the fourth pass
8
Insertion Sort Steps
  • Starting with the first element, as you advance
    to the next element, insert in the correct order
  • As you insert, compare with the element before
    the new element and move downward

9
  • The step by step process to sort by insertion

First step
Second step
Third step
Last step
10
Applet for Sorting
11
Computer Complexity
  • All of the above are O(N2)
  • They can be classified as exchanging adjacent
    elements
  • It can be shown that this classification is ?(N2)
    time on average

12
The Radix Sort
  • This sort is based on the concept of sorting
    IBMs 80 column Hollerith cards.
  • To sort in columns 1-5, sort column 5 and
    progress until column 1.
  • Note that the number of passes will be 5N i.e.,
    this sort is O(N).
  • Each sorted column must be kept in memory, called
    bins

13
(No Transcript)
14
(No Transcript)
15
Sorting Indices
  • Maintaining a separate array of indices
  • Run time can be reduced by sorting the indices
  • Why?

16
Heapsort
  • A heap is a natural structure to help for sorting
  • Build a binary heap
  • This takes O(N) time
  • deleteMin
  • Place the deleted node in an array
  • This takes O(log N)
  • Hence the total run time is O(N log N)
  • The negative is that it doubles the memory
    allocation for the sort

17
Better Heapsort Algorithm
  • Build the heap in descending order
  • Interchange the root with the last element
  • Reduce size by 1
  • Restore the heap property
  • Repeat this n times for the n elements in the
    array

18
Complexity
  • The complexity is O( n log n ), even in the worst
    case
  • If worst case behavior is critically important,
    this is a very good sort

19
Build Heap and Delete Max
  • The heap is a max heap
  • Remove 97
  • The last element, 31, tries to go to the root but
    filters down to its natural position
  • The 97 goes to where the 31 used to be but is not
    part of the heap
  • This process is continued until the data is
    sorted in ascending order

20
Heapsort Code- 1
  • / 1/ template ltclass Etypegt
  • / 2/ void
  • / 3/ Perc_Down( Etype A , unsigned int i,
    const unsigned int N )
  • / 4/
  • / 5/ unsigned int Child
  • / 6/ Etype Tmp A i
  • / 7/ for( i 2 lt N i Child )
  • / 8/
  • / 9/ Child i 2
  • /10/ if( Child ! N A Child 1 gt
    A Child )
  • /11/ Child
  • /12/ if( Tmp lt A Child )
  • /13/ A i A Child
  • /14/ else
  • /15/ break
  • /16/
  • /17/ A i Tmp
  • /18/

21
Heapsort Code 2
  • / 1/ template ltclass Etypegt
  • / 2/ void
  • / 3/ Heap_Sort( Etype A , const unsigned int
    N )
  • / 4/
  • / 5/ for( unsigned int i N / 2 i gt 0
    i-- ) // Build_Heap.
  • / 6/ Perc_Down( A, i, N )
  • / 7/ for( i N i gt 2 i-- )
  • / 8/
  • / 9/ Swap( A 1 , A i ) //
    Delete_Max.
  • /10/ Perc_Down( A, ( unsigned int ) 1,
    i - 1 )
  • /11/
  • /12/

22
Applet for Heapsort
23
Homework
  • Build a max heap from the following data, then
    show the data pass by pass as the heapsort is
    performed77 22 84 34 35 75 21
    46 88

24
Mergesort - 1
  • A mergesort is based on merging two sorted
    subsequences together to produce a sorted,
    combined subsequence
  • Divide and Conquer
  • Surprisingly, the complexity is O(N log N)
  • The sizes of the subsequences grow from 1 to 2 to
    4 to 8 to 16 and so forth it takes log n merges
    each of complexity O(n)
  • Requires 3 pointers, called them Actr, Bctr, Cctr
    where we are merging A and B to form C

25
Mergesort 2
26
Mergesort -3
27
Mergesort Code 1
  • / 1/ template ltclass Etypegt
  • / 2/ void
  • / 3/ Merge_Sort( Etype A , const unsigned int
    N )
  • / 4/
  • / 5/ Etype Tmp_Array new Etype N 1
  • / 6/ unsigned int New_N N //
    Non-constant, for m_sort.
  • / 7/ if( Tmp_Array ! NULL )
  • / 8/
  • / 9/ M_Sort( A, Tmp_Array, ( unsigned
    int ) 1, New_N )
  • /10/ delete Tmp_Array
  • /11/
  • /12/ else
  • /13/ Error( "No space for tmp array" )
  • /14/

28
  • / 1/ template ltclass Etypegt
  • / 2/ void
  • / 3/ M_Sort( Etype A , Etype Tmp_Array ,
  • / 4/ unsigned int Left, unsigned int
    Right )
  • / 5/
  • / 6/ if( Left lt Right )
  • / 7/
  • / 8/ unsigned int Center ( Left
    Right ) / 2
  • / 9/ M_Sort( A, Tmp_Array, Left, Center
    )
  • /10/ M_Sort( A, Tmp_Array, Center 1,
    Right )
  • /11/ Merge( A, Tmp_Array, Left, Center
    1, Right )
  • /12/
  • /13/

29
Mergesort Code 2
  • / 1/ // Left_Pos start of left half.
  • / 2/ // Right_Pos start of right half.
  • / 3/ template ltclass Etypegt
  • / 4/ void
  • / 5/ Merge( Etype A , Etype Tmp_Array ,
    unsigned int Left_Pos,
  • / 6/ unsigned int Right_Pos, unsigned
    int Right_End )
  • / 7/
  • / 8/ int Left_End Right_Pos - 1
  • / 9/ int Tmp_Pos Left_Pos
  • /10/ int Num_Elements Right_End -
    Left_Pos 1
  • /11/ // Main loop.
  • /12/ while( Left_Pos lt Left_End
    Right_Pos lt Right_End )
  • /13/ if( A Left_Pos lt A Right_Pos
    )
  • /14/ Tmp_Array Tmp_Pos A
    Left_Pos
  • /15/ else
  • /16/ Tmp_Array Tmp_Pos A
    Right_Pos
  • /17/ while( Left_Pos lt Left_End ) // Copy
    rest of first half.
  • /18/ Tmp_Array Tmp_Pos A
    Left_Pos
  • /19/ while( Right_Pos lt Right_End ) //
    Copy rest of second half.

30
Homework
  • Sort the following data using a mergesort. Show
    the results pass by pass.77 22 84 34
    35 75 21 46 88

31
Analysis using Telescoping
  • Basic Recurrence

32
(No Transcript)
33
(No Transcript)
34
Quicksort the basic approach
  • The basic idea
  • Partition the data into two sets, those elements
    gt a pivot and those lt the pivot
  • The key insight is that no data will go from one
    partition to the other after the partitioning is
    finished
  • This is a divide and conquer problem where the
    data in the partitions can be sorted independently

35
The algorithm
36
The Big Picture
  • Some critical issues
  • How is the pivot selected
  • How can the number of recursive steps be reduced
  • How to avoid worst case behavior

37
Picking the Pivot
  • Some choices are
  • Pick the leftmost or rightmost element
  • Pick the center element
  • Pick the median of the leftmost, center, and
    rightmost elements
  • We will use the median of three approach
  • It has better performance since, statistically,
    the median of a subset of elements (three in this
    case) is more likely to be near the median of all
    the data (the optimal choice)

38
Algorithm
  • We can perform the sort in place using the
    following algorithm
  • Determine the pivot (using the median of the
    first, last and median)
  • Interchange the pivot with the last element
  • Use 2 pointers, i pointing to the first element
    and j pointing to the last element before the
    pivot
  • Move i to the right until a large number (number
    greater than the pivot) is encountered
  • Move j to the left until a small number is
    encountered
  • Swap the elements
  • Continue until j is left of i
  • Then swap i with the pivot
  • Example use 8 1 4 9 6 3 5 2 7 0

39
Partitioning
  • Assume the pivot, 6, has been placed in the
    rightmost position
  • These pictures show a complete partitioning of
    the data

40
Driver for Quicksort
  • / 1/ template ltclass Etypegt
  • / 2/ void
  • / 3/ Quick_Sort( Etype A , const unsigned int
    N )
  • / 4/
  • / 5/ const unsigned int One 1
  • / 6/ Q_Sort( A, One, N )
  • / 7/ Insertion_Sort( A, N )
  • / 8/
  • / 9/ template ltclass Etypegt
  • /10/ inline void
  • /11/ Swap( Etype A, Etype B )
  • /12/
  • /13/ Etype Tmp
  • /14/ Tmp A
  • /15/ A B
  • /16/ B Tmp
  • /17/

41
Median of Three
  • / 1/ template ltclass Etypegt
  • / 2/ Etype
  • / 3/ Median3( Etype A ,
  • / 4/ const unsigned int Left, const
    unsigned int Right )
  • / 5/
  • / 6/ unsigned int Center ( Left Right )
    /2
  • / 7/ if( A Left gt A Center )
  • / 8/ Swap( A Left , A Center )
  • / 9/ if( A Left gt A Right )
  • /10/ Swap( A Left , A Right )
  • /11/ if( A Center gt A Right )
  • /12/ Swap( A Center , A Right )
  • /13/ // Invariant A Left lt A Center
    lt A Right .
  • /14/ // Now hide and return pivot.
  • /15/ Swap( A Center , A Right - 1 )
  • /16/ return A Right - 1
  • /17/

42
Quicksort Routine
  • / 1/ template ltclass Etypegt
  • / 2/ void
  • / 3/ Q_Sort( Etype A ,
  • / 4/ const unsigned int Left, const
    unsigned int Right )
  • / 5/
  • / 6/ if( Left Cutoff lt Right )
  • / 7/
  • / 8/ Etype Pivot Median3( A, Left,
    Right )
  • / 9/ unsigned int i Left, j Right -
    1
  • /10/ for( )
  • /11/
  • /12/ while( A i lt Pivot )
  • /13/ while( A --j gt Pivot )
  • /14/ if( i lt j )
  • /15/ Swap( A i , A j )
  • /16/ else
  • /17/ break
  • /18/
  • /19/ Swap( A i , A Right - 1 ) //
    Restore pivot.

43
Homework
  • Perform a quicksort on the following data. Show
    the results pass by pass77 22 84 34
    35 75 21 46 88

44
Small Arrays
  • For small arrays (Nlt20), quicksort does not work
    as well as the O(N2) sorts.
  • Use quicksort until partition reaches 20
  • Then use one of the other sorts

45
Analysis of QuicksortWorst case
  • General recurrence
  • Worst case recurrence

46
Analysis of Quicksort Best Case
  • Best case recurrence
  • Average case recurrence is more complex but
    results in O( n log n )

47
Quick Select - general
  • The goal is to find the kth largest element
  • A simple approach is to sort the data and get the
    data at the k-1 index
  • The complexity would be O(n log n)
  • A linear time approach can be accomplished
    without sorting all the data by adapting the
    quicksort routine to a new routine, quickselec
  • Only apply quicksort recursively only to
    partition containing the desired final position
  • When the partition gets small enough, use an
    insertion sort to avoid the cost of recursion

48
QuickSelectCode
  • / 1/ template ltclass Etypegt
  • / 2/ void
  • / 3/ Q_Select( Etype A , const unsigned int
    k,
  • / 4/ const unsigned int Left, const
    unsigned int Right )
  • / 5/
  • / 6/ if( Left Cutoff lt Right )
  • / 7/
  • / 8/ Etype Pivot Median3( A, Left,
    Right )
  • / 9/ unsigned int i Left, j Right -
    1
  • /10/ for( )
  • /11/
  • /12/ while( A i lt Pivot )
  • /13/ while( A --j gt Pivot )
  • /14/ if( i lt j )
  • /15/ Swap( A i , A j )
  • /16/ else
  • /17/ break
  • /18/
  • /19/ Swap( A i , A Right - 1 ) //
    Restore pivot.

49
A Lower Bound on Complexity
  • Sorts based on comparison
  • We will prove the best sorting routine based on
    comparisons cannot be any better than O( n log n
    )
  • We have studied three sorts of this complexity
    class
  • Quicksort has the best average time behavior
  • Heapsort has the best worst case behavior
  • Mergesort is well suited for data stored
    sequentially in external files
  • Other sorts
  • Radix sort can have linear time behavior, but the
    multiplying constant may be high
  • Radix sort only applies to certain types of data

50
A Decision Tree
  • Three elements can be ordered 3! 6 ways, this
    decision tree finds which ordering is correct

51
Some Theorems - 1
52
Some Theorems - 2
Write a Comment
User Comments (0)
About PowerShow.com