2IL05 Data Structures 2IL06 Introduction to Algorithms

1 / 37
About This Presentation
Title:

2IL05 Data Structures 2IL06 Introduction to Algorithms

Description:

QuickSort is a divide-and-conquer algorithm. To sort the subarray A[p..r]: Divide ... Conquer. Sort the two subarrays by recursive calls to QuickSort. Combine ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 38
Provided by: bettinas

less

Transcript and Presenter's Notes

Title: 2IL05 Data Structures 2IL06 Introduction to Algorithms


1
2IL05 Data Structures 2IL06 Introduction to
Algorithms
  • Spring 2009Lecture 5 QuickSort Selection

2
QuickSort
One more sorting algorithm
3
Sorting algorithms
  • Input a sequence of n numbers a1, a2, , an
  • Output a permutation of the input such that ai1
    ain
  • Important properties of sorting algorithms
  • running time how fast is the algorithm in the
    worst case
  • in place only a constant number of input
    elements are ever stored outside the
    input array

4
Sorting algorithms
  • Input a sequence of n numbers a1, a2, , an
  • Output a permutation of the input such that ai1
    ain
  • Important properties of sorting algorithms
  • running time how fast is the algorithm in the
    worst case
  • in place only a constant number of input
    elements are ever stored outside the
    input array

T(n2)
yes
T(n log n)
no
yes
T(n log n)
T(n2)
yes
5
QuickSort
T(n2)
yes
T(n log n)
no
yes
T(n log n)
T(n2)
yes
  • Why QuickSort?
  • Expected running time T(n log n) (randomized
    QuickSort)
  • Constants hidden in T(n log n) are small
  • using linear time median finding to guarantee
    good pivot gives worst case T(n log n)

6
QuickSort
  • QuickSort is a divide-and-conquer algorithm
  • To sort the subarray Ap..r
  • DividePartition Ap..r into two subarrays
    Ap..q-1 and Aq1..r, such that each element
    in Ap..q-1 is Aq and Aq is lt each element
    in Aq1..r.
  • ConquerSort the two subarrays by recursive calls
    to QuickSort
  • CombineNo work is needed to combine the
    subarrays, since they are sorted in place.
  • Divide using a procedure Partition which returns
    q.

7
QuickSort
  • QuickSort(A, p, r)
  • if p lt r
  • then q ? Partition(A, p, r)
  • QuickSort(A, p, q-1)
  • QuickSort(A, q1, r)
  • Partition(A, p, r)
  • x ? Ar
  • i ? p-1
  • for j ? p to r-1
  • do if Aj x
  • then i ? i1
  • exchange Ai ? Aj
  • exchange Ai1 ? Ar
  • return i1
  • Initial call QuickSort(A, 1, n)
  • Partition always selects Ar as the pivot (the
    element around which to partition)

8
Partition
  • As Partition executes, the arrayis partitioned
    into four regions (some may be empty)
  • Loop invariant
  • all entries in Ap..i are pivot
  • all entries in Ai1..j-1 are gt pivot
  • Ar pivot
  • Partition(A, p, r)
  • x ? Ar
  • i ? p-1
  • for j ? p to r-1
  • do if Aj x
  • then i ? i1
  • exchange Ai ? Aj
  • exchange Ai1 ? Ar
  • return i1

9
Partition
  • Partition(A, p, r)
  • x ? Ar
  • i ? p-1
  • for j ? p to r-1
  • do if Aj x
  • then i ? i1
  • exchange Ai ? Aj
  • exchange Ai1 ? Ar
  • return i1

10
Partition - Correctness
  • Partition(A, p, r)
  • x ? Ar
  • i ? p-1
  • for j ? p to r-1
  • do if Aj x
  • then i ? i1
  • exchange Ai ? Aj
  • exchange Ai1 ? Ar
  • return i1
  • Loop invariant
  • all entries in Ap..i are pivot
  • all entries in Ai1..j-1 are gt pivot
  • Ar pivot
  • Initializationbefore the loop starts, all
    conditions are satisfied, since r is the pivot
    and the two subarrays Ap..i and Ai1..j-1 are
    empty
  • Maintenancewhile the loop is running, if Aj
    pivot, then Aj and Ai1 are swapped and then
    i and j are incremented ? 1. and 2. hold.If Aj
    gt pivot, then increment only j ? 1. and 2. hold.

11
Partition - Correctness
  • Partition(A, p, r)
  • x ? Ar
  • i ? p-1
  • for j ? p to r-1
  • do if Aj x
  • then i ? i1
  • exchange Ai ? Aj
  • exchange Ai1 ? Ar
  • return i1
  • Loop invariant
  • all entries in Ap..i are pivot
  • all entries in Ai1..j-1 are gt pivot
  • Ar pivot
  • Terminationwhen the loop terminates, j r, so
    all elements in A are partitioned into one of
    three cases
  • Ap..i pivot, Ai1..r-1 gt pivot, and Ar
    pivot
  • Lines 7 and 8 move the pivot between the two
    subarrays
  • Running time

T(n) for an n-element subarray
12
QuickSort running time
  • QuickSort(A, p, r)
  • if p lt r
  • then q ? Partition(A, p, r)
  • QuickSort(A, p, q-1)
  • QuickSort(A, q1, r)
  • Running time depends on partitioning of
    subarrays
  • if they are balanced, then QuickSort is as fast
    as MergeSort
  • if they are unbalanced, then QuickSort can be as
    slow as InsertionSort
  • Worst case
  • subarrays completely unbalanced 0 elements in
    one, n-1 in the other
  • T(n) T(n-1) T(0) T(n) T(n-1) T(n)
    T(n2)
  • input sorted array

13
QuickSort running time
  • QuickSort(A, p, r)
  • if p lt r
  • then q ? Partition(A, p, r)
  • QuickSort(A, p, q-1)
  • QuickSort(A, q1, r)
  • Running time depends on partitioning of
    subarrays
  • if they are balanced, then QuickSort is as fast
    as MergeSort
  • if they are unbalanced, then QuickSort can be as
    slow as InsertionSort
  • Best case
  • subarrays completely balanced each has n/2
    elements
  • T(n) 2T(n/2) T(n) T(n log n)
  • Average?

14
QuickSort running time
  • Average running time is much closer to best case
    than to worst case.
  • Intuition
  • imagine that Partition always produces a 9-to1
    split
  • T(n) T(9n/10) T(n/10) T(n)

15
T(n) T(9n/10) T(n/10) T(n)
  • Remember Section 4.2 (or Lecture 2)
  • log10n full levels, log10/9n non-empty levels
  • base of log does not matter in asymptotic
    notation (as long as it is constant)

16
QuickSort running time
  • Average running time is much closer to best case
    than to worst case.
  • Intuition
  • imagine that Partition always produces a 9-to1
    split
  • T(n) T(9n/10) T(n/10) T(n)
  • T(n log n)
  • Any split of constant proportionality yields a
    recursion tree of depth T(log n)
  • But splits will not always be constant, there
    will be a mix of good and
  • bad splits

17
QuickSort running time
  • Average running time is much closer to best case
    than to worst case.
  • More intuition
  • mixing good and bad splits does not affect the
    asymptotic running time
  • assume levels alternate between best-case and
    worst case splits
  • extra levels add only to hidden constant, in both
    cases O(n log n)

18
Randomized QuickSort
  • pick pivot at random
  • RandomizedPartition(A, p, r)
  • i ? Random(p, r)
  • exchange Ar ?Ai
  • return Partition(A, p, r)
  • random pivot results in reasonably balanced split
    on average ? expected running time T(n log n)
  • see book for detailed analysis
  • alternative use linear time median finding to
    find a good pivot ? worst case running time T(n
    log n)price to pay added complexity

19
Selection
Medians and Order Statistics
20
Definitions
  • ith order statistic ith smallest of a set of n
    elements
  • minimum 1st order statistic
  • maximum nth order statistic
  • median halfway point
  • n odd ? unique median at i (n1)/2
  • n even ? lower median at i n/2, upper median at
    i n/21
  • here median means lower median

21
The selection problem
  • Input a set A of of n distinct numbers and a
    number i, with 1 i n.
  • Output The element x ? A that is larger than
    exactly i-1 other elements in A. (The ith
    smallest element of A.)
  • Easy solution
  • sort the input in T(n log n) time
  • return the ith element in the sorted array
  • This can be done faster

start with minimum and maximum
22
Minimum and maximum
  • Find the minimum with n-1 comparisons examine
    each element in turn and keep track of the
    smallest one
  • Is this the best we can do?
  • Each element (except the minimum) must be
    compared to a smaller element at least once
  • Minimum(A, n)
  • min ? A1
  • for i ? 2 to n
  • do if min gt Ai
  • then min ? Ai
  • return min
  • Find maximum by replacing gt with lt

yes
23
Simultaneous minimum and maximum
  • Assume we need to find both the minimum and the
    maximum
  • Easy solution find both separately
  • ? 2n-2 comparisons ? T(n) time
  • But only 3 n/2 are needed
  • maintain the minimum and maximum seen so far
  • dont compare elements to the minimum and maximum
    separately, process them in pairs
  • compare the elements of each pair to each other,
    then compare the largest to the maximum and the
    smallest to the minimum
  • ? 3 comparisons for every 2 elements

24
The selection problem
  • Input a set A of of n distinct numbers and a
    number i, with 1 i n.
  • Output The element x ? A that is larger than
    exactly i-1 other elements in A. (The ith
    smallest element of A.)
  • TheoremThe ith smallest element of A can be
    found in O(n) time in the worst case.
  • Idea
  • partition the input array, recurse on one side of
    the split
  • guarantee a good split
  • use Partition with a designated pivot element

25
Selection in worst-case linear time
20
3
10
8
14
6
12
9
11
18
7
4
5
17
15
1
2
13
i 12
  • Divide the n elements into groups of 5 ? n/5
    groups

26
Selection in worst-case linear time
x
20
3
10
8
14
6
12
9
11
18
7
4
5
17
15
1
2
13
i 12
  • Divide the n elements into groups of 5 ? n/5
    groups
  • Find the median of each of the n/5 groups(sort
    each group of 5 elements in constant time and
    simply pick the median)
  • Find the median x of the n/5 medians
    recursively
  • Partition the array around x

27
Selection in worst-case linear time
i 12
  • Divide the n elements into groups of 5 ? n/5
    groups
  • Find the median of each of the n/5 groups(sort
    each group of 5 elements in constant time and
    simply pick the median)
  • Find the median x of the n/5 medians
    recursively
  • Partition the array around x

? x is the kth element after partitioning
28
Selection in worst-case linear time
i 12
  • Divide the n elements into groups of 5 ? n/5
    groups
  • Find the median of each of the n/5 groups(sort
    each group of 5 elements in constant time and
    simply pick the median)
  • Find the median x of the n/5 medians
    recursively
  • Partition the array around x
  • If i k, return x. If i lt k, recursively find
    the ith smallest element on the low side. If i gt
    k, recursively find the (i-k)th smallest element
    on the high side.

? x is the kth element after partitioning
29
Selection in worst-case linear time
i 12
  • Divide the n elements into groups of 5 ? n/5
    groups
  • Find the median of each of the n/5 groups(sort
    each group of 5 elements in constant time and
    simply pick the median)
  • Find the median x of the n/5 medians
    recursively
  • Partition the array around x
  • If i k, return x. If i lt k, recursively find
    the ith smallest element on the low side. If i gt
    k, recursively find the (i-k)th smallest element
    on the high side.

? x is the kth element after partitioning
i 5
30
Analysis
  • How many elements are larger than x?

31
Analysis
  • How many elements are larger than x?
  • Half of the medians found in step 2 are x
  • The groups of these medians contain 3 elements
    each which are gt x(discounting xs group and the
    last group)
  • ? at least
    elements are gt x

x
32
Analysis
  • Symmetrically, at least 3n/10 6 elements are lt
    x
  • ? the algorithm recurses on 7n/10 6 elements

x
33
Analysis
  • Divide the n elements into groups of 5 ? n/5
    groups
  • Find the median of each of the n/5 groups(sort
    each group of 5 elements in constant time and
    simply pick the median)
  • Find the median x of the n/5 medians
    recursively
  • Partition the array around x
  • If i k, return x. If i lt k, recursively find
    the ith smallest element on the low side. If i gt
    k, recursively find the (i-k)th smallest element
    on the high side.
  • O(n)
  • O(n)
  • T( n/5 )
  • O(n)
  • T(7n/10 6)
  • T(n) O(1) for small n (lt 140)

34
Solving the recurrence
  • Solve by substitution
  • Inductive hypothesis T(n) cn for some constant
    c and all n gt 0
  • assume that c is large enough such that T(n) cn
    for all n lt 140
  • pick constant a such that the O(n) term is an
    for all n gt 0
  • T(n) c n/5 c(7n/10 6) an
  • c n/5 c 7cn/10 6c an
  • 9cn/10 7c an
  • cn (-cn/10 7c an)
  • remains to show -cn/10 7c an 0

35
Solving the recurrence
  • remains to show -cn/10 7c an 0
  • -cn/10 7c an 0
  • cn/10 -7c an
  • cn -70c 10an
  • c(n -70) 10an
  • c 10a(n/(n-70))
  • n 140 ? n/(n-70) 2
  • ? 20a 10a(n/(n-70))
  • choose c 20a ? T(n) O(n)

Why 140? Any integer gt 70 would have worked
36
Selection
  • TheoremThe ith smallest element of A can be
    found in O(n) time in the worst case.
  • Does not require any assumptions on the input
  • Is not in conflict with the O(n log n) lower
    bound for sorting, since it does not use sorting
  • Randomized Selection pick a pivot at random
  • TheoremThe ith smallest element of A can be
    found in O(n) expected time.

37
Tutorials this week
  • Small tutorials on Tuesday 34.
  • No Wednesday 78 big tutorial.
  • Small tutorial Friday 78.
Write a Comment
User Comments (0)