DCO20105 Data structures and algorithms - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

DCO20105 Data structures and algorithms

Description:

selection sort, bubble sort, insertion sort, radix sort, partition sort, merge sort ... Radix Sort. It is based on the values of the actual digits of its octal ... – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 34
Provided by: rossel
Category:

less

Transcript and Presenter's Notes

Title: DCO20105 Data structures and algorithms


1
DCO20105 Data structures and algorithms
  • Lecture 7 Big-O analysis
    Sorting Algorithms
  • Big-O analysis on different ways of
    multiplication
  • Sorting Algorithms selection sort, bubble sort,
    insertion sort, radix sort, partition sort, merge
    sort
  • Comparison of different sorting algorithms
  • -- By Rossella Lau

2
Performance re-visit
  • For multiplication, we can use (at least) three
    different ways
  • The one we used to use in primary school
  • bFunction(m, n) in slide 9 of Lecture 6
  • funny(a, b) in slide 10 of Lecture 6

3
Performance analysis
4
Execution time vs memory
  • The traditional multiplication has the least
    operations but it requires the most memory at O
    (log10 n)
  • bFunction() does not require additional memory
    but it spends a terrible amount of time getting
    the result O(n)
  • funny() does not require additional memory and it
    has a bit more operations at O (log2 n)
  • The traditional way may have less operations but
    hard to say if it really outperforms funny()
    since memory load may not be faster than shift
    operation

5
Ordering of data
  • In order to search a record efficiently, records
    are stored in the order of key values
  • A key is a field or some fields of a record that
    can uniquely identify the record in a file
  • Usually, only the key values are stored in memory
    and the corresponding record is loaded into the
    memory only when it is necessary
  • The key values, therefore, usually are sorted in
    a special order to allow efficient searching

6
Classification of sorting methods
  • Comparison-Based Methods
  • Insertion Sorts
  • Selection Sorts
  • Heapsort (tree sorting) in future lesson
  • Exchange sorts
  • Bubble sort
  • Quick sort
  • Merge sorts
  • Distribution Methods Radix sorting

7
Selection sort
  • Selection choose the smaller element from a list
    and place it in the 1st position.
  • The process is from the first element to the
    second to last element on a list and for each
    element to apply the selection on the sub-list
    starting from the element being processed.
  • Ford text book slides 2-9 in Chapter 3

8
Bubble sort
  • To pass through the array n-1 times, where n is
    the number of data in the array
  • For each pass
  • compare each element in the array with its
    successor
  • interchange the two elements if they are not in
    order
  • The algorithm

9
An example trace of bubble sort
Given data sequence 25 57 48 37 12 92 86
33 The first pass 25 57 48 37 12 92 86
33 25 57 48 37 12 92 86 33 25 48 57 37
12 92 86 33 25 48 37 57 12 92 86 33 25
48 37 12 57 92 86 33 25 48 37 12 57 92
86 33 25 48 37 12 57 86 92 33 25 48 37
12 57 86 33 92
Subsequent passes Pass2 25 37 12 48 57 33
86 92 Pass3 25 12 37 48 33 57 86
92 Pass4 12 25 37 33 48 57 86 92 Pass5
12 25 33 37 48 57 86 92 Pass6 12 25 33
37 48 57 86 92 Pass7 12 25 33 37 48 57
86 92
10
Improvement can be made
  • At pass i, the last i elements should be in
    proper positions since, at the first pass the
    largest element should be placed at the end of
    the array. At the second pass, the second large
    element should be placed before the last element,
    and so on. ? The comparison only requires from
    x0 to xn-i-1
  • The array has already been sorted at the fifth
    iteration and the sixth and seventh are redundant
  • Therefore, once no exchange is required in an
    iteration, the array is already sorted and the
    subsequent iterations are redundant

11
The improved algorithm for bubble sort
12
Performance considerations of bubble sort
  • For the first version, it requires (n-1)
    comparisons in (n-1) passes ? the total number of
    comparisons is n2 -2n 1, i.e., O(n2)
  • For the improved version, it requires (n-1)
    (n-2) ... (n-k) for k (ltn) passes ? the total
    number of comparisons is (2kn-k2 -k)/2. However,
    the average k is O(n) yielding the overall
    complexity as O(n2) and the overhead (set and
    check exchange) introduced should also be
    considered
  • It only requires little additional space

13
Insertion sort
  • Insert an item into a previous sorted order one
    by one for each of the data.
  • It is similar to repeatedly picking up playing
    cards and inserting them into the proper position
    in a partial hand of cards

14
An example trace of insertion sort
25 37 48 57 12 92 86 33 25 37 48
57 92 86 33 25 37 48 57 92 86
33 25 37 48 57 92 86 33 25 37
48 57 92 86 33 12 25 37 48 57 92 86
33 12 25 37 48 57 92 86 33 12 25 37 48
57 92 86 33 12 25 37 48 57 86 92 33 12
25 37 48 57 86 92 33 12 25 33 37 48
57 86 92
25 57 48 37 12 92 86 33 25 57 48 37 12
92 86 33 25 57 48 37 12 92 86 33 25
57 37 12 92 86 33 25 48 57 37 12 92
86 33 25 48 57 37 12 92 86 33 25 48
57 12 92 86 33 25 48 57 12 92 86
33 25 37 48 57 12 92 86 33
15
The algorithm of insertion sort
  • The checking of igt0 is time consuming. Setting
    a sentinel in the beginning of the array will
    prevent y from going beyond the array

16
Performance analysis of insertion sort
  • If the original sequence is already in order,
    only one comparison is made on each pass gt O(n)
  • If the original sequence is in a reversed order,
    it requires n comparison in each pass gt O(n2)
  • The complexity is from O(n) to O(n2)
  • It requires little additional space

17
Quick sort
  • It is also called partition exchange sort
  • In each step, the original sequence is
    partitioned into 3 parts
  • a. all the items less than the partitioning
    element
  • b. the partitioning element in its final
    position
  • c. all the items greater than the
    partitioning element
  • The partitioning process continues in the left
    and right partitions

18
The partitioning in each step of quicksort
  • To pick one of the elements as the partitioning
    element, p, usually the first element of the
    sequence
  • To find the proper position for p while
    partitioning the sequence into 3 parts
  • a) it employs two indexes, down and up
  • b) down goes from left to right to find
    elements greater than p
  • c) up goes from right to left to find elements
    less than p
  • d) elements found by up and down are exchanged
  • e) process until up and down are matched or
    passed each other
  • f) the position of p should be pointed by up
  • g) exchange p with the element pointed by up

19
An example trace of quicksort
25 57 48 37 12 92 86 33 25 57 48 37
12 92 86 33 25 57 48 37 12 92 86 33
25 57 48 37 12 92 86 33 25 57 48 37
12 92 86 33 25 57 48 37 12 92 86 33
25 12 48 37 57 92 86 33 25 12 48 37
57 92 86 33 25 12 48 37 57 92 86 33
25 12 48 37 57 92 86 33 25 12 48 37
57 92 86 33 (12) 25 (48 37 57 92 86 33)
Subsequent processes 12 25 (48 37 57 92
86 33) 12 25 (48 37 33 92 86 57) 12 25
(48 37 33 92 86 57) 12 25 (33 37) 48 (92
86 57) 12 25 (33 37) 48 (92 86 57) 12
25()33 (37) 48 (57 86) 92() 12 25 33 37 48
(57 86) 92 12 25 33 37 48()57(86) 92 12
25 33 37 48 57 86 92
_ down, _ up
20
The algorithm for quicksort
21
Performance considerations of quicksort
  • Quciksort got its name because it quickly puts an
    element into its proper position by employing two
    indexes to speed up the partioning process and to
    minimize the exchange
  • Each pass reduces the comparisons about a half ?
    total number of comparisons is about O(nlog2n)
  • It requires spaces for the recursive process or
    stacks for an iterative process, it is about
    O(log2n)

22
Merge
  • Merge means to combine two or more sorted
    sequences into another sorted sequence
  • The merging of two sequences, for example, are as
    follows

32 45 78 90 92 25 30 52 88 98 32 45 78 90
92 25 30 52 88 98 25 32 45 78 90 92 25 30
52 88 98 25 30 32 45 78 90 92 25 30 52 88 98
25 30 32 32 45 78 90 92 25 30 52 88 98 25
30 32 45 32 45 78 90 92 25 30 52 88 98 25 30
32 45 52 32 45 78 90 92 25 30 52 88 98 25 30
32 45 52 78 32 45 78 90 92 25 30 52 88 98 25
30 32 45 52 78 88 32 45 78 90 92 25 30 52 88
98 25 30 32 45 52 78 88 90 32 45 78 90 92 25
30 52 88 98 25 30 32 45 52 78 88 90 92 32 45 78
90 92_ 25 30 52 88 98_25 30 32 45 52 78 88 90
92 98
23
Merge sort
  • It employs the merging technique in the following
    way
  • 1. Divide the sequence into n parts
  • 2. Merge adjacent parts yielding the
    sequence n/2 parts
  • 3. Merge adjacent parts again yielding the
    sequence n/4 parts
  • ......
  • Process goes on until the sequence becomes 1
    part

24
An example of merge sort
8 parts 25 57 48 37 12 92 86 33 merge
25 57 37 48 12 92 33 86 4 parts 25
57 37 48 12 92 33 86 merge 25 37 48
57 12 33 86 92 2 parts 25 37 48 57 12
33 86 92 merge 12 25 33 37 48 57 86
92
25
Performance considerations of merge sort
  • There are only log2n passes yielding a complexity
    of O(nlogn)
  • It never requires n log2n comparison while
    quicksort may require O(n2) at the worst case
  • However, it requires about double of assignment
    statements as quicksort
  • It also requires more additional spaces, about
    O(n), than quicksort's O(log2n)

26
Radix Sort
  • It is based on the values of the actual digits of
    its octal position
  • Starting from the least significant digit to the
    most significant digit
  • define 10 vectors for each digit and number the
    vectors from v0 to v9 for digit 0 to 9
    respectively
  • scan the data sequence once and add xi into the
    significant digit's respective vector
  • new data sequence is as follows remove elements
    from each vector from the beginning one by one
    until it is empty from q0 to q9
  • After the above actions, the new data sequence is
    the sorted sequence!

27
An example of radix sort
25 57 48 37 12 92 86 33
12
12 92 33 25 86 57 37 48
25
12
92
33
37
33
48
57
25
86
57
37
86
48
92
12 25 33 37 48 57 86 92
28
Performance considerations of radix sort
  • It does not require any comparison between data
  • It requires number of digits, log10 m, passes
  • ?O(nlog10 m) ?O(n), treating log10 m a constant
  • It requires 10 times of the memory for numbers
  • It seems that radix sort has the best
    performance however, it is not popularly used
    because
  • It consumes a terrible amount of memory
  • Log10 m depends on the digit (length) of a key
    and may not be treated as a small constant when
    the key length is long

29
The real life sort for vector based data
  • Although quick sort is known to be the fastest in
    many cases, the library will not usually directly
    use quick sort as the sort method
  • Usually, a carefully designed library will
    implement its sort method with quick sort and
    insertion sort
  • Quick sort divides partitions until a partition
    is about the size from 8 to 16, insertion is
    applied to the partition since the partitions
    usually are near being sorted

30
The real life sort for non vector data
  • Quick sort requires a container with random
    access
  • A container such as a linked list does not
    support random access and cannot apply quick sort
  • Merge sort is preferred to be applied

31
Sample timing of sort methods
  • Fords prg15_2.cpp d_sort.hTiming for some
    sample runs timeSort.out

32
Summary
  • Bubble sort and insertion sort have complexity of
    O(n2) but insertion sort is still preferred for
    short data stream
  • Partition sort, merge sort have a less complexity
    at O(n logn)
  • Radix sort seemed at O(n) complexity but it
    consumes more memory and may depend on the key
    length
  • Many times, the trade off is space

33
Reference
  • Ford 3.1, 4.4, 8.3 15.1
  • Data Structures using C and C by Yedidyah
    Langsam, Moshe J. Augenstein Aaron M.
    Tenenbaum Chapter 6
  • Example programs Ford prg15_2.cpp, d_sort.h,
  • -- END --
Write a Comment
User Comments (0)
About PowerShow.com