Design and Analysis of Computer Algorithm Lecture 5-3 - PowerPoint PPT Presentation

1 / 75
About This Presentation
Title:

Design and Analysis of Computer Algorithm Lecture 5-3

Description:

Design and Analysis of Computer Algorithm. 1. Design and Analysis of ... This lecture note has been modified from lecture note for 23250 by Prof. Francis ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 76
Provided by: davidl67
Category:

less

Transcript and Presenter's Notes

Title: Design and Analysis of Computer Algorithm Lecture 5-3


1
Design and Analysis of Computer AlgorithmLecture
5-3
  • Pradondet Nilagupta
  • Department of Computer Engineering

This lecture note has been modified from lecture
note for 23250 by Prof. Francis Chin , CS332 by
David Luekbe
2
Greedy Method (Cont.)
3
Disjoint-Set Union Problem
  • Want a data structure to support disjoint sets
  • Collection of disjoint sets S Si, Si n Sj ?
  • Need to support following operations
  • MakeSet(x) S S U x
  • Union(Si, Sj) S S - Si, Sj U Si U Sj
  • FindSet(X) return Si ? S such that x ? Si
  • Before discussing implementation details, we look
    at example application MSTs

4
Kruskals Algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

5
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
6
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
7
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
8
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1?
13
21
9
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
10
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2?
19
9
17
14
25
8
5
1
13
21
11
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
12
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5?
1
13
21
13
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
14
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8?
5
1
13
21
15
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
16
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9?
17
14
25
8
5
1
13
21
17
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
18
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13?
21
19
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
20
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14?
25
8
5
1
13
21
21
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
22
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17?
14
25
8
5
1
13
21
23
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19?
9
17
14
25
8
5
1
13
21
24
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21?
25
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25?
8
5
1
13
21
26
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
27
Kruskals Algorithm
Run the algorithm
  • Kruskal()
  • T ?
  • for each v ? V
  • MakeSet(v)
  • sort E by increasing edge weight w
  • for each (u,v) ? E (in sorted order)
  • if FindSet(u) ? FindSet(v)
  • T T U u,v
  • Union(FindSet(u), FindSet(v))

2
19
9
17
14
25
8
5
1
13
21
28
Correctness Of Kruskals Algorithm
  • Sketch of a proof that this algorithm produces an
    MST for T
  • Assume algorithm is wrong result is not an MST
  • Then algorithm adds a wrong edge at some point
  • If it adds a wrong edge, there must be a lower
    weight edge (cut and paste argument)
  • But algorithm chooses lowest weight edge at each
    step. Contradiction
  • Again, important to be comfortable with cut and
    paste arguments

29
Kruskals Algorithm
What will affect the running time?
Kruskal() T ? for each v ? V
MakeSet(v) sort E by increasing edge weight
w for each (u,v) ? E (in sorted order)
if FindSet(u) ? FindSet(v) T T U
u,v Union(FindSet(u), FindSet(v))
30
Kruskals Algorithm
What will affect the running time? 1 Sort O(V)
MakeSet() calls O(E) FindSet() callsO(V) Union()
calls (Exactly how many Union()s?)
Kruskal() T ? for each v ? V
MakeSet(v) sort E by increasing edge weight
w for each (u,v) ? E (in sorted order)
if FindSet(u) ? FindSet(v) T T U
u,v Union(FindSet(u), FindSet(v))
31
Kruskals Algorithm Running Time
  • To summarize
  • Sort edges O(E lg E)
  • O(V) MakeSet()s
  • O(E) FindSet()s
  • O(V) Union()s
  • Upshot
  • Best disjoint-set union algorithm makes above 3
    operations take O(E??(E,V)), ? almost constant
  • Overall thus O(E lg E), almost linear w/o sorting

32
Disjoint Set Union
  • So how do we implement disjoint-set union?
  • Naïve implementation use a linked list to
    represent each set
  • MakeSet() ??? time
  • FindSet() ??? time
  • Union(A,B) copy elements of A into B ??? time

33
Disjoint Set Union
  • So how do we implement disjoint-set union?
  • Naïve implementation use a linked list to
    represent each set
  • MakeSet() O(1) time
  • FindSet() O(1) time
  • Union(A,B) copy elements of A into B O(A)
    time
  • How long can a single Union() take?
  • How long will n Union()s take?

34
Disjoint Set Union Analysis
  • Worst-case analysis O(n2) time for n Unions
  • Union(S1, S2) copy 1 element
  • Union(S2, S3) copy 2 elements
  • Union(Sn-1, Sn) copy n-1 elements
  • O(n2)
  • Improvement always copy smaller into larger
  • Why will this make things better?
  • What is the worst-case time of Union()?
  • But now n Unions take only O(n lg n) time!

35
Huffman Code
36
Data Compression
  • Motivation
  • Limited network bandwidth.
  • Limited disk space.
  • Huffman coding
  • Variable length coding
  • Shorter codes are used to encode characters that
    occur frequently.

37
Fixed-length code
  • Each symbol is encoded using the same number of
    bits.
  • C symbols
  • ?log2 C? bits
  • Simple, easy to work with
  • TEST
  • 100001011100

38
Example Fixed-length code
39
Code Trees
000 001 010 011 100 101 110
40
Code Trees
TASTE 11 01 00000 11 10
1101000001110
41
(No Transcript)
42
Code Trees
P
A
E
T
sp

I
S
nl
Wrong
1101000001110 11 01 00 00 01 11 0 or
11 01 00000 11 10
43
Code Trees
  • Characters are placed only at the leaves.
  • No character code is a prefix of another
    character code.
  • Prefix code
  • A sequence of bits can always be decoded
    unambiguously.
  • Full Trees
  • All nodes are leaves or have two children.

44
Problem Description
  • Input
  • A list of symbols and their frequencies
  • Output
  • a code tree with minimum total cost
  • Total cost
  • ls number of bits of the code for symbol s
  • fs frequency of symbol s
  • Algorithm
  • Huffmans algorithm

45
Huffmans algorithm
  • Maintain a forest of trees.
  • weight of a tree sum of the frequencies of its
    leaves
  • Initially, there are C single-node treesone for
    each character.
  • Select two trees, T1 and T2, of smallest weights,
    breaking ties arbitrarily, and form a new tree
    with T1 and T2 as subtrees.
  • Repeat the previous step until there is only 1
    tree. The tree is the optimal Huffman coding
    tree.

46
(No Transcript)
47
41
39
67
A
T3
T4
T
sp
E
T2
T1
I
S
nl
48
(No Transcript)
49
Different Optimal Code Trees
50
Exercise (1/2)
areleeelaeelarrn..
51
Exercise (2/2)
52
Maintaining the Trees
  • Operations
  • C initial inserts
  • 2(C-1) deletes and C-1 inserts
  • Maintain a sorted list using linked list
  • O(C) per insert, O(1) per delete
  • O(C2) total running time
  • Use Priority Queues
  • O(log C) per insert, O(log C) per delete
  • O(C log C) total running time

53
Code Table
  • How to store the code table?
  • Use some kind of array
  • Problem codes have different lengths.

54
Parenthesized Infix Expression
  • E x symb (E,symb,E)
  • symb a b c d e o
  • (((a,o,b),o,c),o,(d,o,e))

55
Prefix expression
  • E symb E E
  • symb a d m k f
  • admkf
  • prefix 000111011
  • leaves admkf

56
Encoding the code tree
  • Let interior nodes contain the symbol
  • Construct the prefix expression E
  • admkf
  • Remove s from E
  • admkf
  • In E, replace s with 0,
  • others with 1
  • 000111011

57
Encoding the code tree
  • Tree Structure
  • 0 for interior nodes
  • 1 for leaf nodes
  • encoded as prefix expression
  • 000111011
  • Information in leaf nodes
  • Array
  • All entries have the same size.
  • a d m k f

58
More Example Huffman codes
  • If we used a variable number of bits for a code
    such that frequent characters use fewer bits and
    infrequent character use more bits, we can
    decrease the space needed to store the same
    information. For example, consider the following
    sentence
  • dead beef cafe deeded dad. dad faced a faded
    cab. dad acceded. dad be bad.
  • There are 12 a's, 4 b's, 5 c's, 19 d's, 12
    e's, 4 f's, 17 spaces, and 4 periods, for a total
    of 77 characters.

59
fixed-length code
  • If we use a fixed-length code like this
  • 000 (space)
  • 001 a
  • 010 b
  • 011 c
  • 100 d
  • 101 e
  • 110 f
  • 111 .

Then the sentence, which is of length 77,
consumes 77 3 231 bits.
60
variable length code
  • if we use a variable length code like this
  • 100 (space)
  • 110 a
  • 11110 b
  • 1110 c
  • 0 d
  • 1010 e
  • 11111 f
  • 1011 .

we can encode the text in 3 12 4 5 5 4
19 1 12 4 4 5 17 3 4 4 230
bits. That a savings of 1 bit.
61
Binary Code
  • Suppose that we have a large amount of text that
    we wish to store on a computer disk in an
    efficient way. The simplest way to do this is
    simply to assign a binary code to each character,
    and then store the binary codes consecutively in
    the computer memory.
  • The ASCII system for example, uses a fixed 8-bit
    code to represent each character. Storing n
    characters as ASCII text requires 8n bits of
    memory.

62
Binary Code (cont.)
  • suppose that we are storing only the 10 numeric
    characters 0 , 1 , . . . ,9 .

63
Non-random data
  • Consider the following data, which is taken from
    a Postscript file.

64
A good code
  • What would happen if we used the following code
    to store the data rather than the fixed length
    code?

65
Example
  • To store the string 0748901
  • we would get 0000011101001000100100000001
  • using the fixed length code and
  • 10110001010000111011010 using the variable length
    code.

66
Prefix codes
  • a code in which no codeword is a prefix of any
    other codeword. Decoding such a code is done
    using a binary tree.

0
1
0
0
1
1
0
1
0
1
1
8
0
0
1
3
7
0
1
1
0
4
6
2
0
1
9
5
67
Optimal trees
  • A tree representing an optimal code for a file is
    always a full binary tree namely, one where
    every node is either a leaf or has precisely two
    children.

68
Why does it work?
  • In order to show that Huffman's algorithm works,
    we must show that there can be no prefix codes
    that are better than the one produced by
    Huffman's algorithm.

69
Huffman Codes
  • Let T be the tree, C be the set of characters c
    that comprise the alphabet, and f(c) be the
    frequency of character c. Since the number of
    bits is the same as the depth in the binary tree,
    we can express the sum in terms of d T , the
    depth of character c in the tree
  • This is the sum we want to minimize. We'll call
    it the cost, B(T) of the tree. Now we just need
    an algorithm that will build a tree with minimal
    cost.

70
Huffman Pseudocode
  • Huffman (C)
  • n the size of C
  • insert all the elements of C into Q,
  • using the value of the node as the priority
  • for i in 1..n-1 do
  • z a new tree node
  • x Extract-Minimum (Q)
  • y Extract-Minimum (Q)
  • left node of z x
  • right node of z y
  • fz fx fy
  • Insert (Q, z)
  • end for
  • return Extract-Minimum (Q) as the complete tree

71
Proving Optimality
  • Greedy-choice property
  • building an optimal tree can begin by merging two
    lowest-frequency characters
  • Optimal-substructure property

Optimal
Optimal
72
Greedy-Choice Property (1/2)
  • Let x and y be two characters w/ lowest freq.
  • Prove that there exists an optimal-code tree
    where x and y appear as sibling leaves of max.
    depth in the tree.

73
Greedy-Choice Property (2/2)
T
T
T
Y
C
Y
B
B
X
C
X
Y
X
C
B
74
Optimal-Substructure Property
T
T
C
C
f(c) f(x) f(y)
Y
X
75
Conclusion
  • Greedy algorithm is
  • simple
  • easy to invent
  • easy to implement
  • efficient
  • Do not always yield optimal solutions
  • require greedy-choice and optimal-substructure
    properties for optimality
Write a Comment
User Comments (0)
About PowerShow.com