Randomized Algorithms

About This Presentation

Title:

Randomized Algorithms

Description:

Randomized algorithms: Use of probabilistic inequalities in analysis, amortized analysis, competitive analysis, and applications using examples. – PowerPoint PPT presentation

Number of Views:1580

Slides: 182

Provided by: vudasrinivasarao

Category: Medicine, Science & Technology

Tags: algorithms

more less

Transcript and Presenter's Notes

Title: Randomized Algorithms

1
ADVANCED ALGORITHM ANALYSISLECTURE
2RANDOMIZED ALGORITHMS

Prof. Vuda Sreenivasarao
Bahir Dar University-ETHIOPIA

2
Randomized Algorithms

Objectives
1.Randomized Algorithms
1.1.Basic concept of Randomized algorithms with
example.
1.2.Probabilistic inequalities in analysis with
examples.
1.3.Amortized analysis with examples.
1.4.Competitive analysis with examples.

3
Deterministic Algorithms

Input
Output
Goal to prove that the algorithm solves the
problem correctly always and quickly typically
the number of steps should be polynomial in the
size of the input.

Algorithm
4
A short list of categories

Algorithm types we will consider include
Simple recursive algorithms
Backtracking algorithms
Divide and conquer algorithms
Dynamic programming algorithms
Greedy algorithms
Branch and bound algorithms
Brute force algorithms
Randomized algorithms

5
Randomized algorithms

Randomization. Allow fair coin flip in unit
time.
Why randomize? Can lead to simplest, fastest, or
only known algorithm for a particular problem.
Examples Symmetry breaking protocols, graph
algorithms, quicksort, hashing, load balancing,
Monte Carlo integration, cryptography.

6
Randomized algorithms

A randomized algorithm is just one that depends
on random numbers for its operation.
These are randomized algorithms
Using random numbers to help find a solution to a
problem.
Using random numbers to improve a solution to a
problem.
These are related topics
Getting or generating random numbers.
Generating random data for testing (or other)
purposes.

7
1.Randomized Algorithms

INPUT
OUTPUT
RANDOM NUMBERS
In addition to input algorithm takes a source of
random numbers and makes random choices during
execution.
Random algorithms make decisions on rolls of the
dice.
Ex Quick sort, Quick Select and Hash tables.

ALGORITHM
8
Why use randomness?

Avoid worst-case behavior randomness can
(probabilistically) guarantee average case
behavior
Efficient approximate solutions to inflexible
problems.

9
Making Decision
Flip a coin.
10
Making Decision
Flip a coin!
An algorithm which flip coins is called a
randomized algorithm.
11
Why Randomness?
Making decisions could be complicated.
A randomized algorithm is simpler.
Consider the minimum cut problem
Can be solved by max flow.
Randomized algorithm?
Pick a random edge and contract.
And repeat until two vertices left.
12
Why Randomness?
Making good decisions could be expensive.
A randomized algorithm is faster.
Consider a sorting procedure.
5 9 13 8 11 6 7 10
5 6 7
8
9 10 11 13
Picking an element in the middle makes the
procedure very efficient, but it is expensive
(i.e. linear time) to find such an element.
Picking a random element will do.
13
Why Randomness?
Making good decisions could be expensive.
A randomized algorithm is faster.

Minimum spanning trees.
A linear time randomized algorithm,
known but no linear time deterministic
algorithm.
Primality testing
A randomized polynomial time algorithm,
but it takes thirty years to find a
deterministic one.
Volume estimation of a convex body
A randomized polynomial time approximation
algorithm,
but no known deterministic polynomial time
approximation algorithm.

14
Why Randomness?
In many practical problems, we need to deal with
HUGE input, and dont even have time to read it
once. But can we still do something useful?
Sub linear algorithm randomness is essential.

Fingerprinting verifying equality of strings,
pattern matching.
The power of two choices load balancing,
hashing.
Random walk check connectivity in log-space.

15
Advantages of randomized algorithms

Simplicity.
Performance.
For many problems, a randomized algorithm is the
simplest, the fastest, or both.

16
Scope of Randomized Algorithms

Number theoretic algorithms primality testing
Monte Carlo.
Data structures Sorting, order statistics,
searching, computational geometry.
Algebraic identities Polynomial and matrix
identity verification. Interactive proof systems.
Mathematical programming Faster algorithms for
linear programming. Rounding linear program
solutions to integer program solutions.

17
Scope of Randomized Algorithms

Graph algorithms Minimum spanning trees shortest
paths, minimum cuts.
Counting and enumeration Matrix permanent.
Counting combinatorial structures.
Parallel and distributed computing Deadlock
avoidance, distributed consensus.
Probabilistic existence proofs Show that a
combinatorial object arises with nonzero
probability among objects drawn from a suitable
probability space.

18
Randomized algorithms

In a randomized algorithm (probabilistic
algorithm), we make some random choices.
2 types of randomized algorithms
For an optimization problem, a randomized
algorithm gives an optimal solution. The average
case time-complexity is more important than the
worst case time-complexity.
For a decision problem, a randomized algorithm
may make mistakes. The probability of producing
wrong solutions is very small.

19
Types of Random Algorithms

Las Vegas
Guaranteed to produce correct answer, but running
time is probabilistic.
Monte Carlo
Running time bounded by input size, but answer
may be wrong.

20
Las Vegas or Monte Carlo?
21
Las Vegas

Always gives the true answer.
Running time is random.
Running time is bounded.
Quick sort is a Las Vegas algorithm.
A Las Vegas algorithm always produces the correct
answer its running time is a random variable
whose expectation is bounded (say by a
polynomial).

22
Monte Carlo

It may produce incorrect answer!
We are able to bound its probability.
By running it many times on independent random
variables, we can make the failure probability
arbitrarily small at the expense of running time.
A Monte Carlo algorithm runs for a fixed number
of steps and produces an answer that is correct
with probability ?1/2.

23
RP Class ( randomized polynomial )

Bounded polynomial time in the worst case.
If the answer is Yes Pr return Yes gt ½.
If the answer is No Pr return Yes 0.
½ is not actually important.

24
PP Class ( probabilistic polynomial )

Bounded polynomial time in worst case.
If the answer is Yes Pr return Yes gt ½.
If the answer is No Pr return Yes lt ½.
Unfortunately the definition is weak because the
distance to ½ is important but is not considered.

25
Routing Problem

There are n computers.
Each computer has a packet.
Each packet has a destination D(i).
Packets can not follow the same edge
simultaneously.
An oblivious algorithm is required.
For any deterministic oblivious algorithm on a
network of N nodes each of out degree d, there is
an instance of permutation routing requiring
(N/d) ½.

26
Routing Problem

Pick random intermediate destination.
Packet i first travels to the intermediate
destination and then to the final destination.
With probability at least 1-(1/N), every packet
reaches its destination in 14n of fewer steps in
Qn.
The expected number of steps is 15n.

27
1.Randomized Algorithms

EXAMPLE Expectation-
X()------ flips a coin.
Heads One second to execute.
Tails Three seconds.
Let X be running time of one cell to X()
with probability 0.5------ X is 1.
With probability 0.5------X is 3.
Here random variable is X.
Expected value of XEX0.5x10.5x3 2 seconds
expected time.
Suppose we run X(),// take time X
X(),// take
time Y
Total running time is TXY , here T is random
variable.
What is expected total time ET?
Linearity of expectation EXYEXEY224
seconds expected time.

28
Min_Cut Problem

Definition
Min_cut Problem is to find the minimum edge set
C such that removing C disconnects the graph.
Traditional Solution
Max-flow The maximum amount of flow is equal to
the capacity of a minimum cut

29
Example of Min_Cut
a
b
e.g. Min_Cut 2
30
Intuition

Let a graph G has n nodes and size of min_cut
k, that is C k then degree for each node gt
k
total number of edges in G gt nk/2.
Randomized Min_Cut
Input a graph G(V, E), V n
Output min_cut C
Repeat Pick any edge uniformly at random,
collapse it and remove self-loops
Until V down to 2.
Running time is O(n-2)

31
Example of Randomized Min_Cut
min_cut 2
Or maybe
min_cut 4
32
Las Vegas VS Monte Carlo

Las Vegas Algorithm It always produces the
correct answer and the expected running time is
finite (e.s.p. randomized quick sort).
Monte Carlo Algorithm It may produce incorrect
answer but with bounded error probability (e.s.p.
randomized min_cut).

33
Analysis

Probability of the first edge C
Prob (kn/2 k ) / (kn/2)
(n-2) / n
Probability of the second edge C
Prob (k(n-1)/2 k ) / (k(n-1)/2)
(n-3) / (n-1)

min_cut
34
Analysis
Iteration Probability of avoiding C
1 (n 2) / n
2 (n 3) / (n 1)
3 (n 4) / (n 2)
4 (n 5) / (n 3)

n - 2 1 / 3
Prob. Of outputting C Pr gt

35
Analysis

Probability of getting a min_cut is at least
2/n(n-1)
Might look like small, but gets bigger after
repeating the algorithm e.s.p. If algorithm is
running twice, probability of outputting C would
be
Pr 1 ( 1 ) 2
Let r be the number of running times of
algorithm.
Total running time O(nr)

36
Internet Minimum Cut
June 1999 Internet graph, Bill Cheswick http//res
earch.lumeta.com/ches/map/gallery/index.html
37
1.Randomized Algorithms

EXAMPLE Hash Tables-
Random hash code maps each possible key to
randomly chosen bucket, but a keys random hash
code never changes.
Good model for how a good hash code will
perform-
Assume hash table uses chaining , no duplicate
keys.
Perform find (k). K hashes to bucket b cost of
search is one birr , plus birr for every entry in
the bucket b whose key is not k.
Suppose there are n keys in table besides k.
V1,V2,Vn Random variables for each key Ki,
Vi 1 if key Ki hashes to bucket b , Zero
otherwise.

38
1.Randomized Algorithms

Cost of find(k) is T 1V1V2.Vn.
Expectation cost is ET 1EV1EV2-------E
Vn
N buckets- each key has 1/N probability of
hashing to bucket b 0
EVi1/N
ET1n/N.
If load factor C,ET O(1).
Hash table operations take O(1) expected
amortized time.

39
Contention Resolution in a Distributed System

Contention resolution. Given n processes P1, ,
Pn, each competing for access to a shared
database. If two or more processes access the
database simultaneously, all processes are locked
out. Devise protocol to ensure all processes get
through on a regular basis.
Restriction. Processes can't communicate.
Challenge Need symmetry-breaking
paradigm.

P1
P2
...
Pn
40
Contention Resolution Randomized Protocol

Protocol. Each process requests access to the
database at time t with probability p 1/n.
Claim. Let Si, t event that process i
succeeds in accessing the database at time t.
Then 1/(e ? n) ? PrS(i, t) ? 1/(2n).
Pf. By independence, PrS(i, t) p
(1-p)n-1.
Setting p 1/n, we have PrS(i, t) 1/n (1 -
1/n) n-1. ?
Useful facts from calculus. As n increases from
2, the function
(1 - 1/n)n-1 converges monotonically from 1/4 up
to 1/e
(1 - 1/n)n-1 converges monotonically from 1/2
down to 1/e.

none of remaining n-1 processes request access
process i requests access

between 1/e and 1/2
value that maximizes PrS(i, t)
41
Contention Resolution Randomized Protocol

Claim. The probability that process i fails to
access the database inen rounds is at most 1/e.
After e?n(c ln n) rounds, the probability is at
most n-c.
Pf. Let Fi, t event that process i fails to
access database in rounds 1 through t. By
independence and previous claim, we havePrF(i,
t) ? (1 - 1/(en)) t.
Choose t ?e ? n?
Choose t ?e ? n? ?c ln n?

42
1.Randomized Algorithms

EXAMPLE Nuts and Bolts
Suppose we are given n nuts and n bolts of
different sizes.
Each nut matches exactly one bolt and vice versa.
The nuts and bolts are all almost exactly the
same size, so we cant tell if one bolt is bigger
than the other, or if one nut is bigger than the
other. If we try to match a nut witch a bolt,
however, the nut will be either too big, too
small, or just right for the bolt.
Our task is to match each nut to its
corresponding bolt.

43
1.Randomized Algorithms

Suppose we want to find the nut that matches a
particular bolt.
The obvious algorithm test every nut until we
find a match requires exactly n-1 tests in the
worst case.
We might have to check every bolt except one if
we get down the last bolt without finding a
match, we know that the last nut is the one were
looking for.
Intuitively, in the average case, this
algorithm will look at approximately n/2 nuts.
But what exactly does average case mean?

44
Deterministic vs. Randomized Algorithms

Normally, when we talk about the running time of
an algorithm, we mean the worst-case running
time. This is the maximum, over all problems of a
certain size, of the running time of that
algorithm on that input
On extremely rare occasions, we will also be
interested in the best-case running time
The average-case running time is best defined by
the expected value, over all inputs X of a
certain size, of the algorithms running time for
X

45
Randomized Algorithms

Two kinds of algorithms deterministic and
randomized.
A deterministic algorithm is one that always
behaves the same way given the same input the
input completely determines the sequence of
computations performed by the algorithm.
Randomized algorithms, on the other hand, base
their behavior not only on the input but also on
several random choices.
The same randomized algorithm, given the same
input multiple times, may perform different
computations in each invocation. This means,
among other things, that the running time of a
randomized algorithm on a given input is no
longer fixed, but is itself a random variable.

46
EXAMPLE Nuts and Bolts

Finding the nut that matches a given bolt.
Uniformly is a technical term meaning that each
nut has exactly the same probability of being
chosen.
So if there are k nuts left to test, each one
will be chosen with probability 1/k.
Now whats the expected number of comparisons we
have to perform? Intuitively, it should be about
n2, but lets formalize our intuition.

47
EXAMPLE Nuts and Bolts

Let T(n) denote the number of comparisons our
algorithm uses to find a match for a single bolt
out of n nuts.
We still have some simple base cases T(1) 0
and T(2) 1, but when n gt 2, T(n) is a random
variable.
T(n) is always between 1 and n-1 its actual
value depends on our algorithms random choices.
We are interested in the expected value or
expectation of T(n), which is defined as follows

48
EXAMPLE Nuts and Bolts

If the target nut is the kth nut tested, our
algorithm performs mink, n-1 comparisons.
In particular, if the target nut is the last nut
chosen, we dont actually test it. Because we
choose the next nut to test uniformly at random,
the target nut is equally likelywith probability
exactly 1/nto be the first , second, third, or
kth bolt tested, for any k. Thus

49
EXAMPLES

Contention Resolution.
Global Minimum Cut.
Linearity of Expectation.
MAX 3-SATISFIABILITY.
Universal Hashing.
Chernoff Bounds.
Load Balancing.
Randomized Divide-and-Conquer.
Queuing problems.

50
Randomized Algorithms Examples

Verifying Matrix Multiplication
Problem Given three nxn matrices ABC is AB
C?
Deterministic algorithm
(A) Multiply A and B and check if equal to C.
(B) Running time? O(n3) by straight forward
approach. O(n237) with fast matrix multi-
plication (complicated and impractical).
Randomized algorithm
(A) Pick a random n x 1 vector r.
(B) Return the answer of the equality ABr Cr.
(C) Running time? O(n2)!

51
Quicksort Vs. Randomized Quicksort

Quicksort
(A) Pick a pivot element from array
(B) Split array into 3 sub arrays those smaller
than pivot, those larger than pivot, and the
pivot itself.
(C) Recursively sort the sub arrays, and
concatenate them.
Randomized Quicksort
(A) Pick a pivot element uniformly at random from
the array.
(B) Split array into 3 sub arrays those smaller
than pivot, those larger than pivot, and the
pivot itself.
(C) Recursively sort the sub arrays, and
concatenate them.

52
Quicksort Vs. Randomized Quicksort

Quicksort can take O(n2) time to sort array of
size n.
Randomized Quicksort sorts a given array of
length n in O(n log n) expected time.
Randomization can NOT eliminate the worst-case
but it can make it less likely!

53
Examples
54
Examples

Quicksort algorithm is very efficient in
practice, its worst-case running time is rather
slow. When sorting n elements, the number of
comparisons may be O(n2).
The worst case happens if the sizes of the sub
problems are not balanced.
we prove that if the pivot is selected uniformly
at random, the expected number of comparisons of
this randomized version of Quicksort is bounded
by O(n log n).

55
Average-Case Analysis of Quicksort

Let X total number of comparisons performed in
all calls to PARTITION
The total work done over the entire execution of
Quicksort is
O(ncX)O(nX)
Need to estimate E(X)

56
Review of Probabilities
57
Review of Probabilities
(discrete case)
58
Random Variables

Def. (Discrete) random variable X a function
from a sample space S to the real numbers.
It associates a real number with each possible
outcome of an experiment.

X(j)
59
Random Variables
E.g. Toss a coin three times
define X numbers of heads
60
Computing Probabilities Using Random Variables

61
Expectation

Expected value (expectation, mean) of a discrete
random variable X is
EX Sx x PrX x
Average over all possible values of random
variable X

62
Examples
Example X face of one fair dice EX 1?1/6
2?1/6 3?1/6 4?1/6 5?1/6 6?1/6 3.5
Example
63
Indicator Random Variables

Given a sample space S and an event A, we define
the indicator random variable IA associated
with A
IA 1 if A occurs
0 if A does not occur
The expected value of an indicator random
variable XAIA is
EXA Pr A
Proof
EXA EIA

1 ? PrA 0 ? PrA
PrA
64
Examples
65
Examples
66
Quick Sort
67
Quick Sort
Partition set into two using randomly chosen
pivot
68
Quick Sort
69
Quick Sort
70
min-cut Examples

An efficient algorithm cannot check all
partitions.
A sample execution of Algorithm on a graph with
5 nodes. Throughout the execution, the edges of
one min-cut of G are colored blue. At each step,
the red, dashed edge is contracted.

71
Examples

Algorithm Randomized Min-Cut(G (VE)).
while the graph has more than two nodes do
choose an edge e (u , v) uniformly at
random
contract e (the node that combines u and
v inherits all the node labels of u and v)
and remove self-loops note that parallel edges
are not removed
end while
return the cut defined by the labels of one of
the remaining nodes.

72
Probabilistic Analysis
73
Probabilistic analysis

Probabilistic analysis is the use of probability
in the analysis of problems.
Most commonly, we use probabilistic analysis to
analyze the running time of an algorithm.
Then we analyze our algorithm, computing an
expected running time.
The expectation is taken over the distribution of
the possible inputs. Thus we are, in effect,
averaging the running time over all possible
inputs.

74
Probability Measures

Random Variables Binomial and Geometric.
Useful Probabilistic Bounds and Inequalities.
A probability measure (Prob) is a mapping from a
set of events to the reals such that
For any event A
0 ? Prob(A) ? 1
Prob (all possible events) 1
If A, B are mutually exclusive events, then
Prob(A ? B) Prob (A) Prob (B)

75
Conditional Probability

Define

76
Bayes Theorem

If A1, , An are mutually exclusive and contain
all events then

77
Random Variable A (Over Real Numbers)

Density Function

78
Random Variable A (contd)

Prob Distribution Function

79
Random Variable A (contd)

If for Random Variables A,B
Then A upper bounds B and B lower bounds A

80
Expectation of Random Variable A

A is also called average of A and mean of A
?A

81
Variance of Random Variable A
82
Variance of Random Variable A (contd)
83
Discrete Random Variable A
84
Discrete Random Variable A (contd)
85
Discrete Random Variable A Over Nonnegative
Numbers

Expectation

86
Pair-Wise Independent Random Variables

A,B independent if
Prob(A ? B) Prob(A) X Prob(B)
Equivalent definition of independence

87
B is Binomial Variable with Parameters n , p
88
B is Binomial Variable with Parameters n , p
(contd)
89
Probabilistic Inequalities

For Random Variable A

90
Markov and Chebychev Probabilistic Inequalities

Markov Inequality (uses only mean)
Chebychev Inequality (uses mean and variance)

91
Gaussian Density Function
92
Normal Distribution

Bounds x gt 0

93
Sums of Independently Distributed Variables

Let Sn be the sum of n independently distributed
variables A1, , An
Each with mean and variance
So Sn has mean ? and variance ?2

94
Strong Law of Large Numbers Limiting to Normal
Distribution

The probability density function of to normal
distribution ?(x)
Hence
Prob

95
Strong Law of Large Numbers (contd)

So
Prob
(since 1- ?(x) ? ?(x)/x)

96
Moments of Random Variable A

nth Moments of Random Variable A
Moment generating function

97
Amortized analysis
98
Amortized analysis

After discussing algorithm design techniques
(Dynamic programming and Greedy algorithms) we
now return to data structures and discuss a new
analysis method-Amortized analysis.
Until now we have seen a number of data
structures and analyzed the worst-case running
time of each individual operation.
Sometimes the cost of an operation vary widely,
so that that worst-case running time is not
really a good cost measure.

99
Amortized analysis

Similarly, sometimes the cost of every single
operation is not so important.
the total cost of a series of operations are more
important.(e.g when using priority queue to sort)
We want to analyze running time of one single
operation averaged over a sequence of operations.
Again keep in mind Average" is over a sequence
of operations for any sequence
not average for some input distribution (as in
quick-sort).
not average over random choices made by algorithm
(as in skip-lists).

100
Amortized Analysis

Amortized analysis is a technique for analyzing
an algorithm's running time.
The average cost of a sequence of n operations on
a given Data Structure.

101
Applications of amortized analysis

Vectors/ tables
Disjoint sets
Priority queues
Heaps, Binomial heaps, Fibonacci heaps
splay trees
union-find.
Red black trees
Maximum flow
Dynamic arrays / hash tables

102
Difference between amortized and average cost

To do averages we need to use probability
For amortized analysis no such assumptions are
needed
We compute the average cost per operation for any
mix of n operations

103
Amortized Analysis

Not just consider one operation, but a sequence
of operations on a given data structure.
Average cost over a sequence of operations.
Probabilistic analysis
Average case running time average over all
possible inputs for one algorithm (operation).
If using probability, called expected running
time.
Amortized analysis
No involvement of probability
Average performance on a sequence of operations,
even some operation is expensive.
Guarantee average performance of each operation
among the sequence in worst case.

104
Amortized Analysis

In an amortized analysis the time required to
perform a sequence of data structure operations
is averaged over all the operations performed.
Amortized analysis differs from average case
analysis in that probability is not involved an
amortized analysis guarantees the average
performance of each operation in the worst case.

105
Indirect Solution
10 yards / minute 100 yards / minute
(Source Mark Allen Weiss. Data Structures and
Algorithm Analysis in Java.
106
Indirect Solution
10 yards / minute 100 yards / minute
Geometric Series?
The easy solution is indirect. It takes a kitten
5 minutes togo 50 yards, how far can the mother
go in 5 minutes?.... 500 yards!
(Source Mark Allen Weiss. Data Structures and
Algorithm Analysis in Java.
107
Amortized Analysis

The worst-case running time is not always the
same as the worst possible average running time.
Example
Worst case-time is O(n)
Amortized worst-case is O(1)
This could be from a series of table inserts and
clears

(Source Arup Guha. CS2 Notes Summer 2007.
108
Binomial Queue

Binomial Trees
B0 B1 B2
B3
B4

Each tree doubles the previous.
109
Binomial Queue

Binomial Queue
A queue of binomial trees. A Forest.
Each tree is essentially a heap constrained to
the format of a binary tree.
Example B0, B2, B3
Insertion Create a B0 and merge
Deletion remove minimum( the root) from tree Bk.
This leaves a queue of trees B0, B1, , Bk-1.
Merge this queue with the rest of the original
queue.

(Source Mark Allen Weiss. Data Structures and
Algorithm Analysis in Java.
110
Binomial Queue Example

The Most important step is Merge
Merge Rules (for two binomial queues)
0 or 1 Bk trees ? just leave merged
2 Bk trees ? meld into 1 Bk1 tree.
3 Bk trees ? meld two into 1 Bk1, and leave the
third.
Insertion Runtime
M1 steps, where M is smallest tree not present.
Worst-case is k2, when Bk1 is the smallest tree
not present. How does k relate to the total
number of nodes in the tree?
k lg n, thus (non amortized) worst-case time is
O(lg n).

111
Binomial Queue Example

Make Bin Q Problem Build a binomial queue of N
elements. (Like make Heap). How long should this
take?
Insertion worst-case runtime
Worst-case O(lg n) for 1 insert? O(n lg n) n
inserts, but we want O(n) like make Heap
Try amortized analysis directly
Considering each linking step of the merge. The
1st, 3rd, 5th, etc. odd steps require no linking
step because there will be no B0. So ½ of all
insertions require no linking, similarly ¼
require 1 linking steps.
We could continue down this path, but the itll
be come especially difficult for deletion (we
should learn an indirect analysis).

112
Binomial Queue Example

Indirect Analysis (time M 1)
If no B0 ? cost is 1 (M is 0)
Results in 1 B0 tree added to the forest
If B0 but no B1 ? cost is 2 (M is 1)
Results in same of trees (new B1 but B0 is
gone)
When cost is 3 (M is 2)
Results in 1 less tree (new B2 but remove B0 and
B1)
.etc.
When cost is c (M is c 1)
Results in increase of 2 c trees

113
Binomial Queue Example

increase of 2 c trees
How can we use this?
Ti of trees after ith iteration
T0 0 of trees initially
Ci Cost of ith iteration
Then, Ti Ti-1 (2 Ci) ?
Ci (Ti Ti-1) 2
This is only the ith iteration

114
Binomial Queue Example

Ci (Ti Ti-1) 2
To get all iterations
C1 (T1 T0) 2
C2 (T2 T1) 2
Cn-1 (Tn-1 Tn-2) 2
Cn (Tn Tn-1) 2
n
S Ci (Tn T0) 2n (T1 .. Tn-1 cancel out)
i1

115
Binomial Queue Example

n
S Ci (Tn T0) 2n
i1
T0 0 and Tn is definitely not negative, so Tn
T0 is not negative.
n
?S Ci lt 2n
i1

Thus, the total cost lt 2n ? make Bin Q
O(n) Since, make Bin Q consists of O(n) inserts,
then the amortized worst-case of each insert is
O(1).
116
Amortized Analysis

There exists three common techniques used in
amortized analysis
Aggeregate method
Accounting method
Potential method

117
1.Aggeregate method

1.Aggeregate method
We show that a sequence of n operations take
worst-case time T(n) in total.
In the worst case, the average. cost, or
amortized cost, per operation is therefore
T(n) / n.
In the aggregate method, all operations have the
same amortized cost.
The other two methods, the accounting tricky and
the potential function method, may assign
different amortized costs to diferent types of
operations.

118
An example push and pop

A sequence of operations OP1, OP2, OPm
OPi several pops (from the stack) and
one push (into the stack)
ti time spent by OPi
the average time per operation

119

Example a sequence of push and pop
p pop , u push

tave (11311132)/8 13/8
1.625
120

Another example a sequence of push and pop
p pop , u push

tave (12111161)/8 14/8
1.75
121
Amortized Analysis

Example Stack operations
As we know, a normal stack is a data structure
with operations
Push Insert new element at top of stack
Pop Delete top element from stack
A stack can easily be implemented (using linked
list) such that Push and Pop takes O(1) time.
Consider the addition of another operation
Multipop(k) Pop k elements off the stack.
Analysis of a sequence of n operations
One Multipop can take O(n) time ? O(n2) running
time.
Amortized running time of each operation is O(1)
? O(n) running time.
Each element can be popped at most once each time
it is pushed
Number of Pop operations (including the one done
by Multipop) is bounded by n
Total cost of n operations is O(n)
Amortized cost of one operation is O(n)/n O(1).
Notice that no probability involved.

122
The Accounting Method

Assign different charges to different operations.
Some are charged more than actual cost, some
are charged less.
Amortized cost amount we charge.
When amortized cost gt actual cost, store the
difference on a credit object.
Use credit later to pay for operations whose
actual cost gt amortized cost.
Need credit to never go negative.

123
The Accounting Method
124
Stack with Multi-Pop Operations
125
Potential Method

Like the accounting , but credit stored with the
entire structure.
Accounting method stores credit with specific
objects.
Potential method stores potential in the data
structure as a whole.
Can release potential to pay for future
operations.
Most flexible of the amortized analysis
methods.

126
Potential Method
127
Potential Method
128
Stack with Multi-Pop Operations
129
Skew heaps

meld merge swapping

Two skew heaps
Step 1 Merge the right paths. 5 right heavy nodes
130
Step 2 Swap the children along the right path.
No right heavy node
131
Amortized analysis of skew heaps

meld merge swapping
operations on a skew heap
find-min(h) find the min of a skew heap h.
insert(x, h) insert x into a skew heap h.
delete-min(h) delete the min from a skew heap h.
meld(h1, h2) meld two skew heaps h1 and h2.
The first three operations can be implemented
by melding.

132
Potential function of skew heaps

wt(x) of children of node x, including x.
heavy node x wt(x) ? wt(p(x))/2, where
p(x) is the parent node of x.
light node not a heavy node
potential function ?i of right heavy nodes of
the skew heap.

133

Any path in an n-node tree contains at most
?log2n? light nodes.

light nodes ? ?log2n?
heavyk3? ?log2n? possible heavy nodes
of nodes n

The number of right heavy nodes attached to the
left path is at most ?log2n ?.

134
Amortized time
light ? ?log2n1? heavy k1
light ? ?log2n2? heavy k2
135
(No Transcript)
136
AVL-trees
height balance of node v hb(v)height of right
sub tree - height of left sub tree
137

Add a new node A.

Before insertion, hb(B)hb(C)hb(E)0 hb(I)?0
the first nonzero from leaves.
138
Amortized analysis of AVL-trees

Consider a sequence of m insertions on an empty
AVL-tree.
T0 an empty AVL-tree.
Ti the tree after the ith insertion.
Li the length of the critical path involved in
the ith insertion.
X1 total of balance factor changing from 0 to
1 or -1 during these m insertions (rebalancing
cost)

139
Case 1 Absorption

The tree height is not increased, we need not
rebalance it.

Val(Ti)Val(Ti-1)(Li?1)
140
Case 2 Rebalancing the tree
141
Case 2.1 single rotation

After a right rotation on the sub tree rooted at
A

Val(Ti)Val(Ti-1)(Li-2)
142
Case 2.2 double rotation
143
Case 2.2 double rotation

After a left rotation on the sub tree rooted at B
and a right rotation on the sub tree rooted at A

Val(Ti)Val(Ti-1)(Li-2)
144
Case 3 Height increase

Li is the height of the root.

Val(Ti)Val(Ti-1)Li
145
Amortized analysis of X1
146
Example Two-Stack System

Suppose there are two stacks called A and B,
manipulated by the following operations
push(S, d) Push a datum d onto stack S.
Real Cost 1.
multi-pop(S, k) Removes min(k, S) elements
from stack S.
Real Cost min(k, S).
transfer(k) Repeatedly pop elements from stack A
and push them onto stack B. until either k
elements have been moved, or A is empty.
Real Cost of elements moved min(k, A).

147
Design

B
A
148
1.Aggregate Method

For a sequence of n operations, there are n
data pushed into either A or B. Therefore there
are n data popped out from either A or B and
n data transferred from A to B. Thus, the total
cost of the n operations is 3n. Thus.

Operation Real Cost Amortized Cost
Push(A, d) 1 3
Push(B, d) 1 3
Multi pop(A, k) min(k, A) 3
Multi pop(B, k) min(k, B) 3
Transfer(k) min(k, A) 3
149
Illustration

B
A
150
2.Accounting Method

push(A, d) 3 -- This pays for the push and a
pop of the push a transfer and a pop.
push(B, d) 2 -- This pays for the push and a
pop.
multi-pop(S, k) 0
transfer(k) 0
After any n operations you will have 2A B
dollars in the bank.
Thus the bank account never goes negative.
Furthermore the amortized cost of the n
operations is O(n) (more precisely 3n).

151
Comparison
Operation Real Cost Aggregate Accounting Potential
Push(A, d) 1 3 3
Push(B, d) 1 3 2
Multipop(A, k) min(k, A) 3 0
Multipop(B, k) min(k, B) 3 0
Transfer(k) min(k, A) 3 0

152
3.Potential Method

Let F(A, B) 2A B, then
c(push(A, d)) 1 DF 1 2 3
c(push(B, d)) 1 DF 1 1 2
c(multi-pop(A, k)) min(k, A) DF -min(k,
A)
c(multi-pop(B, k)) min(k, B) DF 0
c(transfer(k)) min(k, A) DF 0
Sc Sc DF
Sc Sc - DF Sc 3n O(n).
If we can pick F so that F(Di) ? F(D0) for all i,
and that Sc is easy to compute, then Sc/n
upper-bounds the average cost.

153
Summary
Operation Real Cost Aggregate Accounting Potential
Push(A, d) 1 3 3 3
Push(B, d) 1 3 2 2
Multipop(A, k) min(k, A) 3 0 -min(k, A)
Multipop(B, k) min(k, B) 3 0 0
Transfer(k) min(k, A) 3 0 0

154
Competitive analysis
155
On Bounds

Worst Case.
Average Case Running time over some distribution
of input. (Quicksort)
Amortized Analysis Worst case bound on sequence
of operations.
Competitive Analysis Compare the cost of an
on-line algorithm with an optimal perceptive
algorithm on any sequence of requests.

156
Applications

Resource Allocation
Scheduling
Memory Management
Routing
Robot Motion Planning
Exploring an unknown terrain
Finding a destination
Computational Finance

157
Self-organizing lists

List L of n elements
The operation ACCESS(x)costs rankL(x)distance of
x from the head of L.
L can be reordered by transposing adjacent
elements at a cost of 1.
Example
Accessing the element with key 14 costs 4.
Transposing 3 and 50 costs 1.

158
On-line and off-line problems

Definition. A sequence S of operations is
provided one at a time. For each operation, an
on-line algorithm A must execute the operation
immediately without any knowledge of future
operations (e.g., Tetris).
An off-line algorithm may see the whole sequence
S in advance.
Goal Minimize the total cost CA(S).

159
Worst-case analysis of self-organizing lists

An adversary always accesses the tail (nth)
element of L. Then, for any on-line algorithm A,
we have
CA(S) O(Sn)
in the worst case.

160
Average-case analysis of self-organizing lists

Suppose that element x is accessed with
probability p(x). Then, we have
which is minimized when L is sorted in decreasing
order with respect to p.
Heuristic Keep a count of the number of times
each element is accessed, and maintain L in order
of decreasing count.

161
The move-to-front heuristic

Practice Implementers discovered that the
move-to-front (MTF)heuristic empirically produces
good results.
IDEA After accessing x, move x to the head of L
using transposes
cost 2 rankL(x) .
The MTF heuristic responds well to locality in
the access sequence S.

162
Competitive analysis

Definition . An on-line algorithm A is
a-competitive if there exists a constant k such
that for any sequence S of operations,
CA(S) aCOPT(S) k,
where OPT is the optimal off-line algorithm
(Gods algorithm).

163
MTF is O(1)-competitive

Theorem MTF is 4-competitive for self-organizing
lists.

164
Potential function
Define the potential function FLi ?R by F(Li)
2 (x, y) x ?Li y and y ?Li x 2
inversions
F(Li) 2
F(Li) 2 (E,C),
165
Potential function
F(Li) 2 (E,C), (E,A),
F(Li) 2 (E,C), (E,A), (E,D),
F(Li) 2 (E,C), (E,A), (E,D), (E,B),
166
Potential function
F(Li) 2 (E,C), (E,A), (E,D), (E,B), (D,B)
F(Li) 2 (E,C), (E,A), (E,D), (E,B), (D,B)
10.
Note that F(Li) 0 for i 0, 1, , F(L0) 0
if MTF and OPT start with the same list. How much
does F change from 1 transpose? A transpose
creates/destroys 1 inversion. ?F 2 .
167
What happens on an access?
Suppose that operation i accesses element x, and
define
168
What happens on an access?
169
Amortized cost

The amortized cost for the i th operation of MTF
with respect to F is
cici F(Li) F(Li1)
2r 2(A B ti)
2r 2(A (r1 A) ti)
(since r A B 1)
2r 4A 2r 2 2ti
4A 2 2ti
4(r ti)
(since r A C 1
A 1)
4ci.

170
The grand finale
171
Appendix

If we count transpositions that move x toward the
front as free(models splicing x in and out of
Lin constant time), then MTF is 2-competitive.
What if L0?L0?
Then, F(L0)might be T(n2)in the worst case.
Thus, CMTF(S) 4 COPT(S) T(n2), which is
still 4-competitive, since n2is constant as S
?8.

172
Assignments

1. Example on Traveling salesman problem.
2.Show that RANDOMIZED-QUICKSORTs expected
running time is ?(n log n).
3.E XAMPLES on competitive Analysis.

173
(No Transcript)
174
(No Transcript)
175
Analysis

Worst- case computing times for two sorting
algorithms on random inputs.
Average- case computing times for two sorting
algorithms on random inputs.
Comparison of Quick Sort and RQuick Sort on the
input aii, 1,ltiltn times are in
milliseconds.

N 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Merge Sort 105.7 206.4 335.2 422.1 589.9 691.3 794.8 889.5 1067 1167
Quick Sort 41.6 97.1 158.6 244.9 397.8 383.8 497.3 568.9 616.2 738.1
N 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Merge Sort 72.8 167.2 275.1 378.5 500.6 607.6 723.4 811.5 949.2 1073.6
Quick Sort 36.6 85.1 138.9 205.7 269.0 339.4 411.0 487.7 556.3 645.2
N 1000 2000 3000 4000 5000
Quick Sort 195.5 759.2 1728 3165 4829
R Quick Sort 9.4 21.0 30.5 41.6 52.8
176
Matrix multiplication
177
(No Transcript)
178
Pivot element