Title: Dynamic Programming
1Dynamic Programming
2- The basic idea is drawn from intuition behind
divide and conquer - One implicitly explores the space of all
possible solutions, by decomposing things into a
series of sub-problems, and then building up
correct solutions to larger and larger
sub-problems.
3- The term Dynamic Programming comes from Control
Theory, not computer science. - Programming refers to the use of tables (arrays)
to construct a solution. - Used extensively in "Operation Research" given in
the Math dept.
4The Main Idea
- In dynamic programming we usually reduce time by
increasing the amount of space - We solve the problem by solving subproblems of
increasing size and saving each optimal solution
in a table (usually). - The table is then used for finding the optimal
solution to larger problems. - Time is saved since each subproblem is solved
only once.
5When is Dynamic Programming used?
- Used for problems in which an optimal solution
for the original problem can be found from
optimal solutions to subproblems of the original
problem - Usually, a recursive algorithm can solve the
problem. But the algorithm computes the optimal
solution to the same subproblem more than once
and therefore is slow. - The two examples (Fibonacci and binomial
coefficient) have such a recursive algorithm - Dynamic programming reduces the time by computing
the optimal solution of a subproblem only once
and saving its value. The saved value is then
used whenever the same subproblem needs to be
solved.
6Principle of Optimality(Optimal Substructure)
- The principle of optimality applies to a problem
(not an algorithm) - A large number of optimization problems satisfy
this principle. - Principle of optimality
- Given an optimal sequence of decisions or
choices, each subsequence must also be optimal.
7Principle of optimality - shortest path problem
- Problem Given a graph G and vertices s and t,
find a shortest path in G from s to t - Theorem A subpath P (from s to t) of a
shortest path P is a shortest path from s to t
of the subgraph G induced by P. Subpaths are
paths that start or end at an intermediate vertex
of P.
Proof If P was not a shortest path from s to
t in G, we can substitute the subpath from s
to t in P, by the shortest path in G from s
to t. The result is a shorter path from s to t
than P. This contradicts our assumption that P
is a shortest path from s to t.
8Principle of optimality
P (a,b), (b,c) (c.d), (d,e)
P(c.d), (d,e)
f
5
e
3
13
G
10
7
a
b
c
d
3
1
6
G
P must be a shortest path from c to e in G,
otherwise P cannot be a shortest path from a to
e in G.
9Principle of optimality - MST problem (minimum
spanning tree)
- Problem Given an undirected connected graph G,
find a minimum spanning tree - Theorem Any subtree T of an MST T of G, is an
MST of the subgraph G of G induced by the
vertices of the subtree. Proof If T is not an
MST of G, we can substitute the edges of T in
T, with the edges of an MST of G. This would
result in a lower cost spanning tree,
contradicting our assumption that T is an MST of
G.
10Principle of optimality
T(c.d), (d,f), (d,e), (a,b), (b,c)
T(c.d), (d,f), (d,e)
f
5
e
3
G
10
2
a
b
c
d
3
1
6
G
T must be an MST of G, otherwise T cannot be an
MST
11A problem that does not satisfy the Principle of
Optimality
- Problem What is the longest simple route
between City A and B? - Simple never visit the same spot twice.
- The longest simple route (solid line) has city C
as an intermediate city. - It does not consist of the longest simple route
from A to C plus the longest simple route from C
to B.
Longest C to B
D
Longest A to C
B
A
C
Longest A to B
12Longest Common Subsequence (LCS)
13Longest Common Subsequence (LCS)
- Problem Given sequences x1..m and y1..n,
find a longest common subsequence of both. - Example xABCBDAB and yBDCABA,
- BCA is a common subsequence and
- BCBA and BDAB are two LCSs
14LCS
- Brute force solution
- Writing a recurrence equation
- The dynamic programming solution
- Application of algorithm
15Brute force solution
- Solution For every subsequence of x, check if it
is a subsequence of y. - Analysis
- Each subsequence of X corresponds to a subset of
the indices 1, 2, ., m of X. There are 2m
subsequences of x. - Each check takes O(n) time, since we scan y for
first element, and then scan for second element,
etc. - The worst case running time is O(n2m).
16Writing the recurrence equation
- Let Xi denote the ith prefix x1..i of x1..m,
and - X0 denotes an empty prefix
- We will first compute the length of an LCS of Xm
and Yn, LenLCS(m, n), and then use information
saved during the computation for finding the
actual subsequence - We need a recursive formula for computing
LenLCS(i, j).
17Writing the recurrence equation
- If Xi and Yj end with the same character xiyj,
the LCS must include the character. If it did not
we could get a longer LCS by adding the common
character. - If Xi and Yj do not end with the same character
there are two possibilities - either the LCS does not end with xi,
- or it does not end with yj
- Let Zk denote an LCS of Xi and Yj
18Xi and Yj end with xiyj
19Xi and Yj end with xi ¹ yj
Zk is an LCS of Xi and Yj -1
Zk is an LCS of Xi -1 and Yj
LenLCS(i, j)maxLenLCS(i, j-1), LenLCS(i-1, j)
20The recurrence equation
21The dynamic programming solution
- Initialize the first row and the first column of
the matrix LenLCS to 0 - Calculate LenLCS (1, j) for j 1,, n
- Then the LenLCS (2, j) for j 1,, n, etc.
- Store also in a table an arrow pointing to the
array element that was used in the computation. - It is easy to see that the computation is O(mn)
22Example
To find an LCS follow the arrows, for each
diagonal arrow there is a member of the LCS
23LCS-Length(X, Y)
- m ? lengthX
- n ? lengthY
- for i ? 1 to m do ci, 0 ? 0
- for j ? 1 to n do c0, j ? 0
24LCS-Length(X, Y) cont.
- for i ? 1 to m do for j ? 1 to n do if xi
yj ci, j ? ci-1, j-11 bi, j ?
D else - if ci-1, j ? ci, j-1 ci, j ?
ci-1, j bi, j ? U
else ci, j ? ci, j-1 bi, j
? L - return c and b
25Questions
- How do we find the LCS?
- Do we need B?
- Do we need whole C (for finding longest length)?
26Fibonacci's Series
27Fibonacci's Series
- Definition S0 0, S1 1, Sn Sn-1
Sn-2 ngt10, 1, 1, 2, 3, 5, 8, 13, 21, - Applying the recursive definition we get
- fib (n) 1. if n lt 2 2. return n 3.
else return (fib (n -1) fib(n -2))
28Analysis using Substitution of Recursive Fibonacci
- Let T(n) be the number of additions done by
fib(n) - T(n)T(n-1)T(n-2)1 for ngt2T(2)1
T(1)T(0)0 - T(n) ÃŽO(2n), T(n) ÃŽW(2n/2)
29What does the Execution Tree look like?
Fib(5)
Fib(3)
Fib(4)
Fib(3)
Fib(2)
Fib(1)
Fib(2)
Fib(2)
Fib(1)
Fib(1)
Fib(0)
Fib(1)
Fib(0)
Fib(1)
Fib(0)
30Dynamic Programming Solution for Fibonacci
- Builds a table with the first n Fibonacci
numbers. - fib(n) 1. A0 0 2. A1 1 3. for i ? 2 to
n 4. do A i A i -1 Ai -2 5. return
A - Is there a recurrence equation?
- What is the run time?
- What is the space requirements?
- If we only need the nth number can we save space?
31Binomial Coefficient
32The Binomial Coefficient
33The recursive algorithm
- binomialCoef(n, k)1. if k 0 or k n2.
then return 13. else - return (binomialCoef( n -1, k -1)
binomialCoef(n-1, k))
34binomialCoef(n, k)1. if k 0 or k n2.
then return 13. else return (binomialCoef(
n -1, k -1) binomialCoef(n-1, k))
(
)
The Call Tree
n
k
)
(
)
(
n -1
n-1
k -1
k
)
(
(
)
n -2
n -2
(
(
)
)
n -2
n -2
k -1
k
k -2
k -1
(
)
n -3
(
(
(
)
)
)
n -3
n -3
n -3
(
)
)
(
(
)
n -3
n -3
n -3
(
)
n -3
k
k -1
k -2
k -1
k -2
k -2
k -1
k -3
35Dynamic Solution
- Use a matrix B of n1 rows, k1 columns where
- Establish a recursive property. Rewrite in terms
of matrix BB i , j B i -1 , j -1 B
i -1, j , 0 lt j lt i 1 , j
0 or j i - Solve all smaller instances of the problem in a
bottom-up fashion by computing the rows in B in
sequence starting with the first row.
36The B Matrix
0 1 2 3 4 ... j k
01234i n
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
Bi -1, j -1 Bi -1, j
B i, j
37Compute B4,2
- Row 0 B0,0 1
- Row 1 B1,0 1 B1,1 1
- Row 2 B2,0 1 B2,1 B1,0 B1,1
2 B2,2 1 - Row 3 B3,0 1 B3,1 B2,0 B2,1
3 B3,2 B2,1 B2,2 3 - Row 4 B4,0 1 B4,1 B3, 0 B3, 1
4 B4,2 B3, 1 B3, 2 6
38Dynamic Program
- bin(n,k )1. for i 0 to n // every row2. for
j 0 to minimum( i, k ) 3. if j 0 or j
i // column 0 or diagonal4. then B i
, j 15. else B i , j Bi -1, j
-1 Bi -1, j 6. return B n, k - What is the run time?
- How much space does it take?
- If we only need the last value, can we save space?
39Dynamic programming
- All values in column 0 are 1
- All values in the first k1 diagonal cells are 1
- j ¹ i and 0 lt j ltmini,k ensures we only
compute B i, j for j lt i and only first k1
columns.
- Elements above diagonal (Bi,j for jgti) are not
computed since
is undefined for j gti
40Number of iterations
41Floyds Algorithm
42All pairs shortest path
- The problem find the shortest path between every
pair of vertices of a graph(expensive using
brute-force approach) - The graph may contain negative edges but no
negative cycles - A representation a weight matrix where
W(i,j)0 if ij. W(i,j) if there is no edge
between i and j. W(i,j)weight of edge - Note we have shown principle of optimality
applies to shortest path problems
43The weight matrix and the graph
1
v1
v2
3
9
5
3
1
2
v5
3
2
v3
v4
4
Adjacency Matrix
44The subproblems
- How can we define the shortest distance di,j in
terms of smaller problems? - One way is to restrict the paths to only include
vertices from a restricted subset. - Initially, the subset is empty.
- Then, it is incrementally increased until it
includes all the vertices.
45The subproblems
- Let D(k)i,jweight of a shortest path from vi
to vj using only vertices from v1,v2,,vk as
intermediate vertices in the path - D(0)W
- D(n)D which is the goal matrix
- How do we compute D(k) from D(k-1) ?
46The Recursive Definition
- Case 1 A shortest path from vi to vj restricted
to using only vertices from v1,v2,,vk as
intermediate vertices does not use vk.
Then D(k)i,j D(k-1)i,j. - Case 2 A shortest path from vi to vj restricted
to using only vertices from v1,v2,,vk as
intermediate vertices does use vk. Then
D(k)i,j D(k-1)i,k D(k-1)k,j.
Shortest path using intermediate verticesV1, .
. . Vk
Vk
Vj
Vi
Shortest Path using intermediate vertices V1, .
. . Vk -1
47The recursive definition
- Since D(k)i,j D(k-1)i,j or D(k)i,j
D(k-1)i,k D(k-1)k,j.We conclude
D(k)i,j min D(k-1)i,j, D(k-1)i,k
D(k-1)k,j .
Shortest path using intermediate verticesV1, .
. . Vk
Vk
Vj
Vi
Shortest Path using intermediate vertices V1, .
. . Vk -1
48The pointer array P
- Used to enable finding a shortest path
- Initially the array contains 0
- Each time that a shorter path from i to j is
found, the k that provided the minimum is saved
(highest index node on the path from i to j) - To print the intermediate nodes on the shortest
path a recursive procedure that print the
shortest paths from i and k, and from k to j can
be used
49Floyd's Algorithm Using n1 D matrices
- Floyd//Computes shortest distance between all
pairs of //nodes, and saves P to enable
finding shortest paths1. D0 ? W // initialize
D array to W 2. P ? 0 // initialize P
array to 03. for k ? 1 to n4. do for i
? 1 to n5. do for j ? 1 to n6.
if (Dk-1 i, j gt Dk-1 i, k
Dk-1 k, j ) 7. then Dk i, j ?
Dk-1 i, k Dk-1 k, j 8.
P i, j ? k9. else Dk i, j
? Dk-1 i, j
50Example
W D0
1
5
4
3
2
-3
2
P
51k 1Vertex 1 can be intermediate node
D1
- D12,3 min( D02,3, D02,1D01,3 )
- min (?, 7)
- 7
- D13,2 min( D03,2, D03,1D01,2 )
- min (-3,?)
- -3
D0
P
52k 2Vertices 1, 2 can be intermediate
D1
- D21,3 min( D11,3, D11,2D12,3 )
- min (5, 47)
- 5
- D23,1 min( D13,1, D13,2D12,1 )
- min (?, -32)
- -1
D2
P
53k 3Vertices 1, 2, 3 can be intermediate
D2
- D31,2 min(D21,2, D21,3D23,2 )
- min (4, 5(-3))
- 2
- D32,1 min(D22,1, D22,3D23,1 )
- min (2, 7 (-1))
- 2
D3
P
54Floyd's Algorithm Using 2 D matrices
- Floyd1. D ? W // initialize D array to W
2. P ? 0 // initialize P array to 03.
for k ? 1 to n // Computing D from D4.
do for i ? 1 to n5. do for j ? 1 to
n6. if (D i, j gt D i, k
D k, j ) 7. then D i, j ? D
i, k D k, j 8. P i,
j ? k 9. else D i, j ? D i, j 10.
Move D to D.
55Can we use only one D matrix?
- Di,j depends only on elements in the kth column
and row of the distance matrix. - We will show that the kth row and the kth column
of the distance matrix are unchanged when Dk is
computed - This means D can be calculated in-place
56The main diagonal values
- Before we show that kth row and column of D
remain unchanged we show that the main diagonal
remains 0 - D(k) j,j min D(k-1) j,j , D(k-1) j,k
D(k-1) k,j min 0, D(k-1)
j,k D(k-1) k,j 0 - Based on which assumption?
57The kth column
- kth column of Dk is equal to the kth column of
Dk-1 - Intuitively true - a path from i to k will not
become shorter by adding k to the allowed subset
of intermediate vertices - For all i, D(k)i,k min D(k-1)i,k,
D(k-1)i,k D(k-1)k,k min D(k-1)i,k,
D(k-1)i,k0 D(k-1)i,k
58The kth row
- kth row of Dk is equal to the kth row of Dk-1
- For all j, D(k)k,j min D(k-1)k,j,
D(k-1)k,k D(k-1)k,j min D(k-1) k,j
, 0D(k-1)k,j D(k-1) k,j
59Floyd's Algorithm using a single D
- Floyd1. D ? W // initialize D array to W
2. P ? 0 // initialize P array to 03.
for k ? 1 to n4. do for i ? 1 to n5.
do for j ? 1 to n6. if
(D i, j gt D i, k D k, j ) 7.
then D i, j ? D i, k D k, j 8.
P i, j ? k
Time-complexity T(n)nnn n3
60Printing intermediate nodes on shortest path from
q to r
- path(index q, r)
- if (P q, r !0)
- path(q, Pq, r)
- println( v Pq, r)
- path(Pq, r, r)
- return
- //no intermediate nodes
- else return
- Before calling path check Dq, r lt ?, and print
node q, after the call to - path print node r
P
61The graph in the Floyd example
62The final distance matrix D and P
5(1)
The values in parenthesis are the non zero P
values.
63The call tree for Path(1, 4)
Path(1, 4)
Path(6, 4)
Path(1, 6)
Print v6
P(1, 6)0
Path(6, 3)
Print v3
Path(3, 4)
P(3, 4)0
P(6, 3)0
The intermediate nodes on the shortest path from
1 to 4 are v6, v3. The shortest path is v1, v6,
v3, v4.
64An alternative computation of P
- Initialization
- Pi, j i if there is an edge from i to j to
indicate that i is the last node before j on the
shortest path from i to j - Pi, j is initialized to NIL if there is no
edges from i to j - Update
- Pi, j Pk, j if D i, j gt D i, k D
k, j . This guarantees Pi ,j contains the last
node before j on the shortest path from i to j.
65findPath (i, j)
- if Pi, j NIL return "No path"
create empty stack S push j on stack S k
j //k is the last node on the path while Pi,
k ! i - //push node before k on stack push
Pi, k on stack S - k Pi, k //node before k is new last
node push i on stack S return S
66Optimal Binary Search Tree
67Optimal Binary Search Trees
- Problem
- Given sequence K k1 lt k2 lt lt kn of n sorted
keys, with a search probability pi for each key
ki. - Want to build a binary search tree (BST) with
minimum expected search cost. - Actual cost of items examined.
- For key ki, cost depthT(ki)1, where depthT(ki)
depth of ki in BST T . - Minimizes the expected search time for a given
probability distribution
68Optimal Binary Search Tree
Average number of comparisons
69Optimal Binary Search Tree
Let T be a binary search tree that contains Ki ,
Ki1 , . . . , Kj for some 1 i j n. We
define the cost of average search time as
Recurrence relation
Let TL and TR be the left and right sub-trees of
T. Then
70Optimal Binary Search Tree
- Principle of optimality Applied to Optimal Binary
Search Trees - Let T be a binary search tree that has minimum
cost among all trees containing keys Ki , Ki1 ,
. . . , Kj and let Km be the key at the root of T
(so i m j ). Then TL, the left subtree of T,
is a binary search tree that has minimum cost
among all trees containing keys Ki , . . , Km-1
and TR, the right subtree of T, is a binary
search tree that has minimum cost among all trees
containing keys Km1 , . . . , Kj.
71Expected Search Cost
Sum of probabilities is 1.
72Example
- Consider 5 keys with these search
probabilitiesp1 0.25, p2 0.2, p3 0.05, p4
0.2, p5 0.3.
k2
i depthT(ki) depthT(ki)pi 1 1
0.25 2 0 0 3
2 0.1 4 1
0.2 5 2 0.6
1.15
k1
k4
k3
k5
Therefore, Esearch cost 2.15.
73Example
- p1 0.25, p2 0.2, p3 0.05, p4 0.2, p5
0.3.
i depthT(ki) depthT(ki)pi 1 1
0.25 2 0 0 3
3 0.15 4 2
0.4 5 1
0.3 1.10
Therefore, Esearch cost 2.10.
This tree turns out to be optimal for this set of
keys.
74Example
- Observations
- Optimal BST may not have smallest height.
- Optimal BST may not have highest-probability key
at root. - Build by exhaustive checking?
- Construct each n-node BST.
- For each, assign keys and compute expected
search cost. - But there are ?(4n/n3/2) different BSTs with n
nodes.
75Optimal Substructure
- One of the keys in ki, ,kj, say kr, where i r
j, must be the root of an optimal subtree for
these keys. - Left subtree of kr contains ki,...,kr?1.
- Right subtree of kr contains kr1, ...,kj.
- To find an optimal BST
- Examine all candidate roots kr , for i r j
- Determine all optimal BSTs containing ki,...,kr?1
and containing kr1,...,kj
kr
ki
kr-1
kr1
kj
76Recursive Solution
- Find optimal BST for ki,...,kj, where i 1, j
n, j i?1. When j i?1, the tree is empty. - Define
- Ai, j expected search cost of optimal
BST for ki,...,kj. - If j i?1, then Ai, j 0.
- If j i,
- Select a root kr, for some i r j .
- Recursively make an optimal BSTs
- for ki,..,kr?1 as the left subtree, and
- for kr1,..,kj as the right subtree.
77Computing an Optimal Solution
- For each subproblem (i,j), store
- expected search cost in a table A1 ..n1 , 0
..n - Will use only entries Ai, j , where j i?1.
- Rooti, j root of subtree with keys ki,..,kj,
for 1 i j n. - P1..n1, 0..n sum of probabilities
- pi, i?1 0 for 1 i n.
- pi, j pi, j-1 pj for 1 i j n.
- The Optimal Solution is
- A1, n (minimum search time)
- Root1, n R (1ltrltn)
- Make an optimal binary search tree based on the
root at R with keys k1,, kn.
78Memoization
79Idea
- Memoization is used for recursive code that
solves sub problems more than once. - The idea is to modify the recursive code, and
use a table to store the solution for every
problem that was already solved. - We can still use the recursive call with more
efficiency - Â
80Initialization
- Before you can apply the recursive code, you need
to initialize the table to some "impossible
value". - For example if the function can have only
positive values you can initialize the table to
negative values.
81Method
- The recursive code is changed to first check the
table to determine whether you have already
solved the current problem or not. - If the value in the table is still the initial
value you know that you have not yet solved the
current problem, so you apply the modified
recursive code and store the solution in the
table. - Otherwise, you know that the value is the
required solution and use or return it directly.
82Example 1
- The following code is for a memoized version of
recursive Fibonacci. - MemoizedFib(n)
- for (i0 iltn i)
- Ai-1 //initialize the array
- return LookupFib(n)
- Â
- Â
- LookupFib(n) //compute fn if not yet done
- if An-1 //initial value
- if nlt1 //base case
- An n //store solution
- //solve and store general case
- else An (LookupFib(n-1)LookupFib(n-2))
- return An //return fn An
83Example 2
- The following code is for a memoized version of
the recursive code for binomial coefficients - Â
- memoizedBC(n, k)
- for i0 to n
- for j0 to min(i, k)
- Bi, j -1 //initialize array
- return binomialCoef(n, k)
- Â
- binomialCoef(n, k)
- if B(n, k) -1 //Bn, k not yet
computed - Â if k 0 or k n //Base case
- Â Bk, n 1 //solve base case and store
- Â //solve and store general case
- Â else Bk,n(binomialCoef(n-1,k
1)binomialCoef(n-1, k)) - return Bn, k
84Example 3
- Give a memorized version of the longest common
subsequence that runs in O(mn) time. This version
is slightly different since the base cases are
directly stored into the table during
initialization. - Â
- Memoized-LCS-Length(X, Y)
- m lt- lengthX
- n lt- lengthY
- for i lt- 1 to m do //initialize column 0
- ci, 0 lt- 0
- for j lt- 1 to n do //initialize row 0
- c0, j lt- 0
- for i lt- 1 to m do //initialize remaining matrix
- for j lt- 1 to n do
- ci, j lt- -1
- L-LCS-Length(X, Y, m, n)//call with m and n
- return c, b
-
85Example 3 (contd)
- L-LCS-Length(X, Y, i, j)
- if ci, j - 1
- if xi yj
- L-LCS-Length(X, Y, i - 1, j -1)
- c i j lt- ci - 1, j - 1 1
- bi, j lt- D
- return
- L-LCS-Length(X, Y, i - 1, j)
- L-LCS-Length(X, Y, i, j - 1)
- if ci - 1, j gt ci, j - 1
- ci, j lt- ci - 1, j
- bi, j lt- U
- else
- ci, j lt- ci, j - 1
- bi, j lt- L
- returnÂ