Title: Lecture 2: Dynamic Programming
1Lecture 2 Dynamic Programming
2Content
- What is Dynamic Programming?
- Matrix Chain-Products
- Sequence Alignments
- Knapsack Problem
- All-Pairs Shortest Path Problem
- Traveling Salesman Problem
- Conclusion
3Lecture 2 Dynamic Programming
- What is Dynamic Programming?
4What is Dynamic Programming?
- Dynamic Programming (DP) tends to break the
original problem to sub-problems, i.e., in a
smaller size - The optimal solution in the bigger sub-problems
is found through a retroactive formula which
connects the optimal solutions of sub-problems. - Used when the solution to a problem may be viewed
as the result of a sequence of decisions.
5Properties for Problems Solved by DP
- Simple Subproblems
- The original problem can be broken into smaller
subproblems with the same structure - Optimal Substructure of the problems
- The solution to the problem must be a composition
of subproblem solutions (the principle of
optimality) - Subproblem Overlap
- Optimal subproblems to unrelated problems can
contain subproblems in common
6The Principle of Optimality
- The basic principle of dynamic programming
- Developed by Richard Bellman
- An optimal path has the property that whatever
the initial conditions and control variables
(choices) over some initial period, the control
(or decision variables) chosen over the remaining
period must be optimal for the remaining problem,
with the state resulting from the early decisions
taken to be the initial condition.
7Example Shortest Path Problem
Goal
Start
8Example Shortest Path Problem
Start
Goal
9Example Shortest Path Problem
25
10
28
5
Start
Goal
40
3
10Recall ? Greedy Method forShortest Paths on a
Multi-stage Graph
Is the greedy solution optimal?
- Problem
- Find a shortest path from v0 to v3
11Recall ? Greedy Method forShortest Paths on a
Multi-stage Graph
?
Is the greedy solution optimal?
- Problem
- Find a shortest path from v0 to v3
The optimal path
12Example ? Dynamic Programming
13Lecture 2 Dynamic Programming
14Matrix Multiplication
- C A B
- A is d e and B is e f
- O(def )
15Matrix Chain-Products
- Given a sequence of matrices, A1, A2, , An, find
the most efficient way to multiply them together. - Facts
- A(BC) (AB)C
- Different parenthesizing may need different
numbers of operation. - Example A10 30, B 30 5, C 5 60
- (AB)C (10305) (10560) 1500 3000
4500 ops - A(BC) (30560) (103060) 9000 18000
27000 ops
16Matrix Chain-Products
- Given a sequence of matrices, A1, A2, , An, find
the most efficient way to multiply them together. - A Brute-force Approach
- Try all possible ways to parenthesize
AA1?A2??An - Calculate number of operations for each one
- Pick the best one
- Time Complexity
- paranethesizations binary trees of n nodes
- O(4n)
17A Greedy Approach
- Idea 1
- repeatedly select the product that uses the most
operations. - Counter-example
- A 10 ? 5, B 5 ? 10, C 10 ? 5, and D 5 ? 10
- Greedy idea 1 gives (AB)(CD), which takes
5001000500 2000 ops - A((BC)D) takes 500250250 1000 ops
18Another Greedy Approach
- Idea 2
- repeatedly select the product that uses the least
operations. - Counter-example
- A 101 ? 11, B 11 ? 9, C 9 ? 100, and D 100 ?
999 - Greedy idea 2 gives A((BC)D), which takes
1099899900108900228789 ops - (AB)(CD) takes 99998999189100189090 ops
19DP ? Define Subproblem
Subproblem (Pij, i? j)
Original Problem
(P1n)
Suppose operations for the optimal solution of
Pij is Nij
operations for the optimal solution of the
original problem P1n is N1n
20DP ? Define Subproblem
Subproblem (Pij, i? j)
Original Problem
(P1n)
Suppose operations for the optimal solution of
Pij is Nij
operations for the optimal solution of the
original problem P1n is N1n
21DP ? Define Subproblem
What is the relation btw Nij (Pij) and N1n
(P1n)?
Subproblem (Pij, i? j)
Original Problem
(P1n)
Suppose operations for the optimal solution of
Pij is Nij
operations for the optimal solution of the
original problem P1n is N1n
22DP ? Principle of Optimality
dk?dj1
di?dk1
Nk1,n
Nik
23DP ? Implementation
Nij
24DP ? Implementation
Nij
25DP ? Implementation
Nij
?
26DP ? Implementation
Nij
1
2
j
n
1
2
?
i
n
27DP ? Implementation
Nij
1
2
j
n
1
2
?
i
n
28DP ? Implementation
Nij
1
2
j
n
1
2
i
n
29DP ? Implementation
Nij
1
2
j
n
1
?
2
i
n
30DP ? Implementation
Nij
1
2
j
n
1
2
?
i
n
31DP ? Implementation
Nij
1
2
j
n
1
2
i
n
32DP ? Implementation
Nij
1
2
j
n
1
?
2
i
n
33DP ? Implementation
Nij
1
2
j
n
1
2
i
?
n
34DP ? Implementation
Nij
1
2
j
n
1
2
i
n
35DP for Matrix Chain-Products
Algorithm matrixChain(S) Input sequence S of n
matrices to be multiplied Output number of
operations in an optimal parenthesization of
S for i ?1 to n // main diagonal terms are all
zero Ni,i ? 0 for d ? 2 to n // each diagonal
do following for i ?1 to n?d1 // do from top to
bottom for each diagonal j ? id?1 Ni,j ?
infinity for k ? i to j?1 // counting
minimum Ni,j ? min(Ni,j, Ni,k Nk1,j di
dk1 dj1)
36Time Complexity
Algorithm matrixChain(S) Input sequence S of n
matrices to be multiplied Output number of
operations in an optimal parenthesization of
S for i ?1 to n // main diagonal terms are all
zero Ni,i ? 0 for d ? 2 to n // each diagonal
do following for i ?1 to n?d1 // do from top to
bottom for each diagonal j ? id?1 Ni,j ?
infinity for k ? i to j?1 // counting
minimum Ni,j ? min(Ni,j, Ni,k Nk1,j di
dk1 dj1)
O(n3)
37Exercises
- The matrixChain algorithm only computes
operations of an optimal parenthesization. But,
it doesnt report the optimal parenthesization
scheme. Please modify the algorithm so that it
can do so. - Given an example with 5 matrices to illustrate
your idea using a table.
38Lecture 2 Dynamic Programming
39Question
- Given two strings
- are they similar?
- what is their distance?
and
40Example
applicable
X
Y
plausibly
How similar they are?
Can you give them a score?
41Example
applica---ble
X
Match
Match
Match
Match
Match
Mismatch
Indel
Indel
Indel
Indel
Indel
Indel
Indel
-p-l--ausibly
Y
Matches Mismatches Insertions deletions (indel)
Three cases
42Example
applica---ble
X
Match
Match
Match
Match
Match
Mismatch
Indel
Indel
Indel
Indel
Indel
Indel
Indel
-p-l--ausibly
Y
Matches Mismatches Insertions deletions (indel)
(1)
(?1)
Three cases
(?1)
43Example
applica---ble
X
Score 5?(1) 1?(?1) 7 ?(?1) ?3
Match
Match
Match
Match
Match
Mismatch
Indel
Indel
Indel
Indel
Indel
Indel
Indel
-p-l--ausibly
Y
Matches Mismatches Insertions deletions (indel)
(1)
(?1)
Three cases
(?1)
44Example
applica---ble
X
Is the alignment optimal?
Score 5?(1) 1?(?1) 7 ?(?1) ?3
Match
Match
Match
Match
Match
Mismatch
Indel
Indel
Indel
Indel
Indel
Indel
Indel
-p-l--ausibly
Y
Matches Mismatches Insertions deletions (indel)
(1)
(?1)
Three cases
(?1)
45Sequence Alignment
- In bioinformatics, a sequence alignment is a way
of arranging the primary sequences of DNA, RNA,
or protein to identify regions of similarity that
may be a consequence of functional, structural,
or evolutionary relationships between the
sequences.
46Global and Local Alignments
L G P S S K Q T G K G S - S R I W D N
Global alignment L N -
I T K S A G K G A I M R L G D A - - - - - - - T
G K G - - - - - - - -
Local alignment - - - - - - - A G K
G - - - - - - - -
47Global and Local Alignments
48Global and Local Alignments
- Global Alignment
- attempts to align the entire sequence
- most useful when the sequences in the query set
are similar and of roughly equal size. - NeedlemanWunsch algorithm (1971).
- Local Alignment
- Attempts to align partial regions of sequences
with high level of similarity. - Smith-Waterman algorithm (1981)
49NeedlemanWunsch Algorithm
- Find the best global alignment of any two
sequences under a given substitution matrix. - Maximize a similarity score, to give maximum
match - Maximum match largest number of residues of one
sequence that can be matched with another
allowing for all possible gaps - Based on dynamic programming
- Involves an iterative matrix method of
calculation
50Substitution Matrix
- In bioinformatics, a substitution matrix
estimates the rate at which each possible residue
in a sequence changes to each other residue over
time. - Substitution matrices are usually seen in the
context of amino acid or DNA sequence alignment,
where the similarity between sequences depends on
the mutation rates as represented in the matrix.
51Substitution Matrix (DNA) w/o Gap Cost
52Substitution Matrix (DNA) w/ Gap Cost
53Substitution Matrix (3D-BLAST)
54DP ? Define Subproblem
- Consider two strings, s of length n and t of
length m. Let S be the substitution matrix. - Subproblem Let Pij is defined to be the optimal
aligning for the two substrings t1..i and
s1..j, - and let Mij be the matching score.
- Original Problem Pmn (matching score Mmn)
55DP ? Principle of Optimality
?
?
?
?
56Example
- Step 1. Create a scoring matrix
- Step 2. Make an empty table for Mij
- Step 3. Initialize base conditions
- Step 4. Fill table by
- Step 5. Trace back
57Example
- Step 1. Create a scoring matrix
- Step 2. Make an empty table for Mij
- Step 3. Initialize base conditions
- Step 4. Fill table by
- Step 5. Trace back
0
?2
?4
?6
?8
?10
?12
?14
?2
?4
?6
?8
?10
?12
58Example
- Step 1. Create a scoring matrix
- Step 2. Make an empty table for Mij
- Step 3. Initialize base conditions
- Step 4. Fill table by
- Step 5. Trace back
0
?2
?4
?6
?8
?10
?12
?14
?2
?1
0
?2
?4
?6
?8
?10
?4
?3
1
?1
?3
?3
?5
?7
?6
?5
?1
0
?2
?1
?3
?5
?8
?4
?3
0
1
?1
1
?1
?10
?6
?5
?2
1
0
1
2
?12
?8
?7
?3
0
0
1
3
59Example
s t
G G
A A
T -
G A
G T
C C
A C
- Step 1. Create a scoring matrix
- Step 2. Make an empty table for Mij
- Step 3. Initialize base conditions
- Step 4. Fill table by
- Step 5. Trace back
0
?2
?4
?6
?8
?10
?12
?14
0
?2
?1
0
?2
?4
?6
?8
?10
?1
?4
?3
1
?1
?3
?3
?5
?7
1
?6
?5
?1
0
?2
?1
?3
?5
0
?8
?4
?3
0
1
?1
1
?1
?1
1
?10
?6
?5
?2
1
0
1
2
1
?12
?8
?7
?3
0
0
1
3
3
60NeedlemanWunsch Algorithm
- Step 1. Create a scoring matrix
- Step 2. Make an empty table for Mij
- Step 3. Initialize base conditions
- Step 4. Fill table by
- Step 5. Trace back
61NeedlemanWunsch Algorithm
s ? , t ? while i lt 1 and j lt 1 do
if s ?sj s t ?ti t else
if s ?sj s t ?gap t
else s ?gap s t ?ti t while
i gt 1 do t ?gap t while j gt 1 do s ?gap s
- Step 1. Create a scoring matrix
- Step 2. Make an empty table for Mij
- Step 3. Initialize base conditions
- Step 4. Fill table by
- Step 5. Trace back
62Local Alignment Problem
- Given two strings s s1sn,
- t t1.tm
- Find substrings s, t whose similarity
- (optimal global alignment value) is maximum.
63Example Local Alignment
GTAGT CATCAT ATG TGACTGAC G TC CATDOGCAT CC
TGACTGAC A
Best aligned subsequeces
64Recursive Formulation
- Global Alignment (NeedlemanWunsch Algorithm)
- Local Alignment (Smith-Waterman Algorithm)
65Exercises
- Find the best local aligned substrings for the
following two DNA strings - GAATTCAGTTA
- GGATCGA
- You have to give the detail.
- Hint start from the left table.
66Exercises
- What is longest common sequence (LCS) problem?
How to solve LCS using dynamic programming
technique?
67Lecture 2 Dynamic Programming
68Knapsack Problems
- Given some items, pack the knapsack to get the
maximum total value. Each item has some weight
and some benefit. Total weight that we can carry
is no more than some fixed capacity. - Fractional knapsack problem
- Items are divisible you can take any fraction of
an item. - Solved with a greedy algorithm.
- 0-1 knapsack problem
- Items are indivisible you either take an item or
not. - Solved with dynamic programming.
690-1 Knapsack Problem
- Given a knapsack with maximum capacity W, and a
set S consisting of n items - Each item i has some weight wi and benefit value
bi (all wi and W are integer values) - Problem How to pack the knapsack to achieve
maximum total value of packed items?
Why it is called a 0-1 Knapsack Problem?
70Example 0-1 Knapsack Problem
Which boxes should be chosen to maximize the
amount of money while still keeping the overall
weight under 15 kg ?
71Example 0-1 Knapsack Problem
- Objective Function
- Unknowns or Variables
- Constraints
72Formulation 0-1 Knapsack Problem
730-1 Knapsack Problem Brute-Force Approach
- Since there are n items, there are 2n possible
combinations of items. - We go through all combinations and find the one
with maximum value and with total weight less or
equal to W - Running time will be O(2n)
74DP ? Define Subproblem
- Suppose that items are labeled 1,..., n.
- Define a subproblem, say, Pk as to finding an
optimal solution for items in Sk 1, 2,..., k. - ? original problem is Pn.
- Is such a scheme workable?
- Is the principle of optimality held?
75A Counterexample
1. 2kgs, 3
P1
P2
P3
P4
P5
2. 3kgs, 4
3. 4kgs, 5
4. 5kgs, 8
20 kgs
5. 9kgs, 10
76A Counterexample
1. 2kgs, 3
2. 3kgs, 4
3. 4kgs, 5
4. 5kgs, 8
20 kgs
5. 9kgs, 10
77A Counterexample
Solution for P4 is not part of the solution for
P5 !!!
1. 2kgs, 3
2. 3kgs, 4
3. 4kgs, 5
4. 5kgs, 8
20 kgs
5. 9kgs, 10
78DP ? Define Subproblem
- Suppose that items are labeled 1,..., n.
- Define a subproblem, say, Pk as to finding an
optimal solution for items in Sk 1, 2,..., k. - ? original problem is Pn.
- Is such a scheme workable?
- Is the principle of optimality held?
?
?
79DP ? Define Subproblem
New version
- Suppose that items are labeled 1,..., n.
- Define a subproblem, say, Pk,w as to finding an
optimal solution for items in Sk 1, 2,..., k
and with total weight no more than w. - ? original problem is Pn,W.
- Is such a scheme workable?
- Is the principle of optimality held?
80DP ? Principle of Optimality
Denote the benefit for the optimal solution of
Pk,w as Bk,w.
81DP ? Principle of Optimality
In this case, it is impossible to include the kth
object.
Denote the benefit for the optimal solution of
Pk,w as Bk,w.
include the kth object
Not include the kth object
There are two possible choices.
82Example
1. 2kgs, 3
2. 3kgs, 4
3. 4kgs, 5
4. 5kgs, 6
5 kgs
83Example
Step 1. Setup table and initialize base
conditions.
1. 2kgs, 3
2. 3kgs, 4
3. 4kgs, 5
4. 5kgs, 6
5 kgs
84Example
Step 2. Fill all table entries progressively.
1. 2kgs, 3
2. 3kgs, 4
0
3
3. 4kgs, 5
3
4. 5kgs, 6
5 kgs
3
3
85Example
Step 2. Fill all table entries progressively.
1. 2kgs, 3
2. 3kgs, 4
0
0
3
3
3. 4kgs, 5
3
4
4. 5kgs, 6
5 kgs
3
4
3
7
86Example
Step 2. Fill all table entries progressively.
1. 2kgs, 3
2. 3kgs, 4
0
0
0
3
3
3
3. 4kgs, 5
3
4
4
4. 5kgs, 6
5 kgs
3
4
5
3
7
7
87Example
Step 2. Fill all table entries progressively.
1. 2kgs, 3
2. 3kgs, 4
0
0
0
0
3
3
3
3
3. 4kgs, 5
3
4
4
4
4. 5kgs, 6
5 kgs
3
4
5
5
3
7
7
7
88Example
Step 3. Trace back
1. 2kgs, 3
2. 3kgs, 4
0
0
0
0
3
3
3
3
3. 4kgs, 5
3
4
4
4
4. 5kgs, 6
5 kgs
3
4
5
5
3
7
7
7
89Pseudo-Polynomial Time Algorithm
- The time complexity for 0-1 knapsack using DP is
O(Wn). - Not a polynomial-time algorithm if W is large.
- This is a pseudo-polynomial time algorithm.
90Lecture 2 Dynamic Programming
- All-Pairs
- Shortest Path Problem
91All-Pairs Shortest Path Problem
- Given weighted graph G(V,E), we want to determine
the cost dij of the shortest path between each
pair of nodes in V.
92Floyd's Algorithm
- Let be the minimum cost of a path from node i
to node j, using only nodes in Vkv1,,vk.
k
The all-pairs shortest path problem is to find
all paths with costs
i
j
93Floyd's Algorithm
Input Parameter D Output Parameter D,
next all_paths(D, next) n D.NumberOfRows //
initialize next if no intermediate // vertices
are allowed nextij j for i 1 to n for j
1 to n nextij j for k 1 to n //
compute D(k) for i 1 to n for j 1 to
n if (Dik Dkj lt Dij)
Dij Dik Dkj nextij
nextik
O(n3)
94Floyd's Algorithm
Input Parameters next, i, j Output Parameters
None print_path(next, i, j) // if no
intermediate vertices, just // print i and j
and return if (j nextij) print(i
j) return // output i and
then the path from the vertex // after i
(nextij) to j print(i )
print_path(next,nextij, j)
95Lecture 2 Dynamic Programming
- Traveling Salesman Problem
96Traveling Salesman Problem (TSP)
97Traveling Salesman Problem (TSP)
How many feasible paths?
n cities
98Example (TSP)
(1234) 18
(1243) 19
(1324) 23
(1342) 19
(1423) 23
(1432) 18
99Subproblem Formulation for TSP
length of the shortest path from i to 1 visiting
each city in S exactly once.
g(i, S)
1
g(1, V ? 1)
length of the optimal TSP tour.
i
100Subproblem Formulation for TSP
Goal g(1, V ? 1)
length of the shortest path from i to 1 visiting
each city in S exactly once.
g(i, S)
1
j
i
101Example
Goal g(1, V ? 1)
102Example
Goal g(1, V ? 1)
18
2
6
4
16
13
14
7
6
7
5
6
5
9
8
13
11
10
9
5
5
6
6
7
7
4
6
4
2
6
2
103DP ? TSP Algorithm
Goal g(1, V ? 1)
Input Parameter D Output Parameter P //
path TSP(D) n Dim(D) for i 1 to n gi, ?
Di, 1 for k 1 to n?2 // compute g for
subproblems for all S ? V?1 with Sk
for all i ? S ? 1 gi, S minj?SDi,
j, gj, S ? j Pi, S arg minj?SDi,
j, gj, S ? j // compute the TSP tour g1,
V?1 minj?V?1D1, j, gj, V ? 1,
j P1, V?1 arg minj?V?1D1, j, gj,
V ? 1, j
104DP ? TSP Algorithm
Goal g(1, V ? 1)
Input Parameter D Output Parameter P //
path TSP(D) n Dim(D) for i 1 to n gi, ?
Di, 1 for k 1 to n?2 // compute g for
subproblems for all S ? V?1 with Sk
for all i ? S ? 1 gi, S minj?SDi,
j, gj, S ? j Pi, S arg minj?SDi,
j, gj, S ? j // compute the TSP tour g1,
V?1 minj?V?1D1, j, gj, V ? 1,
j P1, V?1 arg minj?V?1D1, j, gj,
V ? 1, j
O(2n)