Dynamic Programming - PowerPoint PPT Presentation

About This Presentation

Title:

Dynamic Programming

Description:

Dynamic Programming In this handout A shortest path example Deterministic Dynamic Programming Inventory example Resource allocation example Dynamic Programming ... – PowerPoint PPT presentation

Number of Views:42

Avg rating:3.0/5.0

Slides: 19

Provided by: vard7

Learn more at: https://people.ohio.edu

Category:

more less

Transcript and Presenter's Notes

Title: Dynamic Programming

1
Dynamic Programming

In this handout
A shortest path example
Deterministic Dynamic Programming
Inventory example
Resource allocation example

2
Dynamic Programming

Dynamic programming is a widely-used mathematical
technique for solving problems that can be
divided into stages and where decisions are
required in each stage.
The goal of dynamic programming is to find a
combination of decisions that optimizes a certain
amount associated with a system.

3
A typical example Shortest Path

Ben plans to drive from NY to LA
Has friends in several cities
After 1 days driving can reach Columbus,
Nashville, or Louisville
After 2 days of driving can reach Kansas City,
Omaha, or Dallas
After 3 days of driving can reach Denver or San
Antonio
After 4 days of driving can reach Los Angeles
The actual mileages between cities are given in
the figure (next slide)
Where should Ben spend each night of the trip to
minimize the number of miles traveled?

4
Shortest Path network figure
680
Columbus 2
Kansas City 5
610
790
790
1050
Denver 8
550
1030
580
540
New York 1
Los Angeles 10
Omaha 6
Nashville 3
900
760
940
660
Stage 1
1390
Stage 5
San Antonio 9
770
790
510
700
Stage 4
270
Dallas 7
Louisville 4
830
Stage 2
Stage 3
5
Shortest Path problem Solution

The problem is solved recursively by working
backward in the network
Let cij be the mileage between cities i and j
Let ft(i) be the length of the shortest path from
city i to LA (city i is in stage t)
Stage 4 computations are obvious
f4(8) 1030
f4(9) 1390

6
Stage 3 computations

Work backward one stage (to stage 3 cities) and
find the shortest path to LA from each stage 3
city.
To determine f3(5), note that the shortest path
from city 5 to LA must be one of the following
Path 1 Go from city 5 to city 8 and then take
the shortest path from city 8 to city 10.
Path 2 Go from city 5 to city 9 and then take
the shortest path from city 9 to city 10.

Similarly,
7
Stage 2 computations

Work backward one stage (to stage 2 cities) and
find the shortest path to LA from each stage 2
city.

8
Stage 1 computations

Now we can find f1(1), and the shortest path from
NY to LA.
Checking back our calculations, the shortest path
is
1 2 5 8 10
that is,
NY Columbus Kansas City Denver LA
with total mileage 2870.

9
General characteristics of Dynamic Programming

The problem structure is divided into stages
Each stage has a number of states associated with
it
Making decisions at one stage transforms one
state of the current stage into a state in the
next stage.
Given the current state, the optimal decision for
each of the remaining states does not depend on
the previous states or decisions. This is known
as the principle of optimality for dynamic
programming.
The principle of optimality allows to solve the
problem stage by stage recursively.

10
Division into stages

The problem is divided into smaller subproblems
each of them represented by a stage.
The stages are defined in many different ways
depending on the context of the problem.
If the problem is about long-time development of
a system then the stages naturally correspond to
time periods.
If the goal of the problem is to move some
objects from one location to another on a map
then partitioning the map into several
geographical regions might be the natural
division into stages.
Generally, if an accomplishment of a certain task
can be considered as a multi-step process then
each stage can be defined as a step in the
process.

11
States

Each stage has a number of states associated with
it. Depending what decisions are made in one
stage, the system might end up in different
states in the next stage.
If a geographical region corresponds to a stage
then the states associated with it could be some
particular locations (cities, warehouses, etc.)
in that region.
In other situations a state might correspond to
amounts of certain resources which are essential
for optimizing the system.

12
Decisions

Making decisions at one stage transforms one
state of the current stage into a state in the
next stage.
In a geographical example, it could be a decision
to go from one city to another.
In resource allocation problems, it might be a
decision to create or spend a certain amount of a
resource.
For example, in the shortest path problem three
different decisions are possible to make at the
state corresponding to Columbus these decisions
correspond to the three arrows going from
Columbus to the three states (cities) of the next
stage Kansas City, Omaha, and Dallas.

13
Optimal Policy and Principle of Optimality

The goal of the solution procedure is to find an
optimal policy for the overall problem, i.e., an
optimal policy decision at each stage for each of
the possible states.
Given the current state, the optimal decision for
each of the remaining states does not depend on
the previous states or decisions. This is known
as the principle of optimality for dynamic
programming.
For example, in the geographical setting the
principle works as follows the optimal route
from a current city to the final destination does
not depend on the way we got to the city.
A system can be formulated as a dynamic
programming problem only if the principle of
optimality holds for it.

14
Recursive solution to the problem

The principle of optimality allows to solve the
problem stage by stage recursively.
The solution procedure first finds the optimal
policy for the last stage. The solution for the
last stage is normally trivial.
Then a recursive relationship is established
which identifies the optimal policy for stage t,
given that stage t1 has already been solved.
When the recursive relationship is used, the
solution procedure starts at the end and moves
backward stage by stage until it finds the
optimal policy starting at the initial stage.

15
Solving Inventory Problems by DP

Main characteristics
Time is broken up into periods. The demands for
all periods are known in advance.
At the beginning of each period, the firm must
determine how many units should be produced.
Production and storage capacities are limited.
Each periods demand must be met on time from
inventory or current production.
During any period in which production takes
place, a fixed cost of production as well as a
variable per-unit cost is incurred.
The firms goal is to minimize the total cost of
meeting on time the demands.

16
Inventory Problems Example

Producing airplanes
3 production periods
No inventory at the beginning
Can produce at most 3 airplanes in each period
Can keep at most 2 airplanes in inventory
Set-up cost for each period is 10
Determine a production schedule to minimize the
total cost (the DP solution on the board).

Period 1 2 3
Demand 1 2 1
Unit cost 3 5 4
17
Resource Allocation Problems

Limited resources must be allocated to different
activities
Each activity has a benefit value which is
variable and depends on the amount of the
resource assigned to the activity
The goal is to determine how to allocate the
resources to the activities such that the total
benefit is maximized

18
Resource Allocation Problems Example

A college student has 6 days remaining before
final exams begin in his 4 courses
He wants allocate the study time as effectively
as possible
Needs at least 1 day for each course and wants to
concentrate on just one course each day. So 1, 2,
or 3 days should be allocated to each course
He estimates that the alternative allocations for
each course would yield the number of grade
points shown in the following table
How many days should be allocated to each course?
(The DP solution on the board).