Chapter 1: The DP Algorithm - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Chapter 1: The DP Algorithm

Description:

Chapter 1: The DP Algorithm To do: sequential decision-making state random elements discrete-time stochastic dynamic system optimal control/decision problem – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 31

Provided by: Jose368

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 1: The DP Algorithm

1
Chapter 1 The DP Algorithm

To do
sequential decision-making
state
random elements
discrete-time stochastic dynamic system
optimal control/decision problem
actions vs. strategy (information gathering,
feedback)
Illustrated via examples, later on the general
model will be described.

2
Example Inventory Control Problem
0
1
2
k-1
k
k1
N-1
N
Quantity of a certain item, e.g. gas in a service
station, oil in a refinery, cars in a dealership,
spare parts in a maintenance facility, etc. The
stock is checked at equally spaced periods in
time, e.g. every morning, at the end of each
week, etc. At those times, a decision must be
made as to what quantity of the item to order, so
that demand over the present period is
satisfactorily met (we will give a quantitative
meaning to this).
kth period
k-1
k
k1
check stock, place order
3
Example Inventory Control Problem

Stochastic Difference Equation
xk1 xk uk wk
xk stock at the beginning of kth period
uk quantity ordered at beginning of kth
period. Assume delivered during kth period.
wk demand during kth period, wk stochastic
process
assume real-valued variables

4
Example Inventory Control Problem

Negative stock is interpreted as excess demand,
which is backlogged and filled ASAP.
Cost of operation
purchasing cost cuk (c cost per unit)
H(xk1) penalty for holding and storage of extra
quantity (xk1gt0), or for shortage (xk1lt0)
Cost for period k cuk H(xkuk-wk)
g(xk,uk,wk)

xk1
5
Example Inventory Control Problem
Let or
6
Example Inventory Control Problem
Objective to minimize, in some meaningful
sense, the total cost of operation over a finite
number of periods (finite horizon) total
cost over N periods
7
Example Inventory Control Problem

Two distinct situations can arise
Deterministic Case xo is perfectly known, and
the demands are known in advance to the manager.
at k0, all future demands are known w0, w1,
..., wN-1.
? select all orders at once, so as to exactly
meet the demand
? x1 x2 ... xN-1 0
0 x1 x0 u0 w0
? u0 w0 x0
uk wk, 1 ? k ? N-1
fixed order schedule

assume x0 ? w0
8
Example Inventory Control Problem

What we do is to select a set of fixed actions
(numbers, i.e. precomputed order schedule).
At the beginning of period k, wk becomes known
(perfect forecast). Hence, we must gather
information and make decisions sequentially.
strategy rule for making decisions based on
information as it becomes available

forecast
9
Stochastic Case

Stochastic Case x0 is perfectly known (can
generalize to case when only distribution is
known), but is a random process.
Assume that are i.i.d., -valued
r.v. , with pdf fw , i.e.

Independent of k
Pw Probability distribution or measure,
i.e. is the problem that takes a value
in the set
10
Stochastic Case

Note that the stock is now a r.v.
Alternatively, we can describe the evolution of
the system in terms of a transition law
Prob
Prob
Prob

11
Stochastic Case

Also, the cost is a random quantity minimize
expected cost
Action select all orders (numbers) at k0
most likely not optimal (reduces to nonlinear
programming problem)
VS
Strategy select a sequence of functions
s.t.

Information available of kth period
difficult problem ! Optimization is over a
function space
12
Stochastic Dynamic Program

Let ? (?0, , ?1, ... , ?N-1) control /
decision strategy, policy, law
? set of all admissible strategies
(e.g. ?k(x) ? 0)

Then, the stochastic DP problem is
minimize
s.t. ? ?? ??
If the problem is feasible, then ? and optimal
strategy ?, i.e.
13
Summary of the Problem

1-Discrete time Stochastic System
system equation
transition law
Note No backlogging
14
Stochastic Dynamic Program
2-Stochastic element , assumed i.i.d. for
example, will generalize to depending on
xk and uk.
3-Control constraint
if there is a maximum capacity M,
then,
4-Additive cost
15
Stochastic Dynamic Program
5-Optimization over admissible strategies
We will see later on that this problem has a neat
closed form solution
for some (threshold levels) Tk base-stock policy
16
Role of Information Actions Vs. Strategies
Example Let a two-stage problem be given as
0
0
where w0 is a random variable s.t. it takes
values ?1 w. p. , i.e.
17
Role of Information Actions Vs. Strategies
Problem A Choose actions (u0 , u1) (open loop,
control schedule) to minimize
Equivalently, let
N2
minimize
s.t. ()
18
Role of Information Actions Vs. Strategies
Solution A
Case (i)
19
Role of Information Actions Vs. Strategies
Case (ii)
20
Role of Information Actions Vs. Strategies
Can be anything, then chooseappropiately.
No information gathering we choose
at the start and do not take in to
consideration x1 at the beginning of stage 1.
21
Role of Information Actions Vs. Strategies
Problem B Choose u0 and u1 sequentially, using
the observed value of x1.
Solution B from (), we select
Sequential decision-making, feedback control.
Thus to take decision u1, we wait until outcome
x1 becomes available, and act accordingly.
22
Role of Information Actions Vs. Strategies
Note information gathering doesnt always help
Let
(Deterministic case)
Do not gain anything by making decisions
sequentially
23
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
1-Discrete time stochastic dynamic system (t, k
can be time or events)
state space of time k
control space
disturbance space (countable)
Also, depending on the state of the system, there
are constraints on the actions that can be taken
Non empty subset
24
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
2-Stochastic disturbance ?wk .
probability measure (distribution), may depend
explicitly in time, current state and action, but
not on previous disturbances wk-1, , w0 .
25
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
3-Admissible Control / Decision Laws (Strategies,
Policies)
Define information patterns !
?Feasible policies ?Markov -Deterministic
-Randomize
and
()
() holds
26
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
4-Finite Horizon Optimal Control / Decision
Problem Given an initial state x0 , and cost
functions gk , k0, , N-1 find ? ? ?
thatminimizes the cost functional
k0, , N-1
subject to the system equation constraint
27
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
We say that ? ? ? is optimal for the initial
state x0 if
Optimal N-stage cost (or value) function
Likewise, for ? gt 0 given, is said to be
?-optimal if
28
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
This stochastic optimal control problem is
difficult! we are optimizing over strategies
The Dynamic Programming Algorithm will give us
necessary and sufficient conditions to decompose
this problem into a sequence of coupled
minimization problems over actions,
(optimization) from which we will obtain
.
DP is only general approach for sequential design
making under uncertainty.
29
Alternative System Description
Given a dynamic description of a system via a
system equation
Then we can alternatively describe the system via
a transition law.
30
Alternative System Description
Given xk and uk , xk1 has distribution
P
? System equation ? system transition law

Write a Comment

User Comments (0)