Chapter 1: The DP Algorithm - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Chapter 1: The DP Algorithm

Description:

Chapter 1: The DP Algorithm To do: sequential decision-making state random elements discrete-time stochastic dynamic system optimal control/decision problem – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 31
Provided by: Jose368
Category:

less

Transcript and Presenter's Notes

Title: Chapter 1: The DP Algorithm


1
Chapter 1 The DP Algorithm
  • To do
  • sequential decision-making
  • state
  • random elements
  • discrete-time stochastic dynamic system
  • optimal control/decision problem
  • actions vs. strategy (information gathering,
    feedback)
  • Illustrated via examples, later on the general
    model will be described.

2
Example Inventory Control Problem
0
1
2
k-1
k
k1
N-1
N
Quantity of a certain item, e.g. gas in a service
station, oil in a refinery, cars in a dealership,
spare parts in a maintenance facility, etc. The
stock is checked at equally spaced periods in
time, e.g. every morning, at the end of each
week, etc. At those times, a decision must be
made as to what quantity of the item to order, so
that demand over the present period is
satisfactorily met (we will give a quantitative
meaning to this).
kth period
k-1
k
k1
check stock, place order
3
Example Inventory Control Problem
  • Stochastic Difference Equation
  • xk1 xk uk wk
  • xk stock at the beginning of kth period
  • uk quantity ordered at beginning of kth
    period. Assume delivered during kth period.
  • wk demand during kth period, wk stochastic
    process
  • assume real-valued variables

4
Example Inventory Control Problem
  • Negative stock is interpreted as excess demand,
    which is backlogged and filled ASAP.
  • Cost of operation
  • purchasing cost cuk (c cost per unit)
  • H(xk1) penalty for holding and storage of extra
    quantity (xk1gt0), or for shortage (xk1lt0)
  • Cost for period k cuk H(xkuk-wk)
  • g(xk,uk,wk)

xk1
5
Example Inventory Control Problem
Let or
6
Example Inventory Control Problem
Objective to minimize, in some meaningful
sense, the total cost of operation over a finite
number of periods (finite horizon) total
cost over N periods
7
Example Inventory Control Problem
  • Two distinct situations can arise
  • Deterministic Case xo is perfectly known, and
    the demands are known in advance to the manager.
  • at k0, all future demands are known w0, w1,
    ..., wN-1.
  • ? select all orders at once, so as to exactly
    meet the demand
  • ? x1 x2 ... xN-1 0
  • 0 x1 x0 u0 w0
  • ? u0 w0 x0
  • uk wk, 1 ? k ? N-1
  • fixed order schedule

assume x0 ? w0
8
Example Inventory Control Problem
  • What we do is to select a set of fixed actions
    (numbers, i.e. precomputed order schedule).
  • At the beginning of period k, wk becomes known
    (perfect forecast). Hence, we must gather
    information and make decisions sequentially.
  • strategy rule for making decisions based on
    information as it becomes available

forecast
9
Stochastic Case
  • Stochastic Case x0 is perfectly known (can
    generalize to case when only distribution is
    known), but is a random process.
  • Assume that are i.i.d., -valued
    r.v. , with pdf fw , i.e.

Independent of k
Pw Probability distribution or measure,
i.e. is the problem that takes a value
in the set
10
Stochastic Case
  • Note that the stock is now a r.v.
  • Alternatively, we can describe the evolution of
    the system in terms of a transition law
  • Prob
  • Prob
  • Prob

11
Stochastic Case
  • Also, the cost is a random quantity minimize
    expected cost
  • Action select all orders (numbers) at k0
    most likely not optimal (reduces to nonlinear
    programming problem)
  • VS
  • Strategy select a sequence of functions
  • s.t.

Information available of kth period
difficult problem ! Optimization is over a
function space
12
Stochastic Dynamic Program
  • Let ? (?0, , ?1, ... , ?N-1) control /
    decision strategy, policy, law
  • ? set of all admissible strategies
    (e.g. ?k(x) ? 0)

Then, the stochastic DP problem is
minimize
s.t. ? ?? ??
If the problem is feasible, then ? and optimal
strategy ?, i.e.
13
Summary of the Problem

1-Discrete time Stochastic System
system equation
transition law
Note No backlogging
14
Stochastic Dynamic Program
2-Stochastic element , assumed i.i.d. for
example, will generalize to depending on
xk and uk.
3-Control constraint
if there is a maximum capacity M,
then,
4-Additive cost
15
Stochastic Dynamic Program
5-Optimization over admissible strategies
We will see later on that this problem has a neat
closed form solution
for some (threshold levels) Tk base-stock policy
16
Role of Information Actions Vs. Strategies
Example Let a two-stage problem be given as
0
0
where w0 is a random variable s.t. it takes
values ?1 w. p. , i.e.
17
Role of Information Actions Vs. Strategies
Problem A Choose actions (u0 , u1) (open loop,
control schedule) to minimize
Equivalently, let
N2
minimize
s.t. ()
18
Role of Information Actions Vs. Strategies
Solution A
Case (i)
19
Role of Information Actions Vs. Strategies
Case (ii)
20
Role of Information Actions Vs. Strategies
Can be anything, then chooseappropiately.
No information gathering we choose
at the start and do not take in to
consideration x1 at the beginning of stage 1.
21
Role of Information Actions Vs. Strategies
Problem B Choose u0 and u1 sequentially, using
the observed value of x1.
Solution B from (), we select
Sequential decision-making, feedback control.
Thus to take decision u1, we wait until outcome
x1 becomes available, and act accordingly.
22
Role of Information Actions Vs. Strategies
Note information gathering doesnt always help
Let
(Deterministic case)
Do not gain anything by making decisions
sequentially
23
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
1-Discrete time stochastic dynamic system (t, k
can be time or events)
state space of time k
control space
disturbance space (countable)
Also, depending on the state of the system, there
are constraints on the actions that can be taken
Non empty subset
24
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
2-Stochastic disturbance ?wk .
probability measure (distribution), may depend
explicitly in time, current state and action, but
not on previous disturbances wk-1, , w0 .
25
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
3-Admissible Control / Decision Laws (Strategies,
Policies)
Define information patterns !
?Feasible policies ?Markov -Deterministic
-Randomize
and
()
() holds
26
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
4-Finite Horizon Optimal Control / Decision
Problem Given an initial state x0 , and cost
functions gk , k0, , N-1 find ? ? ?
thatminimizes the cost functional
k0, , N-1
subject to the system equation constraint
27
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
We say that ? ? ? is optimal for the initial
state x0 if
Optimal N-stage cost (or value) function
Likewise, for ? gt 0 given, is said to be
?-optimal if
28
Discrete-Time Stochastic Dynamic System Modeland
Optimal Decision / Control Problem
This stochastic optimal control problem is
difficult! we are optimizing over strategies
The Dynamic Programming Algorithm will give us
necessary and sufficient conditions to decompose
this problem into a sequence of coupled
minimization problems over actions,
(optimization) from which we will obtain
.
DP is only general approach for sequential design
making under uncertainty.
29
Alternative System Description
Given a dynamic description of a system via a
system equation
Then we can alternatively describe the system via
a transition law.
30
Alternative System Description
Given xk and uk , xk1 has distribution
P
? System equation ? system transition law
Write a Comment
User Comments (0)
About PowerShow.com