Markov Decision Processes (MDPs)

About This Presentation

Title:

Markov Decision Processes (MDPs)

Description:

Markov Decision Processes (MDPs) read Ch 17.1-17.2 utility-based agents goals encoded in utility function U(s), or U:S effects of actions encoded in state transition ... – PowerPoint PPT presentation

Number of Views:141

Avg rating:3.0/5.0

Slides: 7

Provided by: Thoma535

Learn more at: https://people.engr.tamu.edu

Tags: decision | markov | flash | free | online_training | powerpoint | ppt | pptx | presentation | slide_show | slideshow

Transcript and Presenter's Notes

Title: Markov Decision Processes (MDPs)

1
Markov Decision Processes (MDPs)

read Ch 17.1-17.2
utility-based agents
goals encoded in utility function U(s), or US ??
effects of actions encoded in state transition
function TSxA?S
or TSxA?pdf(S) for non-deterministic
rewards/costs encoded in reward function RSxA??
Markov property effects of actions only depend
on current state, not previous history

2

the goal maximize reward over time
long-term discounted reward
handles infinite horizon encourages quicker
achievement
plans are encoded in policies
mappings from states to actions pS?A
how to compute optimal policy p that maximizes
long-term discounted reward?

3
(No Transcript)
4

value function Vp(s) expected long-term reward
from starting in state s and following policy p
derive policy from V(s)
p(s)maxa?A ER(s,a)gV(T(s,p(s)))
max S p(ss,a)(RgV(s))
optimal policy comes from optimal value function
p(s) max S p(ss,a)V(s)

5
Calculating V(s)

Bellmans equations
(eqn 17.5)
method 1 linear programming
n coupled linear equations
v1 max(v2,v3,v4...)
v2 max(v1,v3,v4...)
v3 max(v1,v2,v4...)
solve for v1,v2,v3... using Gnu LP kit, etc.

6

method 2 Value Iteration
initialize V(s)0 for all states
iteratively update value of each state based on
neighbors
...until convergence

Write a Comment

User Comments (0)

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Online Sampling for Markov Decision Processes PowerPoint PPT Presentation

Online Sampling for Markov Decision Processes - Online Sampling for Markov Decision Processes Bob Givan Joint work w/ E. K. P. Chong, H. Chang, G. Wu Electrical and Computer Engineering Purdue University | PowerPoint PPT presentation | free to view

Online Sampling for Markov Decision Processes PowerPoint PPT Presentation

Online Sampling for Markov Decision Processes - Markov Decision Process (MDP) Ingredients: System state x in state space X Control action a in A(x) Reward R(x,a) ... Two-pronged solution approach: ... | PowerPoint PPT presentation | free to view

POMDPs: Partially Observable Markov Decision Processes Advanced AI PowerPoint PPT Presentation

POMDPs: Partially Observable Markov Decision Processes Advanced AI - Title: Probabilistic Robotics Author: SCS Last modified by: Wolfram Burgard Created Date: 5/13/2000 3:49:16 PM Document presentation format: On-screen Show | PowerPoint PPT presentation | free to view

POMDPs: Partially Observable Markov Decision Processes Advanced AI PowerPoint PPT Presentation

POMDPs: Partially Observable Markov Decision Processes Advanced AI - The third component can therefore safely be pruned away from V1(b). 22 ... The pruned value functions at T=20, in comparison, contains only 12 linear components. ... | PowerPoint PPT presentation | free to view

RiskSensitive Markov Decision Processes PowerPoint PPT Presentation

RiskSensitive Markov Decision Processes - Time preference, risk preference, and discounting. Discounting without risk neutrality. Application to ... Work with Qiaohai (Joice) Hu in the risk-neutral case ... | PowerPoint PPT presentation | free to view

Game Theory, Markov Game, and Markov Decision Processes: A Concise Survey PowerPoint PPT Presentation

Game Theory, Markov Game, and Markov Decision Processes: A Concise Survey - Game Theory, Markov Game, and Markov Decision Processes: A Concise Survey Cheng-Ta Lee August 29, 2006 Outline Game Theory Decision Theory Markov Game Markov Decision ... | PowerPoint PPT presentation | free to view

Solving Decentralized Markov Decision Processes PowerPoint PPT Presentation

Solving Decentralized Markov Decision Processes - Multi-agent Rover Example. Simple rovers on Mars. Must explore sites (rocks) ... Mars Rover Experiment. Number of sites 5. Processing a site takes random time amount ... | PowerPoint PPT presentation | free to view

Markov Decision Processes: Approximate Equivalence PowerPoint PPT Presentation

Markov Decision Processes: Approximate Equivalence - Property Testing and its connection to Learning and Approximation. ... Union of polytopes: each H can be computed by a linear program. ... | PowerPoint PPT presentation | free to view

Survivability Analysis of Networked Systems PowerPoint PPT Presentation

Survivability Analysis of Networked Systems - Use Markov Decision Processes to model both nondeterministic and probabilistic transitions. ... Constrained Markov Decision Processes S, A, P, c, d S is a ... | PowerPoint PPT presentation | free to view

5/6: Summary and Decision Theoretic Planning PowerPoint PPT Presentation

5/6: Summary and Decision Theoretic Planning - Metric-Temporal Planning: Issues and Representation. Search ... (belief) state action tables. Deterministic Success: Must reach goal-state with probability 1 ... | PowerPoint PPT presentation | free to view

Optimal Electricity Supply Bidding by Markov Decision Process PowerPoint PPT Presentation

Optimal Electricity Supply Bidding by Markov Decision Process - Authors: Haili Song, Chen-Ching Liu, Jacques Lawarree, & Robert Dahlgren. Presentation Review By: ... Song, H.; Liu, C.-C.; Lawarree, J.; Dahlgren, R.W, ... | PowerPoint PPT presentation | free to view

Markov decision process PowerPoint PPT Presentation

Markov decision process - An Markov decision process is characterized by {T, S, As, pt ... Applications Total tardiness minimization on a single machine Job 1 2 3 Due date di 5 6 5 ... | PowerPoint PPT presentation | free to view

The Partially-Observable Markov-Decision-Process (POMDP) model explicitly models the uncertainty PowerPoint PPT Presentation

The Partially-Observable Markov-Decision-Process (POMDP) model explicitly models the uncertainty - POMDP Distribution over possible dialog acts (eg N-best list) Statistical Approaches The POMDP Approach The HIS Model The Demo System The Partially-Observable Markov ... | PowerPoint PPT presentation | free to view

Multiagent Planning with Factored MDPs PowerPoint PPT Presentation

Multiagent Planning with Factored MDPs - Peasants collect resources and build. Footmen attack enemies ... (x0) = both peasants get wood. x0 (x1) = one peasant gets gold, other builds barrack ... | PowerPoint PPT presentation | free to view

Optimal Electricity Supply Bidding by Markov Decision Process PowerPoint PPT Presentation

Optimal Electricity Supply Bidding by Markov Decision Process - ... Monopolistic Market Deregulated: Ideally, ... Based on Profit & Competition Introduction Cont d Traditional Power System Generation, Transmission, ... | PowerPoint PPT presentation | free to view

Probability and Time: Markov Models PowerPoint PPT Presentation

Probability and Time: Markov Models - Big Picture: R&R systems. Environment. Problem. Query. Planning ... Variations of SMC are at the core of most Natural Language Processing (NLP) applications! ... | PowerPoint PPT presentation | free to view

BAMS 517 Decision Analysis: A Dynamic Programming Perspective PowerPoint PPT Presentation

BAMS 517 Decision Analysis: A Dynamic Programming Perspective - Title: BAMS 517 Decision Analysis Author: Eric Cope Last modified by: Sauder Schooll of Business Created Date: 12/20/2004 6:46:51 PM Document presentation format | PowerPoint PPT presentation | free to view

Decision Trees with Minimal Costs (ICML 2004, Banff, Canada) PowerPoint PPT Presentation

Decision Trees with Minimal Costs (ICML 2004, Banff, Canada) - Decision Trees with Minimal Costs (ICML 2004, Banff, Canada) ... Most inductive learning algorithms: minimizing classification errors ... | PowerPoint PPT presentation | free to view

Apprenticeship Learning via Inverse Reinforcement Learning PowerPoint PPT Presentation

Apprenticeship Learning via Inverse Reinforcement Learning - RL formalism. Assume that at each time step, our system is in some state st. ... RL formalism. Markov Decision Process (S,A,P,s0,R) W.l.o.g. we assume. Policy ... | PowerPoint PPT presentation | free to view

Structured Models for Decision Making PowerPoint PPT Presentation

Structured Models for Decision Making - Structured Models for Decision Making Daphne Koller Stanford University koller@cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18, 2000 | PowerPoint PPT presentation | free to view

Uncertainty in motion planning PowerPoint PPT Presentation

Uncertainty in motion planning - Behrouz Haji Soleimani Dr. Moradi Outline What is uncertainty? Some examples Solutions to uncertainty Ignoring uncertainty Markov Decision Process (MDP) Stochastic ... | PowerPoint PPT presentation | free to view

A BDI model for HighLevel Agent Control with a POMDP Planner Gavin Rens Meraka Institute Knowledge S PowerPoint PPT Presentation

A BDI model for HighLevel Agent Control with a POMDP Planner Gavin Rens Meraka Institute Knowledge S - In Decision Analysis, we roll back a decision tree to decide' the action by ... We iteratively roll back from last decision nodes to first decision node ... | PowerPoint PPT presentation | free to view

Apprenticeship Learning via Inverse Reinforcement Learning PowerPoint PPT Presentation

Apprenticeship Learning via Inverse Reinforcement Learning - Pieter Abbeel and Andrew Y. Ng. Apprenticeship Learning. Learning from ... Pieter Abbeel and Andrew Y. Ng. Preliminaries. Markov Decision ... and Andrew Y. ... | PowerPoint PPT presentation | free to view

Multiagent Coordination, Planning, Learning and Generalization with Factored MDPs PowerPoint PPT Presentation

Multiagent Coordination, Planning, Learning and Generalization with Factored MDPs - Multiagent Coordination, Planning, Learning and Generalization. with ... [Guestrin, Koller, Parr 01] Limited communication for optimal action choice. Comm. ... | PowerPoint PPT presentation | free to view

Probability and Time: Markov Models PowerPoint PPT Presentation

Probability and Time: Markov Models - For instance, the system keeps collecting evidence to diagnose ... state of the world at each specific point in time via a series of snapshots, or time slices, ... | PowerPoint PPT presentation | free to view

Toward Optimal and Efficient Adaptation in Web Processes PowerPoint PPT Presentation

Toward Optimal and Efficient Adaptation in Web Processes - Toward Optimal and Efficient Adaptation in Web Processes. Prashant Doshi. LSDIS Lab. ... SMs are responsible for local adaptation. Local oversight ... | PowerPoint PPT presentation | free to view

George Bush, Competitive Bayesian MDPs With Influence, and Blying PowerPoint PPT Presentation

George Bush, Competitive Bayesian MDPs With Influence, and Blying - George Bush, Competitive Bayesian MDPs With Influence, and 'Blying' Theodore T. Allen, Ph.D. ... 2006 US Federal Budget (Trillions) Not much foreign aid, obvious waste ... | PowerPoint PPT presentation | free to view