Making Simple Decisions

About This Presentation

Title:

Making Simple Decisions

Description:

Some actions may have uncertain outcomes ... X is the outcome of performing action A (i.e., the state reached after A is taken) ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 16

Provided by: Mariedes8

Learn more at: https://userpages.cs.umbc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Making Simple Decisions

1
Making Simple Decisions

Chapter 16

Some material borrowed from Jean-Claude Latombe
and Daphne Koller by way of Marie desJadines,
2
Topics

Decision making under uncertainty
Utility theory and rationality
Expected utility
Utility functions
Multiattribute utility functions
Preference structures
Decision networks
Value of information

3
Uncertain Outcomes of Actions

Some actions may have uncertain outcomes
Action spend 10 to buy a lottery which pays
1000 to the winner
Outcome win, not-win
Each outcome is associated with some merit
(utility)
Win gain 990
Not-win lose 10
There is a probability distribution associated
with the outcomes of this action (0.0001,
0.9999).
Should I take this action?

4
Expected Utility

Random variable X with n values x1,,xn and
distribution (p1,,pn)
X is the outcome of performing action A (i.e.,
the state reached after A is taken)
Function U of X
U is a mapping from states to numerical utilities
(values)
The expected utility of performing action A is
EUA Si1,,n p(xiA)U(xi)

Utility of each outcome
Probability of each outcome
5
One State/One Action Example
U(S0A1) 100 x 0.2 50 x 0.7 70 x 0.1
20 35 7 62
6
One State/Two Actions Example

U1(S0A1) 62
U2(S0A2) 74
U(S0) maxU1(S0A1),U2(S0A2)
74

80
7
Introducing Action Costs

U1(S0A1) 62 5 57
U2(S0A2) 74 25 49
U(S0) maxU1(S0A1),U2(S0A2)
57

-5
-25
80
8
MEU Principle

Decision theory A rational agent should choose
the action that maximizes the agents expected
utility
Maximizing expected utility (MEU) is a normative
criterion for rational choices of actions
Must have complete model of
Actions
States
Utilities
Even if you have a complete model, will be
computationally intractable

9
Comparing outcomes

Which is better A Being rich and sunbathing
where its warm B Being rich and sunbathing
where its cool C Being poor and sunbathing
where its warm D Being poor and sunbathing
where its cool
Multiattribute utility theory
A clearly dominates B A ?gt B. A gt C. C gt D. A
gt D. What about B vs. C?
Simplest case Additive value function (just add
the individual attribute utilities)
Others use weighted utility, based on the
relative importance of these attributes
Learning the combined utility function (similar
to joint prob. table)

10
Decision networks

Extend Bayesian nets to handle actions and
utilities
a.k.a. influence diagrams
Make use of Bayesian net inference
Useful application Value of Information

11
Decision network representation

Chance nodes random variables, as in Bayesian
nets
Decision nodes actions that decision maker can
take
Utility/value nodes the utility of the outcome
state.

12
RN example
13
Evaluating decision networks

Set the evidence variables for the current state.
For each possible value of the decision node
(assume just one)
Set the decision node to that value.
Calculate the posterior probabilities for the
parent nodes of the utility node, using BN
inference.
Calculate the resulting utility for the action.
Return the action with the highest utility.

14
Exercise Umbrella network
take/dont take
P(rain) 0.4
Umbrella
Weather
Lug umbrella
Forecast
P(lugtake) 1.0 P(lugtake)1.0
Happiness
f w p(fw) sunny rain
0.3 rainy rain 0.7 sunny no rain
0.8 rainy no rain 0.2
U(lug, rain) -25 U(lug, rain) 0 U(lug,
rain) -100 U(lug, rain) 100
15
Value of Perfect Information (VPI)

How much is it worth to observe (with certainty)
a random variable X?
Suppose the agents current knowledge is E. The
value of the current best action ? is EU(a E)
maxA ?i U(Resulti(A)) p(Resulti(A) E, Do(A))
The value of the new best action after observing
the value of X is EU(a E,X) maxA ?i
U(Resulti(A)) p(Resulti(A) E, X, Do(A))
But we dont know the value of X yet, so we have
to sum over its possible values
The value of perfect information for X is
therefore VPI(X) ( ?k p(xk E) EU(axk
xk, E)) EU (a E)

Expected utility of the best action if we dont
know X (i.e., currently)
Expected utility of the best action given that
value of X
Probability of each value of X
16
VPI exercise Umbrella network
Whats the value of knowing the weather forecast
before leaving home?
take/dont take
P(rain) 0.4
Umbrella
Weather
Lug umbrella
Forecast
P(lugtake) 1.0 P(lugtake)1.0
Happiness
f w p(fw) sunny rain
0.3 rainy rain 0.7 sunny no rain
0.8 rainy no rain 0.2
U(lug, rain) -25 U(lug, rain) 0 U(lug,
rain) -100 U(lug, rain) 100

Write a Comment

User Comments (0)