Imprecise probabilities in engineering design

About This Presentation

Title:

Imprecise probabilities in engineering design

Description:

Credal set (of possible probability measures) Relaxes the idea of a single probability measure ... Credal sets. Distributions with interval-valued parameters ... – PowerPoint PPT presentation

Number of Views:98

Avg rating:3.0/5.0

Slides: 83

Provided by: scott417

Category:

more less

Transcript and Presenter's Notes

Title: Imprecise probabilities in engineering design

1
Imprecise probabilities in engineering design

Scott Ferson Applied Biomathematics scott_at_ramas.co
m Workshop on Uncertainty Representation in
Robust and Reliability-based Design ASME
DETC/CIE, Philadelphia, 10 September 2006
2
Imprecise probabilities (IP)

Credal set (of possible probability measures)
Relaxes the idea of a single probability measure
Coherent upper and lower previsions
de Finettis notion of a fair price
Generalizes probability and expectation
Gambles

3
Three pillars of IP

Behavioral definition of probability
Can be operationalized
Natural extension
Linear programming to compute answers
Rationality criteria
Avoiding sure losses (Dutch books)
Coherence (logical closure)

ASL means you cannot be made into a money
pump Inverted interval bounds on a probability
would violate ASL Coherence means fully
recognizing the implications of your betting
rates P(A ? B) gt P(A) P(B), for disjoint A and
B, would violate coherence
4
Probability of an event

Imagine a gamble that pays one dollar if an event
occurs (but nothing otherwise)
How much would you pay to buy this gamble?
How much would you be willing to sell it for?
Probability theory requires the same price for
both
By asserting the probability of the event, you
agree to buy any such gamble offered for this
amount or less, and to sell the same gamble for
any amount less than or equal to this fair
price and for every event!
IP just says, sometimes, your highest buying
price might be smaller than your lowest selling
price

5
Credal set

Knowledge and judgments are used to define a set
of possible probability measures M
All distributions within bounds are possible
Only distributions having a given shape
Probability of an event is within some interval
Event A is at least as probable as event B
Nothing is known about the probability of C

6
IP generalizes other approaches

Probability theory
Bayesian analysis
Worst-case analysis, info-gap theory
Possibility / necessity models
Dempster-Shafer theory, belief / plausibility
functions
Probability intervals, probability bounds
analysis
Lower/upper mass/density functions
Robust Bayes, Bayesian sensitivity analysis
Random set models
Coherent lower previsions

DeFinetti probability measures Credal
sets Distributions with interval-valued
parameters Contamination models Choquet
capacities, 2-monotone capacities
7
Assumptions

Everyone makes assumptions
But not all sets of assumptions are equal!
Linear Gaussian Independent
Montonic Unimodal Known correlation sign
Any function Any distribution Any dependence
IP doesnt require unwarranted assumptions
Certainties lead to doubt doubts lead to
certainty

8
Activities in engineering design

Decision making
Optimization
Constraint propagation
Convolutions
Arithmetic
Logic (event trees)
Updating
Validation
Sensitivity analyses

often
sometimes
a lot
9
Convolutions (i.e., adding, multiplying,
and-gating, or-gating, etc., for quantifying the
reliability or risk associated with a design)
10
Probability boxes (p-boxes)
Interval bounds on an cumulative distribution
function (CDF)
1
Cumulative probability
0

1.0

2.0

3.0

0.0
X
11
A few ways p-boxes arise
1
CDF
0
Precise distribution
12
P-box arithmetic (and logic)

All standard mathematical operations
Arithmetic operations (, ?, , , , min, max)
Logical operations (and, or, not, if, etc.)
Transformations (exp, ln, sin, tan, abs, sqrt,
etc.)
Other operations (envelope, mixture, etc.)
Faster than Monte Carlo
Guaranteed to bounds answer
Optimal answers generally require LP

13
Example

Calculate A B C D, with partial
information
As distribution is known, but not its parameters
Bs parameters known, but not its shape
C has a small empirical data set
D is known to be a precise distribution
Bounds assuming independence?
Without any assumption about dependence?

A lognormal, mean .5,.6, variance
.001,.01)
B min 0, max 0.5, mode 0.3
C sample data 0.2, 0.5, 0.6, 0.7, 0.75, 0.8
D uniform(0, 1)

1
1
A
B
CDF
0
0
0
0.2
0.4
0.6
0
1
1
1
D
C
CDF
0
0
0
1
0
1
15
ABCD
1
Under independence
0

1.0

2.0

3.0

0.0
16
Generalization of methods

Marries interval analysis with probability theory
When information abundant, same as probability
theory
When inputs only ranges, agrees with interval
analysis
Cant get these answers from Monte Carlo methods
Fewer assumptions
Not just different assumptions
Distribution-free methods
Rigorous results
Automatically verified calculations
Built-in quality assurance

17
Can uncertainty swamp the answer?

Sure, if uncertainty is huge
This should happen (its not unhelpful)
If you think the bounds are too wide, then put in
whatever information is missing
If there isnt any such information, do you want
the results to mislead?

18
Decision making
19
Knights dichotomy

Decisions under risk
The probabilities of various outcomes are known
Maximize expected utility
Not good for big unique decisions or when
gamblers ruin is possible
Decisions under uncertainty
Probabilities of the outcomes are unknown
Several strategies, depending on the analyst

20
Decisions under uncertainty

Pareto (some strategy dominates in all scenarios)
Maximin (largest minimum payoff)
Maximax (largest maximum payoff)
Hurwicz (largest average of min and max payoffs)
Minimax regret (smallest of maximum regret)
Bayes-Laplace (maximum expected payoff assuming
scenarios are equiprobable)

21
Decision making in IP

State of the world is a random variable, X ? X
Outcome (reward) of an action depends on X
We identify an action a with its reward fa X ?
R
In principle, wed like to choose the decision
with the largest expected reward, but how do we
do this?
We explore how the decision changes for different
probability measures in M, the set of possible
ones

22
Comparing actions a and b

Strictly preferred a gt b Ep( fa) gt Ep( fb) for
all p ?M
Almost preferred a ? b Ep( fa) ? Ep( fb) for
all p ?M
Indifferent a ? b Ep( fa) Ep( fb) for all p
?M
Incomparable a b Ep( fa) lt Ep( fb) and
Eq( fa) gt Eq( fb) some p,q ?M
where Ep( f ) p(x) f (x), and
M is the set of possible probability distributions

? x ? X
23
E-admissibility

Vary p in M and, assuming it is the correct
probability measure, see which decision emerges
as the one that maximizes expected utility
The result is the set of all such decisions for
all p ? M

24
Alternative maximality

Maximal decisions are undominated for some p
Ep( fa) ? Ep( fb), for some action b, for some p
? M
Actions cannot be
linearly ordered,
but only partially
ordered

25
Another alternative ?-maximin

We could take the decision that maximizes the
worst-case expected reward
Essentially a worst-case optimization
Generalizes two criteria from traditional theory
Maximize expected utility
Maximin

26
Several IP decision criteria
?-maximax
?-maximin
E-admissible
maximal
interval dominance
27
Example
(due to Troffaes 2004)

Suppose we are betting on a coin toss
Only know probability of heads ? 0.28, 0.7
Want to decide among six available gambles
1 Pays 4 for heads, pays 0 for tails
2 Pays 0 for heads, pays 4 for tails
3 Pays 3 for heads, pays 2 for tails
4 Pays ½ for heads, pays 3 for tails
5 Pays 2.35 for heads, pays 2.35 for tails
6 Pays 4.1 for heads, pays ?0.3 for tails

f1(H) 4, f1(T) 0 f2(H) 0, f2(T) 4 f3(H)
3, f3(T) 2 f4(H) ½, f4(T) 3 f5(H)
2.35, f5(T) 2.35 f6(H) 4.1, f6(T) ?0.3
28
E-admissibility

M is a one-dimensional space of probability
measures
Probability Preference
p(H) lt 2/5 2
p(H) 2/5 2, 3 (indifferent)
2/5 lt p(H) lt 2/3 3
2/5 lt p(H) lt 2/3 1, 3 (indifferent)
2/3 lt p(H) 1

29
Criteria yield different answers
?-maximax 2
?-maximin 5
E-admissible 1,2,3
maximal 1,2,3,5
interval dominance 1,2,3,5,6
30
So many answers

Topic of current discussion and research
Different criteria are useful in different
settings
The more precise the input, the tighter the
outputs
? criteria usually yield only one decision
? criteria not good if many sequential decisions
Some argue that E-admissibility is best overall
Maximality is close to E-admissibility, but much
easier to compute, especially for large problems

31
IP versus traditional approaches

Decisions under IP allow indecision when your
uncertainty entails it
Bayes always produces a single decision (up to
indifference), no matter how little information
may be available
IP unifies the two poles of Knights division
into a continuum

32
Comparison to Bayesian approach

Axioms identical except IP doesnt use
completeness
Bayesian rationality implies not only avoidance
of sure loss coherence, but also the idea that
an agent must agree to buy or sell any bet at one
price
Uncertainty of probability is meaningful, and
its operationalized as the difference between
the max buying price and min selling price
If you know all the probabilities (and utilities)
perfectly, then IP reduces to Bayes

33
Why Bayes fares poorly

Bayesian approaches dont distinguish ignorance
from equiprobability
Neuroimaging and clinical psychology shows humans
strongly distinguish uncertainty from risk
Most humans regularly and strongly deviate from
Bayes
Hsu (2005) reported that people who have brain
lesions associated with the site believed to
handle uncertainty behave according to the
Bayesian normative rules
Bayesians are too sure of themselves (e.g.,
Clippy)

34
Robust Bayes

35
Derivation of Bayes rule

P(A B) P(B) P(A B) P(B A) P(A)
P(A B) P(A) P(B A) / P(B)
The prevalence of a disease in the general
population is 0.01.
If a diseased person is tested, theres a 99.9
chance the test is positive.
If a healthy person is tested, theres a 99.99
chance the test is negative.
If you test positive, whats the chance you have
the disease?

Almost all doctors say 99 or greater, but the
true answer is 50.
36
Bayes rule on distributions

posterior ? prior ? likelihood

posterior (normalized)
likelihood
prior
37
Two main problems

Subjectivity required
Beliefs needed for priors may be inconsistent
with public policy/decision making
Inadequate model of ignorance
Doesnt distinguish between ignorance and
equiprobability

38
Solution study robustness

Answer is robust if it doesnt depend sensitively
on the assumptions and inputs
Robust Bayes analysis, also called Bayesian
sensitivity analysis, investigates this

39
Uncertainty about the prior

class of prior distributions ? class of posteriors

posteriors
priors
likelihood
40
Uncertainty about the likelihood
class of likelihood functions ? class of
posteriors
posteriors
likelihoods
prior
41
Uncertainty about both
Posteriors
Priors
Likelihoods
42
Uncertainty about decisions

class of probability models ? class of decisions
class of utility functions ? class of decisions
If you end up with a single decision, great.
If the class of decisions is large and diverse,
then any conclusion should be rather tentative.

43
Bayesian dogma of ideal precision

Robust Bayes is inconsistent with the Bayesian
idea that uncertainty should be measured by a
single additive probability measure and values
should always be measured by a precise utility
function.
Some Bayesians justify it as a convenience
Others suggest it accounts for uncertainty beyond
probability theory

44
Sensitivity analysis
45
Sensitivity analysis with p-boxes

Local sensitivity via derivatives
Explored macroscopically over the uncertainty in
the input
Describes the ensemble of tangent slopes to the
function over the range of uncertainty

46
Monotone function
Nonlinear function
range of input
range of input
47
Sensitivity analysis of p-boxes

Quantifies the reduction in uncertainty of a
result when an input is pinched
Pinching is hypothetically replacing it by a less
uncertain characterization

48
Pinching to a point value
1
1
Cumulative probability
Cumulative probability
0
0
1
2
3
0
1
2
3
0
X
X
49
Pinching to a (precise) distribution
1
1
Cumulative probability
Cumulative probability
0
0
1
2
3
0
1
2
3
0
X
X
50
Pinching to a zero-variance interval
1
Cumulative probability
0
1
2
3
0
X

Assumes value is constant, but unknown
Theres no analog of this in Monte Carlo

51
Using sensitivity analyses

There is only one take-home message
Shortlisting variables for treatment is bad
Reduces dimensionality, but erases uncertainty

52
Validation
53
How the data come
400
350
300
Temperature degrees Celsius
250
200
1000
900
800
700
600
Time seconds
54
How we look at them
55
One suggestion for a metric
1
Area or average horizontal distance between the
empirical distribution Sn and the predicted
distribution
Probability
0
200
250
300
350
450
400
Temperature
56
Pooling data comparisons

When data are to be compared against a single
distribution, theyre pooled into Sn
When data are compared against different
distributions, this isnt possible
Conformance must be expressed on some universal
scale

57
Universal scale
N(2, 0.6) normal(range0.454502,3.5455,
mean2, var0.36) max(0.0001,exponential(1.7))
(range0.0001,9.00714, mean1.699999,1.700
1, var2.43,2.89) mix(U(1,5),N(10,1)) 2.3
(range2.3,28.9244, mean14.95,
var70.9742)
1
1
1

Probability
0
0
0
1
10
100
1000
0
1
2
3
4
0
10
5

uiFi (xi) where xi are the data and Fi are
their respective predictions

58
Backtransforming to physical scale
1
G
u
Probability
Probability
0
0
5
1
3
2
4
59
Backtransforming to physical scale

The distribution of G?1(Fi (xi)) represents the
empirical data (like Sn does) but in a common,
transformed scale
Could pick any of many scales, and each leads to
a different value for the metric
The likely distribution of interest is the one
used for the validation statement

60
Epistemic uncertainty in predictions
a N(5,11,1) show a b 8.1 show b in blue b
15 breadth(env(rightside(a),b))
4.023263478773 b 11 breadth(env(rightside(a),b
)) / 2 0.4087173895951
1
1
1
Probability
d 0
d ? 4
d ? 0.4
0
0
0
0
10
20
0
10
20
0
10
20

In left, the datum evidences no discrepancy at
all
In middle, the discrepancy is relative to the
edge
In right, the discrepancy is even smaller

61
Epistemic uncertainty in both
z0.0001 zz 9.999 show z,zz a
N(6,7,1)-1 show a b -1mix(1,5,7,
1,6.5,8, 1,7.6,9.99, 1, 3.3,6, 1,4,8,
1,4.5,8, 1,5,7, 1,7.5,9, 1,4,8, 1,5,9,
1,6,9.99) show b in blue b -0.2mix(1,
9,9.6,1, 5.3,6.2, 1,5.6,6, 1,7.8,8.4,
1,5.9,7.8, 1,8.3,8.7, 1,5,7, 1,7.5,8,
1,7.6,9.99, 1, 3.3,6, 1,4,8, 1,4.5,8,
1,5,7, 1,8.5,9, 1,7,8, 1,7,9,
1,8,9.99) breadth(env(rightside(a),b))
2.137345705795 c -4 b -0.2mix(1,
9,9.6,1, 5.3,6.2c, 1,5.6,6c, 1,7.8,8.4,
1,5.9,7.8, 1,8.3,8.7, 1,5,7, 1,7.5,8,
1,7.6,9.99, 1, 3.3,6, 1,4,8, 1,4.5,8c,
1,5,7c, 1,8.5,9, 1,7,8, 1,7,9,
1,8,9.99) breadth(env(rightside(a),b)) / 2
1.329372857714
1
1
1
d 0
d ? 0.05
d ? 0.07
Probability
0
0
0
0
5
10
0
5
10
0
5
10
Predictions in white Observations in blue
62
Backcalculation
63
A typical problem

How can we design an shielding system if we cant
well specify the radiation distribution?
Could plan for worst case analysis
Often wasteful
Cant account for rare even worse extremes
Could pretend we know the distribution
Unreasonable for new designs or environments

64
IP solution

Natural compromise that can express both
Gross uncertainty like intervals and worst cases
Distributional information about tail risks
Need to solve equations containing uncertain
numbers
Constraint propagation, or backcalculation

65
Cant just invert the equation

Total ionizing dose Radiation / Shielding
Shielding Radiation / Dose
When Shielding is put back into the forward
equation, the resulting dose is wider than planned

66
How come?
a 2,8 b 0,4 c a b bb c / a cc
a bb c 0, 32 cc 0, 128
128/32 4

Suppose dose should be less than 32, and
radiation ranges between 50 and 200
If we solved for shielding by division, wed get
a distribution ranging between ltltgtgt
But if we put that answer back into the equation
Dose Radiation / Shielding
wed get a distribution with values as large as
128, which is four times larger than planned

67
Backcalculation with p-boxes

Suppose A B C, where
A normal(5, 1)
C 0 ? C, median ? 15, 90th ile ? 35, max ?
50

68
Getting the answer

The backcalculation algorithm basically reverses
the forward convolution
Not hard at allbut a little messy to show
Any distribution totally inside B is
sure to satisfy the constraint
its a kernel

1
B
0
-10
0
10
20
30
40
50
69
Check it by plugging it back in

A B C ? C

70
When you Know that A B C A B C A ?
B C A / B C A B C 2A C A² C
And you have estimates for A, B A, C B ,C A,
B A, C B ,C A, B A, C B ,C A, B A, C B ,C A, B A,
C B ,C A C A C
Use this formula to find the unknown C A B B
backcalc(A,C) A backcalc (B,C) C A B B
backcalc(A,C) A backcalc(B,C) C A B B
factor(A,C) A factor(B,C) C A / B B
1/factor(A,C) A factor(1/B,C) C A B B
factor(log A, log C) A exp(factor(B, log C)) C
2 A A C / 2 C A 2 A sqrt(C)
71
Hard with probability distributions

Inverting the equation doesnt work
Available analytical algorithms are unstable for
almost all problems
Except in a few special cases, Monte Carlo
simulation cannot compute backcalculations trial
and error methods are required

72
Precise distributions dont work

Precise distributions cant express the target
A specification for shielding giving a prescribed
distribution of doses seems to say we want some
doses to be high
Any distribution to the left would be better
A p-box on the dose target expresses this idea

73
Conclusions
74
New organization

In the past, focus on where uncertainty arose
Parameters
Drivers
Model structure
Today, focus is on the nature of uncertainty
Ignorance (epistemic uncertainty)
Variability (aleatory uncertainty)
Vagueness (semantic uncertainty, fuzziness)
Confusion, mistakes

75
Untenable assumptions

Uncertainties are small
Sources of variation are independent
Uncertainties cancel each other out
Linearized models good enough
Underlying physics is known and modeled
Computations are inexpensive to make

76
Need ways to relax assumptions

Possibly large uncertainties
Non-independent, or unknown dependencies
Uncertainties that may not cancel
Arbitrary mathematical operations
Model uncertainty

77
Good engineering
Dumb luck
Honorable failure
Negligence
78
Take-home messages

It seems antiscientific (or at least silly) to
say you know more than you do
Bayesian decision making always yields one
answer, even if this is not really tenable
IP tells you when you need to be careful and
reserve judgment

79
References
http//www.sciencemag.org/cgi/content/short/310/57
54/1680 http//www.sciencedirect.com/science?_ob
ArticleURL_udiB6T24-3VXBPWR-1_user10_handleV
-WA-A-W-V-MsSWYWW-UUA-U-AABDWWZUBV-AABVYUZYBV-CVUE
BVVZZ-V-U_fmtsummary_coverDate012F312F1996_
rdoc1_origbrowse_srch23toc234908231996239
99419998237027221_cdi4908viewc_acctC000050
221_version1_urlVersion0_userid10md5c6985d
af53c5402c195c1106cec9622f

Cosmides, L., and J. Tooby. 1996. Are humans good
intuitive statisticians after All? Rethinking
some conclusions from the literature on judgment
under uncertainty. Cognition 581-73.
Hsu, M., M. Bhatt, R. Adolphs, D. Tranel, and
C.F. Camerer. 2005. Neural systems responding to
degrees of uncertainty in human decision-making.
Science 3101680-1683.
Kmietowicz, Z.W. and A.D. Pearman. 1981. Decision
Theory and Incomplete Knowledge. Gower,
Hampshire, England.
Knight, F.H. 1921. Risk, Uncertainty and Profit.
L.S.E., London.
Troffaes, M. 2004. Decision making with imprecise
probabilities a short review. The SIPTA
Newsletter 2(1) 4-7.
Walley, P. 1991. Statistical Reasoning with
Imprecise Probabilities. Chapman and Hall,
London.