CS 570 Artificial Intelligence Chapter 13. Uncertainty (Probability)

About This Presentation

Title:

CS 570 Artificial Intelligence Chapter 13. Uncertainty (Probability)

Description:

CS 570 Artificial Intelligence. Chapter 13. Uncertainty (Probability) Jahwan Kim ... also http://www.stat.cmu.edu/~minka/papers/nuances.html for interesting nuances ... – PowerPoint PPT presentation

Number of Views:219

Avg rating:3.0/5.0

Slides: 15

Provided by: aiKai

Category:

more less

Transcript and Presenter's Notes

Title: CS 570 Artificial Intelligence Chapter 13. Uncertainty (Probability)

1
CS 570 Artificial IntelligenceChapter 13.
Uncertainty (Probability)

Jahwan Kim
Dept. of CS, KAIST

2
A Riddle

Each box contains two chips the first has two
red chips, the second has two green chips, and
the third has one red and one green chip. We do
not know which box contains which chips. We take
one chip out of a box without looking inside, and
the chip was green. What are the chances,
theoretically speaking, that the second one in
the box is also green?
There is only one box out of three that has two
green chips, so the probability is 1/3.
We know the first chip was green. There are only
two boxes holding green chips, and only one of
them has two green chips. So the probability is
1/2.

3
Probability Mathematical Definition

Mathematical definition of probability
(Kolmogorov) involves measure theory.
A probability space S consists of
A set S
A subset of F the power set of S, closed under
finite intersection and union (so-called
sigma-algebra) and also containing the empty set
and S itself. Elements of F are called events.
A function P from F to real numbers, satisfying
the following three axioms of Kolmogorov
P(E) is in 0,1 for any event E.
P(S)1.
For any mutually disjoint events,

4
ProbabilityRandom Variables

A random variable X is simply a function with
domain S.
Normally the range (co-domain) of a random
variable is either (i) a subset of Rn (ii)
discrete set. X is called continuous in case (i),
and discrete in case (ii).
When X is continuous and ngt2, X is also called a
random vector.
Recall the definitions of
Expectation of a random variable,
Mean and variance of a random variable,
(More generally, the n-th order moment of a
random variable)

5
ProbabilityDiscrete and Continuous Probability

Normally, either
S is a finite set, and F is the power set of S.
The probability is called discrete.
P is determined by its values on singletons
(sometimes called atomic events), i.e., a
function on S. This function is called the
probability mass function.
S is (a subset of) Rn, F is the Borel sigma
algebra (Any open/closed set is an event).
The probability is continuous.
P is usually given by
for some function f. f is called the probability
density function.
It is convenient to treat both discrete and
continuous cases separately.

6
ProbabilityExamples

Uniform distribution
On a finite S, assign the equal probability to
each element of S.
On a finite interval a,b, assign to each
subinterval (its length)/(b-a).
Can you define uniform distribution for the real
line?
Gaussian (or Normal) distribution

7
ProbabilityMultivariate Cases

Consider probability on S x S.
A new probability on S, called marginal
probability, can be obtained by marginalization,
or summing/integrating out S.
The original distribution on S x S is called the
joint distribution.
On the other hand, given probabilities on S and
S, we can define the product probability on S x
S.
Is every probability on S x S a product
probability?

8
Probability Conditional Probability

After some observation, probabilities (not in the
mathematical sense but in the usual sense)
inevitably change. Conditional probability is the
corresponding rigorous mathematical concept.
Conditional probability is defined by
(Product rule)
Bayes Rule follows

9
ProbabilityIndependence

Two random variables X Y are called independent
if P(XY)P(X)
Equivalently,
Chain Rule

10
ProbabilityIndependence

In many cases, independence assumption works
quite well while clearly independence does not
hold.
Usually independence assertions are based on
domain knowledge.
Naïve Bayes Methods
Independence reduces complexity
Suppose S has n elements. A probability on S x S
in general requires n2 values, while assuming
independence, a probability can be prescribed by
2n values.

11
Handling Uncertainties

There are too many uncertainties in our world
Partial observability
Noisy sensors
Uncertainty in action outcome
Inherent immense complexity
Probability summarizes the effects of
Laziness
Theoretical ignorance
Practical ignorance
Probability measures degree of belief, not degree
of truth.
Not 80 true, but true in 80 of the cases.

12
Use of Probability

Decision Theory Utility Theory Probability
Theory
Propositions in the pure logical setting are
replaced by propositions containing random
variables.

13
Examples

Answers to the Riddle 2/3
See also http//www.stat.cmu.edu/minka/papers/nua
nces.html for interesting nuances of probability
theory.
Traffic and action example in the textbook
Cavity example in the textbook
Wumpus example in the textbook