A Quick Overview of Probability - PowerPoint PPT Presentation

About This Presentation
Title:

A Quick Overview of Probability

Description:

A black crow seems to support the hypothesis 'all crows are black'. A pink highlighter supports the hypothesis 'all non-black things are non-crows' ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 52
Provided by: awm
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: A Quick Overview of Probability


1
A Quick Overview of Probability
  • Tom Mitchell
  • Machine Learning 10-601
  • Jan 21 2009
  • a significant amount of this material is pilfered
    from Andrew Moores slides and William Cohens
    slides
  • www.cs.cmu.edu/awm/tutorials
  • http//www.cs.cmu.edu/tom/10601_sp08/slides/proba
    bility-1-23-2008.ppt

2
The Problem of Induction
  • David Hume (1711-1776) pointed out
  • Empirically, induction seems to work
  • Statement (1) is an application of induction.
  • This stumped people for about 200 years

3
A Second Problem of Induction
  • A black crow seems to support the hypothesis all
    crows are black.
  • A pink highlighter supports the hypothesis all
    non-black things are non-crows
  • Thus, a pink highlighter supports the hypothesis
    all crows are black.

4
Probability Theory
  • Events
  • discrete random variables, continuous random
    variables, compound events
  • Axioms of probability
  • What defines a reasonable theory of uncertainty
  • Independent events
  • Conditional probabilities
  • Bayes rule and beliefs
  • Joint probability distribution

5
Random Variables
  • Informally, A is a random variable if
  • A denotes something about which we are uncertain
  • perhaps the outcome of a randomized experiment
  • Examples
  • A True if a randomly drawn person from our
    class is female
  • A Hometown of a randomly drawn person from our
    class
  • A True if two randomly drawn persons from our
    class have same birthday
  • A True if the 1,000,000,000,000th digit of pi
    is 7
  • Define P(A) as the fraction of possible worlds
    in which A is true
  • the set of possible worlds is called the sample
    space, S
  • A random variable A is a function defined over S
  • A S ? 0,1

6
A little formalism
  • More formally, we have
  • a sample space S (e.g., set of students in our
    class)
  • aka the set of possible worlds
  • a random variable is a function defined over the
    sample space
  • Gender S ? m, f
  • Weight S ? Reals
  • an event is a subset of S
  • e.g., the subset of S for which Genderf
  • e.g., the subset of S for which (Genderm) AND
    (nationalityUS)
  • were often interested in probabilities of
    specific events
  • and specific events conditioned on other specific
    events

7
Visualizing A


Sample space of all possible worlds
P(A) Area of reddish oval
Worlds in which A is true
Its area is 1
Worlds in which A is False
8
The Axioms of Probability
  • 0 lt P(A) lt 1
  • P(True) 1
  • P(False) 0
  • P(A or B) P(A) P(B) - P(A and B)

9
(This is Andrews joke)
The Axioms Of Probability
10
These Axioms are Not to be Trifled With
  • There have been many many other approaches to
    understanding uncertainty
  • Fuzzy Logic, three-valued logic, Dempster-Shafer,
    non-monotonic reasoning,
  • 25 years ago people in AI argued about these now
    they mostly dont
  • Any scheme for combining uncertain information,
    uncertain beliefs, etc, really should obey
    these axioms
  • If you gamble based on uncertain beliefs, then
    you can be exploited by an opponent ? your
    uncertainty formalism violates the axioms - di
    Finetti 1931 (the Dutch book argument)

11
Interpreting the axioms
  • 0 lt P(A) lt 1
  • P(True) 1
  • P(False) 0
  • P(A or B) P(A) P(B) - P(A and B)

The area of A cant get any smaller than 0
And a zero area would mean no world could ever
have A true
12
Interpreting the axioms
  • 0 lt P(A) lt 1
  • P(True) 1
  • P(False) 0
  • P(A or B) P(A) P(B) - P(A and B)

The area of A cant get any bigger than 1
And an area of 1 would mean all worlds will have
A true
13
Interpreting the axioms
  • 0 lt P(A) lt 1
  • P(True) 1
  • P(False) 0
  • P(A or B) P(A) P(B) - P(A and B)

14
Theorems from the Axioms
  • 0 lt P(A) lt 1, P(True) 1, P(False) 0
  • P(A or B) P(A) P(B) - P(A and B)
  • ? P(not A) P(A) 1-P(A)

15
Theorems from the Axioms
  • 0 lt P(A) lt 1, P(True) 1, P(False) 0
  • P(A or B) P(A) P(B) - P(A and B)
  • ? P(not A) P(A) 1-P(A)

P(A or A) 1 P(A and A) 0 P(A or
A) P(A) P(A) - P(A and A) 1
P(A) P(A) - 0
16
Elementary Probability in Pictures
  • P(A) P(A) 1

A
A
17
Another useful theorem
  • 0 lt P(A) lt 1, P(True) 1, P(False) 0
  • P(A or B) P(A) P(B) - P(A and B)
  • ? P(A) P(A B) P(A B)

A A and (B or B) (A and B) or (A and
B) P(A) P(A and B) P(A and B) P((A and B)
and (A and B)) P(A) P(A and B) P(A and B)
P(A and A and B and B)
18
Elementary Probability in Pictures
  • P(A) P(A B) P(A B)

A B
B
A B
B
19
Multivalued Discrete Random Variables
  • Suppose A can take on more than 2 values
  • A is a random variable with arity k if it can
    take on exactly one value out of v1,v2, .. vk
  • Thus

20
Elementary Probability in Pictures
A2
A3
A5
A4
A1
21
More about Multivalued Random Variables
  • Using the axioms of probability
  • 0 lt P(A) lt 1, P(True) 1, P(False) 0
  • P(A or B) P(A) P(B) - P(A and B)
  • And assuming that A obeys
  • Its easy to prove that

22
More about Multivalued Random Variables
  • Using the axioms of probabilityand assuming that
    A obeys
  • Its easy to prove that
  • And thus we can prove

23
Definition of Conditional Probability
P(A B) P(AB)
----------- P(B)
Corollary The Chain Rule
P(A B) P(AB) P(B)
24
Conditional Probability in Pictures
picture P(BA2)
A2
A3
A5
A4
A1
25
Independent Events
  • Definition two events A and B are independent if
    Pr(A and B)Pr(A)Pr(B).
  • Intuition outcome of A has no effect on the
    outcome of B (and vice versa).
  • We need to assume the different rolls are
    independent to solve the problem.
  • You almost always need to assume independence of
    something to solve any learning problem.

26
Picture A independent of B
27
posterior
prior
Bayes rule
Bayes, Thomas (1763) An essay towards solving a
problem in the doctrine of chances. Philosophical
Transactions of the Royal Society of London,
53370-418
by no means merely a curious speculation in the
doctrine of chances, but necessary to be solved
in order to a sure foundation for all our
reasonings concerning past facts, and what is
likely to be hereafter. necessary to be
considered by any that would give a clear account
of the strength of analogical or inductive
reasoning
28
More General Forms of Bayes Rule
29
More General Forms of Bayes Rule
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
The Joint Distribution
Example Boolean variables A, B, C
Recipe for making a joint distribution of M
variables
35
The Joint Distribution
Example Boolean variables A, B, C
A B C
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
  • Recipe for making a joint distribution of M
    variables
  • Make a truth table listing all combinations of
    values of your variables (if there are M Boolean
    variables then the table will have 2M rows).

36
The Joint Distribution
Example Boolean variables A, B, C
A B C Prob
0 0 0 0.30
0 0 1 0.05
0 1 0 0.10
0 1 1 0.05
1 0 0 0.05
1 0 1 0.10
1 1 0 0.25
1 1 1 0.10
  • Recipe for making a joint distribution of M
    variables
  • Make a truth table listing all combinations of
    values of your variables (if there are M Boolean
    variables then the table will have 2M rows).
  • For each combination of values, say how probable
    it is.

37
The Joint Distribution
Example Boolean variables A, B, C
A B C Prob
0 0 0 0.30
0 0 1 0.05
0 1 0 0.10
0 1 1 0.05
1 0 0 0.05
1 0 1 0.10
1 1 0 0.25
1 1 1 0.10
  • Recipe for making a joint distribution of M
    variables
  • Make a truth table listing all combinations of
    values of your variables (if there are M Boolean
    variables then the table will have 2M rows).
  • For each combination of values, say how probable
    it is.
  • If you subscribe to the axioms of probability,
    those numbers must sum to 1.

A
0.05
0.10
0.05

0.10
0.25
C
0.05
0.10
B
0.30
38
Using the Joint
One you have the JD you can ask for the
probability of any logical expression involving
your attribute
39
Using the Joint
P(Poor Male) 0.4654
40
Using the Joint
P(Poor) 0.7604
41
Inference with the Joint
42
Inference with the Joint
P(Male Poor) 0.4654 / 0.7604 0.612
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
Inference is a big deal
  • Ive got this evidence. Whats the chance that
    this conclusion is true?
  • Ive got a sore neck how likely am I to have
    meningitis?
  • I see my lights are out and its 9pm. Whats the
    chance my spouse is already asleep?
Write a Comment
User Comments (0)
About PowerShow.com