Title: Bits of probability
1Bits of probability
2Why?
We need a concept of probability to make
judgements about our hypotheses in the scientific
method. Is the data consistent with our
hypotheses?
3What is probability?
- relative frequency
- If a process or an experiment is repeated a large
number of - times, n, and if the characteristic, E, occurs m
times, then the - relative frequency, m/n, of E will be
approximately equal to the - probability of E.
- P(E) m / n
- personal probability
- What is the probability of life on Mars?
4Pictures
P(A) (Area of A)/(Area of U) implicitly P(AU)
U
A
Event space
Not A
5Operation on event sets
.OR.
- Union of 2 events probability(union)
- P(E1 or E2) P(E1 ? E2)
E1 ? E2
U
U
E1
E2
6Operation on event sets
- Intersection of 2 events probability(intersecti
on) - P(E1 e E2) P(E1?E2)
.AND.
U
U
E1
E1
E1 ? E2
E2
E2
7Probability Properties
- 0 ? P(Ei) ? 1 The probability of Ei is always a
- number between 0 e 1
- 2. ?i P(Ei) 1 The sum of all the outcomes
- Ei ? U (the event space) is 1
- 3. Additivity P(E1 ? E2) ?
8Probability Properties
- 0 ? P(Ei) ? 1 The probability of Ei is always a
- number between 0 e 1
- 2. ?i P(Ei) 1 The sum of all the outcomes
- Ei ? U (the event space) is 1
- 3. Additivity P(E1 ? E2) P(E1) P(E2) -
P(E1 ? E2)
91 experiment toss 2 dice results sum
of the outcomes
P(sum is even or ? 7) P(sum
even) P(sum ?7) - P(sum 8,10,12)
18/36 21/36 - 9/36
30/36
10if E1 and E2 are mutually exclusive then
Probability Additivity
P(E1 ? E2) P(E1) P(E2) For instance P(sum
2 ? 3) 1 2 3 . 36 36
36
112 experiment joint probability of
parents-children
Event a pair of values one for each variable
Parent title Parent title Parent title
primary High school degree
Children title primary 0,04 0,01 0,00
Children title High school 0,06 0,24 0,05
Children title degree 0,05 0,30 0,25
Marginal probability P(Pd) P(parent title
degree) ? P(Cd) P(child title degree)
?
122 experiment joint probability of
parents-children
Event a pair of values one for each variable
Parent title Parent title Parent title
primary High school degree total
Children title primary 0,04 0,01 0,00 0,05
Children title High school 0,06 0,24 0,05 0,35
Children title degree 0,05 0,30 0,25 0,60
total 0,15 0,55 0,30 1,00
Marginal probability P(Pd) P(parent title
degree) 0,30 P(Cd) P(child title degree)
0,60
132 experiment joint probability of
parents-children
Event a pair of values one for each variable
Parent title Parent title Parent title
primary High school degree total
Children title primary 0,04 0,01 0,00 0,05
Children title High school 0,06 0,24 0,05 0,35
Children title degree 0,05 0,30 0,25 0,60
total 0,15 0,55 0,30 1,00
Marginal probability P(Pd) P(parent title
degree) 0,30 P(Cd) P(child title degree)
0,60
Union probabilities P(Pd ? Cd) P(parent
degree ) or (childdegree) ? P(Pp ? Cp)
P(parentprimary) or (childprimary) ? P(Pd ?
Cp) P(parentdegree) or (childprimary) ?
142 experiment joint probability of
parents-children
Event a pair of values one for each variable
Parent title Parent title Parent title
primary High school degree total
Children title primary 0,04 0,01 0,00 0,05
Children title High school 0,06 0,24 0,05 0,35
Children title degree 0,05 0,30 0,25 0,60
total 0,15 0,55 0,30 1,00
Marginal probability P(Pd) P(parent title
degree) 0,30 P(Cd) P(child title degree)
0,60
Union probabilities P(Pd ? Cd) P(parent
degree ) or (childdegree) 0,300,60-0,25
0,65 P(Pp ? Cp) P(parentprimary) or
(childprimary) 0,150,05-0,04 0,16 P(Pd ?
Cp) P(parentdegree) or (childprimary)
0,300,05-0,00 0,35
15Conditional probability
- P(Cd Pp) P(childdegree) given
(parentprimary) ? - P(Cd Phs) P(childdegree) given (parenthigh
school) ? - P(Cd Pd) P(childdegree) given
(parentdegree) ?
16Conditional probability
- P(Cd Pp) P(childdegree) given
(parentprimary) 0,05/0,15 0,33 - P(Cd Phs) P(childdegree) given (parenthigh
school) 0,30/0,55 0,54 - P(Cd Pd) P(childdegree) given
(parentdegree) 0,25/0,30 0,83
17Conditional probability
Conditioning on an event implies that the new
total event space is reduced to that event. This
is why we divide by its probability
Independent event 2 outcomes E1 and E2 are
independent when
P(E1 E2) P(E1) and P(E2E1) P(E2) both holds
18Conditioning on an event implies that the new
total event space is reduced to that event.
19 Independent event?
Two dice case
Â
Â
20Computing the joint probability
P(E1?E2)P(E1E2)P(E2)
- Hint
- assuming E2 is a certain event we can compute
P(E1E2). - Then we can relax this assumption by multiplying
the results by P(E2). - The product is the joint probability
(intersection) of the 2 events
If E1 and E2 are independent the P(E1E2)P(E1)
and this imply P(E1?E2) P(E1) P(E2)
Note 1) 2 mutually exclusive events cannot be
independent 2) 2 independent events
are not mutually exclusive
21Partition
If U ?i Bi and Bi ? Bj ? for all i?j Bi
is a partition of U
22Partition
If Bi is a partition of U P(A) ?i P(A,Bi) ?i
P(ABi)P(Bi)
23Bayes Rule
- Suppose that B1, B2, Bk form a partition of S
- Suppose that Pr(Bi) gt 0 and Pr(A) gt 0. Then
24Bayes Theorem
P(X,Y) P(X Y) P(Y) P(Y X) P(X) Joint
probability So
P(X Y) P(Y)
P(Y X)
P(X)
P(s M) P(M)
A priori probabilities
P(M s)
P(s)
P(s M)
P(M s)
Evidence s
Conclusion M
Conclusion M
Evidence s
25Bayes rule Example
- A rare disease affects 1 out of 100,000 people.
- A test shows positive
- with probability 0.99 when applied to an ill
person, and - with probability 0.01 when applied to a healthy
person. - You result positive to the test.
- ARE YOU ILL?
26Bayes rule Example
P(ill) 0.99 P(healthy) 0.01 P(ill)
10-5
Happy End More likely the test is incorrect!!
27Is the pope an alien?
Since the probability P(PopeHuman)
1/(6,000,000,000) do this imply that the Pope
is not a human being?
Beck-Bornholdt HP, Dubben HH, Nature 381, 730
(1996)
THAT IS if Human ? Pope is RARE, is Pope ?
Human RARE ? (Human ? Not Pope) ?? (Pope
? Not Human)
28 P(PopeHuman) is not the same as P(HumanPope)
but P(Alien) 0 So P(HumanPope) 1.0
The pope is (probably) not an alien
S Eddy and D McKays answer
29More examples of fallacious inference
Since most of sport accidents occur when playing
soccer, Stern titled SOCCER IS THE MOST
DANGEROUS SPORT (without considering that
soccer is probably the most common sport) Since
a third of all fatal accidents in Germany occurs
in private homes, Die Welt titled PRIVATE HOMES
AS DANGER SPOTS (without considering that home
is the place where people spend most of the
time) Since most of the cars entering in one-way
streets in the wrong direction are driven by
women, Bild titledWOMEN MORE DISORIENTED
DRIVERS (without considering whether the
samples of men and women drivers had the same
size) From Kramer W, Gigerenzer G,
Statistical Science 20223-230 (2005)
3033 Pirates (zecchino doro)
- 11 pirati nellocchio hanno una benda (sight
problem) - 11 pirati son zoppi in una gamba (leg
problem) - 11 pirati non sentono la tromba (hearing
problem)
- What is the probability of
- Having all three injuries
- Having 2 injuries
- Having 1 injury
- No injury
3133 Pirates (zecchino doro)
- Suppose that the problems are independent
- P(I_i)1/3 (prob injury i), P(NI_I)2/3 (prob
injury j) - Having all three injuries
- P(S)P(L)P(H)1/31/31/31/27
- Having 2 injuries
- P(i,j,not k) P(j,k,not i) P(k,i,not j )
32/32/31/312/27 - Having 1 injury
- P(i,not (j,k))P(j, not (i,k))P(k,not (i,j))
32/31/31/36/27 - No injury
- P(not S)P(not L)P(not H) 2/32/32/3 8/27
32Game 1 car and 2 sheep
- Two sheep and a care are hidden by three
different doors
From The Curious Incident of the Dog in the
Night-Time by Mark Haddon
33Game 1 car and 2 sheep
1 2 3
- The game you select one door (ex. 1)
- From the remaining two one door with a sheep is
shown to you (ex. 2)
- You may change your door (selecting 3) or you can
keep the your first choice (1)
From The Curious Incident of the Dog in the
Night-Time by Mark Haddon
34Game 1 car and 2 sheep
1 2 3
Question Are the 2 choices1. Equivalent2.
Better change opinion3. Better keeping the first
choice
From The Curious Incident of the Dog in the
Night-Time by Mark Haddon
35Game 1 car and 2 sheep
Suppose you select x (y and z are the
alternatives). P(x)P(y)P(z)1/3P(Sz)probabilit
y of showing z P(first) 1/3 P(second)
P(y,Sz)P(z,Sy) P(Szy)P(y)P(Syz)P(z)
11/311/3 2/3
From The Curious Incident of the Dog in the
Night-Time by Mark Haddon
36Game 1 car and 2 sheep
Write program to test it
firstOK 0 secondOK 0 for i1 to MaxIter
doors 0,0,0 put random a 1 in doors
first one door selected random shown
position in door ! first which is 0 second
the remaining position ! first ! shown if
doorsfirst 1 then firstOK firstOK
1 else if doorssecond 1 then
secondOK secondOK 1 end ifend for
probfirst firstOK / MaxIterprobsecond
secondOK / MaxIter
From The Curious Incident of the Dog in the
Night-Time by Mark Haddon
37Some useful measure Odd ratio and log-odd score
A measure of the relative influence of A and B
is odd(A,B)P(A,B) / P(A)P(B) if A and B are
independent odd(A,B) 1 alternatively
log(odd(A,B)) gtgt 0 or ltlt 0 indicates strong
correlation Ex Substitution matrices
38Probabilistic training of a parametric method
Generally speaking, a parametric model M aims to
reproduce a set of known data
Model M Parameters T
Modelled data
Real data (D)
How to compare them?
39Maximum likelihood
Ddata, M model, Tmodel parameters
40Example (coin-tossing)
Given N tossing of a coin (our data D), the
outcomes are h heads and t tails (Nth) ASSUME
the model P(DM) ph (1- p)t Computing the
maximum likelihood of P(DM)
We obtain that our estimate of p is
p h / (ht) h / N
41Example (Error measure)
Suppose you think that your data are affected by
a Gaussian error So that they are distributed
according to F(xi)Aexp-(xi m)2 /2s 2 With
A1/sqrt(2? s) If your measures are independent
the data likelihood is
P(Data model) Pi F(xi)
Find m and s that maximize the P(Data model)