Title: Faculty of Social Sciences Induction Block: Maths
1Faculty of Social Sciences Induction Block
Maths Statistics Lecture 4
- Probability, Randomness and Different Types of
Events - Dr Gwilym Pryce
2Plan
- 1. Introduction
- 2. Randomness Probability
- 3. Complementary and Disjoint Events
- 4. When 2 events can occur together
- 5. Independent events
- 6. Contingent events
3Birthday Prediction
- I predict that there are between one and two
birthdays in the class this week - How did I do this?...
4- A/ 1 in 52 chance that it is your birthday
- 70 people in the room
- expected number of birthdays this week 70
1/52 1.346 - Q/ Does that mean that every class of 70 has
1.346 birthdays that week? - A/ No. It means that in the long run (ie lots of
lecture classes of size 70) the average number of
people with birthdays will 1.346
5e.g. Birthdays in many classes
6(No Transcript)
7- it can take a long time for the long run average
to emerge
8(No Transcript)
9Coin tossing example
102. Randomness and Probability
- A phenomenon is random if individual outcomes are
uncertain but there is nonetheless a regular
distribution of outcomes in a large number of
repetitions.
11- The probability of any outcome of a random
variable is the proportion of times the outcome
would occur in a very long series of repetitions.
- I.e. long-term relative frequency number of
times an event occurs in the long run divided by
the number of possible outcomes.
12Probability of an event
- Probability that a strangers birthday is this
week 1/52 - Probability that flipped coin is heads 1/2
- Probability of picking a red ball from a bag of 2
red and 8 blue is - 0.2 or 20 or 1 in 5 chance or 5 to 1
against. - NB probabilities always lie between 0 and 1.
133. Probability of an event not occurring
- This is called the complement of an event
- It is calculated as
- 1 - probability of the event occurring
- P(A) 1 - P(A)
14Disjoint Events
- Two events A and B are disjoint if they have no
outcomes in common and so can never occur
simultaneously - E.g. You have one die want to know the
probability of rolling a 4 or a 6. - Answer If the die is fair, the chance of rolling
a 4 is 1/6. Chance of rolling a 6 is also 1/6.
Chance of rolling either a 4 or a 6 (cant have
both at the same time) is 1/6 1/6 1/3. - More generally, when A and B are disjoint
(mutually exclusive), - P(A or B) P(A) P(B)
15Venn diagram of disjoint events
B
A
164. When 2 events can occur together
- Suppose that you now have a die and a coin
- What is the chance of getting a 4 and/or heads?
A Roll a 4
B Toss heads (!)
17- Probability of rolling 4 1/6
- Probability of tossing heads 1/2
- Probability of rolling 4 and/or heads
- 1/6 1/2 4/6
- Or is it?
- Havent we double counted the probability of
getting 4 and a heads? - So we need to deduct this probability (1/61/2)
18What we should really do is take away (1/61/2)
- Probability of rolling 4 and/or heads
- 1/6 1/2 - (1/61/2)
- 4/6 1/12
- 7/12
19To see this, suppose you only have a two sided
die (can only roll a one or a two) and a coin and
want to know Pr(1 or tails or both)
- We know that Pr(1) ½ and Pr(heads) ½
- If we use the incorrect formula we get
- Pr(1 or heads or both) ½½ 1
- Which is clearly incorrect
- There are 4 possible outcomes, 3 of which qualify
as 1 or tails or both - (1,H) (1,T) (2,H) (2,T)
- I.e. Pr(1 or heads or both) ¾
- We have double counted Pr(both) Pr(1,H), where
- Pr(1,H) ½ ½ ¼
20- The correct formula is therefore
- P(A or B) P(A) P(B) - P(A and B)
- Probability Notation
- A ? B A union B A or B occur
- A ? B A intersection B A and B occur
- So we can re-write the above as
- P(A ? B) P(A) P(B) - P(A ? B)
215. Probability of two Independent events occurring
- If knowing that one event occurs does not affect
the outcome of another event, we say those two
outcomes are independent. - And if A and B are independent, and we know the
probability of each of them occurring, we can
calculate the probability of them both occurring
22Example 2 sided die and coin, find Pr(1 and H)
- Answer ½ x ½ ¼
- Rule P(A ? B) P(A) x P(B)
23e.g. Tossing one coin twice
- Suppose
- A 1st toss is a head
- B 2nd toss is a head
- what is the probability of A ? B?
- Answer A and B are independent and are not
disjoint. P(A) 0.5 and P(B) 0.5. P (A ? B)
0.5 x 0.5 0.25.
246. Probability of two contingent events occurring
- If knowing that one event occurs does change the
probability that the other occurs, then two
events are not independent and are said to be
contingent upon each other - If events are contingent then we can say that
there is some kind of relationship between them - So testing for contingency is one way of testing
for a relationship
25Example of contingent events
- There is a 70 chance that a child will go to
university if its parents are middle class, but
only a 10 chance if its parents are working
class. Given that there is a 60 chance of a
childs parents being working class, what are the
chances that a child will be working class and go
to University? What proportion of people at
university will be from working working class
backgrounds?
26(No Transcript)
276 of all children are working class and end up
going to University
28 as percent of all children
Working class Middle class
Go to University 6 28
Do not go to University 54 12
29 at Uni from WC parents?
- Of all children, only 34 end up at university
(6 WC 28 MC) - I.e 6 out of every 34 University students are
from WC parents - 6/34 17.65 of University students are WC
30- Probability theory states that
- if x and y are independent, then the probability
of events x and y simultaneously occurring is
simply equal to the product of the two events
occurring - if x and y are not independent, then
- Prob(x ? y) Prob(x) ? Prob(y given that x has
occurred)
31Test for independence
- We can use these two rules to test whether events
are independent - Does the distribution of observations across
possible outcomes resemble the random
distribution we would get if events were
independent? - I.e. if we assume independence and calculate the
expected number of of cases in each category, do
these figures correspond fairly closely to the
actual distribution of outcomes found in our data?
32Example 1 Is there a relationship between social
class and education? We might test this by
looking at categories in our data of WC, MC,
University, no University. Suppose we have 300
observations distributed as follows
Working class Middle class
Go to University 18 84
Do not go to University 162 36
33- To do the test for independence we need to
compare expected with observed. - How do we calculate ei, the expected number of
observations in category i? - I.e. number of cases expected in i assuming that
the variables are independent - the formula for ei is the probability of an
observation falling into category i multiplied
simply by the total number of observations. - I.e. No contingency
34- So, if UNIY or UNIN and WC or MC are independent
(i.e. assuming H0) then - Prob(UNIY ? WC) Prob(UNIY)?Prob(WC)
- so the expected number of cases for each of the
four mutually exclusive categories are as
follows
Working class Middle class
Go to University P(UNIY) x P(WC) x n P(UNIY) x P(MC) x n
Do not go to University P(UNIN) x P(WC) x n P(UNIN) x P(MC) x n
35- But how do we work out
- Prob(UNIY) and Prob(WC)
- which are needed to calcluate Prob(UNIY ? WC)
- Prob(UNIY ? WC) Prob(UNIY)?Prob(WC)
- Answer we assume independence and so estimate
them from out data by simply dividing the total
observations by the total number in the given
category - E.g. Prob(UNIY) Total no. cases UNIY ? All
observations - (18 84) / 300 0.34
-
36Working class Middle class
Go to University P(UNIY) x P(WC) x n (no. at Uni / n) x (no. WC/n) x n P(UNIY) x P(MC) x n (no. at Uni / n) x (no. MC/n) x n
Do not go to University P(UNIN) x P(WC) x n (no.not Uni / n) x (no. WC/n) x n P(UNIN) x P(MC) x n (no. not Uni / n) x (no. MC/n) x n
37Working class Middle class
Go to University 18 84 102
Do not go to University 162 36 198
180 120 300
38Working class Middle class
Go to University P(UNIY) x P(WC) x n (102 / 300) x (180 /300) x 300 P(UNIY) x P(MC) x n (102 / 300) x (120 /300) x 300
Do not go to University P(UNIN) x P(WC) x n (198 / 300) x (180 /300) x 300 P(UNIN) x P(MC) x n (198 / 300) x (120 /300) x 300
39Expected count in each category
Working class Middle class
Go to University (102 / 300) x (180 /300) x 300 .34 x .6 x 300 61.2 (102 / 300) x (120 /300) x 300 .34 x .4 x 300 40.8
Do not go to University (198 / 300) x (180 /300) x 300 .66 x .6 x 300 118.8 (198 / 300) x (120 /300) x 300 .66 x .4 x 300 79.2
40So we have the actual count (I.e. from our data
set)
Working class Middle class
Go to University 18 84
Do not go to University 162 36
41And the expected count (I.e. the numbers wed
expect if we assume class education to be
independent of each other)
Working class Middle class
Go to University 61.2 40.8
Do not go to University 118.8 79.2
42What does this table tell you?
Working class Middle class
Go to University Actual count 18 84
Expected count 61.2 40.8
Do not go to University Actual count 162 36
Expected count 118.8 79.2
43- It tells you that if class and education were
indeed independent of each other - I.e. the outcome of one does not affect the
chances of outcome of the other - Then youd expect a lot more working class people
in the data to have gone to university than
actually recorded (61 people, rather than 18) - Conversely, youd expect far fewer middle class
people to have gone to university (half the
number actually recorded).
44But remember, all this is based on a sample, not
the entire population
- Q/ Is this discrepancy due to sampling variation
alone or does it indicate that we must reject the
assumption of independence?