Title: Probability cont'
1Probability (cont.)
2Assigning Probabilities
- A probability is a value between 0 and 1 and is
written either as a fraction or as a proportion. - For the complete set of distinct possible
outcomes of a random circumstance, the total of
the assigned probabilities must equal 1.
3Complementary Events
One event is the complement of another event if
the two events do not contain any of the same
simple events and together they cover the entire
sample space. Notation AC represents the
complement of A.
Note P(A) P(AC) 1
ExampleA Simple Lottery (cont) A player
buying single ticket wins AC player does not
win P(A) 1/1000 so P(AC) 999/1000
4Classical Approach
- A mathematical index of the relative frequency of
likelihood of the occurrence of a specific event. - Based on games of chance
- The specific conditions of the game are known.
5Estimating Probabilities from Observed
Categorical Data - Empirical Approach
Assuming data are representative, the probability
of a particular outcome is estimated to be the
relative frequency (proportion) with which that
outcome was observed.
6Mutually Exclusive Events
Two events are mutually exclusive if they do not
contain any of the same simple events (outcomes).
Example A Simple Lottery A all three digits
are the same. B the first and last digits are
different The events A and B are mutually
exclusive.
7Independent and Dependent Events
- Two events are independent of each other if
knowing that one will occur (or has occurred)
does not change the probability that the other
occurs. - Two events are dependent if knowing that one will
occur (or has occurred) changes the probability
that the other occurs.
8Example Independent Events
- Customers put business card in restaurant glass
bowl. - Drawing held once a week for free lunch.
- You and Vanessa put a card in two consecutive wks.
Event A You win in week 1. Event B Vanessa
wins in week 2
- Events A and B refer to to different random
circumstances and are independent.
9Example Dependent Events
Event A Alicia is selected to answer Question
1. Event B Alicia is selected to answer
Question 2.
Events A and B refer to different random
circumstances, but are A and B independent
events?
- P(A) 1/50.
- If event A occurs, her name is no longer in the
bag P(B) 0. - If event A does not occur, there are 49 names in
the bag (including Alicias name), so P(B)
1/49.
Knowing whether A occurred changes P(B). Thus,
the events A and B are not independent.
10Probability Calculations
- Some Useful Formulas to Keep in Mind (Or in Hand)
- U Union (or)
- n Intersection (and)
- General Formulas
- Adding (or)
- P(A U B) P(A) P(B) P(A n B)
- Non-mutually Exclusive of Overlapping Outcomes.
- P(A U B) P(A) P(B)
- Mutually Exclusive Outcomes
11Probability Calculations (cont.)
- General Formulas
- Multiplying (and/sequential events)
- P(A n B) P(A)(P(BA)
- Nonindependence sampling without replacement
- P(A n B) P(A)P(B)
- Independence sampling with replacement
12Joint and Marginal Probabilities
- These probabilities refer to the proportion of an
event as a fraction of the total. - P(30 to 64) 62,689/103,870 .60
- P(30 to 64 n married) 43,308/103,870 .42
13Unions and intersections
- PAÈB ¹ PA PB because A and B do overlap.
- PAÈB PA PB - PAÇB.
- AÇB is the intersection of A and B it includes
everything that is in both A and B, and is
counted twice if we add PA and PB.
14PAUB PA PB - PAnB. P(18 to 29 U
Married) .21 .57 - .07 .71
15Conditional Probability
- Consider two events A and B.
- What is the probability of A, given the
information that B occurred? P(A B) ? - Example
- What is the probability that a women is married
given that she is 18 - 29 years old?
16Probability Problems
- P(Married 18-29) 7842/ 22,512
17 Conditional probability and independence
- If we know that one event has occurred it may
change our view of the probability of another
event. Let - A rain today, B rain tomorrow, C rain
in 90 days time - It is likely that knowledge that A has occurred
will change your view of the probability that B
will occur, but not of the probability that C
will occur. - We write P(BA) ¹ P(B), P(CA) P(C). P(BA)
denotes the conditional probability of B, given
A. - We say that A and C are independent, but A and B
are not. - Note that for independent events P(AÇC)
P(A)P(C).
18Age and Marital Status
- P(M) 59,920/103,870 .57
- P(18 to 29) 22,512/103,870 .21
- P(M Ç 18 to 29) 7,842/103,870 .07
- P(M U 18 to 29) .57 .21 - .07 .71
- P(M18 to 29) 7,842/22,512 .34
- P(M30 to 64) 43,808/62,689 .69
- Knowledge of the age changes P(M). Age and
Marital status are not independent.
19Group Practice
20Continuous variables
- A continuous random variable is one which can (in
theory) take any value in some range, for example
crop yield, maximum temperature, height, weight,
etc.
21Probability distributions
- If we measure a random variable many times, we
can build up a distribution of the values it can
take. - Imagine an underlying distribution of values
which we would get if it was possible to take
more and more measurements under the same
conditions. - This gives the probability distribution for the
variable.
22Continuous probability distributions
- Because continuous random variables can take all
values in a range, it is not possible to assign
probabilities to individual values. - Instead we have a continuous curve, called a
probability density function, which allows us to
calculate the probability a value within any
interval. - This probability is calculated as the area under
the curve between the values of interest. The
total area under the curve must equal 1.
23Normal (Gaussian) distributions
- Normal (also known as Gaussian) distributions are
by far the most commonly used family of
continuous distributions. - They are bell-shaped and are indexed by two
parameters - The mean m the distribution is symmetric about
this value - The standard deviation s this determines the
spread of the distribution. Roughly 2/3 of the
distribution lies within 1 standard deviation of
the mean, and 95 within 2 standard deviations.
24The probability of continuous variables
- IQ test
- Mean 100 and sd 15
- What is the probability of randomly selecting an
individual with a test score of 130 or greater? - P(X 95)?
- P(X 112)?
- P(X 95 or X 112)?
25The probability of continuous variables (cont.)
- What is the probability of randomly selecting
three people with a test score greater than 112? - Remember the multiplication rule for independent
events.
26Introduction to Statistical Inference
27Populations vs. Samples
- Population
- The complete set of individuals
- Characteristics are called parameters
- Sample
- A subset of the population
- Characteristics are called statistics.
- In most cases we cannot study all the members of
a population
28(No Transcript)
29Inferential Statistics
- Statistical Inference
- A series of procedures in which the data obtained
from samples are used to make statements about
some broader set of circumstances.
30Two different types of procedures
- Estimating population parameters
- Point estimation
- Using a sample statistic to estimate a population
parameter - Interval estimation
- Estimation of the amount of variability in a
sample statistic when many samples are repeatedly
taken from a population. - Hypothesis testing
- The comparison of sample results with a known or
hypothesized population parameter
31These procedures share a fundamental concept
- Sampling distribution
- A theoretical distribution of the possible values
of samples statistics if an infinite number of
same-sized samples were taken from a population.
32Example of the sampling distribution of a
discrete variable
33Continuous Distributions
- Interval or ratio level data
- Weight, height, achievement, etc.
- JellyBlubbers!!!
34Histogram of the Jellyblubber population
35Repeated sampling of the Jellyblubber population
(n 3)
36Repeated sampling of the Jellyblubber population
(n 5)
37Repeated sampling of the Jellyblubber population
(n 10)
38Repeated sampling of the Jellyblubber population
(n 40)
39For more on this concept
- Visit
- http//www.ruf.rice.edu/lane/stat_sim/sampling_di
st/index.html
40Central Limit Theorem
- Proposition 1
- The mean of the sampling distribution will equal
the mean of the population. - Proposition 2
- The sampling distribution of means will be
approximately normal regardless of the shape of
the population. - Proposition 3
- The standard deviation (standard error) equals
the standard deviation of the population divided
by the square root of the sample size. (see 11.5
in text)
41Application of the sampling distribution
- Sampling error
- The difference between the sample mean and the
population mean. - Assumed to be due to random error.
- From the jellyblubber experience we know that a
sampling distribution of means will be randomly
distributed with
42Standard Error of the Mean and Confidence
Intervals
- We can estimate how much variability there is
among potential sample means by calculating the
standard error of the mean.
43Confidence Intervals
- With our Jellyblubbers
- One random sample (n 3)
- Mean 9
- Therefore
- 68 CI 9 or 1(3.54)
- 95 CI 9 or 1.96(3.54)
- 99 CI 9 or 2.58(3.54)
44Confidence Intervals
- With our Jellyblubbers
- One random sample (n 30)
- Mean 8.90
- Therefore
- 68 CI 8.90 or 1(1.11)
- 95 CI 8.90 or 1.96(1.11)
- 99 CI 8.90 or 2.58(1.11)
45Hypothesis Testing (see handout)
- State the research question.
- State the statistical hypothesis.
- Set decision rule.
- Calculate the test statistic.
- Decide if result is significant.
- Interpret result as it relates to your research
question.