Correlations Revisited - PowerPoint PPT Presentation

About This Presentation
Title:

Correlations Revisited

Description:

The Probability of Lost Luggage '1 in 176 passengers on U.S. ... The proportion of passengers who lose their luggage is 1/176 or about 0.006 (6 out of 1000) ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 36
Provided by: scottm58
Learn more at: http://www.unm.edu
Category:

less

Transcript and Presenter's Notes

Title: Correlations Revisited


1
Correlations Revisited
2
Probability
  • I think you're begging the question, said
    Haydock, and I can see looming ahead one of those
    terrible exercises in probability where six men
    have white hats and six men have black hats and
    you have to work it out by mathematics how likely
    it is that the hats will get mixed up and in what
    proportion. If you start thinking about things
    like that, you would go round the bend. Let me
    assure you of that!
  • Agatha ChristieThe Mirror Crack's

3
  • Misunderstanding of probability may be the
    greatest of all impediments to scientific
    literacy.
  • Stephen Jay Gould

4
The Personal Probability Interpretation
Personal probability of an event the degree
to which a given individual believes the event
will happen. Sometimes subjective probability
used because the degree of belief may be
different for each individual.
  • Restrictions on personal probabilities
  • Must fall between 0 and 1 (or between 0 and
    100).
  • Must be coherent.

5
Probability Definitions and Relationships
Sample space All the possible outcomes that can
occur. Simple event one outcome in the sample
space a possible outcome of a random
circumstance. Event a collection of one or more
simple events in the sample space often written
as A, B, C, and so on.
6
Assigning Probabilities
  • A probability is a value between 0 and 1 and is
    written either as a fraction or as a proportion.
  • A probability simply is a number between 0 and 1
    that is assigned to a possible outcome of a
    random circumstance.
  • For the complete set of distinct possible
    outcomes of a random circumstance, the total of
    the assigned probabilities must equal 1.

7
Classical Approach
  • A mathematical index of the relative frequency of
    likelihood of the occurrence of a specific event.
  • Based on games of chance
  • The specific conditions of the game are known.

8
Determining the probability of an Outcome
(Classical)
A Simple LotteryChoose a three-digit number
between 000 and 999. Player wins if his or her
three-digit number is chosen. Suppose the 1000
possible 3-digit numbers (000, 001, 002, 999) are
equally likely.In long run, a player should win
about 1 out of 1000 times. Probability 0.001 of
winning.This does not mean a player will win
exactly once in every thousand plays.
9
Example Probability of Simple Events
Random Circumstance A three-digit winning
lottery number is selected.Sample Space
000,001,002,003, . . . ,997,998,999. There
are 1000 simple events.Probabilities for Simple
Event Probability any specific three-digit
number is a winner is 1/1000. Assume all
three-digit numbers are equally likely.
Event A last digit is a 9 009,019, . . .
,999. Since one out of ten numbers in set, P(A)
1/10. Event B three digits are all the same
000, 111, 222, 333, 444, 555, 666, 777,
888, 999. Since event B contains 10 events,
P(B) 10/1000 1/100.
10
Estimating Probabilities from Observed
Categorical Data - Empirical Approach
Assuming data are representative, the probability
of a particular outcome is estimated to be the
relative frequency (proportion) with which that
outcome was observed.
11
Methods of sampling
  • Simple random selection
  • Every member of the population has an equal
    chance of being selected.
  • Systematic
  • Every Xth person.
  • Stratified
  • Random sampling by subgroup.
  • Why?

12
Determining the probability of an Outcome
Empirical Approach
Observe the Relative Frequency of random
circumstances
The Probability of Lost Luggage1 in 176
passengers on U.S. airline carriers will
temporarily lose their luggage.This number is
based on data collected over the long run. So the
probability that a randomly selected passenger on
a U.S. carrier will temporarily lose luggage is
1/176 or about 0.006.
13
Proportions and Percentages as Probabilities
  • The proportion of passengers who lose their
    luggage is 1/176 or about 0.006 (6 out of 1000).
  • About 0.6 of passengers lose their luggage.
  • The probability that a randomly selected
    passenger will lose his/her luggage is about
    0.006.
  • The probability that you will lose your luggage
    is about 0.006.

Last statement is not exactly correct your
probability depends on other factors (how late
you arrive at the airport, etc.).
14
Example Probability of Male versus Female Births
  • Long-run relative frequency of males born in the
    United States is about 0.512 (512 boys born per
    1000 births)

Table provides results of simulation the
proportion is far from .512 over the first few
weeks but in the long run settles down around
.512.
15
Nightlights and Myopia
Assuming these data are representative of a
larger population, what is the approximate
probability that someone from that population who
sleeps with a nightlight in early childhood will
develop some degree of myopia?
Note 72 7 79 of the 232 nightlight users
developed some degree of myopia. So we estimate
the probability to be 79/232 0.34.
16
Complementary Events
One event is the complement of another event if
the two events do not contain any of the same
simple events and together they cover the entire
sample space. Notation AC represents the
complement of A.
Note P(A) P(AC) 1
ExampleA Simple Lottery (cont) A player
buying single ticket wins AC player does not
win P(A) 1/1000 so P(AC) 999/1000
17
Mutually Exclusive Events
Two events are mutually exclusive if they do not
contain any of the same simple events (outcomes).
Example A Simple Lottery A all three digits
are the same. B the first and last digits are
different The events A and B are mutually
exclusive.
18
Independent and Dependent Events
  • Two events are independent of each other if
    knowing that one will occur (or has occurred)
    does not change the probability that the other
    occurs.
  • Two events are dependent if knowing that one will
    occur (or has occurred) changes the probability
    that the other occurs.

19
Example Independent Events
  • Customers put business card in restaurant glass
    bowl.
  • Drawing held once a week for free lunch.
  • You and Vanessa put a card in two consecutive wks.

Event A You win in week 1. Event B Vanessa
wins in week 2
  • Events A and B refer to to different random
    circumstances and are independent.

20
Example Dependent Events
Event A Alicia is selected to answer Question
1. Event B Alicia is selected to answer
Question 2.
Events A and B refer to different random
circumstances, but are A and B independent
events?
  • P(A) 1/50.
  • If event A occurs, her name is no longer in the
    bag P(B) 0.
  • If event A does not occur, there are 49 names in
    the bag (including Alicias name), so P(B)
    1/49.

Knowing whether A occurred changes P(B). Thus,
the events A and B are not independent.
21
Joint and Marginal Probabilities
  • These probabilities refer to the proportion of an
    event as a fraction of the total.

22
Unions and intersections
  • PAÈB ¹ PA PB because A and B do overlap.
  • PAÈB PA PB - PAÇB.
  • AÇB is the intersection of A and B it includes
    everything that is in both A and B, and is
    counted twice if we add PA and PB.

23
(No Transcript)
24
Conditional Probability
  • Consider two events A and B.
  • What is the probability of A, given the
    information that B occurred? P(A B) ?
  • Example
  • What is the probability that a women is married
    given that she is 18 - 29 years old?

25
Probability Problems
  • P(Married 18-29) 7842/ 22,512

26
Conditional probability and independence
  • If we know that one event has occurred it may
    change our view of the probability of another
    event. Let
  • A rain today, B rain tomorrow, C rain
    in 90 days time
  • It is likely that knowledge that A has occurred
    will change your view of the probability that B
    will occur, but not of the probability that C
    will occur.
  • We write P(BA) ¹ P(B), P(CA) P(C). P(BA)
    denotes the conditional probability of B, given
    A.
  • We say that A and C are independent, but A and B
    are not.
  • Note that for independent events P(AÇC)
    P(A)P(C).

27
Conditional probability - tornado forecasting
  • Consider the classic data set on the next Slide
    consisting of forecasts and observations of
    tornados (Finley, 1884).
  • Let
  • F Tornado forecast
  • T Tornado observed
  • Use the frequencies in the table to estimate
    probabilities its a large sample, so estimates
    should not be too bad.

28
Forecasts of tornados
29
Conditional probability - tornado forecasting
  • P(T) 51/2803 0.0182
  • P(TÇF) 28/2803
  • P(TF) 28/100 0.2800
  • P(TFc) 23/2703 0.0085
  • Knowledge of the forecast changes P(T). F and T
    are not independent.
  • P(FT) 28/51 0.5490
  • P(TF), P(FT) are often confused but are
    different quantities, and can take very different
    values.

30
Continuous and discrete random variables
  • A continuous random variable is one which can (in
    theory) take any value in some range, for example
    crop yield, maximum temperature.
  • A discrete variable has a countable set of
    values. They may be
  • counts, such as numbers of accidents
  • categories, such as much above average, above
    average, near average, below average, much below
    average
  • binary variables, such as dropout/no dropout

31
Probability distributions
  • If we measure a random variable many times, we
    can build up a distribution of the values it can
    take.
  • Imagine an underlying distribution of values
    which we would get if it was possible to take
    more and more measurements under the same
    conditions.
  • This gives the probability distribution for the
    variable.

32
Continuous probability distributions
  • Because continuous random variables can take all
    values in a range, it is not possible to assign
    probabilities to individual values.
  • Instead we have a continuous curve, called a
    probability density function, which allows us to
    calculate the probability a value within any
    interval.
  • This probability is calculated as the area under
    the curve between the values of interest. The
    total area under the curve must equal 1.

33
Normal (Gaussian) distributions
  • Normal (also known as Gaussian) distributions are
    by far the most commonly used family of
    continuous distributions.
  • They are bell-shaped and are indexed by two
    parameters
  • The mean m the distribution is symmetric about
    this value
  • The standard deviation s this determines the
    spread of the distribution. Roughly 2/3 of the
    distribution lies within 1 standard deviation of
    the mean, and 95 within 2 standard deviations.

34
The probability of continuous variables
  • IQ test
  • Mean 100 and sd 15
  • What is the probability of randomly selecting an
    individual with a test score of 130 or greater?
  • P(X 95)?
  • P(X 112)?
  • P(X 95 or X 112)?

35
The probability of continuous variables (cont.)
  • What is the probability of randomly selecting
    three people with a test score greater than 112?
  • Remember the multiplication rule for independent
    events.
Write a Comment
User Comments (0)
About PowerShow.com