Title: Probability & Standard Error of the Mean
1Probability Standard Error of the Mean
2Definition Review
- Population all possible cases
- Parameters describe the population
- Sample subset of cases drawn from the
population - Statistics describe the sample
Statistics Parameters
3Why Sample????
4Why Sample????
- Can afford it
- Can do it in reasonable time
5Why Sample????
- Can afford it
- Can do it in reasonable time
- Can estimate the amount of error (uncertainty) in
statistics, allowing us to generalize (within
limits) to our population
6Even with True Random Selection
- Some error (inaccuracy) associated with the
statistics (will not precisely match the
parameters) - sampling error everybody is different
- The whole measured only if ALL the parts are
measured.
7With unbiased sampling
- Know that the amount of error is reduced as the n
is increased - statistics more closely approximate the
parameters - Amount of error associated with statistics can be
evaluated - estimate by how much our statistics may differ
from the parameters
8Sample size Rules of thumb
- Larger n the better
- law of diminishing returns
- ie 100 to 200 vs 1500 to 1600
- and time constraints
- Less variability in population gt better estimate
in statistics - reduce factors affecting variability
- control and standardization
9Human beings are terrible randomizers
10True Random sampling rare
- What population is the investigator interested
in??? - Getting a true random sample of any population is
difficult if not impossible - subject refusal to participate
11Catch 22
- NEVER know our true population parameters, so we
are ALWAYS at risk of making an error in
generalization
12Probability
13Backbone of inferential stats
- Probability the number of times some event is
likely to occur out of the total possible events
particular event
p
of possible events
14Backbone of inferential stats
- The classic flip a coin
- heads vs tails each at 1/2 (50)
- flip 8x what possible events (outcomes)??
- flip it 8 million times what probable
distribution of heads/tails?
15Wayne Gretzky
16Wayne Gretzky probability
What is the probability that a geeky looking kid
from Brantford, Ontario, Canada would meet, much
less marry, a movie star?
17Waynes famous quote
18Wayne Gretzky redux.
19Life with Probability
- life insurance rates
- obesity
- smoking
- car insurance rates
- age
- previous accidents
- driving demerits
- flood insurance
All life depends on probabilities Voltaire (1756)
20The Ever-Changing Nature of s
Never go for a 50-50 ball unless you're 80-20
sure of winning it. Ian Darke The 50/50/90
Rule whenever you have a 50/50 chance of
guessing at something, theres a 90 chance you
will guess wrong. Menards Philosophy
21How to Count Cards
We are going to show you how to count cards. Card
counting is not illegal. If caught counting cards
you will not be arrested. You will not be taken
into the back room and beaten unconscious, then
dragged to the desert and buried with the rest of
the casino cheaters. You will not get your
fingers cut off with a butcher knife by Michael
Corleone. However, if caught counting cards you
may be banned from playing at that casino. You
have to be smart about counting cards and don't
be too obvious. You do not want to be banned from
the casino that you are sleeping at. If you are
going to try your luck at counting cards we
suggest you go down the street to a different
casino in case you get caught. Use this
information at your own risk.
From gamblingandgaming_at_hotmail.com
22One of the most popular card counting systems
currently in use is the point count system, also
known as Hi-Low. This system is based on
assigning a point value of 1, 0, or -1 to every
card dealt to all players on the table, including
the dealer. Each card is assigned its own
specific point value. Aces and 10-point cards are
assigned a value of -1. Cards 7, 8, 9 each count
as 0. Cards 2, 3, 4, 5, and 6 each count as 1.
As the cards are dealt, the player mentally keeps
a running count of the cards exposed, and makes
wagering decisions based on the current count
total.
23The higher the plus count, i.e. the higher
percentage of ten-point cards and aces remaining
to be dealt, means that the advantage is to
player and he/she should increase their wager.
If the running count is around zero, the deck
or shoe is neutral and neither the player nor the
dealer has an advantage. The higher the minus
count, the greater disadvantage it is to the
player, as a higher than normal number of 'stiff'
cards remains to be dealt. In this case a player
should be making their minimum wager or leave the
table.
24As the dealing of the cards progresses, the
credibility of the count becomes more accurate,
and the size of the player's wager can be
increased or decreased with a better probability
of winning when the deck or shoe is rich in face
cards and aces, and betting and losing less when
the deck is rich in 'stiff' cards. It is
important to note that a player's decision
process, when to hit, stand, double down, etc. is
still based on basic strategy. Remember, you MUST
learn basic strategy. However, alterations in
basic strategy play is sometimes recommended
based on the current card count.
25For example, if the running count is 2 or
greater and you have a hard 16 against a dealer's
up card of ten, you should stand, which is a
direct violation of basic strategy. But
considering that the deck or shoe is rich in face
cards you are more likely to bust in this
situation, thus you ignore basic strategy and
stand. Another example is to always take
insurance when the count is 3 or greater. For
the most part however, you should stick with
basic strategy and use the card count as an
indication of when to increase or decrease the
amount of your bet, as that is the whole strategy
behind card counting.
26Probability the Normal Curve
- Normal Curve
- mathematical abstraction
- unimodal
- symmetrical (Mean Mode Md)
- Asymptotic (any score possible)
- a family of curves
- Means the same, SDs are different
- Means are different, SDs the same
- both Means SDs are different
27Dice Roll Outcomes
Each dice has six equal possible outcomes when
thrown - numbers one through six. The two dice
thrown together have a total of 36 possible
outcomes, the six combinations of one dice by the
six combination of the other.
28Dice Roll Outcomes
Numbers Combinations Dice Combinations 2 one
1 1 3 two 1 2, 2 1 4 three 1 3, 3 1, 2
2 5 four 1 4, 4 1, 2 3, 3 2 6 five 1 5, 5
1, 2 4, 4 2, 3 3 7 six 1 6, 6 1, 2 5, 5 2, 3
4, 4 3 8 five 2 6, 6 2, 3 5, 5 3, 4
4 9 four 3 6, 6 3, 4 5, 5 4 10 three 4 6, 6
4, 5 5 11 two 5 6, 6 5 12 one 6 6
Notice how certain totals have more possibilities
of being thrown, or are more probable of
occurring by random throw of the two dice.
29Probability the Normal Curve
- 99.7 of ALL cases within plus or minus 3
Standard Deviations - Any score is possible
- but some more likely than others (which one?)
- Using the NC table
- Mean 50
- SD 7
- What is probability of getting a score gt 64?
- one-tailed probability
30Probability the Normal Curve
- Using the NC table
- What is probability of getting a score that is
more than one SD above OR more than one SD below
the mean? - two-tailed probability
31Defining probable or likely
- What risk are YOU willing to take?
- Fly to Europe for 1,000,000
- BUT
- 50 chance plane will crash
- 25 chance
- 1chance
- .001 chance
- .000000001 chance
32Defining probable or likely
- In science, we accept as unlikely to have
occurred at random (by chance) - 5 (0.05)
- 1 (0.01)
- 10 (0.10)
May be one-tailed or two-tailed
33Serious people take seriously probabilities, not
mere possibilities.
George Will, 11/2/2000
34Six monkeys fail to write ShakespearePantagraph,
May 2003
35Probability the Normal Curve
- Any score is possible, but some more likely than
others - Key to any problem in statistical inference is to
discover what sample values will occur in
repeated sampling and with what probability.
With what probability will a score arise by
chance that is as extreme as a certain value????
36Statistics Humour
A man who travels a lot was concerned about the
possibility of a bomb on board his plane. He
determined the probability of this, found it to
be low but not low enough for him. So now he
always travels with a bomb in his suitcase. He
reasons that the probability of two bombs
being on board would be infinitesimal.
37Sampling Distributions Standard error of the
mean
38Recall
- With sampling, we EXPECT error in our statistics
- statistics not equal to parameters
- cause random (chance) errors
39Recall
- With sampling, we EXPECT error in our statistics
- statistics not equal to parameters
- cause random (chance) errors
- Unbiased sampling no factor(s) systematically
pushing estimate in a particular direction
40Recall
- With sampling, we EXPECT error in our statistics
- statistics not equal to parameters
- cause random (chance) errors
- Unbiased sampling no factors systematically
pushing estimate in a particular direction - Larger sample less error
41Central Limit Theorem
- Consider (conceptualize) a distribution of sample
means drawn from a distribution - repeated sampling (calculating mean) from the
same population - produces a distribution of sample means
42Central Limit Theorem
- A distribution of sample means drawn from a
distribution (the sampling distribution of means)
will be a normal distribution - class from list of 51 state taxes, each student
create 5 random samples of n 6. - Look at distribution in SPSS
- Mp 32.7 cents, SD 18.1 cents
43Central Limit Theorem
- Mean of distribution of sampling means equals
population mean if the n of means is large
?
44Central Limit Theorem
- Mean of distribution of sampling means equals
population mean if the n of means is large - true even when population is skewed if sample is
large (n gt 60)
45Central Limit Theorem
- Mean of distribution of sampling means equals
population mean if the n of means is large - true if population when skewed if sample is large
(n gt 60) - SD of the distribution of sampling means is the
Standard Error of the Mean
46Take home lesson
- We have quantified the expected error (estimate
of uncertainty) associated with our sample mean - Standard Error of the Mean
- SD of the distribution of sampling means
47Typical procedure
48Typical procedure
- Sample
- calculate mean SD
- KNOW RECOGNIZE that
49Typical procedure
- Sample
- calculate mean SD
- KNOW RECOGNIZE that
- statistics are not exact estimates of parameters
50Typical procedure
- Sample
- calculate mean SD
- KNOW RECOGNIZE that
- statistics are not exact estimates of parameters
- a larger n provides a less variable measure of
the mean
51Central Limit Theorem
52Typical procedure
- Sample, calculate mean SD
- KNOW RECOGNIZE that
- statistics are not exact estimates of the
parameters - a larger n provides a less variable measure of
the mean - sampling from a population with low variability
gives a more precise estimate of the mean
53Estimating Sample SEm
54Example Calculation
- Mean 75
- SDp 16
- n 64
- SEm ???
55Confidence Interval for the Mean
- Mean 75
- SDp 16
- n 64
- SEm 2
56Confidence Interval for the Mean
- Mean 75
- SDp 16
- n 64
- SEm 2
We are about 68 sure that population mean lies
between 73 and 77
Sample mean
75
77
73
68
57Confidence Interval for the Mean
- Mean 75
- SDp 16
- n 64
- SEm 2
Sample mean
73 and 77 are the upper and lower limits of the
68 confidence interval for the population mean
75
77
73
68
58Example Calculation
- Mean 75
- SDp 16
- n 16
- SEm ???
59Example Calculation
- Mean 75
- SDp 16
- n 640
- SEm ???
60Example Calculation
- Mean 75
- SDp 160
- n 16
- SEm ???
61Example Calculation
- Mean 75
- SDp 160
- n 640
- SEm ???
62Explain how SD and n affect the error inherent in
estimating the population mean
6395 Confidence Interval for the Mean
- Mean 80
- SDp 20
- n 36
- SEm ??
??
80
??
??
??
6495 Confidence Interval for the Mean
- Mean 80
- SDp 20
- n 36
- SEm 3.33
1.96 3.33 6.53 Up 80 6.53 Lo 80 - 6.53
95
6595 Confidence Interval for the Mean
- Mean 80
- SDp 20
- n 36
- SEm 3.33
Sample mean
86.53
73.47
73.47 and 86.53 are the upper and lower limits of
the 95 confidence interval for the population
mean
95
66Key to any problem in statistical inference is to
discover what sample values will occur
in repeated sampling and with what probability.