Title: Statistical decisionmaking with means: z and t
1Statistical decision-making with means z and t
2More statistical decision-making.
- Review of sign test.
- Sampling distribution of the mean
- What is it?
- Why is it normally distributed?.
- Hypothesis testing with means and the z
distribution when m and s are known. - Hypothesis testing with means and t when m is
known and s must be estimated.
Reminders Exam 2 is April 2!
3Course notes.
- Chapter 12
- Logic of using the normal curve for hypothesis
testing. - Hypothesis testing with one sample when m and s
are known z. - Skip pages 279-285.
- Chapter 13
- Hypothesis testing with one sample when m is
known but s is unknown t. - Skip pages 305-308.
- Exam 2
- will cover Chapters 8, 10, 12, and 13.
- handed out on 4/02 and due back at the beginning
of class on 4/09.
4Sign test.
- The sign test can be used when you have two sets
of scores that are paired because they are scores
collected from the same person while he/she is
participating in two different conditions. - Known as repeated measures, within-subjects
or correlated measures designs. - Goal of the sign test is to determine if scores
collected under one condition differ from those
collected under the other condition - Significant as opposed to random
(non-significant) difference. - Do any differences between conditions reflect
chance variation (H0) or do the differences
suggest that, under one of the conditions, people
were acting like they were drawn from another
population (HA).
5Example of sign test nicotine and heart rate.
- Nicotine is a psychomotor stimulant should
increase heart rate. - Nicotine is present in tobacco and tobacco smoke
that smokers inhale. - If tobacco smokers are taking in nicotine, their
heart rate should increase during smoking as
compared to before smoking. - Heart rate fluctuates naturally over time (random
variability). - Sometimes the heart beats faster
- Sometimes the heart beats slower
- Tobacco smoking is expected to increase heart
rate beyond this natural random variability.
6One way to examine the effect of tobacco smoke on
HR
- Recruit 32 smokers.
- Ask them to abstain from smoking for 8 hours.
- Measure HR before and during smoking.
- Condition 1 Measure HR for 5 minutes before
smoking (HRbefore) - Condition 2 Measure HR for 5 minutes during
smoking (HRduring)
7HA Some questions to think about.
- If tobacco smoking does increase HR (systematic
change) - Which should be greater HRbefore or HRduring?
- If you subtracted HRbefore from HRduring
(HRduring HRbefore) would you expect the
resulting difference score to be positive () or
negative (-)? - Across all subjects would you expect more plusses
or more minuses? - Is p(Plus) greater than, less than, or equal to
0.5?
8H0 Some questions to think about.
- If tobacco smoking does not increase HR (random
variation) - Which should be greater HRbefore or HRduring?
- If you subtracted HRbefore from HRduring
(HRduring HRbefore) would you expect the
resulting difference score to be positive () or
negative (-)? - Across all subjects would you expect more plusses
or more minuses? - Is p(Plus) greater than, less than, or equal to
0.5?
9Decision-making steps
- 1. Define problem Does tobacco smoking increase
HR? - 2. Define hypotheses with respect to sign of
(HRduring HRbefore) - H0 Tobacco smoking does not increase HR
p(Plus) 0.50 - HA Tobacco smoking increases HR p(Plus)
0.50 - 3. Define experiment 32 smokers, measure HR
before/during smoking. - 4. Define statistic P, the number of Plusses
observed with 32 subjects. - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which your
decision hinges P ??
10(No Transcript)
11Rejection region these outcomes make H0 very
difficult to believe p(P gt 22) 0.0249
12Decision-making steps
- 1. Define problem Does tobacco smoking increase
HR? - 2. Define hypotheses with respect to sign of
(HRduring HRbefore) - H0 Tobacco smoking does not increase HR
p(Plus) lt 0.50 - HA Tobacco smoking increases HR p(Plus) gt
0.50 - 3. Define experiment 32 smokers, measure HR
before/during smoking. - 4. Define statistic P, the number of Plusses
observed with 32 subjects. - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges Reject H0 if P gt 22 - 7. Perform experiment/collect data
- 8. Compare observed statistic to critical value.
- 9. Decide
- 10. Draw conclusion using at least one complete
sentence
13Decision-making steps
- 1. Define problem Does tobacco smoking increase
HR? - 2. Define hypotheses with respect to sign of
(HRduring HRbefore) - H0 Tobacco smoking does not increase HR
p(Plus) lt 0.50 - HA Tobacco smoking increases HR p(Plus) gt
0.50 - 3. Define experiment 32 smokers, measure HR
before/during smoking. - 4. Define statistic P, the number of Plusses
observed with 32 subjects. - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges Reject H0 if P gt 22 - 7. Perform experiment/collect data P 30
- 8. Compare observed statistic to critical value.
Is 30 in rejection region? - 9. Decide Reject H0
- 10. Draw conclusion using at least one complete
sentence Based on these results, tobacco smoking
does increase heart rate.
14Example of sign test denicotinized cigs and HR.
- Normal tobacco smoke increases HR in abstinent
smokers. - Is it the nicotine or some other smoke
constituent that causes this HR increase? - What would happen if smokers smoke denicotinized
tobacco cigarettes instead of normal tobacco
cigarettes? Would heart rate increase, decrease,
or stay the same?
15One way to examine the effect of denic cigs on HR
- Recruit 32 smokers.
- Ask them to abstain from smoking for 8 hours.
- Measure HR before and during smoking of
denicotinized tobacco cigarettes. - Condition 1 Measure HR for 5 minutes before
smoking (HRbefore) - Condition 2 Measure HR for 5 minutes during
smoking (HRduring) - Hypothesize some change in HR, but unknown
direction (hint a non-directional hypothesis!).
16Decision-making steps
- 1. Define problem Does smoking denicotinized
tobacco influence HR? - 2. Define hypotheses with respect to sign of
(HRduring HRbefore) - H0 Denicotinized tobacco does not influence
HR p(Plus) 0.50 - HA Denicotinized tobacco influences HR
p(Plus) 0.50 - 3. Define experiment 32 smokers, measure HR
before/during smoking. - 4. Define statistic P, the number of Plusses
observed with 32 subjects. - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which your
decision hinges P ??
17(No Transcript)
18Rejection region these outcomes make H0 very
difficult to believe p(P lt 10) 0.0249
Rejection region these outcomes make H0 very
difficult to believe p(P gt 22) 0.0249
19Decision-making steps
- 1. Define problem Does smoking denicotinized
tobacco influence HR? - 2. Define hypotheses with respect to sign of
(HRduring HRbefore) - H0 Denicotinized tobacco does not influence
HR p(Plus) 0.50 - HA Denicotinized tobacco does influence HR
p(Plus) 0.50 - 3. Define experiment 32 smokers, measure HR
before/during smoking. - 4. Define statistic P, the number of Plusses
observed with 32 subjects. - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges - Reject H0 if 10 gt P gt 22
- 7. Perform experiment/collect data
- 8. Compare observed statistic to critical value.
- 9. Decide
- 10. Draw conclusion using at least one complete
sentence
20Decision-making steps
- 1. Define problem Does smoking denicotinized
tobacco influence HR? - 2. Define hypotheses with respect to sign of
(HRduring HRbefore) - H0 Denicotinized tobacco does not influence
HR p(Plus) 0.50 - HA Denicotinized tobacco does influence HR
p(Plus) 0.50 - 3. Define experiment 32 smokers, measure HR
before/during smoking. - 4. Define statistic P, the number of Plusses
observed with 32 subjects. - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges - Reject H0 if 10 gt P gt 22
- 7. Perform experiment/collect data P 19
- 8. Compare observed statistic to critical value.
Is 19 in rejection region? - 9. Decide Fail to reject H0
- 10. Draw conclusion using at least one complete
sentence Based on these results, there is no
evidence to support the idea that smoking
denicotinized tobacco influences heart rate.
21Hypothesis testing with means.
- So far we have covered hypotheses that test the
probability of the occurrence of a particular
event - For example, with the sign test, we calculated
difference scores and examined H0 p(Plus) lt
0.50 vs. HA p(Plus) gt 0.50. - Sign test loses information the actual value of
the difference score. - A difference of 15 is the same as a difference
of 1 - Using the actual values of scores is better
- Uses more information
- Yields a more powerful (sensitive) statistical
test. - Also allows us to test sample statistics (i.e.,
X) that we collect by comparing the statistic to
a known population parameter (i.e., m)
22Consider this question
- Does cigarette abstinence decrease heart rate in
experienced smokers to a level below that of a
non-smoking adult? - Cant use sign test that test requires that you
have two scores from the same person, and a
smoker cannot also be a non-smoker! - However, if you know the population mean (m 75
bpm) and standard deviation (s 9.0) of heart
rate for a non-smoking adult, you could compare
the heart rate of a sample of non-smokers who
have abstained from smoking to those population
values. - The question becomes Does the sample heart rate
taken from abstaining smokers look like it was
drawn from the population of non-smokers with m
75 and s 9.0, or does it look like it was drawn
from a different population with a lower m?
23Decision-making steps
- 1. Define problem Do smokers who have abstained
from smoking have a lower heart rate than
non-smokers (m 75, s 9.0)? - 2. Define hypotheses with respect to known
population mean, m - H0 Abstaining does not lower HR
- HA Abstaining does lower HR
- 3. Define experiment 32 smokers, measure HR
after 8 hours non-smoking. - 4. Define statistic Mean of sample, X.
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges What distribution do we use? - 7. Perform experiment/collect data
- 8. Compare observed statistic to critical value.
- 9. Decide
- 10. Draw conclusion using at least one complete
sentence
24Big problem need a probability distribution
- Need a probability distribution that we can use
to determine our rejection region (given that H0
is true) and limit p(Type I error) lt .05 - Generally, need to use this distribution for all
variables on all measurement scales, independent
of the unit of measurement. - Should be one that we can use to determine the
probability of all possible events (like we did
with the number of heads in the coin problems or
number of plusses in the sign test. - The unit normal distribution (z) may help. We
know, that - z scores are independent of the unit of
measurement - The z distribution can be used to determine
probability IF we assume that the original
variable has a normal distribution. - Are all variables normally distributed? NO! NO!
NO! NO! NO!
25Why the normal distribution is an appropriate
distribution for hypothesis testing.
- We CANNOT assume that all variables that we
measure are normally distributed! - However, there is one distribution that is
normally distributed, no matter what the shape of
the parent distribution, provided your sample
size (N) is large the sampling distribution of
the mean.
26What is the sampling distribution of the mean?
- The sampling distribution of the mean gives all
the values the mean of a sample size N can take,
along with with probability of getting each value
if sampling is random from the null-hypothesis
population. - Imagine a population of scores 2,3,4,5,6.
- Imagine that you sample from that population
twice (with replacement) - Sample 1 4
- Sample 2 3.
- You take the two scores that you sampled and
calculate their average (mean 3.5). - Do this for all possible samples of size N 2
Ranges from 2 to 6
27What is the sampling distribution of the mean?
- Sampling distribution of the mean gives all the
values the mean of a sample size N can take,
along with with probability of getting each value
if sampling is random from the null-hypothesis
population. - Regardless of the shape of the population of
scores, the sampling distribution of the mean
approaches a normal distribution as sample size
(N) increases. - The mean of the sampling distribution of the mean
(mX) is the same as the population mean (m) - standard deviation of the sampling distribution
of the mean (sX) is equal to the population
standard deviation divided by the square root of
the sample (sX s/ N). For this example, 1.0
1.41/ 2 )
28So
- If you want to use the z distribution to
calculate probability, you need to be able to
assume that the original population is normally
distributed. - With large Ns, the sampling distribution of the
mean is normally distributed! - In order to convert any score to a z score you
need the score (X), the population mean (m) and
the population standard deviation (s) - X score
- mX m
- sX s/ N
- Now back to our problem . . .
29Decision-making steps
- 1. Define problem Do smokers who have abstained
from smoking have a lower heart rate than
non-smokers (m 75, s 9.0)? - 2. Define hypotheses with respect to known
population mean, m - H0 Abstaining does not lower HR mabstaining
smokers gt 75 - HA Abstaining does lower HR mabstaining
smokers lt 75 - 3. Define experiment 32 smokers, measure HR
after 8 hours non-smoking. - 4. Define statistic z
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges
30Decision-making steps
- 1. Define problem Do smokers who have abstained
from smoking have a lower heart rate than
non-smokers (m 75, s 9.0)? - 2. Define hypotheses with respect to known
population mean, m - H0 Abstaining does not lower HR mabstaining
smokers gt 75 - HA Abstaining does lower HR mabstaining
smokers lt 75 - 3. Define experiment 32 smokers, measure HR
after 8 hours non-smoking. - 4. Define statistic z
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges - 7. Perform experiment/collect data Xobt 71.04
- 8. Compare observed statistic to critical value.
Need to convert Xobt to zobt
31Does smoking during pregnancy alter birth weight?
Most folks know what fetal alcohol syndrome is
a constellation of adverse events due to alcohol
abuse during pregnancy. In 1985 Nieburg and
colleagues coined the phrase Fetal Tobacco
Syndrome to describe the adverse events that can
occur when mothers smoke during pregnancy. One
potential adverse event caused by tobacco smoking
during pregnancy is an abnormal birth weight.
Is smoking during pregnancy associated with
abnormal birth weight? The average (m) birth
weight of a baby born to a non-smoking mother is
3,300 grams (s 650). You sample the birth
weight of 438 children born to mothers who
smoked, on average, 14 cigarettes per day during
their pregnancy. The sample mean weight (X) was
3,186 grams for the children born to these
mothers who smoked.
32Decision-making steps
- 1. Define problem Is smoking during pregnancy
associated with birth weight that differs from
the population average (m 3,300, s 650)? - 2. Define hypotheses with respect to known
population mean, m - H0
- HA
- 3. Define experiment Measure weight of 438
children born to smokers. - 4. Define statistic z
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges zcrit
33Decision-making steps
- 1. Define problem Is smoking during pregnancy
associated with birth weight that differs from
the population average (m 3,300, s 650)? - 2. Define hypotheses with respect to known
population mean, m - H0 Smoking is not associated w/ abnormal birth
weight msmoke 3,300 - HA Smoking is associated w/ abnormal birth
weight msmoke 3,300 - 3. Define experiment Measure weight of 438
children born to smokers. - 4. Define statistic z
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges zcrit gt 1.96 - 7. Perform experiment/collect data Xobt 3186
grams - X m
- Zobt
- s/ N
34Decision-making steps
- 1. Define problem Is smoking during pregnancy
associated with birth weight that differs from
the population average (m 3,300, s 650)? - 2. Define hypotheses with respect to known
population mean, m - H0 Smoking is not associated w/ abnormal birth
weight msmoke 3,300 - HA Smoking is associated w/ abnormal birth
weight msmoke 3,300 - 3. Define experiment Measure weight of 438
children born to smokers. - 4. Define statistic z
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges zcrit gt 1.96 - 7. Perform experiment/collect data Xobt 3186
grams - 8. Compare observed statistic to critical value.
- 9. Decide
- 10. Draw conclusion using at least one complete
sentence Based on these results . . .
35What to do if s is unknown (a common occurrence)?
- Absolutely no problem.
- Exact same logic, but estimate s with s.
- Use t distribution table instead of z
distribution to account for estimation. - t is actually a family of normal distributions.
- As N increases, t becomes more like z.
- Need degrees of freedom (in this case, N-1) so
that you use the correct t distribution - Use degrees of freedom to find tcrit in table.
36Do humans follow an internal, 24-hour clock?
Biological theories often emphasize that humans
have adapted to their physical environment. One
such theory hypothesizes that people should
spontaneously follow a 24-hour cycle of sleeping
and waking even if they are not exposed to the
usual pattern of sunlight. To test this notion,
8 paid volunteers were placed (individually) in a
room in which there was no light from the outside
and no clocks or other indicators of time. They
could turn the lights on and off as they wished.
After a month in the room, each individual tended
to develop a steady light-dark cycle. There
cycles at the end of the study were as follows
(in hours) 25, 27, 25, 23, 24, 25, 26, and 25
37Decision-making steps
- 1. Define problem Do humans follow a 24 hour
clock? - 2. Define hypotheses with respect to known
population mean, m - H0 m
- HA m
- 3. Define experiment Measure light/dark cycle of
8 people. - 4. Define statistic t
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges tcrit
38Decision-making steps
- 1. Define problem Do humans follow a 24 hour
clock? - 2. Define hypotheses with respect to known
population mean, m - H0 m 24
- HA m 24
- 3. Define experiment Measure light/dark cycle of
8 people. - 4. Define statistic t
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges tcrit(7) - 7. Perform experiment/collect data Xobt 25
hours, sobt 1.20 - X m
- tobt
- s/ N
39Decision-making steps
- 1. Define problem Do humans follow a 24 hour
clock? - 2. Define hypotheses with respect to known
population mean, m - H0 m 24
- HA m 24
- 3. Define experiment Measure light/dark cycle of
8 people. - 4. Define statistic t
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges tcrit(7) 2.365 - 7. Perform experiment/collect data Xobt 25
hours, sobt 1.20 tobt 2.357 - 8. Compare observed statistic to critical value.
tobt - 9. Decide
- 10. Draw conclusion using at least one complete
sentence Based on these results . . .
40What have you learned?
- Hypothesis testing with coins
- Logic of hypothesis testing.
- How to determine rejection regions (directional
and non-directional tests) - Probability of Type I and Type II errors
- The sign test for use when you have two scores
from the same person and want to tell if scores
collected under one condition were different from
those collected under another condition no means
required! - Hypothesis testing with means
- Logic of using the normal distribution.
- The one-sample z test when m and s are known.
- The one-sample t test when m is known and s must
be estimated.