Title: Chapter 8: Inferences Based on a Single Sample: Tests of Hypotheses
1Statistics
- Chapter 8 Inferences Based on a Single Sample
Tests of Hypotheses
2Where Weve Been
- Calculated point estimators of population
parameters - Used the sampling distribution of a statistic to
assess the reliability of an estimate through a
confidence interval
3Where Were Going
- Test a specific value of a population parameter
- Measure the reliability of the test
48.1 The Elements of a Test of Hypotheses
Confidence Interval Where on the number line do
the data point us? (No prior idea about the value
of the parameter.)
µ?
µ?
Hypothesis Test Do the data point us to this
particular value? (We have a value in mind from
the outset.)
µ0?
58.1 The Elements of a Test of Hypotheses
- Null Hypothesis H0
- This will be supported unless the data provide
evidence that it is false - The status quo
- Alternative Hypothesis Ha
- This will be supported if the data provide
sufficient evidence that it is true - The research hypothesis
68.1 The Elements of a Test of Hypotheses
- If the test statistic has a high probability when
H0 is true, then H0 is not rejected. - If the test statistic has a (very) low
probability when H0 is true, then H0 is rejected.
78.1 The Elements of a Test of Hypotheses
88.1 The Elements of a Test of Hypotheses
98.1 The Elements of a Test of Hypotheses
Note Null hypotheses are either rejected, or
else there is insufficient evidence to reject
them. (I.e., we dont accept null hypotheses.)
108.1 The Elements of a Test of Hypotheses
- Null hypothesis (H0) A theory about the values
of one or more parameters - Ex. H0 µ µ0 (a specified value for µ)
- Alternative hypothesis (Ha) Contradicts the null
hypothesis - Ex. H0 µ ? µ0
- Test Statistic The sample statistic to be used
to test the hypothesis - Rejection region The values for the test
statistic which lead to rejection of the null
hypothesis - Assumptions Clear statements about any
assumptions concerning the target population - Experiment and calculation of test statistic The
appropriate calculation for the test based on
the sample data - Conclusion Reject the null hypothesis (with
possible Type I error) or do not reject it (with
possible Type II error)
118.1 The Elements of a Test of Hypotheses
- Suppose a new interpretation of the rules by
soccer referees is expected to increase the
number of yellow cards per game. The average
number of yellow cards per game had been 4. A
sample of 121 matches produced an average of 4.7
yellow cards per game, with a standard deviation
of .5 cards. At the 5 significance level, has
there been a change in infractions called?
128.1 The Elements of a Test of Hypotheses
- H0 µ 4
- Ha µ ? 4
- Sample statistic ? 4.7
- .05
- Assume the sampling distribution is normal.
- Test statistic
- Conclusion z.05 1.96. Since z gt z.05 ,
reject H0. - (That is, there do seem to be more yellow cards.)
138.2 Large-Sample Test of a Hypothesis about a
Population Mean
The null hypothesis is usually stated as an
equality
even though the alternative hypothesis can be
either an equality or an inequality.
148.2 Large-Sample Test of a Hypothesis about a
Population Mean
158.2 Large-Sample Test of a Hypothesis about a
Population Mean
Rejection Regions for Common Values of ?
168.2 Large-Sample Test of a Hypothesis about a
Population Mean
- H0 µ µ0
- Ha µ ? µ0
- Test Statistic
- Rejection Region z gt z?/2
- H0 µ µ0
- Ha µ lt or gt µ0
- Test Statistic
- Rejection Region z gt z?
Conditions 1) A random sample is selected from
the target population. 2) The sample size n is
large.
178.2 Large-Sample Test of a Hypothesis about a
Population Mean
- The Economics of Education Review (Vol. 21, 2002)
reported a mean salary for males with
postgraduate degrees of 61,340, with an
estimated standard error (s? ) equal to 2,185.
We wish to test, at the ? .05 level, H0 µ
60,000.
188.2 Large-Sample Test of a Hypothesis about a
Population Mean
- H0 µ 60,000
- Ha µ ? 60,000
- Test Statistic
-
- Rejection Region z gt z.025 1.96
- The Economics of Education Review (Vol. 21,
2002) reported a mean salary for males with
postgraduate degrees of 61,340, with an
estimated standard error (s?) equal to 2,185.
We wish to test, at the ? .05 level,
H0 µ 60,000.
Do not reject H0
198.3Observed Significance Levels p - Values
Suppose z 2.12. P(z gt 2.12) .0170.
Reject H0 at the ? .05 level
Do not reject H0 at the ? .01 level
But its pretty close, isnt it?
208.3Observed Significance Levels p - Values
The observed significance level, or p-value, for
a test is the probability of observing the
results actually observed (z) assuming the null
hypothesis is true. The lower this
probability, the less likely H0 is true.
218.3Observed Significance Levels p - Values
- H0 µ 65,000
- Ha µ ? 65,000
- Test Statistic
-
- p-value P(? ? 61,340 H0 )
P(z gt 1.675) .0475
- Lets go back to the Economics of Education
Review report (? 61,340, s? 2,185). This
time well test H0 µ 65,000.
228.3Observed Significance Levels p - Values
- Reporting test results
- Choose the maximum tolerable value of ?
- If the p-value lt ?, reject H0
- If the p-value gt ?, do not reject H0
-
238.3Observed Significance Levels p - Values
Some stats packages will only report two-tailed
p-values.
Converting a Two-Tailed p-Value to a One-Tailed
p-Value
248.3Observed Significance Levels p - Values
Some stats packages will only report two-tailed
p-values.
Converting a Two-Tailed p-Value to a One-Tailed
p-Value
258.4 Small-Sample Test of a Hypothesis about a
Population Mean
If the sample is small and ? is unknown, testing
hypotheses about µ requires the t-distribution
instead of the z-distribution.
268.4 Small-Sample Test of a Hypothesis about a
Population Mean
- H0 µ µ0
- Ha µ ? µ0
-
- Test Statistic
- Rejection Region t gt t?/2
- H0 µ µ0
- Ha µ lt or gt µ0
-
- Test Statistic
- Rejection Region t gt t?
Conditions 1) A random sample is selected from
the target population. 2) The population from
which the sample is selected is approximately
normal. 3) The value of t? is based on (n 1)
degrees of freedom
278.4 Small-Sample Test of a Hypothesis about a
Population Mean
- Suppose copiers average 100,000 between paper
jams. A salesman claims his are better, and
offers to leave 5 units for testing. The average
number of copies between jams is 100,987, with a
standard deviation of 157. Does his claim seem
believable?
288.4 Small-Sample Test of a Hypothesis about a
Population Mean
- H0 µ 100,000
- Ha µ gt 100,000
- Test Statistic
-
- p-value P(? ? 100,987H0 )
P(tdf4 gt 14.06) lt .001
- Suppose copiers average 100,000 between paper
jams. A salesman claims his are better, and
offers to leave 5 units for testing. The average
number of copies between jams is 100,987, with a
standard deviation of 157. Does his claim seem
believable?
298.4 Small-Sample Test of a Hypothesis about a
Population Mean
- H0 µ 100,000
- Ha µ gt 100,000
- Test Statistic
-
- p-value P(? ? 100,987H0 )
P(tdf4 gt 14.06) lt .001
- Suppose copiers average 100,000 between paper
jams. A salesman claims his are better, and
offers to leave 5 units for testing. The average
number of copies between jams is 100,987, with a
standard deviation of 157. Does his claim seem
believable?
Reject the null hypothesis based on the very low
probability of seeing the observed results if the
null were true. So, the claim does seem plausible.
308.5 Large-Sample Test of a Hypothesis about a
Population Proportion
- H0 p p0
- Ha p lt or gt p0
- Test Statistic
- Rejection Region z gt z?
- H0 p p0
- Ha p ? p0
- Test Statistic
- Rejection Region z gt z?/2
p0 hypothesized value of p,
, and q0 1 - p0
Conditions 1) A random sample is selected from a
binomial population. 2) The sample size n
is large (i.e., np0 and nq0 are both ? 15).
318.5 Large-Sample Test of a Hypothesis about a
Population Proportion
- Rope designed for use in the theatre must
withstand unusual stresses. Assume a brand of 3
three-strand rope is expected to have a breaking
strength of 1400 lbs. A vendor receives a
shipment of rope and needs to (destructively)
test it.
- The vendor will reject any shipment which cannot
pass a 1 defect test (thats harsh, but so is
falling scenery during an aria). 1500 sections of
rope are tested, with 20 pieces failing the test.
At the ? .01 level, should the shipment be
rejected?
328.5 Large-Sample Test of a Hypothesis about a
Population Proportion
- H0 p .01
- Ha p gt .01
- Rejection region z gt 2.236
- Test statistic
- The vendor will reject any shipment that cannot
pass a 1 defects test . 1500 sections of rope
are tested, with 20 pieces failing the test. At
the ? .01 level, should the shipment be
rejected?
338.5 Large-Sample Test of a Hypothesis about a
Population Proportion
- H0 p .01
- Ha p gt .01
- Rejection region z gt 2.236
- Test statistic
- The vendor will reject any shipment that cannot
pass a 1 defects test . 1500 sections of rope
are tested, with 20 pieces failing the test. At
the ? .01 level, should the shipment be
rejected?
There is insufficient evidence to reject the null
hypothesis based on the sample results.
348.6 Calculating Type II Error Probabilities
More about ?
- To calculate P(Type II), or ?,
- 1. Calculate the value(s) of ? that divide the
do not reject region from the reject
region(s). - Upper-tailed test
- Lower-tailed test
- Two-tailed test
358.6 Calculating Type II Error Probabilities
More about ?
- To calculate P(Type II), or ?,
- 1. Calculate the value(s) of ? that divide the
do not reject region from the reject
region(s). - 2. Calculate the z-value of ?0 assuming the
alternative hypothesis mean is the true µ - The probability of getting this z-value is ?.
368.6 Calculating Type II Error Probabilities
More about ?
- The power of a test is the probability that the
test will correctly lead to the rejection of the
null hypothesis for a particular value of µ in
the alternative hypothesis. The power of a test
is calculated as (1 - ? ).
378.6 Calculating Type II Error Probabilities
More about ?
- The Economics of Education Review (Vol. 21,
2002) reported a mean salary for males with
postgraduate degrees of 61,340, with an
estimated standard error (s?) equal to 2,185.
We wish to test, at the ? .05 level, H0 µ
60,000.
- H0 µ 60,000
- Ha µ ? 60,000
- Test Statistic z .613
- z?.025 1.96
- We did not reject this null hypothesis earlier,
but what if the true mean were 62,000? -
-
388.6 Calculating Type II Error Probabilities
More about ?
- The Economics of Education Review (Vol. 21,
2002) reported a mean salary for males with
postgraduate degrees of 61,340, with s? equal to
2,185. - We did not reject this null hypothesis earlier,
but what if the true mean were 62,000?
The power of this test is 1 - .3821 .6179
398.6 Calculating Type II Error Probabilities
More about ?
- For fixed n and ?, the value of ? decreases and
the power increases as the distance between µ0
and µa increases. - For fixed n, µ0 and µa, the value of ? increases
and the power decreases as the value of ? is
decreased. - For fixed ?, µ0 and µa, the value of ? decreases
and the power increases as n is increased.
408.6 Calculating Type II Error Probabilities
More about ?
418.7 Tests of Hypotheses about a Population
Variance
428.7 Tests of Hypotheses about a Population
Variance
The chi-square distribution is really a family of
distributions, depending on the number of
degrees of freedom. But, the population must be
normally distributed for the hypothesis tests on
?2 (or ?) to be reliable!
438.7 Tests of Hypotheses about a Population
Variance
- One-Tailed Test
- Test statistic
- Rejection region
Two-Tailed Test Test statistic Rejection
region
448.7 Tests of Hypotheses about a Population
Variance
- Conditions Required for a Valid
- Large- Sample Hypothesis Test for ?2
- 1. A random sample is selected from the target
population. - 2. The population from which the sample is
selected is approximately normal.
458.7 Tests of Hypotheses about a Population
Variance
- Earlier, we considered the average number of
copies between jams for a brand of copiers. The
salesman also claims his copiers are more
predictable, in that the standard deviation of
jams is 125. In the sample of 5 copiers, that
sample standard deviation was 157. Does his claim
seem believable, at the ? .10 level?
468.7 Tests of Hypotheses about a Population
Variance
Earlier, we considered the average number of
copies between jams for a brand of copiers. The
salesman also claims his copiers are more
predictable, in that the standard deviation of
jams is 125. In the sample of 5 copiers, that
sample standard deviation was 157. Does his claim
seem believable, at the ? .10 level?
- Two-Tailed Test
- Test statistic
- Rejection criterion
478.7 Tests of Hypotheses about a Population
Variance
Earlier, we considered the average number of
copies between jams for a brand of copiers. The
salesman also claims his copiers are more
reliable, in that the standard deviation of jams
is 125. In the sample of 5 copiers, that sample
standard deviation was 157. Does his claim seem
believable, at the ? .10 level?
- Two-Tailed Test
- Test statistic
- Rejection criterion
Do not reject the null hypothesis.