Title: Introduction to Formal Inference
1Chapter 6
- Introduction to Formal Inference
- Part II Hypothesis Tests for µ
25.5 Functions of Several Random Variables
- Let X1, X2, , Xn be iid N(5, 10) rvs. So then
- N(5, 10/n)
- What is the probability of observing a sample
mean that is between 4 and 6 in a sample of size
n10?
35.5 Functions of Several Random Variables
- What is the probability of observing a sample
mean that is between 4 and 6 in a sample size
n90?
45.5 Functions of Several Random Variables
- If you had taken a sample of 90 people and found
that their sample mean was less than 4 or greater
than 6, what might you conclude? - There was a 99.74 chance of being between 4 and
6 - Maybe I just got a REALLY rare sample of 90
people - -OR-
- Maybe the population mean I started with doesnt
reflect the population like I thought it did
(more likely scenario)
56.2 Significance (Hypothesis) Testing
- Motivating Example (continued)
- Recall A simple random sample of n 25
sections of ¼ wire rope yielded 15 tons as the
sample average of the breaking strength. Also,
we somehow know or believe that s2 36 tons. - How would you answer a buyer if they ask you are
you sure that the breaking strength isnt 10
tons?
66.2 Significance (Hypothesis) Testing
- Heres how we might convince them
- Suppose the true breaking strength is 10 tons,
i.e. µ10. Then, the probability of obtaining a
SRS with a mean of 15 or higher is - (It is actually 1-0.9999848 1.523x10-5)
76.2 Significance (Hypothesis) Testing
- Sowe have in fact observed an event (i.e the
sample mean greater than 15) that is very rare if
the assumption that µ10 is true. Therefore,
this assumption is probably not plausible. - Definition Significance testing is the use of
data in the quantitative assessment of the value
for a parameter
86.2 Significance (Hypothesis) Testing
- Definition The Null Hypothesis (null, H0) is a
statement of the parameter being equal to some
number (usually a number that is what we want to
disprove, either based on history or experience) - Note Null refers to that fact that the
hypothesis is a statement of no difference - E.g. H0 µ0
- or H0 µ100 (which is equivalent to H0 µ
1000
96.2 Significance (Hypothesis) Testing
- Definition the Alternative Hypothesis
(alternative, HA, H1) is a statement of
opposition to the null hypothesis (usually
reflects what we really believe is true about the
parameter compared to H0) - Note the alternative is a statement that creates
a one-sided or two-sided test - E.g. H0 µ0 vs HA µ?0 is a two-sided test
- H0 µ100 vs HA µ lt 100 is a one-sided
test - H0 µ7 vs HA µ gt 7 is a one-sided test
10 6.2 Significance (Hypothesis) Testing
Determine for the following the null and
alternative hypotheses. 1. According to the
United States Department of Agriculture, the mean
farm rent in Indiana was 89.00 per acre in 1995.
A researcher for the USDA claims that the mean
rent has decreased since then. H0 µ 89
i.e. µ0 89 HA µ lt 89 2. According to the
United States Energy Information Administration,
the mean expenditure for residential energy
consumption was 1338 in 1997. An economist
claims that the mean expenditure for residential
energy is different today. H0 µ 1338 i.e.
µ0 1338 HA µ ? 1338
116.2 Significance (Hypothesis) Testing
- Definition a test statistic is a formula that
summarizes the data under the null hypothesis
(i.e. assuming that H0 is true) - E.g. for testing H0 µ 0 vs HA µ ? 0, we might
use the sample mean, , since it is our
estimate of µ or - where µ0 is µ specified by H0
126.2 Significance (Hypothesis) Testing
- Defintion the Reference (or null) Distribution
is the probability distribution of the test
statistic under the null hypothesis - e.g.
- for
- for
136.2 Significance (Hypothesis) Testing
- Definition a p-value is the (conditional)
probability of observing a test statistic that is
as extreme or more extreme than what is actually
observed given the null hypothesis is true - What this means depends on HA
- E.g. for testing H0 µ0 vs HA µ ? 0 using
- where ZN(0,1) and
146.2 Significance (Hypothesis) Testing
- P-value illustration
- -z z
- More extreme in the More extreme in the
- negative end positive end
156.2 Significance (Hypothesis) Testing
- P-value calculations for three alternative
hypotheses - H0 µ µ0 vs HA µ? µ0 (two-sided alternative)
- -z z
- P-value PZgtz µ µ0
- 2PZ lt -z
- 2PZ gt z
166.2 Significance (Hypothesis) Testing
- P-value calculations for three alternative
hypotheses - H0 µ µ0 vs HA µ gt µ0 (one-sided alternative)
- z
- Note if HA µ gt µ0 , then the test statistic had
better be positive (i.e. we know z gt 0) - P-value PZ gt z µ µ0
176.2 Significance (Hypothesis) Testing
- P-value calculations for three alternative
hypotheses - H0 µ µ0 vs HA µ lt µ0 (one-sided alternative)
- z
- Note if HA µ lt µ0 then the test statistic had
better be negative (i.e. we know z lt 0) - P-value PZ lt z µ µ0
186.2 Significance (Hypothesis) Testing
- Example Suppose we know that the standard
deviation for the weight of a bag of MMs
(labeled 10 oz) is 1.0 ounce. We want to test
the hypothesis that the mean weight of all the
MM bags under consideration is 10 ounces. So we
randomly sampled 30 bags. The mean weight of the
bags is 10.2 ounces. - Sowe know
- µ0 10, we want to test if this is true
- s 1
- 10.2
- Â
196.2 Significance (Hypothesis) Testing
- State the null hypothesis and alternative
hypothesis and calculate the observed value of
the test statistic. - H0 µ 10 vs HA µ ? 10 - or -
- H0 µ 10 0 vs HA µ 10 ? 0
206.2 Significance (Hypothesis) Testing
- Calculate the probability of observing a value
larger than the test statistic in magnitude, i.e.
the p-value. -
- We have a two-sided alternative so
- P-value PZ gt z where ZN(0,1)
- 2PZ lt -z
- 2PZ lt -1.095 (round to 2 digits)
- 2PZ lt -1.10
- 2(0.1357) 0.2714
216.2 Significance (Hypothesis) Testing
- What would you be willing to conclude? Why?
- p-value 0.2714
- 27.14 chance of observing a more
extreme value than what we did observe based
on our sample of 30 bags (if µ 10) - µ 10 is fairly plausible
226.2 Significance (Hypothesis) Testing
- Would it make a difference if the mean weight for
the sample of 30 bags was 10.4 ounces instead? - P-value 2PZ lt -z
- 2PZ lt -2.19
- 2(0.0143) 0.0286
- 2.86 chance of observing a more
extreme value than what we observed (if µ
10) - µ 10 is probably not a plausible value
236.2 Significance (Hypothesis) Testing
- This is the general process used for doing a
hypothesis test - Recall from 10.1 that for large-n, we can use the
sample standard deviation in place of the
population standard deviation (s instead of s)
246.2 Significance (Hypothesis) Testing
- Five-Step Format for Significance Testing (NOT
the same as in the book!) - Step 1 State the null and alternative hypotheses
- Step 2 Compute the test statistic and state its
distribution - Step 3 Compute the p-value
- Step 4 Make a decision (compare steps 2 and 3)
- Step 5 State your conclusion
-
256.2 Significance (Hypothesis) Testing
- How do you make a decision?
- Researchers typically think of p-value in terms
of their level of significance (this gets hazy) - In terms of Hypothesis testingwell want to
compare the p-value to a
266.2 Significance (Hypothesis) Testing
- Example 2 (continued) repeat part (d) but
- Use n 100
- p-value 2PZ lt -4 0
- Use a mean of 10.05 and n 4,000
- p-value 2PZ lt -3.16
- 0.0016
-
276.2 Significance (Hypothesis) Testing
- Note that with a very large sample, one can
make any difference significant according to
the chart. We must ask ourselves, is the
difference meaningful? - Rather than try to figure out a scale for the
p-value, compare it to the level of significance,
a - Reject H0 (and conclude µ µ0 is not reasonable)
if p-value lt a - Fail to reject H0 (FTR and conclude µ µ0 is
reasonable) otherwise
28 6.2 Significance (Hypothesis) Testing
Reality
Null Hypothesis
True
False
Type I
Reject
Action
Type II
Fail to Reject
The level of significance, ?, is the probability
of making a Type I error.
296.2 Significance (Hypothesis) Testing
- Definition a Type I Error is rejecting H0 when
H0 is true - The p-value is the probability of making a Type
I Error - Type I Error typically denoted a
- We want this to be low, typically a 0.01, 0.05
or 0.10 (similar to CIs) - If no value is specified, use a 0.05
- Definition a Type II Error is failing to reject
H0 when H0 is false - Generally considered not as bad as a Type I
Error (hence why we compare everything to a)
306.2 Significance (Hypothesis) Testing
- Criminal Justice Analogy for Understanding Type I
vs. Type II Error - Â In a criminal trial, we assume that the
defendant is innocent until proven guilty.
Therefore, we have H0 The defendant is innocent
vs. - HA The defendant is guilty
- A Type I Error would occur if the defendant was
innocent but found guilty. - A Type II Error would occur if the defendant was
guilty but was found not guilty.
316.2 Significance (Hypothesis) Testing
- Note that in the judicial system, we never say
that a defendant was found innocent. Instead,
our system decides if they are proven guilty, and
if not, we say that they are not guilty. - Similarly, statisticians say fail to reject the
null hypothesis instead of saying accept the
null hypothesis.Â
326.2 Significance (Hypothesis) Testing
- Since we do not want an innocent defendant to be
found guilty, strong evidence has to be presented
to convict them. In the same way, we dont want
to reject the null hypothesis unless there is
strong evidence to reject it. - Â
- However, the stronger the evidence we require for
convicting a defendant, the more likely a guilty
defendant will walk away after being declared not
guilty. Similarly, as we lower the risk of
making a Type I Error, we increase the risk of
making a Type II Error.
336.2 Significance (Hypothesis) Testing
- Example (using the 5 steps) A researcher claims
that the average age of a woman before she has
her first child is greater than the 1990 mean age
of 24.6 years, on the basis of data obtained from
the National Vital Statistics Report, Vol. 48,
No. 14. She obtains a simple random sample of 40
women who gave birth to their first child in 1999
and finds the sample mean age to be 27.1 years.
Assume that the population standard deviation is
6.4 years. Test the researchers claim, using
the classical approach at the ? 0.05 level of
significance.
346.2 Significance (Hypothesis) Testing
- Step 1 H0 µ 24.6 vs HA µ gt 24.6
- Step 2 Z N(0,1)
- Step 3 one-sided alternative so
- p-value P(Z gt z) P(Z gt 2.47)
- 1 0.9932 0.0068
356.2 Significance (Hypothesis) Testing
- Step 4 p-value 0.0068, a 0.05
- a
- p-value
- p-value lt a Reject H0
- Step 5 At the a 0.05 level of significance,
there is significant evidence to reject H0 and
conclude the average age of a woman when she has
her first child has increased since 1990
366.2 Significance (Hypothesis) Testing
- Example The thickness of metal wires used in the
manufacture of silicon wafers is assumed to be
normally distributed with mean µ . To monitor the
production process, the thickness of 40 wires is
taken. The output is considered unacceptable if
the mean differs from the target value of 10. The
40 measurements yield a sample mean of 10.2 and
sample standard deviation of 1.2. Conduct the
appropriate statistical test and state its
implications for the problem. Use an a .1
level of significance.
376.2 Significance (Hypothesis) Testing
- Step 1 H0 µ 10 vs HA µ ? 10
- Step 2 Z N(0,1)
- Step 3 two-sided alternative so
- p-value P(Z gt z) 2P(Z lt -1.05)
- 2(0.1469) 0.2938
386.2 Significance (Hypothesis) Testing
- Step 4 p-value 0.2938, a 0.10
- p-value
- a
- p-value gt a FTR H0
- Step 5 At the a 0.10 level of significance,
there is not enough significant evidence to
reject H0 and conclude the output is unacceptable - (i.e. The output is acceptable)
396.2 Significance (Hypothesis) Testing
- Question Does hypothesis testing have anything
in common with confidence intervals? - Answer YES!
- Suppose we are considering a confidence level of
95. Then all values located within this
confidence interval, - are values that when assumed to be the true
mean value (think null hypothesis) will produce a
p-value of 0.05 or greater.
406.2 Significance (Hypothesis) Testing
- In other words, H0 , where is some number
in the interval above, implies - So any value not in the interval is going to
provide enough evidence against the null
hypothesis. If a certain value is in the
interval, we will have little or no evidence
against the null hypothesis.
416.2 Significance (Hypothesis) Testing
- Since repeated applications of forming (1-a)100
C.I.s results in the true mean being bracketed
by the C.I. (1-a)100 of the time, we know that
(a)100 of the time, the C.I. will not capture
the true mean. When the true mean is not
captured by the C.I., we have evidence against
H0, which will lead to rejecting the null
hypothesis when the null hypothesis is in fact
true (Type I Error). - So, for a (1-a)100 confidence interval, a
represents the probability of making a Type I
Error.
426.2 Significance (Hypothesis) Testing
- Example Air bags were tested to determine the
pressure present in the air bags 40 milliseconds
after releasing the air bag. Suppose 50 bags
were tested, the mean pressure is 6.5 psi and the
standard deviation is 0.25 psi. - Determine plausible (likely) values for the
population mean given an 80 confidence level.
436.2 Significance (Hypothesis) Testing
- Is there clear evidence that the mean pressure
for all air bags under consideration is not 6.5
psi? - Since 6.5 is in the computed 80 confidence
interval, then at the a 0.2 level of
significance we would conclude it is a plausible
value for the mean so we would FTR H0 µ 6.5 - Is there clear evidence that the mean pressure
for all air bags under consideration is not 6
psi? - Since 6 is not in the confidence interval,
then at the - a 0.2 level of significance we would
conclude it is not a - plausible value for the mean so we would
Reject H0 µ 6
446.2 Significance (Hypothesis) Testing
- Going the opposite waysuppose we had performed
the hypothesis test for H0 µ 6.475 HA µ ?
6.475 and we compute a p-value of 0.15. - Would we conclude that 6.475 would be inside an
80 CI? - No, because a .2 gt p-value 0.15 so we would
reject H0 and conclude 6.475 is not a plausible
value for the mean and thus would not be inside
an 80 CI. - What about a 90 confidence interval?
- Yes because a .1 lt p-value 0.15 so we would
FTR H0 and conclude 6.475 is a plausible value
for the mean so it would be contained in a 90 CI.
456.2 Significance (Hypothesis) Testing
- Note you have to be careful what type of interval
you are using with what type of alternative
hypothesis - HA µ ? µ0 a is split in two so we would
compare it to a two sided confidence interval
(upper and lower bounds) - HA µ lt µ0 or HA µ gt µ0 a is NOT split in
two so we would compare it to a one sided
confidence bound (upper bound for HA µ gt µ0 and
lower bound for HA µ lt µ0 )