Title: Hypothesis
1Chapter 8
2An example
- Suppose that we want to compare the crime rate in
Detroit with the crime rate in the rest of the
country. - Is there more or less crime in Detroit than the
national average?
3Null Hypothesis
- The hypothesis that we put to the test is called
the null hypothesis, symbolized H0. - The null hypothesis usually states the situation
in which there is no difference (the difference
is null) between populations.
4Alternative Hypothesis
- The alternative hypothesis, symbolized Ha, is the
opposite of the null hypothesis. - The alternative hypothesis is also identified as
the research hypothesis, or the hunch that the
investigator wants to test.
5Example
- Suppose that a new pesticide has been developed.
The manufacturer claims that it will kill more
than 60 of the ants it comes in contact with.
100 ants are obtained and each is put in a dish.
The chemical is applied and after 1 hour, 62 of
the ants have died.
6- Is the pesticide really more than 60 effective
in killing ants? The data seems to suggest so
since 62/100 62 is more than 60 but there is
variability in the data and 62 may not be
statistically greater than 60.
7- H0 and Ha are opposites of each other and they
are statements about parameter(s). - For the ants example, if we let ? be the
proportion of ants killed, we can write - H0 ? 0.6 and Ha ? gt 0.6
8Type of tests
- right-tailed test
- left-tailed test
- two-tailed test
9Other Examples
- H0 ? 0.2 and Ha ? lt 0.2
- H0 ? 0.7 and Ha ? ? 0.7
- H0 µ 3 and Ha µ lt 3
- H0 µ -1 and Ha µ ? -1
- H0 µ 0 and Ha µ gt 0
- These all involve parameters such as µ and ?.
- Also, H0 always gets some form of equality
- (, or ).
10What are H0 and Ha ?
- Claim The mean IQ score of statistics students
is more than 110. - Claim The percentage of people that watch 60
minutes each night is at most 25.
11Formulate hypothesis
- A conjecture is made that the mean starting
salary for computer science graduates is 30,000
per year. The researcher believes that it is
greater than 30,000. Formulate the null and
alternative to evaluate the claim.
12Test statistic
- A test statistic, i.e. a statistic from the
sample for testing the null hypothesis. The
choice of statistic depends on the parameter
being tested.
13- Determine the sampling distribution of the test
statistic under the assumption that Ho is true.
Then it is possible to determine what values of
the test statistic seem reasonable for rejection
of the null hypothesis.
14- The rejection region consists of those values of
the test statistic that will lead to the
rejection of the null hypothesis
15p-Value
- The p-Value is the probability (computed when Ho
is assumed to be true) of observing a value of
the statistic at least as extreme as that given
by the actual observed data. The smaller the
p-value, the stronger is the evidence against the
null hypothesis. - It depends on the right-tailed, left-tailed and
two-tailed test.
16p-Value
- It depends on the right-tailed, left-tailed and
two-tailed test. - Left tailed test p-value P(ZltZobs)
- Right tailed test p-value P(ZgtZobs)
- Two tailed test p-value 2P(ZgtZobs)
17P-value for right tailed test
p-value P(ZgtZobs)
For both, the p-value is the area to the right of
the TS.
Zobs
Zobs
Here, the p-value is less than a.
Here, the p-value is greater than a.
18P-value for left tailed test
p-value P(ZltZobs)
Zobs
Zobs
For both, the p-value is the area to the left of
the TS.
19P-value for Two Sided Test
Ha p ? p0
p-value 2P(Z gt Zobs)
P-value/2
P-value/2
Zobs
Zobs
For both, half of the p-value is the area to the
right of the Zobs. Zobs gt0
20P-value for Two Sided Test
Ha p ? p0
p-value 2P(Z gt Zobs)
P-value/2
P-value/2
TS
TS
For both, half of the p-value is the area to the
left of the Zobs. Zobslt0
21Example (486)
- The researcher believes that the mean capita
income of the country residents is greater than
15,000. Suppose the researcher selects a random
sample of 100 county residents and finds that
their average per capita income is 16,200.
Suppose we know ?4,000, compute the p-value. Zobs
H0 ?15000 H1 ?gt15000
22Level of Significance ?
- The size of the rejection region, called the
level of significance (denoted by ?), determines
how small the p-value should be before we reject
the null hypothesis. - If p-value of is less than ?, we have
significant evidence to reject H0 in favor of Ha.
If p-value of is larger than ?, we will say we
havent significant evidence to reject H0, so we
will accept H0.
23Procedure for Testing Hypothesis
- Formulate null hypothesis Ho and alternative Ha.
- Decide on an appropriate test statistic.
- Determine the sampling distribution of the test
statistic under the assumption that the null
hypothesis is true. - Set ?. Calculate the p-value and determine
whether it is sufficiently small to reject the
null hypothesis. - Interpret the results.
24Example 8.4 (Page 489)
- The scores on a college placement exam in
mathematics are assumed to be normally
distributed with a mean of 70 and a standard
deviation of 18. The exam is given to a random
sample of 50 high school seniors who have been
admitted to college. Their average score on the
exam was 67. Is the evidence sufficient to
suggest that the population mean score is lower
than or equal to 70?
25Our decision
- will be either
- Reject H0 meaning we feel that we have
sufficient evidence for Ha. - Fail to Reject H0 were not conceding to accept
H0 just yet.
26Errors
- Its not guaranteed that well make the correct
decision. The only way that could happen is if
we knew what the true population parameter is.
274 Possibilities
OUR DECISION
THE TRUTH
Fail to Reject H0 (H0 is TRUE)
The ones in green are correct decisions No
Error. The ones in red are errors.
H0 is TRUE
Reject H0 (H0 is FALSE)
Fail to Reject H0 (H0 is TRUE)
H0 is FALSE
Reject H0 (H0 is FALSE)
28Types of Errors
- Type I Error Rejecting H0 is false when in fact
H0 is true. The probability of committing a Type
I error is denoted as ?. - Type II Error Failing to reject H0 when in fact
is H0 false. The probability of committing a Type
II error is denoted by ?. - On the previous page, which ones do these
correspond to?
29- In the court system, we test
- H0Guilty vs. Ha Innocent
- What would be a Type I and Type II error
- here?
30 - a is used in making a decision.
- ß is used in comparing tests.
- We preset a before collecting the data.
- Common values are 0.01, 0.05, 0.1.
31One Sided Tests (Right and Left)
Ha p gt p0
Ha p lt p0
a
32Two Sided Test
- All tests use the same test statistic.
- For all tests, Reject H0 when the test statistic
is in the rejection region.
Ha p ? p0
a/2
338.2 Testing a Population Proportion (continued)
- We wish to test
- Ho ?.60 Ha ?gt.60
- Suppose that the drug company investigates 200
cases and finds that the new drug is effective in
134 cases. Is this enough to reject Ho and say
that ?gt.60?
What is p-value?
34Large-Sample Test of a Population Proportion
- Assumption ngt30
- Testing statistic
35- If Ha ?lt.60 What is p-value?
36- If Ha ??.60 What is p-value?
37Equivalence of Confidence Intervals and
Two-tailed Tests
- The null hypothesis Ho ??0 versus alternative
Ha ???0 is rejected at an ? level of
significance if and only if the hypothesized
value falls outside a - (1-?)100 confidence interval for ?.
388.3 Testing a Population Mean µ
- The same general principles apply as they
- did for tests about p.
- We need a test statistic.
- We need to know the distribution of the test
statistic. - Compute the p-value and compare it to a.
39- If s is knownor ngt30, using Z-test. The test
statistic is - Use a Z (standard normal)
- to obtain p-values and
- critical values.
- If s is unknown, using T-test. The test statistic
is - Use a t distribution with n-1
- Degrees of freedom obtain
- p-values and critical values.
40- Whether you have a test about p or µ,
- you always reject H0 if p-value lta.
-
41Example 8.8
- To justify raising its rates, an insurance
company claims that the mean medical expense for
all middle-class families is at least 700 per
year. A survey of 100 randomly selected
middle-class families found that the mean
medical expense for the year was 670 and the
standard deviation was 140. Assuming that the
tails of the distribution of medical expenses are
not usually long, is there any evidence that the
insurance company is misinformed?
42Example 8.4
- UM president Mary Sue Coleman claims that the
percentage of UM students who like math is larger
than that of MSU which is given by 0.6. MSU
president Lou Anna doesnt agree. So I take a
random sample of 80 from UM, find that only 35
students who like math. Test Marys claim?
43End
44Example 5.4
- UM president Mary Sue Coleman claims that the
percentage of UM students who like math is larger
than that of MSU students . MSU president Lou
Anna doesnt agree. So I take a sample of 100
from MSU 80 from UM, find that 70 and 49 students
respectively who like math. D? (nMSU 50 nUM
60)