Title: 6'Hypothesis Testing and the Comparison of 2 or More Populations
16. Hypothesis Testing and the Comparison of 2 or
More Populations
2A) Introduction
- Estimating parameters of population ?
hypothesis testing on our model.
3Confidence Intervals and Hypothesis Testing
- Confidence intervals ? range that µ falls into.
- NOW is µ gt 0, or gt 1, etc.
- OR is µ1 gt µ2?
- Testing for specific values of µ.
- We have a confidence interval for Saskatchewan
female wages by age. - What could we test here?
- We will have a confidence interval for Bachelors
salaries in Saskatchewan. - What could we test here?
- Others gasoline prices? Stock Market
fluctuations?
4B) Developing Null and Alternative Hypotheses
- Start with a testable hypothesis.
- Point of interest do older women get paid more?
- Economic theory is 0 lt MPC lt 1 and constant?
- Define its opposite
- Older womens salaries are lt average.
- MPC is gt 1.
- One is the Null hypothesis, the other is the
Alternative hypothesis. - Use sample data to test the Null hypothesis.
- What if it is not that simple to have 2 opposites?
5Which is the Null?
- General rule the hypothesis with the sign or
the lt or the gt sign is the Null. - OR the Null is something we assume is true
unless contradicted by the sample.
61. Research hypotheses
- Testing an exception to the general rule, so it
goes in the alternative. - E.g, testing if older womens salaries (µ) gt
average - H0 µ lt µ(average)
- HA µ gt µ(average)
- Results will tell us either
- If testing shows H0 cannot be rejected
(accepted) ? implies that older womens
salaries are not higher, but we cannot be sure. - If testing shows H0 can be rejected ? we can
infer HA is true, µ gt µ(average).
72. Testing the Validity of a Claim
- Assume claim is true until disproven.
- E.G. manufacturers claim of weight/container.
- H0 µ gt 100 grams.
- HA µ lt 100 grams.
- Results will tell us either
- If testing shows H0 cannot be rejected
(accepted) ? manufacturers claim not
challenged. - If testing shows H0 can be rejected ? we can
infer manufacturer is lying.
83. Testing in Decision-Making
- Here, if either too high or too low, need to do
something. - E.G. is class length 75 minutes?
- H0 µ 75 minutes.
- HA µ ? 75 minutes.
- If H0 not rejected (accepted), no change in
behaviour. - If H0 rejected gt change behaviour.
9C) Type I and Type II Errors
- Sample data ? could have errors.
10Type I and Type II Errors
Population Condition
H0 True
H0 False
Conclusion
Correct Decision
Type II Error
Accept H0
Correct Decision
Type I Error
Reject H0
11False Positive or False Negative?
U.K. police defend shoot-to-kill after
mistake Blair said Menezes had emerged from an
apartment block in south London that had been
under surveillance in connection with Thursdays
attacks, and refused police orders to halt.
Menezes had also been wearing an unseasonably
heavy coat, further raising police
suspicions. MSNBC, July 24th.
12Other Type I and Type II Errors
- Sampling songs.
- Health tests.
- Pregnancy tests.
- Jury decisions.
13Level of Significance
- Hypothesis testing is really designed to control
the chance of a Type I error. - Probability of Type I error the level of
significance. - Selecting ? ( level of significance ) ? select
probability of Type I error. - What is the level of significance for Jury
trials? - We do not control for Type II errors
- gt except by our language of stating do not
reject H0.
14Level of Significance contd
- ? is picked by researcher gt normally 5?
- ? 5 ? type I error happens only 5 of the time.
15D) The probability value (p-value) approach
- Develop null and alternative hypothesis.
- Select level of significance ?.
- Collect data, calculate sample mean and test
statistic.
- Use test statistic to calculate p-value.
- Compare reject H0 if p-value lt ?.
- The sample implies that the alternative (your
research hypothesis) is true.
16Hypothesis Testing The Critical Value Approach
- Develop null and alternative hypothesis.
- Select level of significance ?.
- Collect data, calculate sample mean and test
statistic.
- Use ? to determine critical value and rejection
rule. - Compare if test statistic gt critical value,
reject H0. - The sample implies that the alternative (your
research hypothesis) is true.
17Hypothesis Testing contd
- This is essentially inverting our confidence
index. - ? Is ? more than 2 standard deviations away from
some benchmark?
18E) Population Mean, s Known, One-Tailed Test
- Same hypothesis? p-value method and critical
value method. - Example a new employment program initiative has
been introduced to reduce time spent being
unemployed. - Goal 12 weeks or less unemployed.
- Population standard deviation believed to be 3.2
weeks. - Sample of 40 unemployed workers, average time
unemployed 13.25 weeks. - Assuming a level of significance (?) of .05, is
the program goal being met?
19First (Common) Steps
- Hypothesis H0 µ lt 12 HA µ gt 12
- Clearly 13.25 gt 12.
- This casts doubt on our program goal (the null),
and whether we should continue it. - Key is it enough more, given sample size and
standard deviation, or is it just a (small)
random fluctuation?
20Computing the Test Statistic
- Under our assumptions use the standard normal.
- Use sample mean to calculate test statistic
- Is this z big enough to reject the null
hypothesis? - Next, go our two routesCalculate p-value OR
z-critical.
21Calculating the p-value
- Given the z-value, what is the corresponding
probability?
- This is the probability that 13.25 gt 12 by chance.
22Calculating the p-value contd
- Find 2.47 on the Standard Normal Distribution
tables
23Calculating the p-value contd
- .4932 is the probability of being between 0 and
z2.47.
24Should We Reject the Hypothesis?
- This says that the probability of getting a
sample mean of 13.25 when the true mean is 12
.0038 or less than ½ of 1 percent. - Our significance level was only 5, so we reject
the null. - We are 99.62 certain that the program has
failed.
25Rejection of the Null
0.0038
0
Z2.47
z
Z.05
- Sometimes we say significant at the 0.38
level.
26Critical Value Approach
- This is an alternative you often see in textbooks
or articles. - Find the value of z.05, and compare it to the
test value of z (2.47). - From the tables, z.05 1.645.
- Because 2.47 gt 1.645, reject H0.
27Rejection
0
Z2.47
z
Z.05 1.645
28Excel
- Lets do this example in Excel.
- Look at Appendix 9.2 in text, especially Figure
9.8
29F) Population Mean, s Known, Two-Tailed Test
- Null µ µ0 ? Alternative is µ ? µ0.
- Must examine two areas of the distribution.
- Example
- Price/earnings ratios for stocks.
- Theory stable rate of P/E in market 13.
- If P/E (market) lt 13, you should invest in the
stock market. - If P/E (market) gt 13, you should take your money
out.
30Estimate Steps
- Can we estimate if the population P/E is 13 or
not? - Common steps
- Set hypothesis H0 µ 13 HA µ ? 13
- Select ? .05.
- Calculate standard error.
- Calculate z-value.
31Calculating Test Statistic
- We have a sample of 50
- ? 12.1.
- Historical s 3.0456.
32p-value Approach
- Calculate the p-value.
- We will calculate for the lower tail? then make
an adjustment for the upper tail.
33p-value, Two-Tailed Test
p(z gt 2.09) ??
p(z lt 2.09) ??
0
Z2.09
z
Z2.09
We can just calculate one value, and double it.
34Calculating the p-value contd
- Find 2.090 on the Standard Normal Distribution
tables
35p-value, Two-Tailed Test
0.4817
p(z gt 2.09) 0.5 -.4817 0.0183
p(z lt 2.09) ??
0
Z2.09
z
Z2.09
Doubling the value, we find the p-value 0.0366
36Should We Reject the Null Hypothesis?
- Yes!
- p-value 0.0366 lt ? 0.05.
- There is only a 3.66 chance that the measured
price/earnings ratio sample mean of 12.1 is not
equal to the stable rate of 13 by random chance.
37Critical Value Approach
- Reject the null hypothesis iftest z-value gt
critical value or if test z-value lt critical
value - Two tailed test ? 0.05 ? need critical value
for ?/2 0.025. - The tables tell us that this is 1.96.
38G) Population Mean, s Unknown
- s unknown ? must estimate it with our sample too
? use t-distribution, n 1 degrees of freedom.
39One-tailed Test, p-value Approach
- Steps1. Set up hypothesis.2. Decide on level
of significance (?)3. Collect data, calculate
sample mean and test statistic.4. Use test
statistic t-table/Excel to calculate
p-value.5. Compare reject H0 if p-value lt ?.
40Example Highway Patrol
The RCMP periodically samples vehicle speeds
at various locations on a particular roadway.
The sample of vehicle speeds is used to test the
hypothesis
H0 m lt 65
The locations where H0 is rejected are
deemed the best locations for radar traps.
41Example Highway Patrol
Outside LumsdenA sample of 64 vehicles gt
average speed 65.5 mph gt standard deviation
4.2 mph. Use a .05 to test the hypothesis.
42Common to Both Approaches
1. Determine the hypotheses.
H0 ? lt 65 Ha m gt 65
a .05
2. Specify the level of significance.
3. Compute the value of the test statistic.
434. Estimate the p-value From t-Distribution Table
- Must interpolate the value of t 0.9524, df
65
44 p Value Approach
5. Determine whether to reject H0.
Because pvalue gt a .05, we do NOT reject H0.
We are at least 95 confident that the mean
speed of vehicles outside Lumsden is LESS than OR
EQUAL TO 65 mph.
45 Critical Value Approach
4. Determine the critical value and rejection
rule.
For a .05 and d.f. 64 1 63, t.05 1.669
Reject H0 if t gt 1.669
5. Determine whether to reject H0.
Because 0.9524 lt 1.669, we do NOT reject H0.
46H) Introduction Comparing Population Differences
- Do men get paid more than women?
- 46,452 for men vs. 35,122 for women
(bachelors). - Do more 100-level Economics courses help you in
Econ 201? - Has the crime rate risen?
- Are there more hurricanes recently?
47Key point the role of standard deviation
48Comparing 2 Populations
- True population means ?1 and ?2.
- Random sample of n1 gt ?1.
- Versus random sample of n2 gt ?2.
- Transform into problem is ?1 ?2 0?
- Assuming ?1 and ?2 known gt use z-test.
- If unknown gt estimate ?s from sample ss, and
use t-test.
49I) Confidence Intervals, 2 Means ?s Unknown
- How important is an extra introductory course in
determining your grade in Economics 201? - Data
- Natural experiment.
- 59 students.
- 43 had only one 10x course.
- 16 had two 10x courses.
- Final exam grades
- One 10x average 61.69, s1 22.65, n1 43.
- Two 10x average 75.11, s2 12.80, n2 16.
50Confidence Interval Estimation
- Point estimator ?1 ?2.
- Standard error of ?1 ?2 is
51Confidence Interval contd
- Confidence interval of difference in means ?1
?2 Margin of Error - Typically use a 0.05.
- Margin of error
52Degrees of Freedom
- In this example df 47.36 ? round down to 47.
- 95 confidence interval ? t0.025.
- For 47 degrees of freedom, table says 2.012.
53Confidence Interval contd.
54Confidence Interval contd.
55Confidence Interval
- We are 95 confident that students with only one
10x course scored between 3.775 and 22.725
lower than students with two 10x courses. - Next step would be why, how??
56J) Hypothesis Tests, 2 Means ?s Unknown
- Two datasets gt is the mean value of one larger
than the other? - Is it larger by a specific amount?
- µ1 vs. µ2 gt µ1 µ2 vs. D0.
- Often set D0 0 gt is µ1 µ2?
57Example Female vs. Male Salaries
- Saskatchewan 2001 Census data- only Bachelors
degrees- aged 21-64- work full-time- not in
school - Men ?M 46,452.48, sM 36,260.1, nM 557.
- Women ?W 35,121.94, sW 20,571.3, nW 534.
- ?M ?W 11,330.44 our point estimate.
- Is this an artifact of the sample, or do men make
significantly more than women?
58Hypothesis, Significance Level, Test Statistic
- We will now ONLY use the p-value approach, and
NOT the critical value approach. - Research hypothesis men get paid more
- H0 µM µW lt 0H1 µM µW gt 0
- Select ? 0.05
- Compute test t-statistic
594. a. Compute the Degrees of Freedom
- Can compute by hand, or get from Excel
604. b. Computing the p-value
The p-value ltltlt 0.005.
615. Check the Hypothesis
- Since the p-value is ltltlt 0.05, we reject H0.
- We conclude that we can accept the alternative
hypothesis that men get paid more than women at a
very high level of confidence (greater than 99).
62Excel
63Summary
- Hypothesis tests on comparing two populations.
- Convert to a comparison of the difference to a
standard. - More complex standard deviation and degrees of
freedom. - Same methodology as comparing other hypothesis
tests.
64K) Statistical vs. Practical Significance
- Our tests statistically significantly
- Real world interestpractical significance.
- Men vs. women the difference is statistically
significant AND practically - 46,452.48 vs 35,121.94
- Saskatchewan, full-time, BachelorsWomen make
only 75.6 of men, same average education level.
65Source Leader-Post, Oct. 31, 2008
66L) Matched Samples
- Controlled experiment gt match individuals in
each group. - Matched samples gt each individual tries each
method in turn. - Variation between samples not a problem.
- Focus on difference data.
- Independent samples gt the norm in economics.
- Regression analysis.
67M) Introduction to ANOVA
- What if we want to compare 3 or more sample means
(treatment means)? - Example total income, Saskatchewan females
employed full-time and full-year, by age, 2003
(Source See Oct. 8th lectures)
68ANOVAs Hypotheses
69N) Steps of ANOVA
1. Set up the Hypothesis Statements H0 µ1 µ2
µ3 µ4 µk HA Not all population means
are equal 2. Collect your sample data Means
?1, ?2, ?3, ?4, ?k Variances s21, s22,
s23, s24 , s2k Sample Sizes n1, n2, n3, n4,
nk
70Steps of ANOVA Continued
- 4. Calculate the overall average
5. Create our two estimates of ?2.
71Step 5 a) Estimating ?2 via SSTR
- Between-treatments estimate of ?2 or sum of
squares due to treatments (SSTR). - This compares to , and constructs
an estimate of ?2 based on the assumption the
Null Hypothesis is true
72Step 5 b) Estimating ?2 via SSE
- Within-treatments estimate of ?2 or the sum of
squares due to error (SSE). - This takes the weighted average of the sample sj2
as an estimate of ?2 and is a good estimate
regardless of whether the Null is true
73Step 6 Testing The Null
- If Null true, both estimates should be similar,
andSSTR 1.SST - If ratio gtgtgt 1 ? reject the Null, accept the
Alternative that there is multiple population
distributions.
74Steps of ANOVA
- Set up the Hypothesis Statement. (Null all
means are equal) - Collect the sample data.
- Select level of significance gt a 0.05.
- Calculate the overall average.
- a) Estimate ?2 via sum of squares due to
treatments (SSTR).b) estimate of ?2 via sum of
squares due to error (SSE). - If Null true, both estimates should be similar,
and STR 1. SST
75MSTR and MSE
- MSTR sum of squares due to treatment numerato
r degrees of freedom - sum of squares due to treatment no.
of treatments 1 - SSTR k-1
- MSE Sum of squares due to error denominator
degrees of freedom - Sum of squares due to error
- total no. of obs. no. of treatments
- SSE nT k
76F-test
- F-statistic MSTR ? k-1 degrees of freedom
(df1) MSE ? nT k degrees of freedom
(df2) - If H0 is true, MSTR MSE ? F-statistic 1.
- If H0 is false? p-value is lt level of
significance (a).? F-statistic is higher than
critical value from the table/Excel.
77F-Distribution
0
Ftest-value
78O) Saskatchewan Female Wages Example
- Example total income, Saskatchewan females
employed full-time and full-year, by age, 2003
(Source See Oct. 8th lectures)
79Calculating the MSTR, MSE
80Calculating F-Stat and p-value
- Ftest-value MSTR 1148.5 2.59 MSE
443.8
81F-Table for df1 3 and df2 176
Clearly the p-value gt 0.05 gt accept the Null of
one distribution
176 degrees of freedom, F2.59 in here.
82Excel F-test formula
- FDIST(F-value, df1, df2) gt yields value of
.0544.
0
2.59
83P) Econometrics for Dummies
- Instead of ANOVA, economists tend to use
Regression analysis dummy variables. - Gives us the direction and size of the
differences in mean values. - But ANOVA is a useful first step.