Title: Hypothesis Testing
1Hypothesis Testing
- Introduction to Inductive Statistics
2Background Websites
- http//www.intuitor.com/statistics/CurveApplet.htm
l - http//www.intuitor.com/statistics/T1T2Errors.html
3Terms
- Descriptive Statistics what weve done so far
- Inductive Statistics making decisions on the
basis of statistical evidence - Hypothesis the relationship or proposition you
wish to test, stated to affirm the relationship
or proposition - Null Hypothesis the negative form of the
proposition to be tested
4Some propositions
- Statistics is making decisions on the basis of
incomplete or imperfect information. - A hypothesis or proposition can be refuted by one
observation but not proved by many. - We thus proceed by determining the likelihood
that the null hypothesis can be rejected.
5The Dilemma
- There is always the possibility of making the
wrong decision - Rejecting a true hypothesis Type 1 Error
- Failing to reject a false hypothesis Type 2
Error
6An Example of the Issue A Jurys Decision Making
7An Example of the Issue A Jurys Decision Making
8The Statistical Decision Making Framework
9The Statistical Decision Making Framework
10The Jury and the Researcher Compared
11Steps for Making a Decision
- Specify a hypothesis and the null hypothesis
- Specify a level of probability which you will use
to decide whether to reject the null hypothesis. - Specify the test statistic and the sampling
distribution you will use to make a decision. - Calculate the statistics and compare to the
theoretical probability distribution, for
example, the t distribution http//www.uwm.edu/r
enlex/T.html - Interpret the results.
12General Form of the Test for a Mean
- Z tests
- (Sample mean population mean) / SE of sample
mean, or - (Sample mean population mean)/ (s / vn)
- T tests
- (Sample mean population mean) / SE of sample
mean, or - (Sample mean population mean)/ (s / vn-1)
13General Form of a T Test
- t sample estimate null hypothesis/ SE
- Which simplifies to
- t sample estimate/SE
- When the null hypothesis is that the sample
statistic is 0.
14T Distribution
15Example 1
- Hypothesis There is a difference in the average
number of persons per household in the 18th and
the 14th wards. - Null Hypothesis There is no difference in the
average number of persons per household in the
18th and the 14th wards, or more specifically,
any difference we measure is a matter of the
particular sample we have.
16Example, cont.
- Level of probability 95 confidence level, so
that only 1 in 20 times would the results be
different. - Test statistic Means and a T-Test of the
difference of two groups. - t (mean1 mean2)/ (SE of the difference of
mean1-mean2) - Calculate the statistics.
17Results
Two-sample t test on PERSONS grouped by WARD
Group N Mean SD
14 316 8.15 3.70
18 120 5.25 2.55
Separate Variance t 9.29 df
310.5 Prob 0.00 Difference in
Means 2.90 95.00 CI 2.29 to
3.52 Pooled Variance t
7.90 df 434 Prob 0.00
Difference in Means 2.90 95.00 CI
2.18 to 3.62
18Results, Graphically Displayed
19Interpret the Results
- Lets look at the t distribution again
http//www.uwm.edu/renlex/T.html and p. 135 of
text. - We can reject the null hypothesis that the two
means are the same in the underlying population
(the unknown truth). - We say that there is a statistically significant
difference between the average number of persons
in the two wards.
20Example 2
- Hypothesis There is a difference in the average
number of persons per household in the 18th and
the 20th wards. - Null Hypothesis There is no difference in the
average number of persons per household in the
18th and the 20th wards, or more specifically,
the difference is a matter of the particular
sample we have.
21Results
TEST PERSONS WARD Data for the following
results were selected according to (WARDltgt
14) AND (wardltgt 22) Two-sample t test on
PERSONS grouped by WARD Group N
Mean SD 18 120
5.25 2.55 20 342
5.72 2.62 Separate
Variance t -1.73 df 213.5 Prob
0.08 Difference in Means
-0.47 95.00 CI -1.01 to 0.07
Pooled Variance t -1.71 df 460
Prob 0.09 Difference in Means
-0.47 95.00 CI -1.02 to
0.07
22Results, Graphically Displayed
23Interpret the Results
- We cannot reject the null hypothesis that the two
means are the same in the underlying population
(the unknown truth). - We say that there is not a statistically
significant difference between the average number
of persons in the two wards.
24Additional T tests
- Tests whether regression coefficients and
correlation coefficients are statistically
significantly different from 0. - Test of regression slope (b)
- t b / SE(b)
- Test of correlation coefficients
- t r / SE(r)
25Testing a Regression Coefficient The Impact of
Year Built on Size, 14th Ward, 1905
Dep Var SIZE N 101 Multiple R 0.23
Squared multiple R 0.05 Adjusted squared
multiple R 0.04 Standard error of estimate
588.27 Effect Coefficient Std Error
Std Coef Tolerance t P(2 Tail) CONSTANT
785.26 128.85 0.00 .
6.09 0.00 YEAR 24.72
10.67 0.23 1.00 2.32
0.02 Effect Coefficient Lower 95
Upper 95 CONSTANT 785.26 529.59
1040.92 YEAR 24.72 3.56
45.88
26An example of the issue A jurys decision making