Title: Hypothesis tests for difference between means
1Hypothesis tests for difference between means
2Hypothesis tests for difference between means
- Hypothesis testing
- Central limit theorem
- Sampling distribution of difference between means
- Null and alternative hypotheses
- Type I and type II errors
- Tests for difference between means
- two-sample t-test (equal variances)
- one sample t-test
- paired two-sample t-test
- two-sample z-test
- two-sample t-test (unequal variances)
- One-tailed and two-tailed tests
3Hypothesis Testing
- Where we formally test the objective of our
scientific investigation - As we would speak
- Male and female shrimps are different sizes
- In statistics speak
- H0 There is no difference in size between male
and female shrimps - H1 There is a difference in size between male
and female shrimps - More formally
- H0 ?1 ?2
- H1 ?1 ? ?2 or ?1 - ?2 ? 0
4Test for Differences Between Means
x
Question are these two means calculated from
samples of the same population?
5Test for Differences Between Means
6Central Limit Theorem
- If the sample size is sufficiently large
- the sampling distribution of the means
approximates the normal probability distribution - the underlying population does not have to be
normally distributed - If the sample size is small
- the sampling distribution of the means
approximates the t-distribution, but only if the
sample is normally distributed
7Central Limit Theorem
- We can extend the Central Limit Theorem to the
sampling distribution of other statistics as
well - including the sampling distribution of the
difference between two means.
8Test for Differences Between Means
?1 - ?2 is a statistic which has a sampling
distribution
9The sampling distribution of ?1 - ?2
??1-?2
?1- ?2
10The standard error of the difference between
means
11The sampling distribution of ?1 - ?2
- if samples are from same population, any
deviations of ?1 - ?2 from zero are due to
sampling - the larger the value of ?1 - ?2 the less likely
they are to be drawn from the same population - the sampling distribution of ?1 - ?2 follows the
t distribution
12Example two-sample test
13Example two-sample test
n1 20 ?1 16.67 s1 4.2
n2 20 ?2 13.75 s2 5.1
14Example two-sample test
H0 ?1 ?2 H1 ?1 - ?2 ? 0
n1 20 ?1 16.67 s1 4.2
n2 20 ?2 13.75 s2 5.1
15Example two-sample test
H0 ?1 ?2 H1 ?1 - ?2 ? 0
n1 20 ?1 16.67 s1 4.2
n2 20 ?2 13.75 s2 5.1
16Example two-sample test
H0 ?1 ?2 H1 ?1 - ?2 ? 0
n1 20 ?1 16.67 s1 4.2
n2 20 ?2 13.75 s2 5.1
17Example two-sample test
get critical value for t from the table degrees
of freedom n1 n2 2 20 20 2
38 significance (?) 0.05 tcrit 2.02
18Example two-sample test
19Hypothesis Testing for Sample Means
- State Null Hypothesis (H0)
- State Alternative Hypothesis (H1)
- Decide on Level of Significance
- Choose Test Distribution
- Define Rejection Regions
- State Decision Rule
- Calculations
- Make Statistical Decision
20Type I and Type II Errors
H0 ?1 ?2 H1 ?1 ? ?2
21Probability of Type I and Type II Errors
- Probability of a Type I Error is the significance
level - Probability of a Type I Error is unknown, but
- increases as significance level decreases
22One-sample t-test
- Test a sample mean against some known or
suggested population mean - H0 ? reported value
- H1 ? ? reported value
23Example one-sample test
- Water authority mean nitrate value 17.34 mg/l
- Sample ? 21.89 s 3.04 n 20
- Significance 0.05
24Example one-sample test
25Example one-sample test
- tcrit ?
- df n 1 20 1 19
- ? 0.05
- tcrit 2.093
- tcalc gt tcrit so we reject H0
26t-test for paired samples
- sometimes two samples are not independent
- then we test for difference between paired
measurements
27t-test for paired samples
is the difference between each pair of
measurements is the mean of all the individual
differences is the best estimate of the standard
deviation of d.
degrees of freedom is n-1
28One-tailed and two-tailed tests
- Two-tailed test
- H0 ?1 ? 2 (or ? 1 - ? 2 0)
- H1 ? 1 ? ? 2 (or ? 1 - ? 2 ? 0)
One-tailed test H0 ? 1 ? 2 (or ? 1 - ? 2
0) H1 ? 1 gt ? 2 (or ? 1 - ? 2 gt 0)
29Z-test for Means
- For large samples (gt30)
- But might as well use t-test, as produce
approximately the same results - See notes for more details
30Key assumptions to t-test
- for small samples, the parent data from which the
samples are drawn are normally distributed - for large samples, the parent data can have any
distribution - the two samples come from distributions that may
differ in their mean value, but not in the
standard deviation (or variance) and - the observations are random, and the samples are
independent of each other (not the paired test).