Title: MARE 250
1Hypothesis Testing II
MARE 250 Dr. Jason Turner
2To ASSUME is to make an
However...
Four assumptions for t-test hypothesis
testing 1. Random Samples 2. Independent
Samples 3. Normal Populations (or large
samples) 4. Variances (std. dev.) are equal
3When do I do the what now?
Well, whenever I'm confused, I just check my
underwear. It holds the answer to all the
important questions. Grandpa Simpson
If all 4 assumptions are met Conduct a pooled
t-test - you can pool the samples because the
variances are assumed to be equal If the samples
are not independent Conduct a paired t-test If
the variances (std. dev.) are not equal Conduct
a non-pooled t-test If the data is not normal or
has small sample size Conduct a non-parametric
t-test (Mann-Whitney)
4When to pool, when to not-pool
"We have a pool and a pondThe pond would be
good for you. Ty Webb
Both tests are run by Minitab as 2-sample
t-test For pooled test check box Assume
Equal Variances For non-pooled, do not check box
5Assessing Equal Variances
Equality of variance can checked by performing an
F-test Often not recommended Although pooled
t-test is moderately robust to unequal variances,
F test is extremely non-robust to such
inequalities
What?
Pooled t-test will allow you to run an accurate
test with some degree of unequal variance F-test
is much more specific than pooled-t
6Who did the What Now
What?
7Assessing Equal Variances
F-test and Levenes used to judge the equality of
variances. In both tests, the null hypothesis
(Ho) is that the population variances under
consideration (or equivalently, the population
standard deviations) are equal, and the
alternative hypothesis (Ha) is that the two
variances are not equal. The choice of test
depends on distribution properties
8What the F?
Use the F-test when the data come from a normal
distribution - is not robust to departures from
normality Use Levene's test when the data come
from continuous, but not necessarily normal,
distributions is less sensitive than the F-test,
so use the F-test when your data are normal or
nearly normal
9When the F?
MINITAB calculates and displays a test statistic
and p-value for both the F-test and Levene's
test Ho s1 s2 2 population variances
equal Ha s1 ? s2 2 variances are not
equal High p-values (above a-level) Fail to
Reject Null - indicate no statistically
significant difference between the variances
(equality or homogeneity of variances) Low
p-values (below a-level) Reject Null - indicate
a difference between the variances (inequality
of variances)
10How the F?
STAT Basic Statistics 2-Variances Enter
columns of data as before Under Options can
modify a-level of test (but why would you do that)
Note that by default, MINITAB gives you the
results of both the F-test and Levenes Must
decide a priori which test you plan to utilize
11Significance Level
The probability of making a TYPE I Error
(rejection of a true null hypothesis) is called
the significance level (a) of a hypothesis
test TYPE II Error Probability (ß)
nonrejection of a false null hypothesis For a
fixed sample size, the smaller we specify the
significance level (a) , the larger will be the
probability (ß) , of not rejecting a false
hypothesis
12I have the POWER!!!
The power of a hypothesis test is the probability
of not making a TYPE II error (rejecting a false
null hypothesis) t evidence to support the
alternative hypothesis POWER 1 - ß Produce a
power curve
13We need more POWER!!!
For a fixed significance level, increasing the
sample size increases the power Therefore, you
can run a test to determine if your sample size
HAS THE POWER!!!
By using a sufficiently large sample size, we can
obtain a hypothesis test with as much power as we
want
14Power - the probability of being able to detect
an effect of a given size Sample size - the
number of observations in each sample Difference
(effect) - the difference between µ for one
population and µ for the other
15- Increasing the power of the test
- There are four factors that can increase the
power of a two-sample t-test - Larger effect size (difference) - The greater the
real difference between m for the two
populations, the more likely it is that the
sample means will also be different. - Higher a-level (the level of significance) - If
you choose a higher value for a, you increase the
probability of rejecting the null hypothesis, and
thus the power of the test. (However, you also
increase your chance of type I error.) - 3. Less variability - When the standard
deviation is smaller, smaller differences can be
detected. - 4. Larger sample sizes - The more observations
there are in your samples, the more confident you
can be that the sample means represent m for the
two populations. Thus, the test will be more
sensitive to smaller differences.
16Increasing the power of the test The most
practical way to increase power is often to
increase the sample size However, you can also
try to decrease the standard deviation by making
improvements in your process or measurement
17Sample size Increasing the size of your samples
increases the power of your test You want enough
observations in your samples to achieve adequate
power, but not so many that you waste time and
money on unnecessary sampling If you provide the
power that you want the test to have and the
difference you want it to be able to detect,
MINITAB will calculate how large your samples
must be
18When to pair, when to not-pair
All I got's two fives! - Jean LaRose
Test is run by Minitab directly as paired
t-test Used when there is a natural pairing of
the members of two populations Each pair
consists of a member from one population and that
members corresponding member in the other
population Use difference between the two sample
means
19When to pair, when to not-pair
All I got's two fives! - Jean LaRose
Paired t-test assumptions 1. Random Sample 2.
Paired difference normally distributed large
n 3. Outliers can confound results Tests
whether the difference in the pairs is
significantly different from zero
20Paired Test - Example
For Example If you are testing the effects of
some experimental treatment upon a
population e.g. effect of new diet upon a
single sample of fish However Paired test must
have equal sample sizes
21When to parametric
Nonparametric procedures Statistical procedures
that require very few assumptions about the
underlying population. They are often used when
the data are not from a normal population.
22Non-Parametric
Non-parametric t-test (Mann-Whitney) 1. Random
Sample 2. Do not require normally distributed
data 3. Outliers do not confound results Tests
whether the difference in the pairs is
significantly different from zero Non-parametric
test are used heavily in some disciplines
although not typically in the natural sciences
often the last resort when data is not
collected correctly, low power
23Drawbacks of Nonparametric Tests
Nonparametric tests Less powerful than
parametric tests. Thus, you are less likely to
reject the null hypothesis when it is
false. Often require you to modify the
hypotheses. For example, most nonparametric tests
concerning the population center are tests about
the median rather than the mean. The test does
not answer the same question as the corresponding
parametric procedure. When a choice exists and
you are reasonably certain that the assumptions
for the parametric procedure are satisfied, then
use the parametric procedure.