Title: Stat%20112:%20Lecture%2021%20Notes
1Stat 112 Lecture 21 Notes
- Model Building (Brief Discussion)
- Chapter 9.1 One way Analysis of Variance.
- Homework 6 is due Friday, Dec. 1st.
- I will be e-mailing you tonight or tomorrow some
comments on your project ideas. - I will have the quizzes graded by tomorrows
office hours (Wed. 130-230) otherwise, I will
return to you next Tuesday.
2Model Building
- Among the potential explanatory variables, think
about which explanatory variables address the
question of interest. - For each explanatory variable, investigate
whether a transformation is needed for it either
because of curvature or crunching. - Consider adding polynomial terms for each
variable if there is remaining curvature for the
variable (use the procedure of adding higher
orders as long as the highest order term has
p-value lt 0.05). - Consider interactions between the explanatory
variables, adding the interaction if the p-value
lt 0.05 on the interaction term.
3Analysis of Variance
- The goal of analysis of variance is to compare
the means of several (many) groups. - Analysis of variance is regression with only
categorical variables - One-way analysis of variance Groups are defined
by one categorical variable. - Two-way analysis of variance Groups are defined
by two categorical variables.
4Milgrams Obedience Experiments
- Subjects recruited to take part in an experiment
on memory and learning. - The subject is the teacher.
The subject conducted a paired-associated
learning task with the student. The subject is
instructed by the experimenter to administer a
shock to the student each time he gave a wrong
response. Moreover, the subject was instructed
to move one level higher on the shock generator
each time the learner gives a wrong answer.
The subject was also instructed to announce the
voltage level before administering a shock.
5Four Experimental Conditions
- Remote-Feedback condition Student is placed in a
room where he cannot be seen by the subject nor
can his voice be heard his answers flash
silently on signal box. However, at 300 volts
the laboratory walls resound as he pounds in
protest. After 315 volts, no further answers
appear, and the pounding ceases. - Voice-Feedback condition Same as remote-feedback
condition except that vocal protests were
introduced that could be heard clearly through
the walls of the laboratory.
6- Proximity Same as the voice-feedback condition
except that student was placed in the same room
as the subject, a few feet from subject. Thus,
he was visible as well as audible. - Touch-Proximity Same as proximity condition
except that student received a shock only when
his hand rested on a shock plate. At the
150-volt level, the student demanded to be let
free and refused to place his hand on the shock
plate. The experimenter ordered the subject to
force the victims hand onto the plate.
7Two Key Questions
- Is there any difference among the mean voltage
levels of the four conditions? - If there are differences, what conditions
specifically are different?
8Multiple Regression Model for Analysis of Variance
- To answer these questions, we can fit a multiple
regression model with voltage level as the
response and one categorical explanatory variable
(condition). - We obtain a sample from each level of the
categorical variable (group) and are interested
in estimating the population means of the groups
based on these samples. - Assumptions of multiple regression model for
one-way analysis of variance - Linearity automatically satisfied.
- Constant variance Check if spread within each
group is the same. - Normality Check if distribution within each
group is normally distributed. - Independence Sample consists of independent
observations.
9Comparing the Groups
- The coefficient on ConditionProximity-26.25
means that proximity is estimated to have a mean
that is 26.25 less than the mean of the means of
all the conditions. -
Sample mean of proximity group.
10- Effect Test tests null hypothesis that the mean
in all four conditions is the same versus
alternative hypothesis that at least two of the
conditions have different means. - p-value of Effect Test lt 0.0001. Strong evidence
that population means are not the same for all
four conditions.
11JMP for One-way ANOVA
- One-way ANOVA can be carried out in JMP either
using Fit Model with a categorical explanatory
variable or Fit Y by X with the categorical
variable as the explanatory variable. - After using the Fit Y by X command, click the red
triangle next to Oneway Analysis and then Display
Options, Boxplots to see side by side boxplots
and click Mean/ANOVA to see means of the
different groups and the test of whether all
groups have the same means. This test of whether
all groups have the same means has p-value ProbgtF
in the ANOVA table.
12ProbgtF p-value for test that all groups have
same mean. Same as p-value for Effect test in
Fit Model Output.
13Two Key Questions
- Is there any difference among the mean voltage
levels of the four conditions? - Yes, there is strong evidence of a
difference. p-value of Effect Test lt 0.0001. - If there are differences, what conditions
specifically are different? -
14Testing whether each of the groups is different
- Naïve approach to deciding which groups have mean
that is different from the average of the means
of all groups Do t-test for each group and look
for groups that have p-value lt0.05. - Problem Multiple comparisons.
15(No Transcript)
16Errors in Hypothesis Testing
State of World State of World
Null Hypothesis True Alternative Hypothesis True
Decision Based on Data Accept Null Hypothesis Correct Decision Type II error
Decision Based on Data Reject Null Hypothesis Type I errror Correct Decision
When we do one hypothesis test and reject null
hypothesis if p-value lt0.05, then the probability
of making a Type I error when the null hypothesis
is true is 0.05. We protect against falsely
rejecting a null hypothesis by making probability
of Type I error small.
17Multiple Comparisons Problem
- Compound uncertainty When doing more than one
test, there is an increase chance of making a
mistake. - If we do multiple hypothesis tests and use the
rule of rejecting the null hypothesis in each
test if the p-value is lt0.05, then if all the
null hypotheses are true, the probability of
falsely rejecting at least one null hypothesis is
gt0.05.
18Multiple Comparisons Simulation
- In multiplecomp.JMP, 20 groups are compared with
sample sizes of ten for each group. - The observations for each group are simulated
from a standard normal distribution. Thus, in
fact, - Number of pairs found to have significantly
different means using t-test at level -
Iteration 1 2 3 4 5
of Pairs
19Multiple Comparison Simulation
- In multiplecomp.JMP, 20 groups are compared with
sample sizes of ten for each group. - The observations for each group are simulated
from a standard normal distribution. Thus, in
fact, - Number of groups found to have means different
than average using t-test and rejecting if
p-value lt0.05.
Iteration 1 2 3 4 5
of Groups
20Individual vs. Familywise Error Rate
- When several tests are considered simultaneously,
they constitute a family of tests. - Individual Type I error rate Probability for a
single test that the null hypothesis will be
rejected assuming that the null hypothesis is
true. - Familywise Type I error rate Probability for a
family of test that at least one null hypothesis
will be rejected assuming that all of the null
hypotheses are true. - When we consider a family of tests, we want to
make the familywise error rate small, say 0.05,
to protect against falsely rejecting a null
hypothesis. -
21Bonferroni Method
- General method for doing multiple comparisons for
any family of k tests. - Denote familywise type I error rate we want by
p, say p0.05. - Compute p-values for each individual test --
-
- Reject null hypothesis for ith test if
- Guarantees that familywise type I error rate is
at most p. - Why Bonferroni works If we do k tests and all
null hypotheses are true , then using Bonferroni
with p0.05, we have probability 0.05/k to make
a Type I error for each test and expect to make
k(0.05/k)0.05 errors in total.
22Tukeys HSD
- Tukeys HSD is a method that is specifically
designed to control the familywise type I error
rate (at 0.05) for analysis of variance. - After Fit Model, click the red triangle next to
the X variable and click LSMeans Tukey HSD.
23Comparisons between groups that are in red are
groups for which the null hypothesis that the
group means are the same is rejected using the
Tukey HSD procedure, which controls the
familywise Type I error rate at 0.05. A
confidence interval for the difference in group
means that adjusts for multiple comparisons is
shown in the third and fourth lines.
24Assumptions in one-way ANOVA
- Assumptions needed for validity of one-way
analysis of variance p-values and CIs - Linearity automatically satisfied.
- Constant variance Spread within each group is
the same. - Normality Distribution within each group is
normally distributed. - Independence Sample consists of independent
observations.
25Rule of thumb for checking constant variance
- Constant variance Look at standard deviation of
different groups by using Fit Y by X and clicking
Means and Std Dev. - Rule of Thumb Check whether (highest group
standard deviation/lowest group standard
deviation) is greater than 2. If greater than 2,
then constant variance is not reasonable and
transformation should be considered.. If less
than 2, then constant variance is reasonable. - (Highest group standard deviation/lowest group
standard deviation) (131.874/63.640)2.07.
Thus, constant variance is not reasonable for
Milgrams data.
26Transformations to correct for nonconstant
variance
- If standard deviation is highest for high groups
with high means, try transforming Y to log Y or
. If standard deviation is highest for groups
with low means, try transforming Y to Y2. -
- SD is particularly low for group with highest
mean. Try transforming to Y2. To make the
transformation, right click in new column, click
New Column and then right click again in the
created column and click Formula and enter the
appropriate formula for the transformation.
27Transformation of Milgrams data to Squared
Voltage Level
- Check of constant variance for transformed data
(Highest group standard deviation/lowest group
standard deviation) 1.63. Constant variance
assumption is reasonable for voltage squared. - Analysis of variance tests are approximately
valid for voltage squared data reanalyzed data
using voltage squared.
28Analysis using Voltage Squared
Strong evidence that the group mean voltage
squared levels are not all the same.
Strong evidence that remote has higher mean
voltage squared level than proximity and
touch-proximity and that voice-feedback has
higher mean voltage squared level than
touch-proximity, taking into account the multiple
comparisons.
29Rule of Thumb for Checking Normality in ANOVA
- The normality assumption for ANOVA is that the
distribution in each group is normal. Can be
checked by looking at the boxplot, histogram and
normal quantile plot for each group. - If there are more than 30 observations in each
group, then the normality assumption is not
important ANOVA p-values and CIs will still be
approximately valid even for nonnormal data if
there are more than 30 observations in each
group. - If there are less than 30 observations per group,
then we can check normality by clicking Analyze,
Distribution and then putting the Y variable in
the Y, Columns box and the categorical variable
denoting the group in the By box. We can then
create normal quantile plots for each group and
check that for each group, the points in the
normal quantile plot are in the confidence bands.
If there is nonnormality, we can try to use a
transformation such as log Y and see if the
transformed data is approximately normally
distributed in each group.
30One way Analysis of Variance Steps in Analysis
- Check assumptions (constant variance, normality,
independence). If constant variance is violated,
try transformations. - Use the effect test (commonly called the F-test)
to test whether all group means are the same. - If it is found that at least two group means
differ from the effect test, use Tukeys HSD
procedure to investigate which groups are
different, taking into account the fact multiple
comparisons are being done.