Title: Last Time: From descriptive to inference statistics
1Last Time From descriptive to inference
statisticsEstimation and Hypothesis
Testingfor Linear Regression
2Example 20 kindergarteners
Popularity Score
Social Competence Score
PS SCS PS SCS
1 2.5 15 11 2.1 11
2 1.9 11 12 2.9 17
3 1.8 13 13 1.6 9
4 1.9 9 14 2.5 19
5 2.7 16 15 1.7 12
6 2.2 14 16 1.7 8
7 2 16 17 2.3 15
8 2 15 18 1.5 10
9 2.8 14 19 2.4 18
10 2.7 20 20 1.5 8
3Statistical Inference (for two variables)
Example Children X
Popularity, Y
Social competence
Goal Explain (linear) relationship between X
and Y
4Statistical Inference (for two variables)
5Four assumptions about
the error term
4
For different values of X, the error terms are
uncorrelated
1
3
2
The error term is a normally distributed random
variable
No matter what value X takes, the error has a
mean of zero
6First AssumptionError has a Normal Distribution
Error
7Second assumption Average error is zero for each
value of X
1
X
83rd assumption Error has same standard deviation
for each value of X
Y
Error
X
9Y
X
10Y
X
11We will sample data to estimate the
parameters. This leads to point estimates,
confidence intervals and hypothesis testing for
each parameter, in addition to a general test of
the model as a whole.
12(No Transcript)
13Parameter Estimates
Degrees of freedom
loose 1 df for X loose 1 df for Y
14Recall Point Estimates (Sample Statistics) are
Random Variables
Sampling Distributions
15Recall Point Estimates (Sample Statistics) are
Random Variables
Dont Know!
Hypothesis Testing
16Confidence Intervals for Intercept and Slope
Point estimate
Std. dev. of point estimate
critical value
1795 Confidence Intervals for Intercept and Slope
18Hypothesis Test on Slope
If p-value of the standardized statistic lt ? then
reject H0 and conclude that there is indeed a
linear relationship
19Hypothesis Test on Slope
20Computer Output(Note Different programs differ
in style and content!)
p-value lt .001
21Analysis of Variance for Regression
Degrees of Freedom DFT n-1
Degrees of Freedom DFM 1
Degrees of Freedom DFEn-2
Mean Squares Total (MST)
Mean Squares Error (MSE)
Mean Squares Model (MSM)
22Analysis of Variance Table
23Hypothesis Testingand the ANOVA Table
Mean Squares Error (MSE)
It can be shown that the Null Hypothesis implies
that MSM is also an unbiased estimator of
24Analysis of Variance Table
Table E
df in the numerator
df in the denominator
25Analysis of Variance Table
p
p-value here lt.001
26Note
27The Square of a t Random Variable with n-2
degrees of freedomis an F Random Variablewith 1
degree of freedom in the numerator andwith n-2
degrees of freedom in thedenominator.
28Is there a linear relationship between random
variables X and Y?
Hypothesis Test about Correlation
Question Can we / can we not draw a line close
to the data? Answer No, unless we provide
sufficient evidence that we can.
29loose 1 df for X loose 1 df for Y
If p-value of the standardized statistic lt ? then
reject H0 and conclude that there is indeed a
linear relationship
30PS SCS PS SCS
1 2.5 15 11 2.1 11
2 1.9 11 12 2.9 17
3 1.8 13 13 1.6 9
4 1.9 9 14 2.5 19
5 2.7 16 15 1.7 12
6 2.2 14 16 1.7 8
7 2 16 17 2.3 15
8 2 15 18 1.5 10
9 2.8 14 19 2.4 18
10 2.7 20 20 1.5 8
In our Example
31HmmmmLast TimeThis Time
32One more thing
Percent of Variance
explained by the model
33 Skip Multiple Regression (I recommend just
reading chapter 11 once)TodayOne-way Analysis
of Variance
34One-way analysis of variance
- Remember two-independent sample
t-statistic? - Test whether two population means are equal
- One-way analysis of variance is a strategy to
test whether I many independent groups have equal
population means.
35ExampleIs memory for spoken materialsaffected
by asynchrony betweenvisual and auditory stimuli?
List of 50 spoken words
36Example
- List of 50 spoken words
- 3 x 10 Subjects (split among I3 groups)
- Group 1 (Fast sound) Person in movie reads list,
but sounds precede lip movement slightly - Group 2 (Slow sound) Person in movie reads list,
but sounds lag behind lip movement slightly - Group 3 (Synchrony) Person in movie reads list
with auditory and visual stimuli in synchrony - Memory Task Subjects are asked to recall as many
items as possible.
37Example Wordlist Recall (out of 50 words)
Group 1 Group 2 Group 3
23 27 23
22 28 24
18 33 21
15 19 25
29 25 19
30 29 24
23 36 22
16 30 17
19 26 20
17 21 23
??
38One-way Analysis of Variance Model Assumptions
I many Independent Groups
Popu lation
Data
Sample Size
39One-way Analysis of Variance Model Assumptions
For the jth observation from group i we assume
40One-way Analysis of Variance
41Terminology
- Factor a categorical variable that
distinguishes the groups. - Level of the factor refers to the different
values that the categorical variable can take. - One-way refers to one factor.
42Ingredients and preparation for the analysis
43Ingredients and preparation for the analysis
44Similar recipe as in Linear Regression!
deviation between group mean and grand mean
deviation from grand mean
deviation from group mean
Sum Squares Total (SST)
Sum Squares Groups (SSG) SS between
Sum Squares Error (SSE) SS within
45Similar recipe as in Linear Regression!
Sum Squares Total (SST)
Sum Squares Error (SSE)
Sum Squares Groups (SSG)
Degrees of Freedom DFT N-1
Degrees of Freedom DFG I-1
Degrees of Freedom DFEN-I
MSG
46(No Transcript)
47Lets grind it out for our example
Large MSG leads to significant F
statistic. Reject Null Hypothesis! Conclusion
The population means are not identical across
groups
MSG
48What if I2?
Remember The Square of a t Random
Variable with n-2 degrees of freedom is an F
Random Variablewith 1 degree of freedom in the
numerator andwith n-2 degrees of freedom in the
denominator.
Thus, the one-way analysis of variance is a
natural extension of the comparison of two means
from independent samples (with equal population
variances).
49Robustness
- If the samples sizes are equal, then the
assumption of equal variance (equal standard
deviation) is not crucial. - CLT helps with violations of normality, i.e. as
long as sample sizes are large, we do not need
normality of the X variables.
50Final Comment
- Analysis of Variance
- is used for a
- Comparison of Population Means