Linear Models II - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Linear Models II

Description:

To examine the assumption of linearity, for example, we can apply a residual ... Linear model Example C2 ... S., 2002, Heirarchical Linear Models, Thousand Oaks, ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 14
Provided by: Dol75
Category:

less

Transcript and Presenter's Notes

Title: Linear Models II


1
Linear Models II
Session 3
Damon Berridge
2
Two Level Random Intercept Models
The resulting model with one explanatory variable
xij is given by
For the level-2 model, the group-dependent
intercept can be split into an average intercept
and the group-dependent deviation
and the same fixed value for each level-2 unit is
assumed
  • The average intercept is g00 and the
    regression coefficient for xij is g10 .
  • Substitution now leads to the model

3
The variance of yij conditional on the value of
xij is given by
while the covariance between two different
level-1 units ( i and i , with i ¹ i ) in the
same level-2 unit is
The fraction of residual variability that can be
attributed to level one is given by
and for level two this fraction is
The correlation between them is the residual
intraclass correlation coefficient,
4
  • An extension of this model allows for the
    introduction of level-2 predictors zj.
  • Using the level-2 model

the model becomes
so that
5
General Two Level Models Including Random
Intercepts
so that
6
Likelihood
where
and
7
Residuals
In a single level model the usual estimate of the
single residual term is just the residual
In a multilevel model, however, there are several
residuals at different levels. In a random
intercept model, the level-2 residual uoj can be
predicted by the posterior means
We can show that
  • Note that we can now estimate the level-1
    residuals simply by the formula

8
Checking Assumptions in Multilevel Models
  • Residual plots can be used to check model
    assumptions. There is one important difference
    from ordinary regression analysis there is more
    than one residual, in fact, we have residuals for
    each random effect in the multilevel model.
    Consequently, many different residuals plots can
    be made.
  • The most regression assumptions are concerned
    with residuals the difference between the
    observed y and the y predicted by the regression
    line. These residuals will be very useful to
    test whether or not the multilevel model
    assumptions hold.
  • As in single level models we can use the
    estimated residuals to help check on the
    assumptions of the model. The two particular
    assumptions that can be studied readily are the
    assumption of Normality and that the variances in
    the model are constant.
  • To examine the assumption of linearity, for
    example, we can apply a residual plot against
    predicted values of the dependent variable using
    the fixed part of the multilevel regression model
    for the prediction.
  • To check the normality assumption of residuals we
    can use a normal probability plot.

9
Linear model Example C2
  • The data we use in this example are a sub-sample
    from the 1982 High School and Beyond Survey
    (Raudenbush, Bryk, 2002), and include information
    on 7,185 students nested within 160 schools 90
    public and 70 Catholic. Sample sizes vary from 14
    to 67 students per school.

Raudenbush, S.W., Bryk, A.S., 2002, Heirarchical
Linear Models, Thousand Oaks, CA. Sage.
Number of observations (rows) 7185 Number of
variables (columns) 15 The variables include the
following schoolschool identifier studentstuden
t identifier minority 1 if student is from an
ethnic minority, 0 other) gender 1 if student
is female, 0 otherwise ses a standardized scale
constructed from variables measuring parental
education, occupation, and income, socio economic
status meanses mean of the SES values for the
students in this school mathach a measure of the
students mathematics achievement size school
enrolment sector 1 if school is from the
Catholic sector, 0 public pracad proportion
of students in the academic track disclim a
scale measuring disciplinary climate himnty 1 if
more than 40 minority enrolment, 0 if less than
40)
10
(No Transcript)
11
  • We will use the high school Math Achievement
    data mentioned above as an extensive example.
  • We think of our data as structured in two
    levels students within schools and between
    schools.
  • The outcome considered here is again math
    achievement score ( y ) modelled as a set of
    explanatory variables x.

At student level,
At the school level,
Where and
In the combined form, the model is
12
Comparing Model Likelihoods
  • Each model that is fitted to the same set of
    data has a corresponding log-likelihood value
    that is calculated at the maximum likelihood
    estimates for that model.
  • The deviance test, or likelihood ratio test, is
    a quite general principle for statistical
    testing.
  • When parameters of a statistical model are
    estimated by the maximum likelihood (ML) method,
    the estimation also provides the likelihood,
    which can be transformed into the deviance
    defined as minus twice the natural logarithm of
    the likelihood.
  • In general, suppose that model one has t
    parameters, while model two is a subset of model
    one with only r of the t parameters so that r lt t
    . Model one will have a higher log-likelihood
    than model two. For large sample sizes, the
    difference between these two likelihoods, when
    multiplied by two, will behave like the
    chi-square distribution with t-r degrees of
    freedom. This can be used to test the null
    hypothesis that the t-r parameters that are not
    in both models are zero.
  • Computer printouts produce either the
    log-likelihoods ( log(L) are negative values) or
    -2log L (which are positive values). Differences
    between -2log L 's are called deviances, where

13
  • For regression models we are estimating, the
    homogeneous model Log likelihood -23285.328 on
    7179 residual degrees of freedom when compared to
    the random effect model Log likelihood
    -23166.634 on 7178 residual degrees of freedom,
    problem with the df, here has a c2 improvement of
    -2(-23285.328 23166.634 237.39 for 1 df,
    which is highly significant, justifying the extra
    scale parameter.
  • The estimates of the residual variance and
    the random intercept variance are much lower
    in this model than in the simple model with no
    explanatory variables.

The residual intraclass correlation is estimated
by
  • In a model without the explanatory variables,
    this was 0.18.
Write a Comment
User Comments (0)
About PowerShow.com