Title: Statistical Analysis Overview I Session 2
1Statistical Analysis Overview ISession 2
- Peg Burchinal
- Frank Porter Graham
- Child Development Institute,
- University of North Carolina-Chapel Hill
2Overview Statistical analysis overview I-b
- Nesting and intraclass correlation
- Hierarchical Linear Models
- 2 level models
- 3 level models
3Nesting
- Nesting implies violation of the linear model
assumptions of independence of observations - Ignoring this dependency in the data results in
inflated test statistics when observations are
positively correlated - CAN DRAW INCORRECT CONCLUSIONS
4Nesting and Design
- Educational data often collected in schools,
classrooms, or special treatment groups - Lack of independence among individuals -gt
reduction in variability - Pre-existing similarities (i.e., students within
the cluster are more similar than a students who
would be randomly selected) - Shared instructional environment (i.e.,
variability in instruction greater across
classroom than within classroom) - Educational treatments often assigned to schools
or classrooms - Advantage To avoid contamination, make study
more acceptable (often simple random assignment
not possible) - Disadvantage Analysis must take dependencies or
relatedness of responses within clusters into
account
5Intraclass Correlation (ICC)
- For models with clustering of individuals
- cluster effect proportion of variance in the
outcomes that is between clusters (compares
within-cluster variance to between-cluster
variance) - Example clustering of children in classroom.
ICC describes proportion of variance associated
with differences between classrooms
6Intraclass Correlation
- Intraclass correlation (ICC) measure of
relatedness or dependence of clustered data - Proportion of variance that is between clusters
- ICC or r s2 b / (s2 b s2 w)
- ICC 0 no correlation among individuals within
a cluster - 1 all responses within the clusters are
identical
7Nesting, Design, and ICC
- Taking ICC into account results in less power for
given sample size - less independent information
- Design effect mk / (1 r (m-1))
- m number of individuals per cluster
- Knumber of clusters
- r ICC
- Effective sample size is number of clusters (k)
when ICC1 and is number of individuals (mk) when
ICC0
8ICC and Hierchical Linear Models
- Hierarchical linear models (HLM) implicitly take
nesting into account - Clustering of data is explicitly specified by
model - ICC is considered when estimating standard
errors, test statistics, and p-values
92 level HLM
- One level of nesting
- Longitudinal Repeated measures of individual
over time - Typically - Random intercepts and slopes to
describe individual patterns of change over time - Clusters Nesting of individuals within classes,
families, therapy groups, etc. - Typically - Random intercept to describe cluster
effect
102 level HLM Random-intercepts models
- Corresponds to One-way ANOVA with random effects
(mixed model ANOVA) - Example Classrooms randomly assigned to
treatment or control conditions - All study children within classroom in same
condition - Post treatment outcome per child (can use
pre-treatment as covariate to increase power) - Level 1 children in classroom
- Level 2 classroom
- ICC reflects extent the degree of similarity
among students within the classroom.
112 Level HLMRandom Intercept Model
- Level 1 individual students within the
classroom - Unconditional Model Yij B0j rij
- Conditional Model Yij B0j B1 Xij rij
- Yij outcome for ith student in jth class
- B0j intercept (e.g., mean) for jth class
- B1 coefficient for individual-level covariate,
Xij - rij random error term for ith student in jth
class, - E ( rij) 0, var (rij) s2
122 Level HLMRandom Intercept Model
- Level 2 Classrooms
- Unconditional model B0j g00 u 0j
- Conditional model B0j g00 g01 Wj1 g02 Wj2
u 0j - B0j j intercept (e.g., mean) for jth class
- g00 grand mean in population
- g01 treatment effect for Wj, dummy variable
indicating treatment status - -.5 if control .5 if treatment
- g02 coefficient for Wj2, class level covariate
- u 0j random effect associated with j-th
classroom - E (uij) 0, var (uij) t00
132 Level HLMRandom Intercept Model
- Combined (unconditional)
- Yij g00 u 0j rij
- Yij B0j rij
- B0j g00 u 0j
- Combined (conditional)
- Yij g00 g01 Wj g02 Wj2 B1 Xij u 0j
rij - Yij B0j B1 Xij rij
- B0j g00 g01 Wj g02 Wj2 u 0j
- Var (Yij ) Var ( u 0j rij ) (t00 s2)
- ICC r t00 / (t00 s2)
14Example2 level HLM Random Intercepts
- Purdue Curriculum Study (Powell Diamond)
- Onsite or Remote coaching
- 27 Head Start classes randomly assigned to onsite
coaching and 25 to remote coaching - Post-test scores on writing
- Onsite n196, M6.70, SD1.54
- Remote n171, M7.05, SD1.64
15Example2 level HLM Random Intercepts
- Level 1 Writingij B0j B1 Writing-preij
rij - B1 .56, se.05, plt.001
- E ( rij) 0, var (rij) 1.67
- Level 2 B0j g00 g01 Onsitej u 0j
- g00 (intercept- remote group
adjusted mean) - 3.74, se .31
- g01(Onsite-Remote difference) -.37,
se.17, p.03 - E (uij) 0, var (uij) .137
- ICC t00 / (t00 s2)
- .137 / (.137 1.66) .076
162 Level HLM - Longitudinal (random-slopes and
intercepts models)
- Corresponds NOT to One-way ANOVA with random
effects - Example Longitudinal assessment of childrens
literacy skills during Pre-K years - Level 1 individual growth curve
- Level 2 group growth curve
17Level 1- Longitudinal HLM
- Level 1 individual growth curve
- Unconditional Model Yij B0j B1j Ageij
rij - Conditional Model Yij B0j B1j Ageij B2
Xij rij - Yij outcome for ith student on the jth occasion
- Ageij age at assessment for ith student on the
jth occasion - B0j intercept for ith student
- B1j slope for Age for ith student
- B2 coefficient for tiem-varying covariate, Xij\
- rij random error term for ith student on the
jth occasion - E ( rij) 0, var (rij) s2
18Level 2 Longitudinal HLM
- Level 2 predicting individual trajectories
- Unconditional model B0j g00 u 0j
- B1j g10 u 1j
- Conditional model B0j g00 g01 Wj1 g02 Wj2
u 0j - B1j g10 g11 Wj1 g12 Wj2 u
1j - B0j intercept for ith student
- B1j slope for Age for ith student
- g00 intercept in population
- g10 slope in population
- g01 treatment effect on intercept for Wj,
student -level covariate - g11 treatment effect on slope for Wj,
student -level covariate
19Level 2 Longitudinal HLM
- Level 2 predicting individual trajectories
- Unconditional model B0j g00 u 0j
- B1j g10 u 1j
- Conditional model B0j g00 g01 Wj1 u 0j
- B1j g10 g11 Wj1 u 1j
- u 0j random effect for individual intercept
- u 0j random effect for individual slope
- E (u0j) 0, var (u0j) t00
- E (u1j) 0, var (u1j) t11
- cov (u 0j, u 1j) t10
- var (u 0j, u 1j)t00 t01
- t10 t00
-
- level 1 and 2 error terms independent
- cov (rij, T) 0
20Example Longitudinal HLM
- Purdue Curriculum Study (Powell Diamond)
- Level 1 estimating individual growth curves for
children in one treatment condition (Remote) - Level 2 estimating population growth curves for
Remote condition
Blending Pre Post Follow-up
N M (sd) 187 9.48 (5.34) 171 13.75 (4.57) 63 15.14 (4.60)
21Example
- Level 1 blendingij B0j B1j Ageij rij
- estimated s2 10.34
- Level 2 B0j g00 g01 Wj1 u 0j
- B1j g10 u 1j
- Estimated results
- Intercept g00 11.86 (se.48), t00 10.03
- season g01 2.43 (se.70)
- Slope g10 1.51 (se.60), t11 4.24
t10 -1.45
223 level HLM
- 2 levels of nesting
- Examples
- Longitudinal assessments of children in randomly
assigned classrooms - Level 1 child level data
- Level 2 childs growth curve
- Level 3 classroom level data
- Two levels of nesting such as children nested in
classrooms that are nested in schools - Level 1 child level data
- Level 2 classroom level data
- Level 3 school level data
233 level Model-Random Intercepts
- Children nested in classrooms, classrooms nested
in schools - Level 1 child-level model Yijk pojk eijk
- Yijk is achievement of child I in class J in
school K - pojk is mean score of class j in school k
- eojk is random child effect
- Classroom level model pojk B00k r0jk
- B00k is mean score for school k
- r0jk is random class effect
- School level model B00k g000 u00k
- g000 is grand mean score
- u00k is random school effect
243 level Model-Random Intercepts
- Children nested in classrooms, classrooms nested
in schools - Level 1 child-level model Yijk pojk eijk
- eojk is random child effect,
- E (eijk) 0 , var(eijk) s2
- Within classroom level model pojk B00k r0jk
- r0jk is random class effect,
- E (r0jk ) 0 , var(r0jk ) tp
- Assume variance among classes within school is
the same - Between classroom (school) B00k g000 g01 trt
u00k - E (u00k ) 0 , var(u00k ) tb
25Partitioning variance
- Proportion of variance within classroom
- s2 / (s2 tp tb)
- Proportion of variance among classrooms within
schools - tp / (s2 tp tb)
- Proportion of variance among schools
- tb / (s2 tp tb)
263 Level HLM level 2 longitudinal and level 3
random intercepts
- Typically treatment randomly assigned at
classroom level, children followed longitudinally
(e.g., Purdue Curriculum Study) - (within child) Level 1 Yijk p0j k p1j k
Ageijk rijk - E (eijk) 0 , var(eijk) s2
- (between child ) Level 2
- p0jk b00k r 0jk p1j k b10k r 1jk
- E (r0jk ) 0 , var(r0jk ) tp0 E (r1jk ) 0
, var(r1jk ) tp1 - (between classes) Level 3
- B00k g00 u00k B10k g10 u10k
- E (u00k ) 0 , var(u00k ) tb E (u10k ) 0 ,
var(u10k ) tb
27Example Purdue Curriculum Study
- Level 1 individual growth curve
- Level 2 classroom growth curve
- Level 3 treatment differences in classroom
growth curves
Writing Pre Post Follow-up
Onsite M (se) N199 5.98 (1.49) N196 6.70 (1.54) N79 6.92 (1.74)
Remote M (se) N187 6.01 (1.55) N171 7.04 (1.64) N63 7.48 (1.62)
28Purdue Curriculum Study
29Threats
- Homogeneity of variance at each level
- Nonnormal data with heavy tails
- Bad data
- Differences in variability among groups
- Normality assumption
- Examine residuals
- Robust standard error (large n)
- Inferences with small samples
303 Level HLMLongitudinal assessments of
individual in clustered settings