Structure Equation With Nonnormal Variables - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Structure Equation With Nonnormal Variables

Description:

Sample covariance between xi and xj ... Theoretical fit: the degree of isomorphism between structure and parameter ... 500) depend on the degree of nonnormality ... – PowerPoint PPT presentation

Number of Views:191
Avg rating:3.0/5.0
Slides: 21
Provided by: HYC
Category:

less

Transcript and Presenter's Notes

Title: Structure Equation With Nonnormal Variables


1
Structure Equation With Nonnormal Variables
  • Presented in DHPR, NHRI
  • 2004.5.

2
Major Source of Inappropriate Use of SEM
  • Fail to satisfy the scaling and normality
    assumption
  • Many measurements are dichotomous or ordered
    categories, e.g. agree no preference
    disagree
  • Some are continuous, but depart from normal
    dramatically, e.g. amount of cigarettes smoked by
    females per day
  • In 1990, 72 articles published in personality and
    psychology journals used SEM, only 19
    acknowledged normality assumption, less than 10
    explicitly considered whether the assumption had
    been violated

3
Review of Normal Theory Estimation
  • Estimation minimize the difference between each
    element in S and the corresponding elements in
  • S is the sample covariance matrix based on
    observed data
  • is the covariance matrix implied by a
    set of parameters for the hypothesized model

4
Most Commonly Used Estimation Techniques
??(maximize the function)
  • Maximum likelihood (ML)

5
  • Generalized Least Squares(GLS)
  • Normal Function (used in ML)
  • Therefore, both ML and GLS are based on normal
    assumption

6
More on GLS
  • Minimize
  • W -1 is the weight function, most common choice
    is S 1
  • Sample covariance between xi and xj
  • The large sample distribution of the elements of
    S is assumed to be multivariate normal

W-1
7
Problems
Very Large Sample
Continuous
Assumptions
Multivariate Normal
Statistical Properties ?
Robustness of the Estimators?
8
Effects and Detection
  • The observed variables do not have multivariate
    normal
  • The X2 goodness-of-fit test is an accurate
    assessment of fit, rejecting too many (gt5) true
    models
  • Tests of all parameter estimates are expected to
    be biased, yielding too many significant results
  • Categorical variables assumed continuous
  • Correlation is stronger than it should be

9
Studies on the Effect of Non-Normality
  • Olsson, Foss, Troye and Howell (Structure
    Equation Modeling, 2000)

REALITY
TURE MODEL
POPULATION (TURE)
STATES
AND COVARIANCE ?
Mtrue and ? true
Theoretic fit
True Empirical fit
Empirical fit
COVARIANCE IMPLIED BY THEORETICAL MODEL ? (?)
SAMPLE COVARIANCE S
Mtheory and ?theory
THEORETICAL DOMAIN
EMPIRICAL DOMAIN
10
  • Theoretical fit the degree of isomorphism
    between structure and parameter values of a
    theoretical models and of the true model that
    generates the data.
  • Empirical fit the discrepancy between the
    observed covariance structure and the one implied
    by a theoretical model.
  • True empirical fit the correspondence between
    the population covariance matrix (?) and the
    covariance structure implied by the theoretical
    model (? (?))

11
Comparisons Among ML, GLS, and WLS
  • The performance in terms of empirical and
    theoretical fit of the three models is
    differentially affected by sample size,
    specification error, and kurtosis.
  • ML is considerably more insensitive than the
    others to variation in sample size and kurtosis.
    Only empirical fit is affected by specification
    error. In general, ML tends to be more stable,
    high accuracy
  • GLS requires well-specified models, but allows
    small sample sizes. Its appealing performance in
    terms of empirical fit can be misleading
  • WLS requires well-specified models as well as
    large sample sizes.

12
Detecting Departure From Normal
  • Skewness and Kurtosis
  • Skewness ? Kurtosis ( vs. -)
  • SAS PROC UNIVARIATE
  • Univariate vs. Multivariate
  • When univariate normal is violated in each
    variable, then multivariate normal (joint
    distribution) cannot be true. But the converse is
    not true.
  • Mardia (1970) measures
  • Outliers
  • Checking errors, leverage statistics, etc.

13
Remedies for Multivariate Nonnormalilty
  • Alternative Estimation Techniques
  • Asymptotically Distribution Free Estimator (ADF)
  • Optimal weight matrix consisting of a combination
    of second- and fourth- order terms
  • It has many more elements than the normal theory
    GLS weight matrix (S-1)
  • Computation demanding e.g. 15 measured
    variables, it has ½1516120 unique elements,
    the matrix has 12012014,400 elements. Inversing
    the matrix can be difficult.
  • GLS only take the diagonal of the matrix (120
    elements).

14
  • SCALED ?2 statistic and standard errors (Satorra,
    1990)
  • Corrected or rescaled the ?2
  • The ?2 from ML or GLS is divided by a constant k,
    whose value is a function of the model-implied
    residual weight matrix, the multivariate
    kurtosis, and the degree of freedom for the
    model.
  • k as kurtosis adjusted ?2
  • Its available in EQS program

15
Bootstrapping
  • Taking repeated samples from a population of
    interest
  • Calculate the parameter estimates of interests
    resulting in an empirical sampling distribution
    of the estimates.
  • Repeated samples of the same sample size are
    taken from the original sample with replacement.
  • For example, the original sample consists (1, 2,
    3, 4), possible bootstrap samples are (1,4, 1,1),
    (2,3,1,3), or (4,2,2,4).

16
Re-expression of Variables
  • Item Parcels sum or mean of several items that
    measure the same domain.
  • Potential complication in the interpretation of
    relationships and structure.
  • Use of too few measured variables as indicators
    of a domain yields less stringent tests of the
    proposed structure of confirmatory factor models
  • Identification problems are more likely to occur

17
  • Transformation of variables
  • Linear transformations (e.g. standardization)
    have no effect on either the distributions of
    variables or the results of simple structural
    equation models
  • Non-linear transformations potentially alter the
    distribution of the measured variables as well as
    the relationships among measured variables,
    potentially eliminating some forms of curvilinear
    effects and interactions between variables.

18
Selecting an Appropriate Transformation
  • Power function
  • Positively skewed generally, raising the scores
    on the measured variable to a power less than
    1.0, e.g. log, squared root, reciprocal
  • Negatively skewed raising raw scores to a power
    greater than 1.0.
  • Box-Cox transformation when scattered plots show
    a possible non-linear relationship between pairs
    of variables.

19
About the Transformation
  • Examine the univariate skewness and kurtosis of
    the transformed data
  • Examine the multivariate skewness and kurtosis of
    the transformed data using Mardia measures
  • y y, so the covariance the y should be
    computed, not the original
  • Box-Cox transformation can result in considerable
    confusion in the interpretation of the variables.

20
Choice Among Remedies
  • In large samples (1000 to 5000), ADE and SCALED
    ?2 and standard errors for continuous nonnormal
    data perform well.
  • In median samples (200 to 500) depend on the
    degree of nonnormality
  • Small samples (nonnormality is not severe) SCALED
    ?2 begin to depart from normality (e.g
    skewness2 kurtosis7)
  • Variable re-expression is recommended.
Write a Comment
User Comments (0)
About PowerShow.com