Title: Overview
1Overview
- Why we need multivariate statistics
- Types of designs
- Types of data
- Orthogonality
- Linear composites
- Matrices
- Types of multivariate analyses
2The following says it all!
- Multivariate thinking is defined as a body of
thought processes that illuminate interrelations
between and within sets of variables
3Why o why do we need multivariate statistics?!
- Theoretical reasons
- Real-world is multidimensional and multicausal
- i.e., multiple IVs (predictors) and DVs
(outcomes) - increases precision using multiple measures of a
construct - increases completeness
- Statistical reasons
- Examine large data sets in a single analysis
- control for type 1 error rates (omnibus tests)
- generally increasing statistical power (reduces
error)
4Experimental and Nonexperimental (Correlational)
Research
- Experimental
- researcher controls levels of IVs
- i.e., systematically controls IV variance
- random assignment is used to reduce third
variables - Nonexperimental (correlational)
- research does not control variance of the IV
- causality is more difficult
- IV-DV distinction may be arbitrary
5Multivariate Statistics and Experimental and
Nonexperimental Research
- Nonexperimental (correlational)
- multivariate techniques first developed for this
- goal was to take correlation among measures into
account - reduce the number of measured variables
- reduce type I errors
- Experimental
- IVs are typically orthogonal
- however, multiple DVs are typically measured,
so... - reduce type I errors
6Lets get definitional for a moment
- Types of data (i.e., how are variables measured)
- continuous (quantitative)
- scale scores
- discrete (categorical, qualitative)
- can be either ordered (e.g., income, age)
- or not (e.g., ethnicity)
- dichotomous (or binary)
- you tell me
7Types of data continued
- Making a continuous variable discrete or binary
- generally not a good idea loss of information
- median split, tertiary split, quartile split
- Assuming Likert scales are continuous
- individual items are tough (assume continuous?)
- scale scores are typically OK
- Distribution of data, not property of values
typically more important
8Orthogonality vs. Obliqueness
- Definition
- Non vs. association between variables (typically
IVs) - Old vs. New School
9Linear Composites
- Typically form a linear combination of variables
- Y W1X1 W2X2 error
- The weights are meaningful (indicate strength)
- Indicates the association between variables
- e.g., pattern matrix, structure matrix
- Algorithms maximize the size of these weights
- Maximum likelihood (ML)
- Expectation-maximization (EM), Full-Information
ML, Restricted-Information ML
10Data appropriate for multivariate statistics
Just a look-see at matrices
Participant Depression Coping Alcohol
Use 1 2 3 4 2 1 1 1 3 4 2 3 4
7 7 7
11Just a look-see at matrices continued
- Correlation matrix (R)
- Square and symmetric
- Metric free
Depression Coping
Alcohol Use Depression 1.0 .30 .15 Coping .3
0 1.0 .50 Alcohol Use .15 .50 1.0
12Just a look-see at matrices continued
- Variance-Covariance (?)
- Nonstandardized values
- Square and symmetric
- Retains metric of the original variables
Depression Coping
Alcohol Use Depression 4.21 1.62 2.52 Coping
1.62 4.02 1.38 Alcohol Use 2.52 1.38 4.34
13Just a look-see at matrices continued
- Determinant
- Single number providing an index of the
generalized variance in a matrix - Ranges from 0 to 1
- Tells us how much the variables in a matrix
differ - Values close to 0 indicate that variables are
oblique - Values close to 1 indicate that variables are
orthogonal
14Research Questions and Multivariate Techniques
- Research question, type of data, and number of
variables determine statistic - Five (2) big ones
- Degree of relationship among variables
- Significance of group differences
- Prediction of group membership
- Structure
- Time course of events
- Nested data structures
- Profiles of people
15Degree of relationship among variables
- Some form of correlation/regression or chi-square
- Bivariate r
- Multiple r (not multiple regression, which is
predictive) - Sequential r (hierarchical multiple regression)
- Canonical r (multiple IVs and DVs)
- Multiway frequency analysis (log-linear analysis)
- logit analysis if we want to predict the DV
- Path analysis (can be predictive)
- temporal relations among observed variables
- figure on next page
16Path-Analytic Model
Mediator
Predictor (IV)
Criterion (DV)
17Significance of Group Differences
- Youve had some these previously
- Oneway ANOVA
- Oneway ANCOVA
- Factorial AN(C)OVA
- Hotellings T2
- Oneway MAN(C)OVA
- Factorial MAN(C)OVA
- Add Repeated Measures to any of these
18Prediction of Group Membership
- Predicting group membership (DV) from a set of
variables (IVs) - Types
- Discriminant function analysis
- IVs are continuous
- same as MANOVA but the variables have switched
sides - Logit analysis
- predictors (IVs) are discrete
- Logistic regression
- -predictors are a mix of continuous and discrete
19Structure
- What latent variable(s) underlie our observed
variables? - Types
- Principal Components Analysis (PCA)
- exploratory analysis for data reduction
- transform correlations of observed variables into
components - (Exploratory) Factor Analysis (EFA)
- more theoretical (????????)
- also used for data reduction
20Structure continued
- Confirmatory Factor Analysis (CFA)
- a priori measurement model is tested
- direct relations between observed and latent
variables are modeled
21Structure continued
- Structural Equation Modeling (SEM)
- CFA plus a priori structural model is tested
- direct relations among latent variables are
modeled - see figure on next page, too big for here!
22Structure continued SEM Model
23Structure continued
- Multiple Group (Multisample) CFA
- also called testing for invariance
- determines if the measurement model is equivalent
- across groups
- across items, scales, ...
- Multiple Group SEM
- determines if structural model is equivalent
- across groups!
24Time Course of Events
- Two types
- Survival/Failure Analysis
- How long does it take for something to happen
(e.g., diagnosis of schizophrenia)? - Compare groups or determine variables associated
with time - Longitudinal Models!!!!!
- Autoregressive, Cross-Lagged Models
- Latent Growth Curve Modeling
25Nested Data Structures
- Hierarchical linear modeling
- Examples include
- repeated observations nested within individuals
- individuals nested within groups
- groups nested within communities
- communities nested within cultures
- We cannot treat data units at the lowest level
as independent
26Creating typologies
- Latent class analysis
- Creating groups of individuals based on responses
to binary variables - Latent profile analysis
- Creating groups of individuals based on responses
to continuous variables
27Multivariate statistics are not perfect?
- Garbage in, garbage out!
- there is no substitute for valid, reliable
measures - More variables ?More ambiguity, more output,
more... - classification indices, factor rotation, where
does it end?! - More variables means we need more people
- 10 people when assumptions are met
- 20-50 when assumptions are not met