Multivariate Normality presentation

About This Presentation

Transcript and Presenter's Notes

Title: Multivariate Normality

1
Multivariate Normality
2
Underlying multivariate analyses is the
assumption of multivariate normality. This
assumption extends the idea of bivariate
normality to more than two dimensions. In
bivariate normality, the distribution of one
variable is normal for all values of the other
variable.
3
Bivariate normality can exist even when the
variables are strongly correlated
4
In univariate statistics, the normality
assumption underlies significance testing. It is
with reference to sampling from some theoretical
distribution that we can make claims about the
likelihood of results occurring by chance or
under the null hypothesis. Similarly, the
establishment of confidence intervals depends on
distributional assumptions.
5
Many multivariate procedures rely on maximum
likelihood estimation. The importance of the
normality assumption is easy to demonstrate there
as well. In maximum likelihood, the parameter
estimates maximize the probability of the data.
The maximum likelihood estimates for the sample
mean and variance find the values that maximize
the following
6
Note the explicit assumption that the data are
normally distributed. If that assumption is in
error, then the normal probability density
function will not provide an optimal solution to
the problem.
7
Provided the data are normally distributed, the
maximum likelihood estimates for m and s make the
obtained data more likely than any other
parameter estimates. The estimation process also
produces standard errors, making hypothesis tests
possible as well. But, the validity of these
hypothesis tests rests on the validity of the
normality assumption.
8
The approach can be extended to multivariate data
as well. We could seek the maximum likelihood
estimates for a bivariate normal distribution
9
Three bivariate normal distributions varying only
in the value of r. The validity of estimates of r
rely on the validity of the assumption of
bivariate normality.
10
The maximum likelihood idea is easily extended to
more than two variables, and depending on the
multivariate problem, large numbers of parameters
may be estimated. Underlying the estimation,
however, is the assumption of multivariate
normality.
11
Assessing univariate normality and bivariate
normality is reasonably easy in large part
because they can be inspected visually.
12

Assessing multivariate normality is a bit
trickier. When multivariate normality holds
All marginal distributions will be normal.
All pairs of variables will be bivariate normal.
All linear combinations will be normal.
All pairs of linear combinations will be
bivariate normal.
Squared distances from the population centroid
will be chi-square distributed with k (k number
of variables) degrees of freedom.

Violating any of these is a violation of
multivariate normality.
13
The example data come from a 4 x 4 design 4
Groups each measured on 4 Outcome Measures.
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
All marginal distributions will be normal
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
All pairs of variables will be bivariate normal
26
Looks pretty bad so far, but what did we miss?
We forgot to remove the variability due to
groups. We need to examine the residuals.
27
(No Transcript)
28
(No Transcript)
29
The consequences of forgetting about the group
variability can be considerable.
30
Original
Residuals
31
Original
Residuals
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
R E L I A B I L I T Y A N A L Y S I S - S C
A L E (A L P H A) Reliability Coefficients N
of Cases 200.0 N of Items
4 Alpha .5200
Original
R E L I A B I L I T Y A N A L Y S I S - S C
A L E (A L P H A) Reliability Coefficients N
of Cases 200.0 N of Items
4 Alpha .7326
Residuals
36
All marginal distributions will be normal
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
All pairs of variables will be bivariate normal
42
All linear combinations will be normal

It is not practical to test all linear
combinationsthere are an infinite number of
them. But, testing a small number of commonly
used linear combinations is important. The most
commonly tested
Sum of all measures
Pair-wise differences
Principal components

43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55
All pairs of linear combinations will be
bivariate normal
Here too the tests must be restricted, usually to
the pairs of linear combinations tested on the
previous step.
56
(No Transcript)
57
(No Transcript)
58
Squared distances from the population centroid
will be chi-square distributed with k (k number
of variables) degrees of freedom
This requirement tests the normality of the
multivariate variance. The trick is to make sure
that group variability does not contaminate the
calculation of the Mahalanobis distances.
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
(No Transcript)
64
One additional test of multivariate normality can
be obtained from LISREL in the PRELIS
pre-processor. Multivariate measures of skew and
kurtosis, developed by Mardia, can be used as an
additional index of multivariate normality.
65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
(No Transcript)
73
(No Transcript)
74

Violations of multivariate normality can be
handled in multiple ways
Transformations
Robust methods
Bootstrapping

Write a Comment

User Comments (0)

About PowerShow.com

Multivariate Normality PowerPoint PPT Presentation