Summarizing Variation - PowerPoint PPT Presentation

About This Presentation

Title:

Summarizing Variation

Description:

Summarizing Variation. Michael C Neale PhD. Virginia Institute for ... Pascal's friendChevalier de Mere 1654; Huygens 1657; Cardan 1501-1576. 1. 1 1. 1 2 1 ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 49

Provided by: ibgcol

Learn more at: http://ibgwww.colorado.edu

Category:

more less

Transcript and Presenter's Notes

Title: Summarizing Variation

1
Summarizing Variation
Michael C Neale PhDVirginia Institute for
Psychiatric and Behavioral GeneticsVirginia
Commonwealth University
2
Overview

Mean
Variance
Covariance
Not always necessary/desirable

3
Computing Mean

Formula E(xi)/N
Can compute with
Pencil
Calculator
SAS
SPSS
Mx

4
One Coin toss
2 outcomes
Probability
0.6
0.5
0.4
0.3
0.2
0.1
0
Heads
Tails
Outcome
5
Two Coin toss
3 outcomes
Probability
0.6
0.5
0.4
0.3
0.2
0.1
0
HH
HT/TH
TT
Outcome
6
Four Coin toss
5 outcomes
Probability
0.4
0.3
0.2
0.1
0
HHHH
HHHT
HHTT
HTTT
TTTT
Outcome
7
Ten Coin toss
9 outcomes
Probability
0.3
0.25
0.2
0.15
0.1
0.05
0

Outcome
8
Pascal's Triangle
Probability
Frequency
1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 1 5 10 10 5 1 1 6
15 20 15 6 1 1 7 21 35 35 21 7 1
1/1 1/2 1/4 1/8 1/16 1/32 1/64 1/128
Pascal's friendChevalier de Mere 1654 Huygens
1657 Cardan 1501-1576
9
Fort Knox Toss

Infinite outcomes
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
-1
-2
-3
-4
Heads-Tails
Series 1
Gauss 1827
10
Variance

Measure of Spread
Easily calculated
Individual differences

11
Average squared deviation
Normal distribution

xi
di
0
1
2
3
-1
-2
-3
Variance G di2/N
12
Measuring Variation
Weighs Means

Absolute differences?
Squared differences?
Absolute cubed?
Squared squared?

13
Measuring Variation
Ways Means

Squared differences

Fisher (1922) Squared has minimum variance under
normal distribution
14
Covariance

Measure of association between two variables
Closely related to variance
Useful to partition variance

15
Deviations in two dimensions
x

y

16
Deviations in two dimensions
x
dx

dy
y
17
Measuring Covariation
Area of a rectangle

A square, perimeter 4
Area 1

1
1
18
Measuring Covariation
Area of a rectangle

A skinny rectangle, perimeter 4
Area .251.75 .4385

.25
1.75
19
Measuring Covariation
Area of a rectangle

Points can contribute negatively
Area -.251.75 -.4385

1.75
-.25
20
Measuring Covariation
Covariance Formula
F E(xi - x)(yi - y)
xy
(N-1)
21
Correlation

Standardized covariance
Lies between -1 and 1

r F
xy
xy
2
2
F F
y
x
22
Summary
Formulae
(Exi)/N
Fx E(xi - )/(N-1)
2
2
Fxy E(xi-x)(yi-y)/(N-1)
r F
xy
xy
2
2
F F
y
x
23
Variance covariance matrix
Several variables
Var(X) Cov(X,Y) Cov(X,Z) Cov(X,Y)
Var(Y) Cov(Y,Z) Cov(X,Z) Cov(Y,Z) Var(Z)
24
Conclusion

Means and covariances
Conceptual underpinning
Easy to compute
Can use raw data instead

25
Biometrical Model of QTL
m
d
a
-
a
26
Biometrical model for QTL
Diallelic locus A/a with p as frequency of a
27
Classical Twin Studies
Information and analysis

Summary rmz rdz
Basic model A C E
rmz A C
rdz .5A C
var A C E
Solve equations

28
Contributions to Variance
Single genetic locus

Additive QTL variance
VA 2p(1-p) a - d(2p-1) 2
Dominance QTL variance
VD 4p2 (1-p)2 d2
Total Genetic Variance due to locus
VQ VA VD

29
Origin of Expectations
Regression model

P aA cC eE
Standardize A C E
VP a2 c2 e2
Assumes A C E independent

30
Path analysis
Elements of a path diagram

Two sorts of variable
Observed, in boxes
Latent, in circles
Two sorts of path
Causal (regression), one-headed
Correlational, two-headed

31
Rules of path analysis

Trace path chains between variables
Chains are traced backwards, then forwards, with
one change of direction at a double headed arrow
Predicted covariance due to a chain is the
product of its paths
Predicted total covariance is sum of covariance
due to all possible chains

32
ACE model
MZ twins reared together
33
ACE model
DZ twins reared together
34
ACE model
DZ twins reared apart
35
Model fitting

Takes care of replicate statistics
Maximum likelihood estimates
Confidence intervals on parameters
Overall fit of model
Comparison of nested models

36
Fitting models to covariance matrices

MZ covariances
3 statistics V1 CMZ V2
DZ covariances
3 statistics V1 CDZ V2
Parameters a c e
Df nstat - npar 6 - 3 3

37
Model fitting to covariance matrices

Inherently compares fit to saturated model
Difference in fit between A C E model and A E
model gives likelihood ratio test with df
difference in number of parameters

38
Confidence intervals

Two basic forms
covariance matrix of parameters
likelihood curve
Likelihood-based has some nice properties
squares of CIs on a give CI's on a2 Meeker
Escobar 1995 Neale Miller, Behav Genet 1997

39
Multivariate analysis

Comorbidity
Partition into relevant components
Explicit models
One disorder or two or three
Longitudinal data analysis
Partition into new/old
Explicit models
Markov
Growth curves

40
Cholesky Decomposition
Not a model

Provides a way to model covariance matrices
Always fits perfectly
Doesn't predict much else

41
Perverse Universe
A
E
.7
.7
P
NOT!
42
Perverse Universe
A
E
.7
.7
.7
-.7
X
Y
r(X,Y)0 Problem for almost any multivariate
method
43
Analysis of raw data

Awesome treatment of missing values
More flexible modeling
Moderator variables
Correction for ascertainment
Modeling of means
QTL analysis

44
Technicolor Likelihood Function
For raw data in Mx
m
ln Li fi 3 ln wj g(xi,ij,Gij)
j1
xi - vector of observed scores on n
subjects ij - vector of predicted means Gij -
matrix of predicted covariances - functions
of parameters
45
Pihat Linkage Model for Siblings
Each sib pair i has different COVARIANCE
46
Mixture distribution model
Each sib pair i has different set of WEIGHTS
rQ.0
rQ1
rQ.5
weightj x Likelihood under model j
p(IBD2) x P(LDL1 LDL2 rQ 1 )
p(IBD1) x P(LDL1 LDL2 rQ .5 )
p(IBD0) x P(LDL1 LDL2 rQ 0 ) Total
likelihood is product of weighted likelihoods
47
Conclusion