Title: Vine Copula
1Vine - Copula
- Presenter Andrew Wey
- Advisor Jong-Min Kim
2Outline
- Copulas
- Vines
- Example of Vine-Copula
- UNICORN
- Principle Component Analysis
- Conclusion
- Acknowledgements
3Copula Overview
- First introduced in 1959 by Abe Sklar.
- Has since played an important role in areas of
probability and statistics especially in
dependence studies. - Most easily viewed as connecting two univariate
marginal distributions to their joint distribution
4Copula Definition
- A copula is a function C from I2 to I (where I is
the interval 0,1) that possesses the following
properties - For all u, v in I, C(0,v) C(u,0) 0, and
C(u,1) u and C(1,v) v. - For every u1, u2, v1, v2 in I such that u1 u2
and v1 v2, C(u2,v2) C(u2,v1) C(u1,v2)
C(u1,v1) 0.
5Sklars Theorem
- Let H be a bivariate distribution function with
marginal distribution functions F and G. Then
there exists a copula C such that H(x,y)
C(F(x),G(y)). Conversely, for any univariate
distribution functions F and G and any copula C,
the function H defined above is a bivariate
distribution function with marginal distributions
F and G. Furthermore, if F and G are continuous,
C is unique.
6Dependence Properties and Measures
- Properties or measures that are invariant under
strictly increasing transformations of random
variables - Positive (and negative) quadrant dependence is an
example of a particular dependence property. - Concordance large variables of X tend to occur
with large values of Y. The probability of which
are as follows with vectors (Xi,Yf) and (Xh,Yg). - Probability of Concordance
- P(Xi Xh)(Yf Yg) gt 0
- Probability of Discordance
- P(Xi Xh)(Yf Yg) lt 0
7Specific Dependence Measures
- Kendalls tau - probability of concordance minus
the probability of discordance of the two vectors
(X1,Y1) and (X2,Y2). - Spearmans rho three times the probability of
concordance minus the probability of discordance
of the two vectors (X1,Y1) and (X2,Y3). - Most families of copulas only have a range of
dependence which can be represented.
Rob and Big Discordant?
8Fréchet Bounds for Copula
- A bivariate distribution H with marginals F and G
must satisfy the following inequality - max(F(x) G(y) -1 , 0) H(x,y) min(F(x),G(y))
- Applying the Fréchet bounds to copulas we get the
following inequality for copulas - max(u v -1, 0)) C(u,v) min(u,v)
9C(u,v) max(u v 1,0)
C(u,v) min(u,v)
C(u,v) uv
10Vines!
- Layered acyclical trees.
- Provides a graphical representation of the
conditional specifications being made on a joint
distribution. - The multivariate distribution is represented by
the product of the marginals and edges of the
vine. - We are only interested in a subset of vines
called regular vines which can be furthered
seperated - D-vine
- Canonical vine
11Formal Definition of Vines
- V (T1,,Tm).
- T1 is a tree with nodes N1 1,,n and a set of
edges denoted E1. - For i 2,,m, Ti is a tree with nodes Ni which
forms a subset of N1 U E1 U E2 U U Ei and edge
set Ei - A Vine V is a regular vine on n elements if
- m n
- Ti is a connected tree with edge set Ei and node
set Ni Ei 1, with Ni n (i 1) for i
1,,n, where Ni is the cardinality of the set
Ni. - The proximity condition holds for i 2,,n 1,
if a a1,a2 and b b1,b2 are two nodes in
Ni connected by an edge (a1,a2,b1,b2 e Ni 1)
then a n b 1.
12D-Vine
- If each node in the base tree has a degree of at
most two, then the vine is a d-vine.
12
23
34
Base Tree
1
2
3
4
132
243
12
23
34
Second Tree
1423
132
243
Third Tree
13D-Vine Joint Distribution
14Canonical Vine
2
12
- If each tree has an unique node of degree n i,
then the vine is a canonical vine.
13
1
3
Base Tree
14
4
13
231
12
Second Tree
241
14
3412
231
241
Third Tree
15Canonical Vine Joint Distribution
16Analysis of Variation within Vine-Copula
- Two hypothetical situations in which data was
originally simulated (using UNICORN) from a
correlation matrix to form a baseline data set. - Then used the baseline data set to determine
correlations for a d-vine and canonical vine.
Then simulated data sets using the new d-vine and
canonical vine. - Basic statistical inference and principle
component analysis was used in the analysis of
variation within the d-vine and canonical vine.
17Simulation using UNICORN
Determination of Correlations
Statistical Analysis
D-vine Specification
D-vine Data Set
Correlation Matrix
Baseline Data Set
Analysis
C-vine Specification
C-vine Data Set
18UNICORN
- Uncertainty Analysis with Correlations
- Fairly intuitive and easy to use interface
- Solely used for simulation of data sets in our
example - Has interesting analysis tools that can be used
in analysis in particular, a cobweb plot
Cobweb Plot Example
19Quick Overview of Principle Component Analysis
- Typically used as method for dimensional
reduction - Given a correlation/covariance matrix
- Find the eigenvectors
- Find the eigenvalues
- The components of PCA are the eigenvectors ranked
by largest to smallest corresponding eigenvalues
20First Scenario Light Bulb Lifetimes
- Four Variables
- Light 1
- Light 2
- Light 3
- Light 4
- C-Vine Organization
- Base tree pole Light 1
- Second tree pole Light 2 given Light 1
21UNICORN Example
22Light Bulb Lifetimes Basic Statistics
23Light Bulb Lifetimes PCA Results
24Light Bulb Lifetimes PCA Biplots
25Second Scenario Spending Analysis
- Four Variables
- Clothes
- Food
- Housing
- Entertainment
- D-vine Organization Clothes Food Housing
Entertainment - C-vine Organization
- Base tree pole Housing
- Second tree pole Clothes given Housing
26Spending Analysis Basic Statistics
27Spending Analysis PCA Results
28Spending Analysis PCA Biplots
29Conclusion
- Copulas Useful in dependence studies
- Vines- Provide useful graphical interpretations
of the conditional specifications being made on a
multivariate distribution - Examples resulted in mixed results in comparing
d-vines with canonical vines
30Acknowledgements
- Thanks to Jong-Min Kim who has advised me through
this project. - Special thanks to Peh Ng and Engin Sungur for
providing invaluable help and motivation
throughout my UMM career.
31References
- Roger B. Nelsen, An Introduction to Copulas, 2006
- Tim Bedford and Roger Cooke, Vines- A New
Graphical Model for Dependent Random Variables,
2002 - Dorota Kurowicka and Roger Cooke, Uncertainty
Analysis with High Dimensional Dependence
Modelling, 2006 - Roger B. Nelsen, Copulas, Characterization,
Correlation, and Counterexamples, 200 - Brian Everitt and Graham Dunn, Applied
Multivariate Data Analysis Second Edition, 2001 - K. Aas, C. Czado, A. Frigesse, H. Bakken,
Pair-Copula Constructions of Multiple Dependence,
2006