Title: Learning with Treeaveraged Densities and Distributions
1Learning with Tree-averaged Densities and
Distributions
- Sergey Kirshner
- Alberta Ingenuity Centre for Machine Learning,
- Department of Computing Science,
- University of Alberta, Canada
NIPS 2007 Poster W12
December 5, 2007
2Overview
- Want to fit density to complete multivariate data
- New density estimation model based on averaging
over tree-dependence structures - Distribution Univariate Marginals Copula
- Bayesian averaging over tree-structured copulas
- Efficient parameter estimation for tree-averaged
copulas - Can solve problems with 10-30 dimensions
3Most Popular Distribution
- Interpretable
- Closed under taking marginals
- Generalizes to multiple dimensions
- Models pairwise dependence
- Tractable
- 245 pages out of 691 from Continuous Multivariate
Distributions by Kotz, Balakrishnan, and Johnson
4What If the Data Is NOT Gaussian?
5Curse of Dimensionality
Bellman 57
nd cells
V-2,2d 0.9545d
6Avoiding the Curse Step 1Separating Univariate
Marginals
univariate marginals, independent variables,
multivariate dependence term, copula
7Monotonic Transformation of the Variables
8Copula
Copula C is a multivariate distribution (cdf)
defined on a unit hypercube with uniform
univariate marginals
9Sklars Theorem
Sklar 59
10Example Bivariate Gaussian Copula
11Useful Properties of Copulas
- Preserves concordance between the variables
- Rank-based measure of dependence
- Preserves mutual information
- Can be viewed as a canonical form of a
multivariate distribution for the purpose of the
estimation of multivariate dependence
12Copula Density
13Separating Univariate Marginals
- Fit univariate marginals (parametric or
non-parametric) - Replace data points with cdfs of the marginals
- Estimate copula density
Inference for the margins Joe and Xu 96
canonical maximum likelihood Genest et al 95
14What Next?
- Arent we back to square one?
- Still estimating multivariate density from data
- Not quite
- All marginals are fixed
- Lots of approaches for copulas
- Vast majority focus on bivariate case
- Design models that use only pairs of variables
15Tree-Structured Densities
16Tree-Structured Copulas
17Chow-Liu Algorithm (for Copulas)
A1A2 A1A3 A1A4 A2A3 A2A4 A3A4
A1A2 A1A3 A1A4 A2A3 A2A4 A3A4
c(a1,a2) c(a1,a3) c(a1,a4) c(a2,a3) c(a2,a4) c(a3,
a4)
0.3126 0.0229 0.0172 0.0230 0.0183 0.2603
0.3126 0.0229 0.0172 0.0230 0.0183 0.2603
c(a1,a2) c(a1,a3) c(a1,a4) c(a2,a3) c(a2,a4) c(a3,
a4)
18Distribution over Spanning Trees
Meila and Jaakkola 00, 06
O(d3) !!!
19Tree-Averaged Copula
- Can compute sum over all dd-2 spanning trees
- Can be viewed as a mixture over many, many
spanning trees - Can use EM to estimate the parameters
- Even though there are dd-2 mixture components!
20EM for Tree-Averaged Copulas
Intractable!!!
- E-step compute
- Can be done in O(d3) per data point
- M-step update b and Q
- Update of Q is often linear in the number of
points - Gaussian copula solving cubic equation
- Update of b is essentially iterative scaling
- Can be done in O(d3) per iteration
21Experiments Log-Likelihood on Test Data
UCI ML Repository MAGIC data set 12000
10-dimensional vectors 2000 examples in test
sets Average over 10 partitions
22Binary-Continuous Data
23Summary
- Multivariate distribution univariate marginals
copula - Copula density estimation via tree-averaging
- Closed form
- Tractable parameter estimation algorithm in ML
framework (EM) - O(Nd3) per iteration
- Only bivariate distributions at each estimation
- Potentially avoiding the curse of dimensionality
- New model for multi-site rainfall amounts (POSTER
W12)