Learning with Treeaveraged Densities and Distributions - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Learning with Treeaveraged Densities and Distributions

Description:

Want to fit density to complete multivariate data ... Preserves concordance between the variables. Rank-based measure of dependence ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 24
Provided by: serg121
Category:

less

Transcript and Presenter's Notes

Title: Learning with Treeaveraged Densities and Distributions


1
Learning with Tree-averaged Densities and
Distributions
  • Sergey Kirshner
  • Alberta Ingenuity Centre for Machine Learning,
  • Department of Computing Science,
  • University of Alberta, Canada

NIPS 2007 Poster W12
December 5, 2007
2
Overview
  • Want to fit density to complete multivariate data
  • New density estimation model based on averaging
    over tree-dependence structures
  • Distribution Univariate Marginals Copula
  • Bayesian averaging over tree-structured copulas
  • Efficient parameter estimation for tree-averaged
    copulas
  • Can solve problems with 10-30 dimensions

3
Most Popular Distribution
  • Interpretable
  • Closed under taking marginals
  • Generalizes to multiple dimensions
  • Models pairwise dependence
  • Tractable
  • 245 pages out of 691 from Continuous Multivariate
    Distributions by Kotz, Balakrishnan, and Johnson

4
What If the Data Is NOT Gaussian?
5
Curse of Dimensionality
Bellman 57
nd cells
V-2,2d 0.9545d
6
Avoiding the Curse Step 1Separating Univariate
Marginals
univariate marginals, independent variables,
multivariate dependence term, copula
7
Monotonic Transformation of the Variables
8
Copula
Copula C is a multivariate distribution (cdf)
defined on a unit hypercube with uniform
univariate marginals
9
Sklars Theorem
Sklar 59


10
Example Bivariate Gaussian Copula
11
Useful Properties of Copulas
  • Preserves concordance between the variables
  • Rank-based measure of dependence
  • Preserves mutual information
  • Can be viewed as a canonical form of a
    multivariate distribution for the purpose of the
    estimation of multivariate dependence

12
Copula Density
13
Separating Univariate Marginals
  • Fit univariate marginals (parametric or
    non-parametric)
  • Replace data points with cdfs of the marginals
  • Estimate copula density

Inference for the margins Joe and Xu 96
canonical maximum likelihood Genest et al 95
14
What Next?
  • Arent we back to square one?
  • Still estimating multivariate density from data
  • Not quite
  • All marginals are fixed
  • Lots of approaches for copulas
  • Vast majority focus on bivariate case
  • Design models that use only pairs of variables

15
Tree-Structured Densities
16
Tree-Structured Copulas
17
Chow-Liu Algorithm (for Copulas)
A1A2 A1A3 A1A4 A2A3 A2A4 A3A4
A1A2 A1A3 A1A4 A2A3 A2A4 A3A4
c(a1,a2) c(a1,a3) c(a1,a4) c(a2,a3) c(a2,a4) c(a3,
a4)
0.3126 0.0229 0.0172 0.0230 0.0183 0.2603
0.3126 0.0229 0.0172 0.0230 0.0183 0.2603
c(a1,a2) c(a1,a3) c(a1,a4) c(a2,a3) c(a2,a4) c(a3,
a4)
18
Distribution over Spanning Trees
Meila and Jaakkola 00, 06
O(d3) !!!
19
Tree-Averaged Copula
  • Can compute sum over all dd-2 spanning trees
  • Can be viewed as a mixture over many, many
    spanning trees
  • Can use EM to estimate the parameters
  • Even though there are dd-2 mixture components!

20
EM for Tree-Averaged Copulas
Intractable!!!
  • E-step compute
  • Can be done in O(d3) per data point
  • M-step update b and Q
  • Update of Q is often linear in the number of
    points
  • Gaussian copula solving cubic equation
  • Update of b is essentially iterative scaling
  • Can be done in O(d3) per iteration

21
Experiments Log-Likelihood on Test Data
UCI ML Repository MAGIC data set 12000
10-dimensional vectors 2000 examples in test
sets Average over 10 partitions
22
Binary-Continuous Data
23
Summary
  • Multivariate distribution univariate marginals
    copula
  • Copula density estimation via tree-averaging
  • Closed form
  • Tractable parameter estimation algorithm in ML
    framework (EM)
  • O(Nd3) per iteration
  • Only bivariate distributions at each estimation
  • Potentially avoiding the curse of dimensionality
  • New model for multi-site rainfall amounts (POSTER
    W12)
Write a Comment
User Comments (0)
About PowerShow.com