Nonparametric Factor Analysis with Beta Process Priors - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Nonparametric Factor Analysis with Beta Process Priors

Description:

The beta process (two-parameter) is a nonparametric prior that allows ... We are currently in the process of expanding these ideas in a full-length paper, ... – PowerPoint PPT presentation

Number of Views:154
Avg rating:5.0/5.0
Slides: 17
Provided by: Joh6391
Category:

less

Transcript and Presenter's Notes

Title: Nonparametric Factor Analysis with Beta Process Priors


1
Nonparametric Factor Analysis with Beta Process
Priors
  • John Paisley and Lawrence Carin
  • Department of Electrical Computer Engineering
  • Duke University, Durham, NC, USA

2
Outline
  • Introduction
  • Beta Process Review
  • Beta Process Factor Analysis Model (BP-FA)
  • Variational Inference for the BP-FA Model
  • Experiments
  • Conclusion

3
Introduction
  • Nonparametric Bayesian priors are useful for
    finding compact statistical representations of a
    data set without limiting the potential
    complexity of the model (e.g., the Dirichlet
    process)
  • Factor analysis modeling is a framework where
    nonparametric priors can be useful. FA models are
    used to separate the important covariance
    structure of a data set from idiosyncratic noise.
    Inferring the value for K is something that
    suggests the use of a nonparametric prior.
  • The beta process (two-parameter) is a
    nonparametric prior that allows
  • (above), while selecting a
    sparse subset of columns of . This process is
    denoted

4
The Beta Process
  • Illustration Define H0 to correspond to the
    uniform distribution Uni(0,1)

5
The Beta Process
  • Illustration Define H0 to correspond to the
    uniform distribution Uni(0,1)

6
The Beta Process
  • Illustration Define H0 to correspond to the
    uniform distribution Uni(0,1)

7
Beta Process Properties
  • By integrating out each to obtain the
    Indian buffet process (Griffiths Ghahramani,
    2005), additional properties of the beta process
    can be derived.
  • Let zi be the binary vectors on the previous
    slide. In the limit as
  • For any set of N vectors, z1,,zN, if CN is the
    total number of unique locations where there is a
    one, then

8
Beta Process Factor Analysis (BP-FA)
  • For the factor analysis model, we model the
    generation of X using a finite approximation to
    the beta process. This approximation allows for
    variational inference to be performed.
  • Below is a noiseless, unweighted example of a
    draw from the model, where we show the sparseness
    of the BP prior using the vector of probabilities
    generated in the previous slides. (Only 23 of the
    1000 factors are used in N 100 samples)

An illustration of the structure of the beta
process prior (noiseless, unweighted model)
9
Beta Process Factor Analysis (BP-FA)
  • The generative process is at right.
  • This approximation can be thought of as similar
    to using a finite Dirichlet prior for mixture
    modeling. The benefit of sparseness is still
    present (since K is set to a large number).
  • The mean and covariance under this truncation is
    given at right. We see that the model remains
    well-defined as

10
Variational Inference for the BP-FA Model
11
Variational Inference for the BP-FA Model
12
Experiments Toy Data
  • We ran three experiments On toy data, on the
    MNIST digits dataset and on the HGDP-CEPH cell
    line panel
  • We generated N 250 samples in D 25
    dimensions. We sampled H with a,b 1, W 1 and
    we set
  • K 100 for inference.
  • The essential structure was uncovered.
  • An issue was the splitting of factors (some
    columns in the loading matrix were similar)

13
Experiments MNIST Digits
  • We also trained on N 2500 odd digits. We set
    a,b 1 and K 100.
  • At right, we see that a sparse subset of the
    factors were selected to represent the data.
  • We also mention that inference was fast, owing to
    the deterministic nature of the variational
    method (far fewer iterations are required than
    for MCMC sampling).

14
Experiments HGDP-CEPH Cell Line Panel
  • A D 377 dimensional genotype data set sampled
    from populations across the world.
  • Below, we show the reconstruction of the data
    set.
  • The noise in the data was significantly reduced,
    while the essential structure was preserved.

15
Conclusion and Future Work
  • We have presented a nonparametric model for
    performing factor analysis that uses the beta
    process prior.
  • A finite approximation to the beta process
    allowed for variational inference to be
    performed.
  • We are currently in the process of expanding
    these ideas in a full-length paper, where we will
    look more in depth at applications.
  • A stick-breaking construction of the beta process
    has been recently derived by the authors. Future
    work is needed to rigorously prove convergence
    properties (help is welcome!)

16
References
Write a Comment
User Comments (0)
About PowerShow.com