Important Ideas in Bayesian Models - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Important Ideas in Bayesian Models

Description:

key step of parameter learning is inference of latent variables ... Markov chain Monte Carlo. particle filters. variational approximations ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 15
Provided by: csCol6
Category:

less

Transcript and Presenter's Notes

Title: Important Ideas in Bayesian Models


1
Important Ideas in Bayesian Models
  • Generative models
  • Consideration of multiple models in parallel
  • infinite model spaces
  • Inference
  • prediction via model averaging
  • role of priors diminishing role with evidence
  • explaining away
  • Learning
  • parameter vs. structure learning
  • key step of parameter learning is inference of
    latent variables
  • Bayesian Occam's razor trade off between model
    simplicity and fit to data (this class)

2
Important Technical Issues To Come
  • Approximate inference techniques
  • Markov chain Monte Carlo
  • particle filters
  • variational approximations
  • Special inference techniques
  • EM, forward-backward algorithm
  • Kalman filter update
  • Representing structured data
  • grammars
  • relational schemas (e.g., paper authors, topics)
  • multiple levels of abstraction
  • Nonparametric models
  • flexible models that grow in complexity as the
    data justifies

3
Ockham's Razor
medieval philosopher and monk
tool for cutting (metaphorical)
  • If two hypotheses are equally consistent with the
    data, prefer the simpler one.
  • simplicity
  • can accommodate fewer observationssmootherfewer
    parametersrestricts predictions more
  • e.g., 2nd vs. 4th order polynomial
  • e.g., small rectangle vs. large rectangle in
    Tenenbaum model
  • e.g., 2-bump vs. 10-bump mixture of Gaussians

4
Motivating Ockham's Razor
PRIORS
  • Aesthetic considerations
  • A theory with mathematical beauty is more likely
    to be right (or believed) than an ugly one, given
    that both fit the same data.
  • Past empirical success of the principle
  • Coherent inference, as embodied by Bayesian
    reasoning, automatically incorporates Ockham's
    razor
  • Two theories H1 and H2

LIKELIHOODS
5
Ockham's Razor with Priors
  • Jeffreys (1921)
  • more complex hypotheses should have smaller
    priors
  • requires a numerical rule for assessing
    complexity
  • e.g., Vapnik-Chervonenkis (VC) dimension
  • e.g., Minimum Description Length (MDL)

6
Rissanen (1976)Minimum Description Length
  • Prefer models that can communicate the data in
    the smallest number of bits.
  • The preferred hypothesis H for explaining data D
    minimizes
  • (1) length of the description of the hypothesis
    (Ockham's razor)(2) length of the description of
    the data with the help of the chosen theory

L length
7
MDL Bayes
  • L some measure of length (complexity)
  • MDL prefer hypothesis that min. L(H) L(DH)
  • Bayes rule implies MDL principle
  • P(HD) P(DH)P(H) / P(D)
  • log P(HD) log P(DH) log P(H) log P(D)
    L(DH) L(H) const

8
Subjective vs. Objective Priors
  • subjective or informative prior specific,
    definite information about a random variable
  • objective or uninformative prior vague, general
    information
  • e.g., uniform over some range
  • philosophical arguments for
  • maximum entropy
  • 1/(?(1-?)) for ? in 0,1

9
Moving Away from Subjective Priors
  • Coin flipping example
  • H1 coin has two headsH2 coin has a head and a
    tail
  • Consider 5 flips producing HHHHH
  • H1 could produce only this sequenceH2 could
    produce HHHHH, but also HHHHT, HHHTH, ... TTTTT
  • P(HHHHH H1) 1, P(HHHHH H2) 1/32
  • H1 is easier to reject based on observations
  • H2 pays the price of having a lower likelihood
    via the fact it can accommodate a greater range
    of observations

10
Simple and Complex Hypotheses
H2
H1
11
Bayes Factor
  • BIC is approximation to Bayes factor

12
Relativity Example
  • Explain deviation in Mercury's orbit with respect
    to prevailing theory
  • E Einstein's theory a true deviationF
    fudged Newtonian theory a observed
    deviation

13
Relativity Example (Continued)
  • Subjective Ockham's razor
  • result depends on one's belief about P(aF)
  • Objective Ockham's razor
  • for Mercury example, RHS is 15.04
  • Applies to generic situation

14
Simple and Complex Hypothesis Classes
  • E.g., 1st and 2nd order polynomials
  • Hypothesis class is parameterized by w

v
Write a Comment
User Comments (0)
About PowerShow.com