Reduction of Variance - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Reduction of Variance

Description:

Parameterize as positive definite PDF: ... We need to sample only near the peak of the distribution: random walks. Intuitive explanation ... – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 13
Provided by: davidce5
Category:

less

Transcript and Presenter's Notes

Title: Reduction of Variance


1
Reduction of Variance
  • As we discussed earlier, in general, the
    statistical error goes as error
    sqrt(variance/computer time).

Efficiency ? 1/vT v error2 of mean and T
total CPU time
DEFINE
  • How can you make simulation more efficient?
  • Write a faster code,
  • Get a faster computer
  • Work on reducing the variance.
  • Or all three

We will talk about the third option today
Importance sampling and correlated sampling
2
Importance Sampling
  • Given the integral

How should we sample x to maximize the
efficiency?
Estimator
Transform the integral
variance is
Optimal sampling
Mean value of estimator I is independent of p(x),
but variance v is not! Assume CPU-time/sample is
independent of p(x), and vary p(x) to minimize v.
3
Finding Optimal p(x) for Sampling
  • Parameterize as positive definite PDF

Solution
Estimator
  • If f(x) is entirely positive or negative,
    estimator is constant. zero variance principle.
  • We cant sample p(x), because, if we could, then
    we would have solved problem analytically!
  • - But the form of p(x) is guide to lowering
    the variance.
  • Importance sampling is a general technique it
    works in many dimensions.

4
Example of importance sampling
Suppose f(x) was given by Optimize a Gaussian
Value is independent of a. CPU time is
not
5
  • Importance sampling functions Variance integrand

6
What are allowed values of a?
  • Clearly for p(x) to exist 0lta
  • 0 .5 .6 1. a
  • For finite estimator 1lta
  • For finite variance .5lta
  • Obvious value a1
  • Optimal value a0.6.

7
What does infinite variance look like?
  • Spikes
  • Long tails on the distributions

Near optimal Why (visually)?
8
General Approach to Importance Sampling
  • Basic idea of importance sampling is to sample
    more in regions where function is large.
  • Find a convenient approximation to f(x).
  • Do not under-sample -- could cause infinite
    variance.
  • Over-sampling -- loss in efficiency but not
    infinite variance.
  • Always derive analytically conditions for finite
    variance.
  • To debug test that estimated value is
    independent of important sampling.
  • Sign problem zero variance is not possible for
    oscillatory integral.
  • Monte Carlo can add but not subtract.

9
Correlated Sampling
  • Suppose we want a function of 2 integrals
    G(F1,F2)
  • where the integrals are Fk ? dx fk(x)
  • Use same p(x) and same random numbers to do both
    integrals.
  • What is optimal p(x)?
  • It is a weighted average of the distributions for
    F1 and F2.
  • Consider GF1/F2 (like Boltzmann distribution),
    then

10
Sampling Boltzmann distribution
  • Suppose we want to calculate a whole set of
    averages
  • Optimal sampling is

variable
constant
  • We need to sample this only. Avoid
    undersampling.
  • The Boltzmann distribution is very highly peaked.

11
Independent Sampling for exp(-V/kT)?
  • Try hit or miss MC to get Z exp(-V/kT).
  • Sample R uniformly in (0,L) P(R) ?-N1
  • What is the variance of the free energy and how
    does it depend on the number of particles?

O(N) and positive!
  • Blows up exponentially fast at large N as F is
    extensive!
  • The number of sample points must grow
    exponentially in N, just like a grid based method.

12
Intuitive explanation
  • Throw N points in a box, area A.
  • Say probability of no overlap is q.
  • Throw 2N points in a box, area 2A.
  • Probability of no overlap is q2.
  • Throw mN points in a box, area mA
  • Probability of no overlap is qm.
  • Probability of getting a success is pexp(m
    ln(q)). Success defined as a reasonable sample of
    a configuration.
  • This is a general argument. We need to sample
    only near the peak of the distribution random
    walks.
Write a Comment
User Comments (0)
About PowerShow.com