Introduction to Bootstrap Estimation - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Introduction to Bootstrap Estimation

Description:

Generate bootstrap samples (B) from this distribution. ... Obtain B bootstrap estimates of the ... [LCL, UCL] can exceed the min and max from the observed data ... – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 20
Provided by: GOOD48
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Bootstrap Estimation


1
Introduction to Bootstrap Estimation
2
What is the Bootstrap Method?
  • Given a sample with n observations, quantify the
    uncertainty in any parameter estimate (e.g.,
    mean, variance, percentile value, etc.)
  • All Bootstrap methods involve generating
    hypothetical samples from the original sample
  • Each hypothetical sample, called a Bootstrap
    Sample, represents a potential set of
    observations that COULD be obtained, if we
    resampled the population
  • Use Monte Carlo simulation to generate MANY
    Bootstrap Samples (B gt 5000)

3
Parametric vs. Distribution Free
  • Parametric Bootstrap
  • Select and fit a probability distribution to the
    sample data. Generate bootstrap samples (B) from
    this distribution.
  • Non-parametric (Distribution-Free) Bootstrap
  • Generate bootstrap samples (B) directly from the
    sample data. Randomly sample with replacement.

4
Distribution-Free Approaches
  • Bootstrap Methods
  • involve resampling from the original data set
    many times, tracking the parameter estimates for
    each iteration, and developing a PDFu.
  • statistically sound method and does not require
    assumptions about PDFv
  • suffers from same limitations as other methods if
    sample is not representative

5
Parameter Uncertainty
Parameter Estimates for each 1-D MCA Simulation
Soil Ingestion Lognormal(m, s)
m uniform(min, max)
s triang(min, mode, max)
6
Example
  • Original Sample (n 5)

23, 28, 30, 50, 61
  • Bootstrap Samples (B 5000)

1 2 3 . . 5000
28, 50, 30, 23, 23 30, 50, 50, 61, 28 61, 23, 30,
23, 28 . . 28, 50, 30, 61, 30
7
Uncertainty in the Mean
  • Original Sample (n 5)

23, 28, 30, 50, 61
38.4
  • Bootstrap Samples (B 5000)

1 2 3 . . 5000
28, 50, 30, 23, 23 30, 50, 50, 61, 28 61, 23, 30,
23, 28 . . 28, 50, 30, 61, 30
8
Confidence Intervals
  • CIs for parameter estimates are calculated using
    statistics from the original sample and the
    bootstrap samples
  • Many different methods are available
  • complexity ? accuracy
  • Difference in results will depend on sample size
    (n) and skewness of original data

9
Example
If x1, x2, ,xnis an independent sample from a
normal distribution, X N(?,?). and m,s are
the sample mean and standard deviation, then
normal distribution theory says that
and that
10
Normal Distribution Assumption
  • Does not require Monte Carlo sampling
  • Select significance level (e.g., a 0.05 for 95
    CI)

where
11
Bootstrap Methods
  • Percentile Bootstrap
  • Standard Bootstrap
  • Bootstrap-t (Pivotal Bootstrap)
  • Bias-corrected (BCa) Bootstrap

References - Hall (1988) - Efron and
Tibshirani (1993) - U.S. EPA (1997) or Singh et
al. (1997)
12
Percentile Bootstrap
  • Easy! Just calculate the parameter for each
    bootstrap sample and select a (e.g., 0.05).
  • LCL a /2 th percentile.
  • UCL (1 - a /2) th percentile.
  • Use EXCELs percentile function
    percentile(bootstrap data array, 0.025)

13
Standard Bootstrap
  • Obtain B bootstrap estimates of the parameter
    theta
  • Calculate the standard error of theta based on
    standard deviation of B bootstrap estimates

14
Bootstrap-t
  • Same as Standard Bootstrap, except obtain
    t-statistic from the bootstrap samples.
  • For each bootstrap sample, calculate tb
  • Calculate the a /2 th and (1- a /2 )th
    percentiles tb.

15
Calculating SEb for Bootstrap-t
  • Normal Approximation Rule (Large Sample)
  • Nested Bootstrap
  • For each bootstrap sample (b), run j 1000
    bootstrap simulations to derive 1000 parameter
    estimates

16
BCa Bootstrap
  • See Appendix or Efron Tibshirani (1993)
  • Accounts for skewness in the bootstrap sample
    means and the rate of change of the standard error

17
How do the Methods Compare?
  • Confidence Intervals will differ depending on the
    approach that is used.
  • In general, as n decreases and skewness
    increases
  • Bootstrap-t gt BCa gt percentile gt standard
  • LCL, UCL can exceed the min and max from the
    observed data
  • CI for mean of Standard bootstrap CI for mean
    assuming X Normal ( , s)

18
So when should you use the Bootstrap estimate,
and which approach is best?
  • Use of Lognormal or Normal PDFs is weakly
    supported
  • Data are poorly fit by continuous distributions
    (e.g., censored, mixed)
  • Analytical solution is messy (simple alternative
    is a parametric bootstrap

19
So when should you use the Bootstrap estimate,
and which approach is best?
  • As with other approaches for quantifying
    parameter uncertainty, confidence in parameter
    estimates improves with better data quality and
    increased sample size
  • Bootstrap is not a substitute for a weak or
    non-representative sample
  • Choosing the best bootstrap approach remains an
    exercise in judgment. Extent of differences in
    coverage of CIs may be a useful contribution to
    sensitivity analysis
Write a Comment
User Comments (0)
About PowerShow.com