Bootstrapping in regular graphs - PowerPoint PPT Presentation

About This Presentation
Title:

Bootstrapping in regular graphs

Description:

The set of all neighbours of a vertex is then the set of all ... neighbour graphs ... each vertex to its k nearest neighbours to the left and to the ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 22
Provided by: rein92
Category:

less

Transcript and Presenter's Notes

Title: Bootstrapping in regular graphs


1
Bootstrapping in regular graphs
  • Gesine Reinert, Oxford
  • With Susan Holmes, Stanford

2
What is the bootstrap?
  • Efron (1979), Bickel and Freedman (1981), Singh
    (1981)
  • Resampling procedure, used to construct
    confidence intervals and calculate standard
    errors for statistics

3
The bootstrap procedure
  • Have random sample of size n, say
  • Draw M observations out of the n, with
    replacement
  • Calculate the statistic of interest for this
    sample of size M
  • Repeat many times
  • Use the standard deviation in these samples to
    estimate standard deviation in the population

4
Example median
  • Suppose we would like to estimate the median of a
    population from a sample of size n
  • Sample Mn observations with replacement from the
    observed data,
  • Take the median of this simulated data set
  • Repeat these steps B times B simulated medians
  • These medians are approximately draws from the
    sampling distribution of the median of n
    observations
  • Calculate their standard deviation to estimate
    the standard error of the median

5
When does the bootstrap work?
  • The underlying idea is that of Russian dolls
    the bootstrap samples should relate to the
    original sample just as the original sample
    relates to the unknown population
  • (Count the freckles on the faces of Russian
    dolls)

6
Empirical measures
  • Each observation can be represented by a point
    mass in space
  • The average of these point masses is called
    empirical measure a random quantity taking
    values in the set of measures

7
Limits of empirical measures
  • This empirical measure will converge to a limit
    if the conditions are right just like the law of
    large numbers
  • Just like for real-valued random quantities, for
    independent identically distributed observations
    an approximation by a Gaussian measure holds
  • We say that the bootstrap works when the
    bootstrap empirical measure can be approximated
    by a Gaussian measure centred around the true
    measure

8
Conditions for validity?
  • The theoretical arguments proving that the
    bootstrap works rely on large independent samples
  • But in dependent observations the standard
    deviation would be estimated wrongly
  • In time series blockwise bootstrap Kuensch
    (1989), Carlstein et al. (1998) sample a whole
    block of observations in the time series, use the
    block to approximate the standard deviation

9
Dependency graphs
  • For random variables we can construct a graph
    with the random variables as the vertices
  • Two vertices are linked by an edge if and only if
    the corresponding random variables are dependent
  • The set of all neighbours of a vertex is then the
    set of all random variables which are dependent
    on the vertex random variable

10
Bootstrapping in such graphs
  • To capture the dependence structure, we bootstrap
    not isolated vertices but whole neighbourhoods of
    dependence together with the vertex
  • Have to weight and re-scale observations

11
Regular graph
  • If all dependency neighbourhoods have the same
    size, i.e. every vertex has the same degree, then
    we have a regular graph
  • If the dependency neighbourhoods are small, then
    the bootstrap works (have numerical bound)

12
Re-weighting
  • If the graph is not only regular, but also all
    pairwise intersections of dependency
    neighbourhoods have the same size, g, say, then
    adjust the variance estimate by multiplying with
    M and then divide by (n-g), where M is the size
    of the bootstrap sample, and n is the original
    sample size
  • Same weights as above also if intersections are
    all empty

13
Weights in K-nearest neighbour graphs
  • Place vertices on a circle, connect each vertex
    to its k nearest neighbours to the left and to
    the right so each vertex has degree 2k
  • Have to multiply variance estimator by M and
    divide by n-2k
  • But also have to weight covariance part
    differently, depending on the size of dependency
    neighbourhood overlaps

14
Example Bucky ball
15
Weighted network
  • For each edge simulate i.i.d. standard normals
  • Fix a random orientation of the edge
  • For each vertex add the normals for edges going
    into the vertex, and subtract the normals going
    out of the vertex
  • Sampling distribution for the variance?

16
Realisation
17
Dependency bucky graph
18
Variances
19
Numerical values
20
Summary
  • Dependency graph bootstrapping from graphs, when
    edges indicate dependence, works when the graph
    is (reasonably) regular, provided that the
    variance estimates are multiplied by the
    correction factor
  • Independent bootstrapping may lead to wrong
    standard error estimates

21
Reference
  • S. Holmes and G. Reinert Steins method for the
    bootstrap. In Steins Method Expository
    Lectures and Applications. P. Diaconis and S.
    Holmes, eds, IMS, Hayward, 2004.
  • http//www.stats.ox.ac.uk/reinert/papers/steinboo
    tstrap.pdf
Write a Comment
User Comments (0)
About PowerShow.com