Bootstraps - PowerPoint PPT Presentation

About This Presentation
Title:

Bootstraps

Description:

... function times even function so integral is zero. YES ! Is s ... CRC Math Handbook gives this integral as equal to 2. Not Quite ... Maximum likelihood estimate ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 42
Provided by: billm7
Category:

less

Transcript and Presenter's Notes

Title: Bootstraps


1
Lecture 6
  • Bootstraps
  • Maximum Likelihood Methods

2
Boostrapping
  • A way to generate
  • empirical probability distributions
  • Very handy for making
  • estimates of uncertainty

3
100 realizations of a normal distribution p(y)
withy50 sy100
4
What is the distribution ofyest Si
yi?
1
N
5
We know this should be a Normal distribution
withexpectationy50and ?variancesy/?N10
p(y)
y
p(yest)
yest
6
Heres an empirical way of determining the
distributioncalledbootstrapping
7
Random integers in the range 1-N
N original data
N resampled data
y1 y2 y3 y4 y5 y6 y7 yN
4 3 7 11 4 1 9 6
y1 y2 y3 y4 y5 y6 y7 yN
Compute estimate
1
Si yi
N
Now repeat a gazillion times and examine the
resulting distribution of estimates
8
Note that we are doingrandom sampling with
replacementof the original dataset yto create
a new dataset y
Note the same datum, yi, may appear several
times in the new dataset, y
9
pot of an infinite number of ys with
distribution p(y)
Does a cup drawn from the potcapture the
statistical behavior of whats in the pot?
cup of N ys drawn from the pot
10
Pour into new pot
Duplicate cup an infinite number of times
p(y)
Take 1 cup
?p(y)
More or less the same thing in the 2 pots ?
11
Random sampling easy to code in MatLabyprime
y(unidrnd(N,N,1))
vector of N random integers between 1 and N
original data
resampled data
12
The theoretical and bootstrap results match
pretty well !
theoretical
Bootstrap with 105 realizations
13
Obviouslybootstrapping is of limited utility
when we know the theoretical distribution(as in
the previous example)
14
but it can be very useful when we dontfor
examplewhats the distribution of syestwhere
(syest)2 1/(N-1) Si (yi-yest)2and yest (1/N)
Si yi(Yes, I know a statistician would know it
follows Students T-distribution )
15
To do the bootstrapwe calculateyest (1/N)
Si yi (syest)2 1/(N-1) Si (yi-yest)2
and syest ?(syest)2 many times say 105
times
16
Heres the bootstrap result
p(syest)
I numerically calculate an expected value of 92.8
and a ?variance of 6.2 Note that the distribution
is not quite centered about the true value of
100 This is random variation. The original N100
data are not quite representative of the an
infinite ensemble of normally-distributed values
Bootstrap with 105 realizations
syest
sytrue
17
So we would be justified saying
sy ? 92.6 12.4
that is, 2?6.2, the 95 confidence interval
18
The Maximum Likelihood Distribution
  • A way to fit
  • parameterized probability distributions
  • to data
  • very handy when you have good reason
  • to believe the data follow a particular
    distribution

19
Likelihood Function, L
  • The logarithm of
  • the probable-ness of a given dataset

20
  • N data y are all drawn from the same distribution
    p(y)
  • the probable-ness of a single measurement yi is
    p(yi)
  • So the probable-ness of the whole dataset is
  • p(y1) ? p(y2) ? ? p(yN) Pi p(yi)
  • L ln Pi p(yi) Si ln p(yi)

21
  • Now imagine that the distribution p(y) is known
    up to a vector m of unknown parameters
  • write p(y m) with semicolon as a reminder
  • that its not a joint probabilty
  • The L is a function of m
  • L(m) Si ln p(yi m)

22
  • The Principle of Maximum Likelihood
  • Chose m so that it maximizes L(m)
  • ?L/?mi 0
  • the dataset that was in fact observed is the most
    probable one that could have been observed

23
Example normal distribution of unknown mean y
and variance s2
  • p(yi) (2p)-1/2 s-1 exp -½ s-2 (yi-y)2
  • L Si ln p(yi)
  • -½Nln(2p) Nln(s) -½ s-2 Si (yi-y)2
  • ?L/?y 0 s-2 Si (yi-y)
  • ?L/?s 0 - N s-1 s-3 Si (yi-y)2

Ns arise because sum is from 1 to N
24
Solving for y and s
  • 0 s-2 Si (yi-y) y N-1
    Siyi
  • 0 -Ns-1 s-3 Si (yi-y)2 s2 N-1 Si
    (yi-y)2

25
Interpreting the results
  • y N-1 Siyi
  • s2 N-1 Si (yi-y)2

Sample mean is the maximum likelihood estimate of
the expected value of the normal distribution
Sample variance (more-or-less) is the maximum
likelihood estimate of the variance of the normal
distribution
issue of N vs. N-1 in the formula
26
Example 100 data drawn from a normal
distribution
truey50s100
27
L(y,s)
maxaty62s107
s
y
28
Another Example exponential distribution
Is this parameter really the expectation ?
  • p(yi) ½ s-1 exp - s-1 yi-y
  • Check normalization use z yi-y
  • ?p(yi)dy ½s-1 ?-?? exp - s-1 yi-y dyi
  • ½ s-1 2 ?0? exp - s-1 z dz
  • s-1 (-s) exp-s-1z0? 1

Is this parameter really ?variance ?
29
Is y the expectation ?
  • E(yi) ?-?? yi ½ s-1 exp - s-1 yi-y dyi
  • use z yi-y
  • E(yi) ½ s-1 ?-??(zy) exp - s-1z dz
  • ½ s-1 2 y ?o?exp - s-1 z dz
  • - y exp - s-1 z o? y

z exp(-s-1z) is odd function times even
function so integral is zero
YES !
30
Is s the ?variance ?
  • var(yi) ?-??(yi-y)2 ½ s-1 exp - s-1 yi-y
    dyi
  • use z s-1(yi-y)
  • E(yi) ½ s-1 ?-?? s2 z2 exp -z s dz
  • s2 ?0? z2 exp -z dz
  • 2 s2 ? s2

CRC Math Handbook gives this integral as equal to
2
Not Quite
31
Maximum likelihood estimate
x
  • L Nln(½) Nln(s) - s-1 Si yi-y
  • ?L/?y 0 - s-1 Si sgn (yi-y)
  • ?L/?s 0 - N s-1 s-2 Si yi-y
  • y such that Si sgn (yi-y) 0

x
dx/dx
1
x
-1
Zero when half the yis bigger than y, half of
them smaller
y is the median of the yis
32
Once y is known then
  • ?L/?s 0 - N s-1 s-2 Si yi-y
  • s N-1 Si yi-y with y median(y)
  • Note that when N is even, y is not unique,
  • but can be anything between the two middle
    values in a sorted list of yis

33
Comparison
  • Normal distribution
  • best estimate of expected value is sample mean
  • Exponential distribution
  • best estimate of expected value is sample median

34
Comparison
  • Normal distribution
  • short tailed
  • outlier extremely uncommon
  • expected value should be chosen to make
  • outliers have as small a deviation as
    possible
  • Exponential distribution
  • relatively long-tailed
  • outlier relatively common
  • expected value should ignore actual value of
    outliers

outlier
yi
median
mean
35
another important distribution
  • Gutenberg-Richter distribution
  • (e.g. earthquake magnitudes)
  • for earthquakes greater than some threshhold
    magnitude m0, the probability that the earthquake
    will have a magnitude greater than m is
  • b (m-m0)
  • or P(m) exp log(10) b (m-m0)
  • exp-b (m-m0) with b log(10) b

P(m)10
36
  • This is a cumulative distribution, thus the
    probability that magnitude is greater than m0 is
    unity
  • P(m) exp b (m-m0) exp0 1
  • Probability density distribution is its
    derivative
  • p(m) b exp b (m-m0)

37
  • Maximum likelihood estimate of b is
  • L(m) N log(b) b Si (mi-m0)
  • ?L/?b 0 N/b - Si (mi-m0)
  • b N / Si (mi-m0)

38
Originally Gutenberg Richtermade a mistake
slope -b
Log10 P(m)
least-squares fit
magnitude, m
by estimating slope, b using least-squares, and
not the Maximum Likelihood formula
39
yet another important distribution
  • Fisher distribution on a sphere
  • (e.g. paleomagnetic directions)
  • given unit vectors xi that scatter around some
    mean direction x, the probability distribution
    for the angle q between xi and x (that is,
    cos(q)xi?x) is
  • p(q) sin(q) exp k cos(q)

k 2 sinh(k)
k is called the precision parameter
40
Rationale for functional form
  • p(q) ? exp k cos(q)

For q close to zero q ? 1 ½q2 so
p(q) ? exp k cos(q) expk exp ½q2

which is a gaussian
41
Ill let you figure out themaximum likelihood
estimate ofthe central direction, x,and the
precision parameter, k
Write a Comment
User Comments (0)
About PowerShow.com