730 Lecture 20 - PowerPoint PPT Presentation

About This Presentation
Title:

730 Lecture 20

Description:

Get sample s1,...sN from sampling distribution. Display graphically, ... hist(s,breaks=50) alphas -((1:N)-0.5)/N. gamma.quantiles -theta*qgamma(alphas,n)/n ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 23
Provided by: Ala757
Category:
Tags: hist | lecture

less

Transcript and Presenter's Notes

Title: 730 Lecture 20


1
730 Lecture 20
Todays lecture
Computer Intensive Methods
2
Sampling distributions
Population distribution F
Sampling distribution Fs
3
The basic idea (iid case)
F S Fs
Population distribution statistic equals
sampling distribution
4
Examples
5
Sampling distributions the alternatives
  • Derived from theory (if we can), or
  • Can use simulation! Eg
  • Simulate X1,,Xn from Expo(q)
  • Compute s1sample mean
  • Repeat N10,000 times
  • Get sample s1,sN from sampling distribution
  • Display graphically, calculate std dev etc

6
R code
nlt-10 Nlt-10000 thetalt-5 slt-numeric(N) for(i in
1N) generate sample from population
distribution xlt-rgamma(n,1)theta Calculate
statistic silt-mean(x) sqrt(var(s)) 1
1.560461 (Correct value is 5/sqrt(10) 1.5811)
7
R code- graphs
par(mfrowc(1,2)) hist(s,breaks50) alphaslt-((1N
)-0.5)/N gamma.quantileslt-thetaqgamma(alphas,n)/n
plot(sort(s),gamma.quantiles,xlab"order
statistics") abline(0,1)
8
Graphs
9
The big problem.
  • In practice we dont know F!
  • What can we do?
  • Estimate F!
  • How?

10
Estimating F
  • Two methods
  • Non-parametric use Empirical distribution
    function
  • Parametric assume form of F is known but F
    depends on unknown parameters.

11
Method 1Non-parametric method
  • Estimate F by EDF Fn(x)
  • Fn (x)proportion of sample that is x
  • MaxxFn (x) -F(x) 0 in prob
  • Ön(Fn (x) -F(x) ) N(0, F(x)(1-F(x) )

12
Empirical distribution function (cont)
EDF jumps up 1/n at each data value eg for n3
13
EDF example
EDF of a N(0,1) sample of 50
14
Sampling from the EDF
  • The EDF of a sample x1,,xn is the df of a
    discrete distribution that has probability mass
    1/n at each data point of the sample.
  • Thus, to draw a sample of size N from this
    distribution we draw a random sample of size N
    with replacement from x1,,xn

15
Method 2 the parametric method
  • If we assume that the df of the population is
    F(x,q) where F is known but q is not, estimate F
    by is an estimate of q.

16
The bootstrap
  • To estimate the standard error of a statistic S
  • Estimate the population df.
  • Draw a random sample of size n from the estimated
    F and calculate S from the sample.
  • Repeat N times, get s1,,sN
  • Calculate the std dev of the N values s1,,sN

17
The bootstrap (cont)
18
Example
  • Suppose we want to estimate the standard error of
    the sample variance. The population distribution
    is exponential and n10.

19
R code
nlt-10 Nlt-1000 thetalt-5 theta is the true
value generate a sample xlt-rgamma(n,1)theta
now do non-parametric bootstrap (use
EDF) slt-numeric(N) for(i in 1N) bootstrap.sample
lt-sample(x,n,replaceT) silt-var(bootstrap.sample
) sqrt(var(s)) 1 22.19461
20
R code (cont)
now do parametric bootstrap (use
exponential with estimated mean) xbarlt-mean(x) slt-
numeric(N) for(i in 1N) bootstrap.samplelt-rgamm
a(n,1)xbar silt-var(bootstrap.sample) sqrt(va
r(s)) 1 35.26925
21
Theory
Using tedious algebra, one can show that
For the exponential, m49q4, m2q2. Thus
22
Results
  • For q5, n10, the exact variance is
    25xsqrt(74/90) 22.66912
  • The nonparametric bootstrap did very well
    (22.19461)
  • The parametric bootstrap was not very good
    (35.26925)
  • Any ideas why?
Write a Comment
User Comments (0)
About PowerShow.com