A Resampling Study of NASS Survey MPPS Sampling Strategy - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

A Resampling Study of NASS Survey MPPS Sampling Strategy

Description:

Multivariate Probability Proportional to Size ... ALF Alfalfa All Harvested Acres. BAR Barley All Planted Acres. CAN Canola All Planted Acres ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 40
Provided by: wen73
Category:

less

Transcript and Presenter's Notes

Title: A Resampling Study of NASS Survey MPPS Sampling Strategy


1
A Resampling Study of NASS Survey MPPS Sampling
Strategy
  • By Stanley Weng
  • National Agricultural Statistics Service
  • U.S. Department of Agriculture

2
INTRODUCTION
  • MPPS
  • Multivariate Probability Proportional to Size
  • Address multiple, and often competing, purposes
    (multi targets) of a survey
  • Used for NASS Crops Survey (CS) etc., since 1999

3
MPPS
  • Technically
  • Sample was selected using a Poisson method. Each
    farm i had a unique probability of selection,
    formed by


4
MPPS
  • where is the item m selection
  • probability, determined by
  • ? auxiliary data with the assumption of the
    variance proportional to (a power of) the
    auxiliary variable value
  • ? optimal allocation
  • ? a desired item-level sample size

5
MPPS
  • Development and application of the MPPS strategy
    at NASS
  • Amrhein, Hicks and Kott (1996)
  • Amrhein and Bailey (1998)
  • Bailey and Kott (1997)
  • Hicks, Amrhein and Kott (1996)
  • Kott, Amrhein and Hicks (1998).

6
A COMPARISON STUDY
  • This study was designed to compare MPPS with the
    previously used SRS ((Stratified) Simple Random
    Sampling) strategy

7
THIS STUDY
  • Explored the resampling approach to reveal the
    statistical characteristics/ behavior of NASS Ag
    survey data
  • Raised issues for further investigation to
    improve our understanding and practice of NASS Ag
    survey sampling /estimation

8
RESAMPLING
  • ? Population bootstrap
  • Base sample
  • June Crop Survey MPPS samples
  • Pseudo population
  • Composed of replicates of base sample elements,
    according to the (integerized) weight of the
    element

9
RESAMPLING
  • Resamples
  • Independent samples, drawn from
  • by Poisson and SRS sampling strategies
    respectively

10
RESAMPLING
  • ? Resample totals , and

11
RESAMPLING
  • Resampling variance estimate for the sample total
    estimate
  • Bootstrap statistic

12
DATA
  • The crop component of the 2004 and 2005 June QAS,
    for all 42 participating states
  • Certainty elements were eliminated from sample,
    to avoid unnecessary complication


,


13
RESAMPLING VAR ESTIMATES
  • ? Based on 1000 resamples
  • Naive Comparison
  • ? Log-Log Plot
  • ?Resampling variance est vs sample
  • total across crops for each state
  • ?Overlay Poisson () vs SRS ()

14
Naive Comparison
  • General linear trend
  • (Assumption the variance proportional to a
    power of the total)
  • For majority of crops, SRS variance appeared
    greater than Poisson variance (but often not
    appreciably)

15
Log-Log Plot of Resampling Variance Est vs Total
Across Crops CAOverlay Poisson () vs SRS
() pot srg saf sun dwh bar
ctp oat ohy ctu wwh ric crn
alf
16
Validness of the Comparison
  • Need additional information to justify
  • The quality of the resampling variance estimate
    depends on the statistical quality of the
    resample totals, which also provides evidence
    for the appropriateness of the sampling strategy
  • Among various aspects, the most important
    NORMALITY

17
Normality
  • ? Q Q plot of resample totals
  • Demonstration CA
  • ? Most crops Good shape of Q-Q Plot
  • (Corn, Potatoes)
  • ? Exception Other Hay
  • Evidence that Poisson was better than SRS

18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
Outliers on the log-log plot
  • Located far apart from the general trend
  • The two sampling strategies gave appreciably
    different estimates
  • Demonstration
  • ? CA Other Hay
  • ? MT Potatoes
  • Evidence that SRS was better

25
Log-Log Plot of Resampling Variance Est vs Total
Across Crops MTOverlay Poisson () vs SRS ()
mus sun
can pot saf fla crn oat
ohy dwh bar alf wwh swh
26
(No Transcript)
27
(No Transcript)
28
FINITE SAMPLE RESAMPLING
  • Complexities
  • - Due to the special features of survey sampling
  • ? Nonindependence arising in sampling without
    replacement
  • ? Other complexities of finite population
    structure by designs and estimators

29
FINITE SAMPLE RESAMPLING
  • Effects of discreteness
  • (Davison Hinkley, 1997, 2.3.2)
  • ? Discrete empirical distribution
  • and in particular,
  • ? In finite population sampling, the pseudo
    population formed by replicates of sample
    elements

30
FINITE SAMPLE RESAMPLING
  • Issues with this study
  • Comparable sample size
  • - Addressed by size adjustment
  • Impact of the base sample
  • - Not clear

31
Impact of Base Sample
  • For finite population resampling, the general
  • guideline
  • ? The resampling population mimics the original
    population, and
  • ? The resamples, mimic the base sample, drawn
    from by a design identical to the one by
    which the base sample was originally drawn
  • (Sarndal, et al., 1992, Ch. 11)

32
AT ISSUE
  • ? How the resampling technique should be
    correctly modified to accommodate the finite
    sampling situation?

33
AT ISSUE
  • ? In literature, most reported finite
  • sample resampling studies used
  • (stratified) SRS, which bears the most
  • similarity to the infinite population
  • independent random sampling - the
  • standard setting that the resampling
  • technique is based on

34
SUMMARY
  • An Approach
  • Resampling analysis of resamples,
  • using statistical graphical and diagnostic
    techniques, to reveal statistical characteristics
    / behavior of NASS Ag survey data

35
SUMMARY
  • ? Sampling strategy comparison
  • ? Poisson seemed to be preferable to stratified
    simple random sampling
  • ? A national comparison table of the two
    strategies across crops and states is to be
    produced for a comprehensive picture with likely
    causal factors identified

36
FURTHER INVESTIGATION
  • To develop statistical understanding,
  • the resampling setting of this study and
  • other statistical information techniques
  • will be further explored

37
FURTHER INVESTIGATION
  • ? Behavior of Studentized bootstrap statistics
  • ? Statistical function
  • (Booth, Butler, and Hall, 1994
  • Davison Hinkley, 1997)
  • ? Examine different survey data

38
THANK YOU
39
ALF Alfalfa All Harvested Acres
BAR Barley All Planted Acres
CAN Canola All Planted Acres
CRN Corn Planted
Acres
CTP Pima Cotton Planted Acres CTU
Upland Cotton Planted Acres

DEB Dry Beans Planted Acres
DWH
Durum Wheat Planted Acres
FLA Flaxseed Planted Acres
MUS Mustard All Planted
Acres
OAT Oats All Planted Acres
OHY Other Hay Harvested Acres
PNT Peanuts All Planted
Acres POT Potatoes All
Planted Acres RIC Rice
All Grain Planted Acres RYE
Rye All Planted Acres
SAF Safflower All
Planted Acres SGB Sugarcane All
Planted Acres SOY Soybeans All
Planted Acres SPT
Sweet potatoes Planted Acres SRG
Sorghum All Planted Acres SUG
Sugarcane For Sugar Harvested Acres
SUN Sunflowers All Planted
Acres SWH Spring Wheat Irr Planted
Acres WWH Winter Wheat All Planted
Acres
Write a Comment
User Comments (0)
About PowerShow.com