Systematic Sampling - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Systematic Sampling

Description:

... checked, while it is difficult to check the process with simple random sampling. ... 3. Under this procedure there are N possible samples. ... – PowerPoint PPT presentation

Number of Views:1795
Avg rating:3.0/5.0
Slides: 18
Provided by: barbara177
Category:

less

Transcript and Presenter's Notes

Title: Systematic Sampling


1
Systematic Sampling  Basic ideas 1. Systematic
sampling is commonly used as an alternative to
simple random sampling. 2. It is easier to apply
and less likely to make mistakes than simple
random sampling. 3. The cost of sampling is
less with systematic sampling than simple random
sampling. 4. Its process can be easily checked,
while it is difficult to check the process with
simple random sampling.
2
  • Selection procedures
  •   For 1-in-k systematic sampling, k is
    determined by N/n. The selection probability is
    1/k or n/N.
  •   When Nnk
  • 1. Select a random number (i) between 1 and k
    to determine the starting point.
  • 2. Select every k-th unit starting with i.
  • 3. Under this procedure there are k possible
    samples and each sample consists of n units.

3
4.
  • When but the
    bias is
  • negligible when ngt50,

4
  • Circular systematic sampling when N?nk
  • 1. Select a random number (i) between 1 and N
    to determine the starting point.
  • 2. Select every k-th unit starting with i,
    assuming the frame is circular (the end
    connected to the beginning).

5
3. Under this procedure there are N possible
samples. For example, when N10, n3, and k3,
the ten possible samples are (any element can be
a starting point)
1. (1, 4, 7), 2. (2, 5, 8), 3. (3, 6, 9),
4. (4, 7, 10), 5. (5, 8, 1),
6. (6, 9, 2), 7. (7, 10, 3), 8. (8, 1, 4),
9. (9, 2, 5), 10. (10, 3, 6)
6
  • Note that each element is represented in 3 out of
    10 samples. The sampling fraction is 3/10 or
    n/N.

7
Variance of systematic sample mean   By
definition, the variance of systematic sample
mean is
8
  • Consider a population with N9, consisting of
    1, 2, 3, 4, 5, 6, 7, 8, and 9. If 1-in-3
    systematic sample is taken, the following three
    samples are possible
  •  
  • Sample 1 1, 4, 7 mean4
  • Sample 2 2, 5, 8 mean5
  • Sample 3 3, 6, 9 mean6
  •  
  • By definition,
  •  

9
  • If we take a simple random sample of n3 from
    the above population, the variance is
  • (60/9)(1/3)(6/8) 5/3, applying the
    formula in Box 3.2.
    ( 5,
    (16941014916)/9 60/9.)
  • The above results ( 2/3 lt 5/3) suggest that
    systematic sampling from an ordered frame
    (sorted by magnitude of value) performs better
    than simple random sampling.
  •  

10
  • The intra-class correlation coefficient from
    the above three possible systematic samples can
    be calculated, applying Formula 4.5 in Box 4.2.
  •   ? 2(1-5)(4-5)(1-5)(7-5)(4-5)(7-5)(2-5)(8
    -5)(3- 5)(6-5)(3-5)(9-5)(6-5)(9-5)
  • 9(2)(60/9)
  • -21/60
  • ? can also be obtained by computing a correlation
    coefficient between all possible pairs of units
    within the same systemic systematic sample
    (order considered) (1, 4), (4, 1), (1, 7), (7,
    1), (4, 7), (7, 4), (2, 5), (5, 2), (2, 8), (8,
    2), (5, 8), (8, 5), (3, 6), (6, 3), (3, 9), (9,
    3), (6, 9), and (9, 6)

11
  • Using ? (equation 4.3 in Box 4.2 on page 91),
    the variance of systematic sample mean is Var
    (60/9)(1/3) 1 (-21/60)2 (20/9)(18/60)
    2/3 (same as above).
  •  
  • We generally consider that the units in the
    systematic sample are heterogeneous when ? is
    negative and homogeneous when ? is positive.
    The minimum and maximum values of ? are 1/(n-1)
    and 1.

12
  • It is easy to verify that when ? -1/(N-1) the
    variance of systematic sample mean is the same
    as the variance of simple random sample of the
    same sample size. When ? lt -1/(N-1) the
    systematic sampling will perform better than
    simple random sampling.
  •  
  • This suggests that the without-replacement
    sampling produces more heterogeneous sample and
    hence gives smaller sampling variance than the
    with-replacement sampling.

13
  • We need to understand under what circumstances
    we get homogeneous or heterogeneous sampling
    units in a systematic sample. This bids down to
    in what sampling frame we can keep ? low in a
    systematic sample.
  • The following three situations can be
    considered
  •   1. The list population in random order e.g.
    alphabetical listing of names. This assures
    that the sampling units a systematic sample
    will be randomly selected. In other words, a
    systematic sample from a randomly ordered
    frame would be equivalent to a simple random
    sample.

14
2. The list population is ordered or stratified
with respect to the variable under
consideration. In this situation, the
systematic sample will avoid chances of
selecting samples containing too many large or
small units. As seen above, the systematic
sample selected from an ordered frame will yield
a smaller variance than the simple random
sample. 3. The list population contains periodic
variation. In this situation, the systematic
sample will tend to produce a homogeneous sample
and hence sampling variance would increase.
Systematic sampling is not a good choice when
dealing with a frame that contains periodic
variation - we need to consider alternative
sampling strategies.
15
  • Note that the estimating formulas for
    systematic sampling in Box 4.1 are same as those
    in Box 3.1 for simple random sampling. The
    formulas in Box 4.1 can only be used in the
    first situations described above.

16
  • Repeated systematic sampling
  • In general, it is not possible to estimate
    sampling variance from a systematic sample.
  •   Repeated (replicated) systematic sampling
    would make it possible to estimate sampling
    variance from sample data.
  • Repeated systematic sampling offers other
    practical advantages.
  •   Repeated systematic sample design is equivalent
    to single-stage cluster sample design (with
    equal cluster).

17
  • Illustrative example is shown in Section 4.7
    (pages 101-110).
  •   Contrary to the comments on Page 108, STATA can
    analyze data for a repeated systematic sample
    using a data file consisting of the original
    sample records (without aggregating over each of
    sample clusters. In addition to the STATA
    statements on page 106, use svyset psu cluster
    to get the same results shown on page 107.
  •   Deff shown on page 107 is misleading. Actually
    deff is 0.79, suggesting the repeated systematic
    sampling performed better than comparable simple
    random sampling in this example. The
    intra-class correlation in days lost is
    0.151.
Write a Comment
User Comments (0)
About PowerShow.com