Title: Systematic Sampling
1Systematic Sampling Basic ideas 1. Systematic
sampling is commonly used as an alternative to
simple random sampling. 2. It is easier to apply
and less likely to make mistakes than simple
random sampling. 3. The cost of sampling is
less with systematic sampling than simple random
sampling. 4. Its process can be easily checked,
while it is difficult to check the process with
simple random sampling.
2- Selection procedures
- For 1-in-k systematic sampling, k is
determined by N/n. The selection probability is
1/k or n/N. - When Nnk
- 1. Select a random number (i) between 1 and k
to determine the starting point. - 2. Select every k-th unit starting with i.
- 3. Under this procedure there are k possible
samples and each sample consists of n units.
34.
- When but the
bias is - negligible when ngt50,
4- Circular systematic sampling when N?nk
- 1. Select a random number (i) between 1 and N
to determine the starting point. - 2. Select every k-th unit starting with i,
assuming the frame is circular (the end
connected to the beginning).
53. Under this procedure there are N possible
samples. For example, when N10, n3, and k3,
the ten possible samples are (any element can be
a starting point)
1. (1, 4, 7), 2. (2, 5, 8), 3. (3, 6, 9),
4. (4, 7, 10), 5. (5, 8, 1),
6. (6, 9, 2), 7. (7, 10, 3), 8. (8, 1, 4),
9. (9, 2, 5), 10. (10, 3, 6)
6- Note that each element is represented in 3 out of
10 samples. The sampling fraction is 3/10 or
n/N.
7Variance of systematic sample mean By
definition, the variance of systematic sample
mean is
8- Consider a population with N9, consisting of
1, 2, 3, 4, 5, 6, 7, 8, and 9. If 1-in-3
systematic sample is taken, the following three
samples are possible -
- Sample 1 1, 4, 7 mean4
- Sample 2 2, 5, 8 mean5
- Sample 3 3, 6, 9 mean6
-
- By definition,
-
9- If we take a simple random sample of n3 from
the above population, the variance is - (60/9)(1/3)(6/8) 5/3, applying the
formula in Box 3.2.
( 5,
(16941014916)/9 60/9.) - The above results ( 2/3 lt 5/3) suggest that
systematic sampling from an ordered frame
(sorted by magnitude of value) performs better
than simple random sampling. -
10- The intra-class correlation coefficient from
the above three possible systematic samples can
be calculated, applying Formula 4.5 in Box 4.2. - ? 2(1-5)(4-5)(1-5)(7-5)(4-5)(7-5)(2-5)(8
-5)(3- 5)(6-5)(3-5)(9-5)(6-5)(9-5) - 9(2)(60/9)
- -21/60
- ? can also be obtained by computing a correlation
coefficient between all possible pairs of units
within the same systemic systematic sample
(order considered) (1, 4), (4, 1), (1, 7), (7,
1), (4, 7), (7, 4), (2, 5), (5, 2), (2, 8), (8,
2), (5, 8), (8, 5), (3, 6), (6, 3), (3, 9), (9,
3), (6, 9), and (9, 6)
11- Using ? (equation 4.3 in Box 4.2 on page 91),
the variance of systematic sample mean is Var
(60/9)(1/3) 1 (-21/60)2 (20/9)(18/60)
2/3 (same as above). -
- We generally consider that the units in the
systematic sample are heterogeneous when ? is
negative and homogeneous when ? is positive.
The minimum and maximum values of ? are 1/(n-1)
and 1.
12- It is easy to verify that when ? -1/(N-1) the
variance of systematic sample mean is the same
as the variance of simple random sample of the
same sample size. When ? lt -1/(N-1) the
systematic sampling will perform better than
simple random sampling. -
- This suggests that the without-replacement
sampling produces more heterogeneous sample and
hence gives smaller sampling variance than the
with-replacement sampling.
13- We need to understand under what circumstances
we get homogeneous or heterogeneous sampling
units in a systematic sample. This bids down to
in what sampling frame we can keep ? low in a
systematic sample. - The following three situations can be
considered - 1. The list population in random order e.g.
alphabetical listing of names. This assures
that the sampling units a systematic sample
will be randomly selected. In other words, a
systematic sample from a randomly ordered
frame would be equivalent to a simple random
sample.
142. The list population is ordered or stratified
with respect to the variable under
consideration. In this situation, the
systematic sample will avoid chances of
selecting samples containing too many large or
small units. As seen above, the systematic
sample selected from an ordered frame will yield
a smaller variance than the simple random
sample. 3. The list population contains periodic
variation. In this situation, the systematic
sample will tend to produce a homogeneous sample
and hence sampling variance would increase.
Systematic sampling is not a good choice when
dealing with a frame that contains periodic
variation - we need to consider alternative
sampling strategies.
15- Note that the estimating formulas for
systematic sampling in Box 4.1 are same as those
in Box 3.1 for simple random sampling. The
formulas in Box 4.1 can only be used in the
first situations described above.
16- Repeated systematic sampling
- In general, it is not possible to estimate
sampling variance from a systematic sample. - Repeated (replicated) systematic sampling
would make it possible to estimate sampling
variance from sample data. -
- Repeated systematic sampling offers other
practical advantages. - Repeated systematic sample design is equivalent
to single-stage cluster sample design (with
equal cluster).
17- Illustrative example is shown in Section 4.7
(pages 101-110). - Contrary to the comments on Page 108, STATA can
analyze data for a repeated systematic sample
using a data file consisting of the original
sample records (without aggregating over each of
sample clusters. In addition to the STATA
statements on page 106, use svyset psu cluster
to get the same results shown on page 107. - Deff shown on page 107 is misleading. Actually
deff is 0.79, suggesting the repeated systematic
sampling performed better than comparable simple
random sampling in this example. The
intra-class correlation in days lost is
0.151.