Quantitative Methods Topic 4 Sampling - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Quantitative Methods Topic 4 Sampling

Description:

Quantitative Methods Topic 4 Sampling – PowerPoint PPT presentation

Number of Views:169
Avg rating:3.0/5.0
Slides: 43
Provided by: shelleyg
Category:

less

Transcript and Presenter's Notes

Title: Quantitative Methods Topic 4 Sampling


1
Quantitative MethodsTopic 4Sampling

2
Outline
  • Populations and samples
  • Sampling Frames
  • Representativeness
  • Probability and non-probability samples
  • Example sampling methods

3
Reading on Sampling
  • IIEP Module 3
  • Kenneth N Ross
  • TIMSS 2003 technical report sampling chapter
  • Pierre Foy and Marc Joncas

4
Populations and samples
Population eg all students at a school
Sample - small N selected to represent
population
5
Why sampling
  • Study of part rather than the whole population
  • Advantages
  • Reduced cost
  • Generalisations about
  • Estimates of characteristics

6
Population and Units of Analysis
  • Defining the population
  • Without definition, we dont have a context for
    the results
  • In an educational survey, the population will be
    defined by the units of analysis which may be
  • the student (eg studies of attainments)
  • the teachers (eg studies of teaching practice)
  • the school (eg studies in school environment)
  • Each unit of analysis may require a different
    sampling strategy.

7
Populations
  • Desired population for which the results are
    ideally required
  • Defined population which is actually studied,
  • Excluded The elements that are excluded from the
    desired target population in order to form the
    defined target population

8
Sampling Frames
  • A listing of the elements in a population.
  • E.g., Schools enrolment records can readily
    provide a sampling frame for the population of
    students all students are listed, and each
    student listed only once.

9
Representativeness
  • A sample is considered as representative if
    certain percentage of frequency distributions of
    some characteristics within the sample data are
    similar to those within the whole population
  • The population characteristics selected for
    comparisons are called marker variables
  • In education, common marker variables are sex,
    age, SES, school types, location, school size,
    ethnicity.

10
Sample types
  • Probability
  • Each member of the defined target population has
    a known and non zero chance of being selected
    into the sample
  • Estimating the values of population parameters
    from sample parameters
  • Testing statistical hypothesis about population
    from samples
  • Non- Probability
  • It is not possible to determine whether a
    non-probability sample is likely to provide very
    accurate or inaccurate estimates of population
    parameters.

11
Types of non-probability samples
  • Judgement sampling
  • Base on researchers judgment
  • Convenience sampling
  • Subjects or elements of a sample were selected
    base on their accessibility to the researcher
  • Quota sampling
  • Number of elements (subjects) are drawn from
    various target population strata in proportion to
    the size of these strata
  • Little or no control over the procedures used to
    select elements within these strata
  • There is no way of checking the accuracy of
    estimates

12
Types of probability samples
  • Simple random sampling
  • Systematic sampling
  • Stratified sampling
  • Multistage Cluster Sampling

13
Simple Random Sampling
  • There is a single sampling frame or list of names
  • A sample is selected from the list in a single
    operation
  • e.g. list of students in a faculty used to select
    a sample for course evaluation

14
Golden Rule of Simple Random Sampling
  • Each member of the population shall have an equal
    chance of selection.

15
Class activity 1 Using SPSS to take a random
sample
  • The data file VNsample.sav contains ID of all
    students of a district. Draw a simple random of
    10 of the population as follows
  • Click DATA -gt SELECT CASES -gt RANDOM SAMPLE -gt
    SAMPLE-APPROXIMATELY 10 CONTINUE
  • Click on copy selected cases to a new data set.
    In the box type the new data set name. Click OK.

16
Class activity 1
  • Examine the new data set to see how many students
    were randomly selected.
  • Calculate frequencies of girls and boys. Compare
    with the main sample.
  • Calculate mean mathematics achievement (variable
    pma500). Compare with the results of the main
    sample.
  • Repeat the selection three times to draw
    different samples and check how results vary.

17
Systematic Random Sampling Example
  • To draw a systematic random sample of size 16
    from our list of Metropolitan schools (160
    schools), ordered by school number, we would
  • Calculate the sampling interval (160/16 10)
  • Draw a random number between 1 and 10 (say it is
    7)
  • The sample will then consist of the following 16
    schools from the list the 7th, the (7 10
    17)th, the (7 210 27)th and so on to the (7
    1510 157th)
  • Note that the number of different samples that
    can be drawn by systematic sampling is typically
    quite small (10 in this example)

18
Systematic Random Sampling(Random Start Sampling
Interval)
  • Work out sampling interval.
  • Select a random start.
  • Every qth element in the register is selected
    from the random start
  • May be more efficient than Simple Random
    Sampling, e.g. when there is a systematic
    relation between the population order and the
    response variable(s) (i.e. give estimates of
    greater precision than a SRS of the same size)
  • May result in a biased sample if there is pattern
    in the list.

19
The list of schools that we have been working
with is largely, but not completely, arranged in
alphabetical order of the school name.It is
unlikely that, here, the order would be related
to the N of students studying Psychology. Hence,
it is unlikely that the precision of the sample
would be superior to that obtained from a simple
random sampleThe list could, however, be sorted
by a variable that we might expect was related to
the N of Psychology students (e.g. school or Year
12 cohort size)In this case, we would expect
the precision of estimates from the sample to
improve for any characteristic related to N of
Psychology students.
20
Class Activity 2 Draw RSSI sample
  • Open METROSCHOOLS.SAV in SPSS.
  • DATA EDITOR WINDOW -gt DATA-SORT CASES put
  • yr12en into the box sort by then OK.
  • In the DATA EDITOR window examine the variable
    yr12en across a few cases comment.
  • Draw a RSSI 10 sample.
  • does the sample represent the full range of
    school sizes (year 12 enrolment) ? Why?

21
Class Activity 2 Dangers of RSSI
  • A. British electoral registers are lists of
    street addresses in street number order. Even
    numbers are on one side of the street, odd
    numbers on the other.
  • With RSSI, and an even sampling interval (eg 20,
    22 or 24) how many sides of any street will you
    sample?
  • B. You are using RSSI to draw a sample from a
    list of club members, in alpha order with many
    married couples listed in male female order any
    problems?

22
Stratified Sampling (1)
  • The target population is divided into
    non-overlapping sub-populations called strata
  • Sampling is performed independently within strata

23
Types of stratified sampling
  • Proportionate
  • the within-stratum sample size is calculated such
    that it is proportional to the size of the
    sub-population
  • Disproportionate
  • Uses different sampling fractions within the
    various strata
  • Is used in order to ensure that the accuracy of
    sample estimates obtained for stratum parameters
    is sufficiently high to be able to make
    meaningful comparisons between strata

24
Analysis of Disproportionate Stratified Sample
  • Weighting is required to analyze the full sample.
  • Weighting is not required to analyze strata
    separately
  • Post-stratification can be used to weight a
    sample to know population characteristics after
    selection and/or after data collected. EG Post
    stratify by age or ethnic background

25
Stratified Sampling (2)
  • Provides increased accuracy in sample estimates
    without leading to increases in costs
  • Can guarantee representation of small
    sub-populations in the sample
  • Many population frames are readily divided into
    sub-populations - e.g. into States and Systems
    (government, Catholic, private) in national
    education surveys into States and residential
    location (rural/urban) in health or employment
    surveys
  • In some studies stratification is used for
    reasons other than obtaining gains in accuracy
  • Strata may be formed in order to employ different
    sample design within strata.
  • Subpopulations defined by the strata are
    designated as separate domains of study.

26
Multi-Stage Cluster Sampling
  • Used where there are naturally formed groups of
    population elements (e.g. schools, households,
    community health centres etc.) and, frequently,
  • Used when a full population frame is not
    available (e.g. all students in all Government
    schools in Australia, all patients seen by
    medical staff in all community health centres in
    Victoria)
  • In face to face interview studies When the
    sample is geographically dispersed and the costs
    of travel would otherwise be prohibitive.
  • Enables the researcher to gather data from within
    the sampled clusters only, and thus lowers the
    cost of a survey.

27
An Example of Cluster Sampling
  • Primary Sampling Stage Select a number of
    schools in Victoria (it can be done using simple
    random sampling or stratified sampling of
    Government, Roman Catholic, and Private schools
    in Victoria
  • Stage 2 A sample of Year 12 students is then
    drawn from the enrolment records of each of the
    sampled schools
  • Note that a full list of all Year 12 students in
    all Victorian schools is not needed. All that is
    required is a list of students for each of the
    sampled schools

28
At least TWO stages in MS sampling
  • Stage One select primary sampling units, eg
    schools, electorates, local authority areas,
    community health centres
  • Stage Two select secondary sampling units eg
    pupils within schools, patients within community
    health centres
  • Note that each stage is a separate sampling
    operation, and these operations need not be
    uniform
  • may use stratification
  • may use RSSI sampling

29
Accuracy of estimates from samples
  • The degree of accuracy of sample estimates may be
    judged by the difference between the sample
    estimates and the value of the population
    parameters
  • In most situations, the value of population
    parameters are unknown.
  • It is possible to estimate the probable accuracy
    of the obtained sample through a knowledge of the
    behaviour of estimates derived from all possible
    samples.

30
A simple example
  • Students are given two packs of cards, combined
    into one deck. They are asked to guess the
    proportion of cards that are red.
  • Students then draw a sample of 10 cards, and use
    it to make the estimate. This is repeated several
    times to draw several samples of 10.
  • Similarly, students draw repeated samples of 50.
  • Results are graphed to form two sampling
    distributions one for samples of 10, the other
    for samples of 50. (example on next slide).
  • Which sample size 10 or 50, will give the most
    accurate result?

31
Class Activity 3 Sampling Task
  • Use the EXCEL procedure as for Week 5 for
    simulating drawing of cards.

32
Sampling Distribution estimates of cards red
from several samples
10
25
Number in each sample Number of
samples drawn
50
47.6
Estimate from first sample Average for
all samples
10
9
8
7
6
5
4
3
2
1
0
10
20
30
40
50
60
70
80
90
100
Percent of red cards in sample
33
Sampling Distribution estimates of red cards
from several samples
50
25
Number in each sample Number of
samples drawn
52..8
50
Estimate from first sample Average
for all samples
10
9
8
7
6
5
4
3
2
1
0
10
20
30
40
50
60
70
80
90
100
Percent of red cards in sample
34
Difference in the distributions
  • samples of 10
  • have a higher Standard Deviation (SD)
  • have a more dispersed distribution
  • the estimates from individual samples vary
    greatly
  • samples of 50
  • have a lower SD
  • have a less dispersed distribution
  • the estimates from individual samples vary less

35
Factors affecting accuracies of estimates from
samples
  • Sample size, as seen from the above example.
  • Other factors
  • Sampling design
  • Stratified and Systematic sampling may increase
    accuracy.
  • Cluster sampling may reduce accuracy.

36
Clustering sampling in education
  • Schools are selected first.
  • Then, students are selected from the selected
    schools.
  • If there is a large intraclass correlation,
    precision of estimates will be reduced.

37
Intra-class correlation
  • In the context of schools/students
  • The degree to which students are similar within
    schools.
  • Large intra-class correlation
  • Schools are highly tracked. High ability students
    are in the same schools. Low ability students are
    in other schools.
  • Low intra-class correlation
  • The range of abilities of students is about the
    same in all schools.

38
Effect of cluster sampling
  • If intra-class correlation is high, then we need
    to select more schools to get the variations of
    student abilities.
  • In the extreme case, if all students within each
    school have the same ability, then sampling all
    students from one school is equivalent to
    sampling just one student.
  • Our estimate from the sample will be quite
    imprecise as compared with the population
    parameter. (loss of sampling efficiency)

39
(Sampling) Design Effect
  • Defined as the loss of efficiency from sampling
  • If n1 students are required to achieve the same
    precision as for n2 students from a simple random
    sample, then the design effect is n1/n2.

40
Table of intra-class correlation - 1
  • Table 1.1 in Reading IIEPModule3.pdf
  • For example, if we sample 20 students from each
    school (cluster size of 20), and the intra-class
    correlation is around 0.2, then the design effect
    is 4.8
  • This means that, if cluster sampling is used, we
    need a sample size 4.8 times larger than the
    sample size for a simple random sample.
  • In Australia, the intra-class correlation is
    around 0.2, or a little higher.

41
Table of design effects
  • See document DesignEffectPISA2003.xls
  • (data from PISA 2003 technical report)

42
Computation of sampling error
  • More complicated when multi-stage cluster
    sampling is used.
  • Can be estimated once the intra-class correlation
    is known (say, from previous studies)
Write a Comment
User Comments (0)
About PowerShow.com