Title: Sampling, Reliability and Validity
1Sampling, Reliability and Validity
2Multiple Indicators/Composite Measures
- Many concepts cannot be measured with a single
indicator. - Hence, sometimes multiple questions are used to
measure a concept. - An index is a composite measure developed to
represent different components of a concept.
3Index of Delinquency Questions from the Richmond
Youth Project Survey
- 67). Have you ever taken little things (worth
less than 2) that did not belong to you? - 68). Have you ever taken things of some value
(between 2 and 50) that did not belong to you? - 69) Have you ever taken things of large value
(worth over 50) that did not belong to you? - 70) Have you ever taken a car for a ride without
the owners permission? - 71) Have you ever banged up something that did
not belong to you on purpose? - 72) Not counting fights you may have had with a
brother or sister, have you ever beaten up on
anyone or hurt anyone on purpose? - Answers
- A) No, never B) More than a year ago C) During
the last year D) During the last year and more
than a year ago - Scoring Add up subjects scores (A0, B1, C2,
D3). - 0 -6 low 7-12 medium 8 high
4Sampling
- A sample is a selection of elements from a
population. - Samples are used because one often cannot survey
the entire population. - Sampling methods may be probability or
non-probability.
5Probability versus Non-Probability Sampling
Methods
- Probability Methods-
- Are used when the odds of selection of population
elements are known. - Methods Include
- Simple Random
- Systematic Random
- Stratified Random
- Cluster
- Non-Probability Methods
- Are used when the odds of selection of population
elements is not known. - Methods Include
- Availability/Convenience
- Quota
- Purposive
- Snowball
6Non-Probability Sampling Methods
- The obvious disadvantage to non-probability
sampling is that since the probability that a
person will be chosen is not known, the
investigator cannot claim that the sample is
representative of the larger population. This
greatly limits generalizability (to whom the
research can be applied).
7So Why Use Non-Probability Sampling Methods
- Cost
- Time
- Available Resources
- The total population of many groups is unknown.
- This type of sample may be adequate if
- the researcher has no need to generalize his/her
findings. - if it is a test of questions, reliability,
validity for a questionnaire.
8Availability/Convenience Samples
- Also called haphazard or accidental sampling.
- Elements are selected for the sample because
they are available or easy to find. - Captive audiences, like sociology classes are
often used. - Drawbacks
- The sample is not very representative.
- There is a tremendous amount of response bias.
9Quota Sampling
- Is the non-probability equivalent of stratified
sampling. - Subjects are recruited because they match a
requirement or quota. For example, one might
want to represent racial categories in proportion
to how they appear in the larger population. - Because of Response Bias, False Homogeneity may
occur. - It is useful to know the characteristics of the
general population. This way you can test if
your sample is missing a characteristic that is
relevant.
10Purposive Sampling
- Sample elements are selected for a purpose,
usually because of the unique position of the
sample element. Only respondents who meet the
purpose of the study are chosen. - For example, if you are interested in altruism,
you might only consider people who give to
charity. - Theoretical Sampling-A growing theoretical
interest guides the selection of sample cases.
The researcher selects cases based on new
insights they may provide.
11Snowball Sampling
- To draw a snowball sample you identify one member
of the population. This person then connects you
to other members of the population. - This method is used when one has a hard to reach,
but inter connected populations. - Researchers who are interested in networks use
snowball sampling. - This is a frequently employed method for in-depth
interviews.
12Probability Sampling Methods
- Sampling Methods in which the probability of
selection of elements in known and is not zero. - These methods use a random method of selecting
elements, and therefore have no systematic bias-
nothing but chance affects the elements in the
sample. - In order to perform a probability sample, one
must have a sampling frame- a list of all the
elements from which the sample will be drawn.
13Are Probability Methods Error Free?
- Knowing the odds of selection of each element
does not eliminate errors due to chance. - Error can be effected by sample size and
homogeneity of the population. - 1. The larger the sample, the more confidence we
can have in the samples representativeness. - 2. The more homogeneous the population, the more
confidence we can have in the representativeness
of the sample of any particular size. - 3. The fraction of the total population that a
sample contains does not affect the degree of
confidence we can have in the samples
representativeness, unless that fraction is
large. (More than 2)
14Sampling Vocabulary
- Elements- the sampling element is the unit of
analysis or case in the population. It can be a
person, a group, an organization etc. that is
being measured. - Populations- the pool of all available elements
is the population. - The target Population refers to the specific pool
of cases that the researcher wants to study. - Sampling Frame- the sampling frame is the list of
elements of the population from which the sample
will be drawn. Having a good sampling frame is
crucial. - Population parameters- any true characteristic
of a population is a parameter.
15Randomness and Probability Sampling
- Random-refers to a process that generates a
mathematcially random result that is, the
selection process operates in a truly random
method. - Each element has an equal chance of selection.
- Random samples are most likely to yield a sample
that truly represents the population (is
generalizable). - Random sampling lets a researcher statistically
estimate the sampling error. - Sampling error is the deviation between sample
results and a population parameter due to random
processes.
16Simple Random Sample
- Elements must be identified from the sampling
frame with a procedure that generates numbers or
otherwise identifies cases strictly on the basis
of chance. - Typically each subject is assigned a number and
this number is drawn at random. - In simple random sampling, there is no replaced
of subjects after they are drawn out of the
sampling frame.
17Central Limit Theorem
- Central Limit theorem- if you take a number of
random samples from the same sampling frame the
sampling distribution increases toward infinity. - The pattern of samples and the population
parameter become more predictable. - With a huge number of random samples, the
sampling distribution forms a normal curve, and
the midpoint of the curve approaches the
population parameter as the number of samples
increases. - We can use our knowledge of the central limit
theorem to construct confidence intervals or a
range in which we are confident that the
population parameter is within.
18Systematic Random Sampling
- In a systematic sample it is assumed that all the
individuals in the sampling frame are randomly
listed, and as a researcher we need to choose a
sample of 1/k of the population. Every nth
element is selected for sampling and since all
elements are randomly distributed, the sample is
random. - Three Steps of Systematic Random Sampling-
- 1. Sampling Interval- total population is
divided by the number of cases required for the
sample. This number is the sampling interval.
Ie. You have 2000 people in your sampling frame.
You need 200 good surveys. You decide in order
to get 200 you need to do 250. So 2000/250 4.
So you take every fourth person. - 2. Random draw for first case- select a random
number from 1-20 (draw lots). That is the first
case. - 3. Take every nth case. If the sampling
interval is not a whole number, vary the size
systematically.
19Stratified Random Sampling
- Stratified Random Sampling- Represents elements
in specific proportions. - All of the elements in the population (sampling
frame) are separated in groups based on some
characteristic or set of characteristics. - Each group is called a strata.
- Elements are sampled randomly from each strata.
This can ensure that you get the right proportion
of elements in your sample. - -proportionate stratified random sampling- each
stratum is sampled exactly in proportion to its
size in the population. So if the sample is all
Americans and 12 of Americans are
African-American, then 12 of your sample comes
from that stratum. - -disproportionate stratified random sampling- the
proportion of each stratum is intentionally
varied from what it is in the population.
20Why Sample Disproportionately?
- A group might be so small, that without over
sampling, their proportion in the total sample
will be too small for any meaningful statistics
to be calculated. - You may want equal numbers of elements in each
group, rather than the proportion represented in
the population.
21Cluster Sampling
- A cluster is a naturally occurring mixed
aggregate of elements of the population, with
each element appearing in one and only one
cluster. For example, Schools, blocks, clubs,
political parties. - A cluster sample is a simple random sample of
each cluster selected. - Cluster sampling is useful when there is no
sampling frame available or the cost of
developing one is too high. - The main advantage is savings in time and money
and access to populations that might not be
sampled any other way. - The main disadvantage is that because it is not a
completely random sample, it might not be
representative of the population.
22Hidden Populations
- Some populations are hidden or difficult to
locate. For example, functioning drug addicts. - It is usually impossible to use probability
sampling with these populations
23Rules of Thumb for Sample Size
- There is no hard and fast rule for the minimum
number of people one needs in a sample. A
minimum of at least 30 are needed to do
statistics, 100 for any kind of real work. - Ultimately it depends on what you want to study.
It is important to include extras in your
sampling frame because of retention issues. - A researchers decision about the best sample
size depends on three things - 1) the degree of accuracy required
- 2) the degree of variability or diversity in the
population - 3) the number of different variables examined
simultaneously in data analysis.
24Determining Sample Size
- There are several considerations that determine
sample size - the less sampling error desired in the sample
statistics, the larger the sample size must be. - Samples of more homogenous populations can be
smaller than samples of more heterogenous
populations. - If analysis is limited to descriptive variables,
a smaller sample is possible than when complex
analysis of sub-groups is planned. - If a researcher is expecting to find a strong
relationship, a smaller sample will be needed to
detect these relationships than if weaker
relationships are expected. - Equal increases in sample size produce more of an
increase in accuracy for small than for large
samples.
25Reliability
- Is a measure of consistency.
- A measure is reliable if the measurement does not
change when the concept being measured remains
constant in value. - For example, height. If you use a measuring tape
to measure your height - you expect to receive
similar results each time.
26Two Principles of Reliability
- Stability- Is the principle that a reliable
measure should not change from one application to
the next. - Applying the same concept to similar subject
populations, should yield similar results.
- Equivalence- Is the principle that all items
that make up a measuring instrument should be
consistent with one another. - Scoring high on one item should mean that one
should score similarly on related items.
27Indications of Unreliability
- Test-Retest Reliability
- Inter-item Reliability (Internal Consistency)
- Alternate-Forms Reliability
- Inter-observer Reliability
28Test-Retest Reliability
- Tests if a measure is consistent across time.
- For example, a test or survey can be
administered, then administered again a month
later. Barring an event that would have some
bearing on results, one can expect similar
results. - When ratings by an observer, rather than ratings
by the subject, are being assessed at two points
in time, test-retest reliability is termed
Intra-observer or Intra-rater Reliability.
29Inter-Item Reliability (Internal Consistency)
- When researchers use multiple items to measure a
single concept, these items should be consistent. - Cronbachs Alpha is a statistic commonly used to
measure inter-item reliability.
30Alternate-Forms Reliability
- When subjects answers to slightly different
versions of survey questions are compared,
alternate-forms reliability is being tested. - A researcher may reverse the order of the
response choices, modify question wording in
minor ways and then administer two forms of the
test to subjects. If the two sets of responses
are not too different, alternate forms
reliability is established.
31Inter-Observer Reliability
- When researchers use more than one observer to
rate the same people, events, or places,
inter-observer reliability is their goal. - If results are similar, we can have more
confidence than the ratings reflect the
phenomenon being assessed rather than the
orientations of the observers.
32Validity
- Validity asks, are you measuring what you think
you are measuring. Or put another way, does
your measure accurately measure the variable that
it is intended to measure. - There are Four Types of Validity
- Face Validity
- Content Validity
- Criterion Validity
- Construct Validity
33Face Validity
- Refers to confidence gained from careful
inspection of the concept to see if it is
appropriate on the face. - Every measure should be inspected for face
validity. - Face validity alone does not provide convincing
evidence of measurement validity.
34Content Validity
- Establishes that the measure covers the full
range of the concepts meaning. - To determine the range of meaning, the researcher
may solicit the opinions of experts and review
literature that identifies the different aspects,
or dimensions, of the concept.
35Criterion Validity
- Is established when the scores obtained on one
measure can be accurately compared to those
obtained with a more direct or already validated
measure of the same phenomenon. - For example, one can compare self reports of
alcohol consumption to a blood, breathe or urine
test. - The criterion that researchers select can be
measured either at the same time as the variable
to be validated or after that time. - Concurrent Validity- exists when a measure yields
scores that are closely related to scores on a
criterion measured at the same time. - Predictive Validity- is the ability of a measure
to predict scores on a criterion measured in the
future.
36Construct Validity
- Shows that a measure is valid by demonstrated
that a measure is related to a variety of other
measures specified in a theory. - Construct validity is used when non clear
criterion exists for validation purposes. - Two other approaches to Construct Validity
- Convergent Validity- is achieved when one
measure of a concept is associated with different
types of measures of the same concept. - Discriminant Validity- Scores on the measure to
be validated are compared to scores on measures
of different by related concepts.
37Ways to Improve Validity and Reliability
- Engage potential respondents in group discussions
about the questions to be included on the survey. - Conduct Cognitive Interviews- test questions,
clarify respondents cognition and what they
meant by their answers. - Audiotape test interviews during the pretest
phase of a survey. Review tapes and code to
identify problems in question wording or
delivery. - Pre-Test final surveys.
38Understanding Reliability and Validity
- Reliability is necessary for validity, however,
reliability does not guarantee validity. - You may consistently measure something you are
not intending to measure.