Sampling Racial and Ethnic Minorities - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

Sampling Racial and Ethnic Minorities

Description:

... Ethnic Minorities ... Requires models to analyze. Probability sampling is ... Portions of ethnic subpopulations are relatively mobile (e.g., migrant farm ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 65
Provided by: chsrS
Category:

less

Transcript and Presenter's Notes

Title: Sampling Racial and Ethnic Minorities


1
Sampling Racial andEthnic Minorities
  • William D. Kalsbeek
  • Director, Survey Research Unit
  • Professor, Department of Biostatistics
  • University of North Carolina
  • June 15, 2000

2
Acknowledgements
  • Ms. Gayle Shimokura
  • For significant contributions to this
    presentation through her meticulous background
    research.
  • CDC/National Center for Health Statistics
    (Contract No. UR6/CCU417428-01)
  • For funding support for this presentation
  • UNC-CHs Center for Health Statistics Research
  • http//www.sph.unc.edu/chsr

3
Race/Ethnic Minorities ( of Population March
2000 CPS)
  • Hispanics (11.7 )
  • Settled (95)
  • Mobile (5 )
  • African-American (12.8 )
  • Settled (99.9)
  • Mobile (0.1)
  • Asian-American (4.0)
  • Native-American (0.9)

4
Overview
  • Some basics on probability sampling
  • Problems in sampling rare population subgroups
  • A review of some existing remedies
  • Note that a reference list is available

5
Context Sampling Race/Ethnic Minorities
lt-------------------- General Population
-------------------gt
Targeted
With Oversampling
Ethnic Minority
  • As the population subgroup of interest in a
    specially targeted study (targeted sampling)
  • As a key subgroup in a general population study
    (oversampling)

6
Probability vs. Nonprobability Sampling?
  • Probability sampling
  • Random sampling methods used
  • Each member of the target population with a
    known, nonzero selection probability
  • Nonprobability sampling in exceptional
    circumstances
  • Judgment used
  • Requires models to analyze
  • Probability sampling is generally preferred

7
Sampling Frames and Linkage
  • Sampling Frame List(s) used to select a
    probability sample
  • EXAMPLE List of patients to sample health care
    users
  • Usefulness of a frame is tied to
  • The linkage that exists between entries on the
    list and the population being sampled

8
Sample Weights
  • A number for each member of the sample
  • Reflecting the inverse of the selection
    probability for the sample member
  • May be adjusted for sample imbalance due to
  • Nonresponse
  • Incomplete frame coverage
  • Other selection problems

9
What are the Statistical Goals of Probability
Sampling?
  • Validity
  • The ability to produce estimates without bias
    tied to sampling
  • Achieved if all population members have some
    known chance to be chosen in the sample
  • Efficiency
  • Tied to precision of estimates
  • Achieved if the right sampling tools are used
  • Greater efficiency costs more (cost-efficiency)

10
What Selection Tools Might be Used to Sample
Race/Ethnic Minorities?
  • Stratified sampling
  • Separate sampling within each of a number of
    population groupings (strata)
  • Screening for the targeted minority group
  • Identify subgroup members in initial sample of
    the full population

11
Stratified Sampling
  • Population divided into a H subgroups called
    strata
  • Separate probability sample in each stratum
  • Combine estimates from each stratum to produce
    the estimate for the whole population
  • Vs. Stratified Analysis

12
Stratified Sampling Used When
  • Wish to improve the efficiency of population-wide
    estimates
  • AND/OR
  • Wish to control the sample size of estimates for
    important population subgroups
  • Isolatable to some degree by the strata

13
Stratum Allocation Options
  • Ch Average cost of adding another respondent
  • to the sample in the h-th stratum

14
Stratum Allocation Options
15
Screening for a Targeted Population Subgroup
  • Sampling in two phases
  • Goal is to locate members of the population
    subgroup
  • Usually done by telephone or face-to-face in
    general population surveys
  • Process
  • Select an initial sample
  • Administer a relatively short interview
  • To determine membership in the targeted subgroup
  • Retain all target subgroup and (perhaps) a random
    portion of the rest

16
What May Lead to Problems in Sampling Race/Ethnic
Minorities?
  • Incomplete Frame(s)
  • A sizable portion of the population not linked to
    entries on the list(s) used for sampling
  • Rarity
  • They usually comprising a relatively small
    percentage of the target population

17
What May Lead to Problems in Sampling Race/Ethnic
Minorities?
  • Mobility
  • Some of them move around a lot, thus creating a
    more dynamic than static linkage between the
    frame and sampled population
  • Dispersion
  • They are somewhat scattered geographically
  • May have some pockets with relatively high
    concentrations

18
(No Transcript)
19
Some Remedies
  • Targeted Sampling
  • Multiple Frame Methods
  • Linkage Exploitation Methods
  • Network/multiplicity sampling
  • Snowball sampling
  • Adaptive cluster sampling
  • Time and Space Sampling
  • Oversampling
  • Disproportionate Stratified Sampling with
    Screening

20
Multiple Frame Methods Selection Approaches
  • Premise
  • Frame options taken alone may be inadequate or
    too costly to use,
  • BUT
  • Choosing the sample jointly from multiple frames
    may
  • Produce better coverage of the targeted
    population and
  • Be more cost-effective
  • Dual-Frame Designs --- Two frames

21
Multiple Frames
Frame B
Frame A
Frame C
22
Multiple Frame Methods EXAMPLE
  • Sampling Native Americans
  • Two frames
  • List of tribal rolls
  • Less complete
  • Less expensive to locate NAs
  • Area household frame from
  • List of residential dwellings in a sample of
    block groups (neighborhoods)
  • More complete
  • More expensive because of the need to screen
  • Most cost-effective mix ?

23
Multiple Frame Methods Estimation Approaches
  • Work by Hartley (1962), Choudry (1989), and
    Skinner and Rao (1996)
  • Special Requirements
  • Identify/eliminate overlap prior to sampling
  • OR
  • Require knowledge of membership in intersection
    groups for analysis adjustments

24
Multiple Frame Methods Estimation Approaches
  • Eliminate frame duplication treat as a
    stratified sample
  • OR
  • Select with duplication present and either
  • Combine estimates for intersection groups
  • OR
  • Determine frame membership for sample respondents
    and weight accordingly

25
Multiple Frame Methods Implications for Sampling
Race/Ethnic Minorities
  • Advantages
  • Improved sample coverage over using a single list
  • Potential cost savings if cost of frame use
    differs among frames
  • Disadvantages
  • Higher design/selection/analysis complexity
    relative to single frame use
  • Challenge in finding the most cost-effective mix
    of sample sizes for frames

26
Linkage Exploitation Methods Selection Approaches
  • Premise
  • Population members with a rare attribute can
    often identify others with the same attribute
  • Various adaptations
  • Based in the notion of multiplicity in frames
  • Differ according to how multiplicity is utilized

27
Multiplicity
Frame Listing
Population Member
28
Linkage Exploitation Methods Various Adaptations
  • Network/multiplicity sampling
  • Network --- social/spatial/organizational linkage
    among members of the targeted subgroup
  • EXAMPLES relatives, friends, co-workers,
    co-habitants, organization co-members, etc.
  • Linkages may be
  • Asymmetric
  • Complex
  • EXAMPLE friends

29
Linkage Exploitation Methods Various Adaptations
  • Network/multiplicity sampling
  • Sampling Process
  • Chose an initial sample of targeted subgroup
  • Sample members interviewed and asked to nominate
    other members of their network who are members of
    the targeted subgroup
  • Interview those nominated and have them nominate
    others in like manner
  • Selection probability directly tied to size of
    network

30
Linkage Exploitation Methods Various Adaptations
  • Snowball sampling
  • Network sampling but with multiple phases of
    nomination
  • Snowballing may be best used to construct frames
    to sample rare populations
  • Continue waves of nomination until list expansion
    ceases

31
Linkage Exploitation Methods Various Adaptations
  • Adaptive cluster sampling
  • Exploits the tendency for members of some
    targeted subgroups to cluster together
  • Original motivation from ecology and geology
  • Sampling Process
  • Select a random sample of the population
  • Where one identifies members of the targeted
    subgroup, sample others in the neighborhood

32
Linkage Exploitation Methods EXAMPLE
  • Snowballing sampling frame of prenatal care
    providers
  • Study of recent female immigrants from Central
    and South America
  • Process
  • Contact OB-GYNs in private practices and public
    clinics
  • Those providing prenatal care to immigrants
    nominate others doing the same
  • Continue iteratively until the no new providers
    are discovered

33
Linkage Exploitation Methods Estimation
  • Major contributors Sirken (network), Goodman
    (snowball), and Thompson (adaptive)
  • Approaches
  • Weighted multiplicity estimation (Sirken)
  • Rao-Blackwellization to improve estimator
    efficiency (Thompson)
  • Special requirements
  • Network membership information
  • Multiplicity counts

34
Linkage Exploitation Methods Implications for
Sampling Race/Ethnic Minorities
  • Advantages
  • Greater operational efficiency in locating
    members of the target population
  • Find a hotspot then sample nearby
  • Disadvantages
  • Difficult to determine selection probabilities
    for weights
  • Asymmetric linkages (A nominates B, but not vice
    versa)
  • Valid probability samples?

35
Time and Space Sampling Selection Approach
  • Premise
  • Portions of ethnic subpopulations are relatively
    mobile (e.g., migrant farm workers, homeless)
  • Sampling a chunk of time
  • Linkage between members of the target subgroup
    and the frame is dynamic overtime
  • Those moving more frequently have greater chance
    of selection
  • Sample space and time to address this potential
    for bias

36
Time and Space Sampling EXAMPLE
  • Sampling migrant seasonal farm workers
  • Process
  • Spatial dimension sample migrant housing
    locations
  • On farms
  • In other residential housing areas
  • Time dimension sample time periods during the
    data collection period
  • Three consecutive days

37
Time and Space Sampling Estimation
  • Contributors Kalsbeek (1988) Kalton (1991)
  • Approaches
  • Multiplicity estimators similar to those used in
    network samples
  • Special Requirements
  • Need multiplicity count for each sample member?
  • Sampling scheme compromise needed between
  • Statistical precision of estimates
  • Operational effectiveness

38
Time and Space Sampling Implications for
Sampling Race/Ethnic Minorities
  • Advantages
  • Deals with the fluidity of frame-population
    linkage in mobile populations
  • Provides a framework for finding a cost-efficient
    solution
  • Disadvantages
  • Added complexity to selection, data gathering,
    and analysis of sample

39
Disproportionate Stratified Sampling with
Screening Selection Approach
  • Premise
  • Concentrations of the targeted subgroup vary in
    the population
  • Sample strata with higher concentrations more
    heavily
  • Result larger sample size for the target
    subgroup relative to a proportionate sample

40
(No Transcript)
41
DSS with Screening EXAMPLE
  • Oversampling African-Americans
  • A simple process
  • Stratify the population
  • By relatively high and low concentrations of
    African-Americans
  • High concentration areas in the South and large
    cities
  • Sample with relatively higher rates in the high
    concentration stratum

42
DSS with Screening Estimation
  • Approaches
  • Weighted estimate to account for sample
    disproportionality
  • Effect of variable weights is to lower precision
    of some population estimates
  • Special Requirements
  • Establishing the most cost-efficient overall and
    stratum-specific sampling rates

43
DSS with Screening Implications for Sampling
Race/Ethnic Minorities
  • Advantages
  • Increased sample size for the targeted subgroups
  • Are target subgroup non-members in the
    (oversampled) high concentration strata)
  • Disadvantages
  • Loss in precision on overall population estimates

44
A Two-Stratum Model for Effects of Oversampling
  • Setting
  • Oversampling a minority group
  • 10 of the population
  • Two sampling strata
  • One with higher minority (to oversample)
  • One with lower minority (to undersample)
  • Two alternative sets of strata
  • Nearly Pure --- strata virtually all members or
    non-members
  • Less Pure --- strata mostly all members or
    non-members

45
Nearly Pure Strata
Oversampled Stratum
Undersampled Stratum
TARGET POPULATION
46
Less Pure Strata
Oversampled Stratum
Undersampled Stratum
TARGET POPULATION
47
A Two-Stratum Model for Effects of Oversampling
  • Assumptions
  • Simple random sampling in each stratum
  • Stratum unit variances are equal
  • Other minor simplifying conditions

48
A Two-Stratum Model for Effects of Oversampling
  • Sample Sizes (Relative to Proportionate)
  • Minority_Nom Nominal Sample Size for Minority
  • Observed increase in size of minority sample
  • Due to oversampling of the predominantly minority
    stratum
  • Minority_Eff Effective Sample Size for Minority
  • Adjusted size of minority sample
  • Considering the (downward) effect of variable
    sample weights on statistical quality of
    estimates
  • Overall_Eff Effiective Size of Overall Sample
  • Adjusted size of overall sample
  • Considering the (downward) effect of variable
    sample weights on statistical quality of
    estimates

49
Effects of Oversampling Nearly Pure Strata
50
Effects of OversamplingLess Pure Strata
51
Summary
  • Sampling rare ethnic groups is possible
  • BUT
  • Accomplishing it effectively is likely to be
  • Complex (dealing with multiplicity, dealing with
    multiple frames, resolving statistical-operational
    dilemmas)
  • Costly (screening, stratification)
  • Adverse effect on overall population estimates
    (if oversampling done)
  • Loss of sampling validity? (snowball sampling)

52
A Case-Study in Oversampling Blacks and
Mexican-Americans
  • The Third National Health and Nutrition
    Examination Survey (NHANESIII)

53
Cluster Sampling
  • Random selection applied to one or more levels of
    a population hierarchy
  • Sampling Stage Level of hierarchy at which
    sampling is done
  • Jargon
  • PSU Primary Sampling Unit is what is sampled in
    the first selection stage
  • SSU Secondary Sampling Unit is what is sampled
    in the second stage

54
Population Hierarchies
55
Population Hierarchies
  • EXAMPLE African-American residents of the US
    non-institutionalized household population

Resident gt Household gt Block Group gt Census Tract
gt Minor Civil Division gt County gt State gt US
56
NHANES III Overview
  • National health survey
  • U.S. civilian noninstitutionalized population
  • Stratified multi-stage sample design
  • Detailed profile and predictors of health status
  • Data gathering timeline
  • 1988-94
  • Data collected by
  • Face-to-face interviews in the home
  • Detailed examination at mobile sites

57
NHANES III Target Population
  • U.S. residents
  • Two months and older
  • Including those living in Alaska and Hawaii
  • Civilians only
  • Excludes housing on military bases
  • Noninstitutionalized population only
  • Excludes some residents of hospitals, nursing
    homes, prisons, and other comparable institutions
  • Eligibility determined as of the time of interview

58
NHANES III in General
  • Key minority domains
  • Black (non-Hispanic)
  • Mexican American
  • Children 2 months 5 years
  • The Elderly gt 60 years

59
(No Transcript)
60
Stratification to OversampleKey Minority Domains
Applied at
  • The PSU level
  • Race/ethnicity or income indicator
  • The segment level
  • Density of Mexican-Americans
  • The household level
  • Race/ethnicity
  • The (sample) person level
  • Age

61
Oversampling of Key Minority Domains
  • Implementation accomplished by
  • Disproportionate allocation favoring key minority
    domains
  • Using a weighted measure of size

62
Stratification to Oversample Key Minority
Domains in NHANES III
63
Stratification to Oversample Key Minority
Domains in NHANES III
  • Oversampling implies more widely variable
    selection probabilities and sample weights
  • Effect of variable weights is to increase
    variances of estimates
  • One model Increased variance by a factor of,

64
Stratification to Oversample Key Minority Domains
in NHANES III
  • EXAMPLE
  • Effect of variable sample weights on total
    population estimates using data from the
    MEC-examined NHANES III sample
Write a Comment
User Comments (0)
About PowerShow.com