Title: EVAULATION OF THE NSCRG SCHOOL SAMPLE
1EVAULATION OF THE NSCRG SCHOOL SAMPLE
- Donsig Jang and Xiaojing Lin
- Third International Conference on Establishment
Surveys - Montreal, Canada, June 21, 2007
2Outline
- Sampling options on repeated establishment
surveys - Reasons to keep the same sample in establishment
surveys - Issues in keeping the same sample
- Example NSRCG school sample
- Summary
- Recommendation for 2008 NSRCG School Sample
3Sampling options on repeated establishment surveys
- Keep the same sample over time with supplemental
samples for births - Efficient change estimates BUT
- Response burden
- Inefficient cross-sectional estimates
- An independent sample in each survey round
- Sample coordination to maximize overlaps between
samples - Rotation samples (Sigman and Monsour 1995)
- Permanent random number technique (Ohlsson 1995,
2001) - Keyfitz procedure (Keyfitz 1951)
4Reasons to keep the same sample in establishment
surveys
- Difficulty in identifying point of contact
- Costly efforts in gaining participation
- Often requires nontrivial process to gather
information previous survey participation would
help
5Issues in keeping the same sample
- Can they be a representative sample of the
current cross-sectional population? - Depending on how dynamic the population is over
time - coverage issues births vs. deaths
- sample efficiency distributional changes
- Alternatives
- Independent sample from the most up-to-date
sample frame - Coordination of samples
- E.g., Keyfitz procedure to maximize the sample
overlap between the current and the previous ones
6National Survey of Recent College Graduates
(NSRCG)
- Repeated every two or three years
- Collects education, demographic, and employment
information from recent college graduates
(bachelors and masters) majoring in
science,engineering, and health fields - Two stage sample design
- 1st stage select schools and obtain the list of
graduates from selected schools - 2nd stage select graduates from the list
provided by schools - NSF-sponsored survey
7NSRCGList collection from schools
- Identify point of contact (usually institutional
coordinator) - Gather the list of graduates with key sampling
and locating information including - degree award dates
- degree level
- field of major
- race/ethnicity
- gender
- date of birth
- SSN
- student ID
- mailing addresses including parents addresses
- phone numbers (land line, cell)
- emails, etc.
8NSRCGList collection from schools (continued)
- Need a good understanding on the information
requested and file format - Time consuming and costly efforts
- different schools have different issues
- A crucial part for the quality of the survey
- strive to get almost perfect cooperation rate
(99) - Out of 300 schools,
- only four final refusals in 2003
- only five refusals in 2006
9NSRCGSchool sample selection
- For 1995, 1997, 1999, 2001 surveys
- 275 schools initially selected in 1995 and kept
with 5 supplemental samples added over three
survey rounds (to account for frame coverage) - A new sample of 300 schools selected in 2003
- To reflect rapid changes of SE populations in
1990s - Health field added to the survey as eligible
field of study
10NSRCGSchool sample selection (continued)
- Probability proportional size (PPS) with
composite size measure - Composite size measures calculated to achieve
equal weights within each of NSRCG analytic
domains constructed by a combination of - degree year, degree level, field of majors,
race/ethnicity, and gender - Population dynamics
- new schools (birth), closed (death), no SE
graduates (temporarily ineligible), etc - Coverage issue
- distributions of schools changed (in terms of
composite size measures) - potential factor affecting the sample efficiency
112003 NSRCG school sample
In both 2001 and 2003 NSRCG 170 (57)
Only in 2003 NSRCG 130 (43)
Total 300
Excessive efforts (time and resources) to achieve
99 of RR (4 schools refused)
12Distribution of list submission dates in 2003
NSRCG
Days
13School sample after 2003 NSRCG 2006 NSRCG
2003 Frame based on AY2001 IPEDS counts 2006
Frame based on AY2003 and AY2004 IPEDS counts
14Graduate counts dropped from and added to the
population
15Graduate counts dropped from and added to the
population
162006 NSRCG School Sample
- No significant change of the population
- Kept the same school sample without any
supplemental sample
17Distribution of list submission dates in 2006
NSRCG
Days
182008 NSRCG ?
- Evaluate the current sampling strategy (keeping
the same sample) by doing - frame evaluation
- comparisons with other sampling schemes
- Independent PPS
- Keyfitz procedure
192008 NSRCG
Frame evaluation
2003 Frame based on AY2001 IPEDS counts 2008
Frame based on AY2006 IPEDS counts
20Graduate counts dropped from and added to the
population
21Sample Evaluation
- Three sample selection methods considered
- Keep the 2003 school sample with a supplemental
sample of size 4 - Independent PPS with composite size measures
based on updated frame information - Keyfitz procedure
22PPS sample selection procedure
Define Size Measure
where md is a sample size of domain d, Md is
the population size of domain d Mid is the
population size of domain d in school i domain
d is constructed from a combination of
graduate year, degree level, field of major,
race/ethnicity, and gender
23PPS sample selection procedure
- School i selected with probability (pi)
proportional to size Si - Achieve equal weight within each domain d
- Distributional changes of the NSRCG graduate
populations would cause unequal weight variations
within domains - Independent PPS with up-to-date frame data is
desirable if weight variation is severe
24Keyfitz procedure
- Maximize the overlap between two samples
- The first sample (2003 NSRCG) was selected with
PPS - The second sample inclusion probability is
dependent upon - updated size measures
- the first sample inclusion probability
- the actual sample realization in the first sample
25Simulation of sampling procedures
- Generate 1000 school independent samples for
each of the following options - Keep the same school sample with a supplemental
sample of size 4 from the newly eligible schools
(births) - Independent PPS sampling using MOS calculated
from 2008 NSRCG frame - Keyfitz procedure
26(No Transcript)
27(No Transcript)
28Summary
- Keeping the same sample is a cost effective
option - Concern about statistical inefficiency due to the
nature of dynamic population - Frame coverage corrected by supplemental sample
- Evaluate the NSRCG school sample
- Empirical frame evaluation
- Samples simulated based on two methods
- Distribution changes (in terms of composite size
measure) would make the final sample inefficient - Weight variation within planned domains
- Over or under estimation of graduates in some
domains -
29Recommendation
- Keep the same school sample with supplemental
sample of size 4 for 2008 NSRCG