Censuses and Surveys: Still Useful for the Common Good PowerPoint PPT Presentation

presentation player overlay
1 / 36
About This Presentation
Transcript and Presenter's Notes

Title: Censuses and Surveys: Still Useful for the Common Good


1
Censuses and Surveys Still Useful for the
Common Good?
  • Henry E. Brady
  • Professor of Political Science and Public Policy
  • Director, Survey Research Center and UC DATA
  • University of California, Berkeley

2
Uses of Census and Survey Data
  • Two Examples of Their Usefulness
  • Historical Question Was Howard County Maryland
    a slave county?
  • Current Policy Question Immigrants and Welfare
    programs in California
  • Methods
  • Mapping Data
  • Linking Data
  • Across administrative datasets
  • Across administrative and survey datasets

3
Example Number One Was Howard County Maryland a
Slave County?
  • Source Henry E. Brady (UC Berkeley)

4
Was Howard County, Maryland a Slave County?
  • Method Consider historical census materials by
    county across states and over time
  • Collect data
  • Map it
  • Visualize it
  • Data Source University of Virginia Library,
    Historical Census Browser, Geospatial and
    Statistical Data Center http//fisher.lib.virgin
    ia.edu/collections/stats/histcensus
  • Further Question How long did this legacy
    matter for politics?
  • Data Source ICPSR County Level Returns matched
    to Census Data. http//www.icpsr.umich.edu/

5
Map of Maryland Counties
District of Columbia
6
(No Transcript)
7
West Virginia and Virginia
Line Where Virginia and West Virginia Separated
West Virginia
West Virginia
Virginia
8
Mason-Dixon Line
9
Southern Pennsylvania, Maryland, Delaware, and
Virginia Slavery in 1850
Pennsylvania
Delaware
Maryland
Empirical Line of Demarcation Between Slave and
Non-Slave Counties
Virginia
10
What was the Legacy? Getting Voting Data
  • Go to ICPSR
  • Search for data using search engine
  • Download data in SPSS format
  • Add Census data by pasting and hand entry
  • Analyze data using statistical package

11
1864
Howard
Southern Counties
12
1876
Howard
13
Example Number Two What is the Experience of
Immigrants with Welfare Programs?
  • Source Henry E. Brady (UCB)
  • and Jon Stiles (UCB)

14
Immigrants and Welfare
  • Question What is the experience of immigrants
    with welfare programs?
  • Problem Very few datasets have both
  • Immigration status (native, naturalized citizen,
    non-citizen)
  • Immigrant welfare and job experience over time

15
Census Survey Data Can Provide
  • Nativity Whether native or non-native and date
    of entry to US and citizenship status for
    non-natives
  • SES and Demographics -- Household composition,
    education, sources of income, race/ethnicity,
    marital status, etc.
  • Cross-Sectional Population Samples -- Description
    of both program participants and non-participants
    at a point in time.

16
Administrative Data Can Provide
  • Program Participation Over Time Medi-Cal
    Eligibility Data System (MEDS)
  • Monthly record of eligibility for welfare
    programs, 1988-2002
  • Programmatic basis for eligibility
  • Work History Over Time Employment Development
    Department - Base Wage files
  • Quarterly earnings as reported for UI/DI coverage
    from 1991 to 1999
  • Identifies number of employers, total covered
    earnings

17
Census Surveys with Program Participation by
Nativity, 1990-02
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
Samples are drawn each year with household and
personal characteristics measured at
sampling. The CPS follows sampled housing units
for 4 months in 2 consecutive years, while the
SIPP follows households for 2.5 years with
interviews each 4 months
CPS and SIPP samples
18
California Administrative Data Program
Participation by Year (MEDS)
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
1
2
1990
1
2
1991
1
2
1992
1
2
1993
1
2
1994
1
2
1995
1
2
1996
1
1997
1
1998
1
1999
1
2
2000
2001
2002
MEDS DATA
Medi-Cal Eligibility and Program Participation
identified monthly for samples following the
initial survey interview for
the year of sampling.
19
Year of MEDS coverage
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
1990
1
2
3
4
5
6
7
8
9
10
11
12
13
1991
1
2
3
4
5
6
7
8
9
10
11
12
1992
1
2
3
4
5
6
7
8
9
10
11
1993
1
2
3
4
5
6
7
8
9
10
1994
1
2
3
4
5
6
7
8
9
1995
1
2
3
4
5
6
7
8
1996
1
2
3
4
5
6
7
1997
1
2
3
4
5
6
1998
1
2
3
4
5
1999
1
2
3
4
2000
1
2
3
2001
2002
MEDS DATA
and each subsequent year
through 2002. So individuals in each panel may
be potentially tracked in the MEDS data for up to
13 years after initial sampling
20
California Administrative Data Wages by Year
from UI Base Wage File
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
1
2
3
4
5
6
7
8
9
10
12
1990
1
2
3
4
5
6
7
8
9
1991
1
2
3
4
5
6
7
8
1992
1
2
3
4
5
6
7
1993
1
2
3
4
5
6
1994
1
2
3
4
5
1995
1
2
3
4
1996
1
2
3
1997
1
2
1998
1
1999
1
2
2000
2001
2002
EDD DATA
Earnings in UI covered employment are
identified for each quarter from mid-1991 through
1999
21
Census Survey Data
State Administrative Data
MEDS Obtains SSN for (almost) all Medi-Cal
eligible persons
CPS Requests SSN for all persons aged 15 in the
household
SIPP Attempts to obtain SSN for all in the
household
EDD Obtains SSN and wages from employers of
UI/DI covered employees
CES and LEHD assign Protected Identification
Keys based on SSN
CES provides crosswalk between PIKs and
publicly available identifiers for CPS and SIPP
CES provides anonymized MEDS And EDD Base Wage
records identified only by PIK
Survey records and Administrative records are
merged using the PIK to create a matched file
Final Linked File
22
2
3
2001
2002
EDD DATA
MEDS DATA
Matched Survey, MEDS, and UI Earnings data cover
pre- and post-Welfare Reform periods, and weak
and strong economies.
23
Basic Problems
  • Highly confidential data
  • Requires Census Research Data Center
  • Non-public data available to researchers
  • On a strictly controlled basis
  • State Data Must be Matched by Census
  • But these are expensive to run

24
Two Big Findings in California
  • Non-Citizen Elderly Immigrants on Welfare
    Non-citizen (but legal) immigrants more likely to
    eventually end up on welfare for the elderly
    (SSI/SSP), especially if they came at older age
    (probably because of less Social Security based
    work)
  • Non-Citizen Immigrant Women on Welfare
    Non-citizen (but legal) immigrant women in
    two-parent families less likely to get off
    welfare (probably because of fewer skills, less
    language competency, perhaps cultural factors).

25
Percent of Adults on SSI/SSP at Some Point of
Adults in Surveys who Were (or Became) 65 or
Older During 90-02
SSI/SSP Welfare Program for Elderly Poor or
Disabled
Non-Citizens and Naturalized
Entire Population
Non-Citizen
Naturalized
Native
26
Aid and Employment in Years after Sampling Women
initially in 2 Parent AFDC/TANF cases(Total
percentage declines over time as we lose track of
people)
Native Women
Non-Citizen Women
Working
Working
Welfare and Work
Still on Welfare
Years Since Initial Sampling
27
Thinking about Census and Survey Data
28
Three Dimensions of DataQuantity and Quality for
Each
Number of Variables and Item Quality
Number of Cases and Representativeness
Length of Time and Panel Integrity
29
Ideal Data Set
  • Variables
  • As many variables as possible
  • High item quality
  • Cases
  • As many cases as possible
  • Highly representative (e.g., random sample)
  • Time
  • As long a period of time as possible
  • Continuous observationno panel mortality

30
Surveys Rich in Variables For Short Time
Periods Not Many Cases
Variables
Cases
Most Survey Data
Time
31
Administrative Data Weak in Variables Rich in
Cases Rich in Time if Linked Over Time
Variables
Cases
Linked Administrative Data
Time
32
Problems with Surveys and Censuses
  • Designing/Implementing Good Sample Frames
  • Telephone cell phones, no phones, etc.
  • Internet choosing random sample,
    self-selection
  • Responses Hard to Get
  • Interview Response Rates Declining
  • Item non-responses problematic (e.g., income,
    race)
  • Costs High In-person Telephone Expensive
  • In-person about 500 to 1500/interview
  • Telephone about 50 to 150/interview
  • Internet about 5 to 50/interview
  • Confidentiality Concerns with Collected Data

33
Internet Surveys as Solution?
  • Virtues Inexpensive way to collect data but it
    requires e-mail addresses hence hard to get
    random samples
  • Three Methods
  • Self-selected samples
  • Starting with random sample and give them
    computers
  • Very expensive initially
  • Hard to maintain random sample because of panel
    mortality
  • Matching Method
  • File of e-mail addresses Collects large numbers
    of e-mail addresses and personal information from
    those willing to be interviewed on the web.
  • File enumerating Americans Chooses random
    samples from a file (like a phone book)
    constructed by a commercial firm which contains a
    nearly universal file of Americans and some
    demographic and SES information on each one of
    them.
  • Matched Sample Interviews the nearest match in
    its e-mail address file to those in its random
    samples.
  • Is this Representative Enough? Still not sure
    but

34
Administrative Data as Solution?
  • Virtues Inexpensive way to collect data but it
    requires linking of data over time and across
    various data-sets using fallible identifiers
  • Problems
  • Mixed quality data Excellent for data related
    to administrative purpose often poor for all
    other
  • Confidentiality concerns and problems
  • Incomplete coverage
  • Change in computer systems over time

35
Linked Social Services Data in American
States--1999
36
Conclusions
  • Exciting New Possibilities
  • Internet Interviewing
  • Administrative Data
  • With Some Real Problems
  • Representativeness
  • Confidentiality
  • Linking
Write a Comment
User Comments (0)
About PowerShow.com