Introduction to CSSCR Archive and Campus Data - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Introduction to CSSCR Archive and Campus Data

Description:

... Ambulatory Medical Care Survey ... Most online data sets at CSSCR can be accessed ... Data dictionary: file name .dic or file name .doc. file ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 49
Provided by: ning
Category:

less

Transcript and Presenter's Notes

Title: Introduction to CSSCR Archive and Campus Data


1
Introduction to CSSCR Archive and Campus Data
2
Topics
  • Major Sources of CSSCR Data Archive
  • Finding Data Sets at CSSCR
  • Other Data Resources at CSSCR
  • Introduction to Decennial Censuses and American
    Community Survey

3
CSSCR Archive
  • The Center for Social Science Computation and
    Research (CSSCR) maintains a large electronic
    data archive related to social science research.
  • Data set are available through web viewer,
    network server or CDROM.

4
Major Sources of CSSCR Data Archive
  • Inter-University Consortium for Political and
    Social Research (ICPSR)
  • US Census Bureau
  • Bureau of Labor Statistics
  • Washington State Data Center
  • IASSIST International Association for Social
    Science Information Service Technology

5
Major Sources of CSSCR Data Archive
  • Inter-University Consortium for Political and
    Social Research (ICPSR) http//www.icpsr.umich.edu
  • Membership-based organization founded in 1962.
  • Provides access to the worlds largest archive
    of computerized social science data.
  • Offers training facilities for the study of
    quantitative social analysis techniques (e.g. the
    ICPSR Summer Program in Quantitative methods of
    Social Research).

6
Major Sources of CSSCR Data Archive
  • US Census Bureau http//www.census.gov
  • 1990, 2000 Decennial Census of Population
    Housing
  • Summary Tape File/Summary File (STF/SF)
  • Public Use Microdata Sample (PUMS)
  • American Community Survey (ACS)

7
Major Sources of CSSCR Data Archive
  • Bureau of Labor Statistics www.bls.gov/nls
  • National Longitudinal Survey of Youth 79,97
    Public-use File (CDs are available at CSSCR, or
    free downloadable on BLS website)
  • National Longitudinal Survey of Youth 79,97
    Geocode data (confidential data)
  • Provides geographic variables for data file
  • To protect the confidentiality of respondents,
  • the agreement letter has to be signed with BLS.

8
Major Sources of CSSCR Data Archive
  • Sources of Economic Data
  • Economagic Economic Time Series Page
  • http//www.economagic.com/
  • Provides internet browsing for the U.S.
    business, economic and trade information
  • DRI_WEFA Basic Economics Database
  • Datastream

9
DRI_WEFA Basic Economics Database
  • A macroeconomics database that contains about
    7000 monthly, quarterly and annual time series
    dated back to 1946 when available and end with
    the latest available observations.
  • Includes financial data, construction housing
    data, industrial statistics, population counts
    estimates, foreign trade interest rates
  • Accessible through E-Views in CSSCR lab. A
    reference book is available at Room 601E.

10
DataStream Database
  • Provides access to various global economic and
    financial databases (e.g. National Government
    OECD Series, International monetary funds,
    equities, bond indices, interest and exchange
    rates, company account definitions, etc).
  • At CSSCR, Datastream is only available through
    the Archivist at Room 601E.

11
Major Sources of CSSCR Data Archive
  • Washington State Data Center
  • http//www.ofm.wa.gov
  • WA State Vital Statistics
  • WA State Population Projections
  • WA state Population Surveys
  • Pregnancy Abortion Data

12
Other Sources of CSSCR Data Archive
  • Data Access via DataFerrett http//dataferrett.cen
    sus.gov
  • Current Population Survey www.beta.ipums.org/cps/
  • Survey of Income Program Participation
    www.cdc.gov/nchs/hus.htm
  • National Health Interview Survey
  • National Hospital Ambulatory Medical Care Survey
  • iPOLL databank at The Roper Center for Public
    Opinion Research is available through UW library
  • http//roperweb.ropercenter.uconn.edu/cgi-
    bin/hsrun.exe/Roperweb/iPOLL/iPOLL.htxstartHS_iP
    OLL_LoginSetup

13
Finding Data Sets at CSSCR
  • Web Site
  • CDROM Log
  • Codebook
  • All these materials are available at Condon
    611 or CSSCR web site

14
Finding Data Sets throughCSSCR web viewers
  • A complete list of data sets at CSSCR is
    available on the CSSCR Web page.
  • Most online data sets at CSSCR can be accessed
    through a web browser.
  • The CSSCR archive website address is
  • http//julius.csscr.washington.edu

15
Finding Data Sets throughCSSCR web viewers
  • The data sets on the CSSCR homepage are divided
    into several categories
  • ICPSR data
  • CDROM data
  • Census 2000
  • ACS
  • Census 2010
  • Clicking on one of these five icons will bring
    you to ICPSR Resource or CDROW list or
    Census 2000, ACS Washington data

16
Finding Data Sets throughCSSCR web viewers
  • In ICPSR resource, click on
  • Archive Brower lets you search the data to get
    files you want. Under each title, information
    such as data source, codename, abstract and
    storage medium is displayed. 

17
Types of File
  • Codebooks Documentation
  • Dataset codebook ltfile namegt.cod
  • Data dictionaryltfile namegt.dic or ltfile
    namegt.doc
  • file descriptionltfile namegt.des
  • Frequency listingltfile namegt.fre
  • Dataset errataltfile namegt.err

18
Types of File
  • Data Files
  • ASCII fileltfilenamegt.dat
  • SPSS system fileltfilenamegt.sav or ltfilenamegt.svf
  • SPSS portable fileltfilenamegt.por or
    ltfilenamegt.exp
  • SPSS data definition statementsltfilenamegt.spss
  • SAS data fileltfilenamegt.sas7bdat
  • SAS catalog fileltfilenamegt.sas7bcat
  • SAS transport fileltfilenamegt.xpt
  • SAS data definition statementsltfilenamegt.sas

19
Seattle Data Viewer
  • A neighborhood information system.
  • Provides access to a comprehensive set of
    information about the city infrastructure and
    environment.
  • Allows to organize and print data and maps of the
    city.
  • Accessible at CSSCR lab through
  • P\Data\Seattle_Data_viewer.

20
Seattle Data Viewer
  • Neighborhood statistics are grouped into the
    units
  • base map
  • Crimes and public safety
  • Housing, health, education and civic
    locations
  • Land use, value and zoning
  • Landscape and environmental features
  • Municipal and district Boundaries
  • Park, recreation and open space
  • Population and demographics
  • Streets and transportation Utilities

21
Introduction to Decennial Censuses
  • Decennial Census of Population Housing
  • Summary Tape File/Summary File (STF/SF)
  • Public Use Microdata Sample (PUMS)

22
Introduction to Decennial Censuses
  • What is Summary Tape File/Summary File (STF/SF)
  • The basic unit of analysis is a specific
    geographic area.
  • About counts of persons or housing units in
    particular categories.
  • Also called tabulated summary statistics.

23
Example of STF/SF
Geography TOTAL POPULATION White alone Black or African American alone American Indian and Alaska Native alone Asian alone
Alabama 4442558 3153627 1144330 23283 38444
Alaska 641724 443874 22103 91013 28838
Arizona 5829839 4440804 180769 275321 129197
Arkansas 2701431 2135069 414260 18481 25249
California 35278768 21491336 2163530 253774 4365548
Colorado 4562244 3809054 165729 40063 117506
Washington 6146338 4988017 202286 88363 405030
24
Introduction to Decennial Censuses
  • The Types of STF/SF
  • STF/SF 1 and 2 present tabulated data from the
    Census short-form (100) questionnaire.
  • STF/SF 3 and 4 present cross-tabulations of
    information from the long-form (sample)
    questionnaire.
  • Tables in STF/SF 2 and 4 are iterated for many
    detailed racial groups, as well as American
    Indian and Alaska Native tribes. In SF4, many
    data are also tabulated by detailed ancestry
    groups.

25
Introduction to Decennial Censuses
  • 2000 Census short-form questionnaire
  • full population
  • six questions
  • Household relationship
  • Sex
  • Age
  • Hispanic or Latino origin
  • Race
  • Tenure (whether the home is owned or rented)

26
Introduction to Decennial Censuses
  • 2000 Census long-form questionnaire
  • a sample includes 15.8-17 of full population
  • separates as two parts
  • Population
  • social and economic characteristics (14 areas)
  • Housing
  • physical and financial characteristics (11 areas)

27
Introduction to Decennial Censuses
  • In 1980, and 1990 census data
  • Letter A,B,C,D indicate different level of
    geographic area
  • A - block groups B - block, zip codes
  • C place, county D - Congressional district
  • In 2000 census data
  • P - person H - housing unit
  • PCT - households
  • HCT - occupied housing unit

28
Introduction to Decennial Censuses
  • What is Public Use Microdata Sample (PUMS)
  • The basic unit of analysis is a housing unit or
    the person who live in it with identifiers (such
    as addresses, names, etc) removed to protect
    individual confidentiality.
  • Its a stratified sample of the population which
    was created by sub sampling the full census
    sample that received census long form
    questionnaires

29
Example of PUMS
Person ID Age Genter Education Level
00001 34 F College
00002 21 M HighSchool
00003 14 M Middle School
00004 67 F HighSchool
00005 54 F HighSchool
00006 26 M College
30
Introduction to Decennial Censuses
  • The Types of PUMS
  • 5-percent sample file (PUMS-A file)
  • 1-percent sample file (PUMS-B file)

31
Introduction to Decennial Censuses
  • 5-percent sample file (PUMS-A file)
  • provides the user records for over 14 million
    people and over 5 million housing units
  • Each PUMA (Public Use Microdata Areas) must meet
    a minimum population threshold of 100,000 (the
    PUMA minimum)
  • Sample has only been produced since 1980

32
Introduction to Decennial Censuses
  • 1-percent sample file (PUMS-B file)
  • Provides a fuller range of detailed
    characteristics
  • Provides the user records for over 2.8 million
    people and over 1 million housing units
  • Each super-PUMAs meet a minimum population of
    400,000 and are composed of a PUMA or PUMAs
    delineated on the 5-percent PUMS files
  • Samples from the 1960 through current censuses

33
Introduction to Decennial Censuses
  • Integrated Public Use Microdata Series (IPUMS)
    http//www.ipums.umn.edu/
  • Consists of thirty-eight high-precision samples
    of the American population drawn from fifteen
    federal censuses (1850 2000) and from the
    American Community Surveys of 2000-2006
  • Is particularly useful for historical research
    because data can be comparable across time

34
What is American Community Survey (ACS)
  • is a large, continuous demographic survey
  • produces annual and multi-year estimates of the
    characteristics of the population and housing
  • will replace the 2010 census long form by
    collecting detailed information throughout the
    decade
  • Short form will still remain in 2010 decennial
    census

35
ACS Program Schedule
  • Testing and development 1994-2004
  • Full implementation began in 2005
  • Group Quarters data collection began in 2006

36
Full Implementation
  • Annual national sample of approximately 3 million
    addresses in every county and American Indian and
    Alaska Native area in the United States
  • Provide profiles every year for communities of
    65,000 or more
  • Provide 3-year cumulations for communities of
    less than 20,000 population
  • Provide 5-year cumulations for all communities,
    the lowest geographic level could be block group

37
ACS Data Release Schedule
Before 2004 ACS the population threshold is
250,000
38
ACS file types
  • ACS Summary File (ACS SF)
  • Public Use Microdata Sample (5 PUMS)

39
Comparing ACS with the Decennial Census long form
questionnaires
  • Samples rate/size design
  • Data collection
  • Residence rules reference periods

40
Samples rate/size designComparison
  • Census sample estimates based on about 18 million
    housing units ACS 5 year estimates based on
    about 11 million housing units, 1 year estimates
    based on about 3 million housing units
  • ACS samples every year and spreads sample over 12
    months census samples once a decade and uses the
    entire sample at the same time

41
Data Collection Comparison
  • ACS nonresponse follow-up uses computer-assisted
    telephone and computer-assisted personal
    interviews past censuses have used only paper
    questionnaires
  • ACS data collected only from household members
    census data often collected from neighbors

42
Residence Rules Comparison
  • ACS uses a two-month rule
  • - Resident of an address if a person
  • Lives there year round
  • Lives there more than 2 months but not year round
  • Is living there now with no other place to live?
  • Is away now for 2 months or less?
  • - Not a resident of an address if a person
  • Lives there 2 months or less with another
    residence
  • Is away now for more than 2 month
  • Decennial census based on concept of usual
    residence

43
Reference Periods Comparison
  • ACS uses the interview data as the single
    reference point, or as the end of a reference
    period, for all data collection
  • Examples
  • Income
  • ACS asks for income for the previous 12 months
  • Decennial census income data refer to the
    previous calendar year April 1
  • School enrollment
  • ACS asks if a person attended school during the
    last three months
  • Census 2000 asks if a person attended school any
    time since April 1

44
Comparison Conclusion
  • ACS estimates have higher sample error than
    census long form, however shown as 90 confidence
    limits or margins of error in every table
  • ACS has higher level of overall response and
    individual item response, so less chance of
    nonresponse bias, means lower potential
    nonsampling error
  • ACS is a better way to collect this wide-ranging
    information than was the decennial census because
    the distribution of the data over the collection
    time frame is more meaningful
  • Comparing ACS Data to Other Sources
  • http//www.census.gov/acs/www/UseData/compACS.ht
    m

45
Available ACS data
  • 2005 single-year ACS provides household
    population only for areas with populations of
    65,000 or more
  • 2006 single-year ACS provides household
    population and group quarters population for
    areas with populations of 65,000 or more
  • 2007 single-year ACS provides household
    population and group quarters population for
    areas with populations of 65,000 or more by the
    end of September
  • 2007 three-year ACS provides household population
    and/or group quarters population for areas with
    populations of 20,000 or more by the end of
    December

46
Available Census Data at CSSCR
  • 1980 census data
  • STF1, STF3 (raw data)
  • 1990 census data
  • STF1, STF2, STF3, STF4, 1PUMS, 5PUMS
  • 2000 census data
  • SF1, SF2, SF3, SF4, 1PUMS, 5PUMS
  • 2005 ACS
  • ACS SF, 5PUMS
  • 2006 ACS
  • ACS SF, 5PUMS

47
Census CDs (GeoLytics)
  • CensusCDMap (run ocensus3.bat to access)
  • US 1990 Census, 1990 Estimates, 2004
    projections, Consumer Expenditures, Time series
    and Maps.
  • CensusCD Blocks (run occdblock.bat to access)
  • Demographic data and boundaries for 7
    millions blocks from STF 1B and PL94-171 files.
  • CensusCD 1980 (run occd1980.bat to access)
  • US 1980 Census data from STF 1 and STF 3.
  • StreetCD98 (run otiger.bat to access)
  • Over 100 layers of map data from TIGER 98.
  • available in the lab

48
Census CDs (GeoLytics)
  • Census CD 1970
  • Census CD 1980 Long Form in 2000 Areas
  • Census CD 1980
  • Census CD 1990 blocks
  • Census CD 1990 Long Form
  • Census CD 1990-2000
  • Census CD 2000 blocks
  • Census 2000 Redistricting
  • Census CD SF1 Blocks
  • NCDB Neighborhood Change Database
  • Available in the Room 601C
Write a Comment
User Comments (0)
About PowerShow.com