Statistical%20confidentiality%20and%20privacy.%202.%20Case%20study:%20IPUMS-International%20www.ipums.org/international%20%20*%20*%20*%20Robert%20McCaa%20Minnesota%20Population%20Center%20rmccaa@umn.edu - PowerPoint PPT Presentation

About This Presentation
Title:

Statistical%20confidentiality%20and%20privacy.%202.%20Case%20study:%20IPUMS-International%20www.ipums.org/international%20%20*%20*%20*%20Robert%20McCaa%20Minnesota%20Population%20Center%20rmccaa@umn.edu

Description:

Suppress: date of birth, precise place of birth ... Number of videos: 2 Number of emigrants in dwelling: 2 Age: 81 Age at first child: = 14 ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Statistical%20confidentiality%20and%20privacy.%202.%20Case%20study:%20IPUMS-International%20www.ipums.org/international%20%20*%20*%20*%20Robert%20McCaa%20Minnesota%20Population%20Center%20rmccaa@umn.edu


1
Statistical confidentiality and privacy.2. Case
study IPUMS-International www.ipums.org/internati
onal Robert McCaaMinnesota Population
Centerrmccaa_at_umn.edu
Inadequate use of microdata has high
costs--Len Cook (2003, registrar general, ONS)
2
MPC largest provider of integrated microdata to
trusted, non-commercial researchers
International(census)
History(19th c.)
USA (census)
GIS
Employment
Health
Time-Use
3
IPUMS-Global (first 10 years) dark green
integrated and disseminating (44 countries, 130
censuses, 279 millon person records)green to
be integrated (35 countries, 90 censuses, 150
mill.)
See Inventory handout
Inventory IPUMS confidentiality protocols
used
Mollweide projection
4
Outline IPUMS statistical confidentiality
methods
  • IPUMS A restricted access, web-based microdata
    dissemination system
  • IPUMS The trusted user/institution approach
  • A. Legal Disclosure Controls
  • B. Administrative Disclosure Controls
  • C. Technical Disclosure Controls
  • Example Saint Lucia, 1991
  • IPUMS Assessments (2007)
  • UN-ECE Case Study
  • Trewin on-site evaluation

5
1. IPUMS-International Goals
  • Inventory census microdata and documentation,
    world-wide
  • Recover and preserve at-risk microdata
  • Integrate census microdata and documentation
  • Disseminate--without cost--extracts of samples to
    bona-fide researchers worldwide, regardless of
    country of birth, citizenship or residence.
  • Sustained funding 1999-20156 grants of 5 years
    duration
  • National Science Foundation (USA) 3 successive
    grants
  • National Institutes of Health (USA) Latin
    America, Europe, Eur-Asia

6
IPUMS-International a restricted-access,
web-based microdata extraction system
  • Researcher licensed to access microdata 1/3
    rejected
  • NO Public access, source files, or complete
    datasets
  • Licensed researcher selects
  • Countries,
  • Censuses,
  • Cases/sub-populations,
  • Variables, and sample densities
  • Extract engine queues request, generates extract
  • Password protected to make and retrieve
    extracts
  • Researcher retrieves extract via web with SSL
    128-bit encryption and analyzes using own wares
    (soft/hard/wet)

7
6 steps using www.ipums.org/international
See 10 tips handout
8
IPUMS-International worlds largest
disseminator of integrated microdata to trusted,
non-commercial researchers
  • 1999 Founded by Steven Ruggles and Bob McCaa,
  • restrict access to trusted users, and apply
    corresponding confidentiality techniques
  • 2002 1st release of integrated samples for 7
    countries gt200 users in first year
  • Big success! 80 countries signed 70 entrusted
    microdata to IPUMS, datasets for more than 250
    censuses, gt180 entire datasets
  • 2006

9
IPUMS-International worlds largest
disseminator of integrated microdata to trusted,
non-commercial researchers
  • 1999 Founded
  • 2006, 3rd release
  • data for 20 countries, samples for 63 censuses,
  • 185 million person records,
  • gt1,000 users
  • 2010, 7th release
  • data for 50 countries, samples for 160 censuses
  • 300 million person records
  • gt4,000 users
  • Note data extracts are provided only to
    licensed users.

10
2. IPUMS-International The trusted-user/institu
tion approach to disseminating integrated,
anonymized microdata extracts
Disclosure ControlsA. Legal Memorandum with
NSIB. Administrative License with
researchersC. Technical Sample, Data
modifications
11
3 kinds of confidentiality protections
  • Legal Dissemination agreement between
    University of Minnesota and each National
    Statistical Institute
  • Uniform 11 point Memorandum of Understanding
    regarding ownership, use, authorization,
    restrictions, confidentiality, security,
    publication, violations, sharing, arbitration,
    and order of precedence
  • Administrative conditional use license between
    the University of Minnesota and each researcher
  • Permission to use restricted access microdata, 3
    criteria research need, research competence,
    and agree to abide by conditions of use license
  • Technical data protection measures
  • Specific to each country /

12
A. NSI with U of Minnesota
13
A. NSI with U. of Minnesota
14
3 kinds of confidentiality protections
  • Legal Dissemination agreement between
    University of Minnesota and each National
    Statistical Institute
  • Uniform 11 point Memorandum of Understanding
    regarding ownership, use, authorization,
    restrictions, confidentiality, security,
    publication, violations, sharing, arbitration,
    and order of precedence
  • Administrative conditional use license between
    the University of Minnesota and each researcher
  • Permission to use restricted access microdata, 3
    criteria research need, research competence,
    and agree to abide by conditions of use license
  • Technical data protection measures
  • Specific to each country /

15
LICENSE
IPUMSi
B. License with researchersRestricted Access
web-based system
  • Legally-binding license agreement
  • forces would-be intruder to violate law by which
    they can be fined and/or jailed
  • Researchers institution sanctioned
  • protects privacy and confidentiality
  • assures proper use
  • Access limited to
  • Bona-fide researchers (credentials)
  • With a demonstrated scientific need
  • who agree to abide by license restrictions
  • Confidentiality
  • No redistribution
  • Safely secured
  • Alleging that a person has been identified is
    prohibited

16
LICENSE
IPUMSi
B. License with researchersRestricted Access
web-based system
  • Legally-binding license agreement
  • forces would-be snoopers to violate law
  • protects privacy and confidentiality
  • assures proper use
  • Access limited to
  • Bona-fide researchers (credentialed)
  • with demonstrated scientific need
  • who agree to abide by license restrictions
  • Confidentiality
  • No redistribution, no commercial use
  • Data safely secured
  • Alleging that a person can be or has been
    identified is a violation

17
(No Transcript)
18
(No Transcript)
19
Apply for Access
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
Must click acceptance of each restriction to gain
access.
25
License is for 1 year, renewable.
End of application
26
C. 9 Technical Disclosure Controls(Thorogood,
1999)
  1. Restrict access to samples
  2. Limit geographical detail
  3. Recode sparse categories
  4. Truncate top and bottom codes
  5. Construct age from birthdate, if necessary
  6. Suppress date of birth, precise place of birth
  7. Migration timing/place not identified in detail
  8. Identify place of residence by major civil
    division (popgt20k, 60k, 100k, 250k, 1
    millioni.e., national convention)
  9. Suppress any sensitive variable requested by NSI

27
C. Technical Disclosure ControlsExample Saint
Lucia, 1991 Census
  • Restrict access to samples 10 (13,405 persons)
  • Limit geographical detail (nlt2,000) suppress
    region, district, town, settlement, enumeration
    district, school identification retain
    urban-rural
  • Recode sparse categories (nlt25)? other.
  • Type of dwelling suppress townhouse, barracks
  • Land occupation suppress sharecrop
  • Type of ownership suppress squatted, leased
  • Type of roof suppress 5 categories
  • Wall material suppress 5 categories
  • Water supply suppress pubwell
  • Type of lighting suppress gas
  • Ethnic origin suppress Chinese, Portuguese,
    Syrian-Lebanese
  • Religion suppress 6 categories
  • School, work mode of transport bicycle
  • Type of school technical institute, university
  • Number of hours worked last week 5 hour
    groups. , 70
  • Pay period suppress quarterly, annually
  • Occupation, industry, training code reduce from
    4 digits to 1

28
C. Technical Disclosure ControlsExample Saint
Lucia, 1991
  • Top-bottom code
  • Number of rooms 10
  • Number of bedrooms 7
  • Number of radios 4
  • Number of tvs 3
  • Number of videos 2
  • Number of emigrants in dwelling 2
  • Age 81
  • Age at first child lt 14
  • Age at first union lt14, 41
  • Age at last child lt14, 45
  • Number of school subjects lt3, gt7
  • Income categories 8

29
C. Technical Disclosure ControlsExample Saint
Lucia, 1991
  • Suppress
  • date of birth, precise place of birth, type of
    work wanted
  • Migration timing/place not identified in detail
  • Country last lived suppress 37 categories
  • Year of immigration lt1948
  • Identify place of residence by major civil
    division (popgt20k, 60k, 100k, 250k, 1
    millioni.e., national convention)
  • all suppressed
  • Suppress any sensitive variable requested by NSI
  • none (as yet)

30
3. AssessmentsA. Why was IPUMS cited as good
practice by the UN-ECE (2007, Annex 23, pp.
98-103)?http//www.unece.org/stats/documents/tfcm
.htm
31
UN-ECE Good practices (see annex 23)
  1. High level of confidence and transparency between
    the researchers (users) and the national
    statistical institutes
  2. The data are anonymized by highly efficient
    technical means
  3. The conditions of use are well defined
  4. Good use is assured by both juridical and
    administrative mechanisms to prevent violations
  5. Sanctions for misuse are clearly spelled out
  6. Sanctions are imposed not only against those who
    misuse the data but also against their
    institutions

32
See Trewin Report handout
B. The Trewin Report
The security of the computing environment used
by IPUMS-International is first class and appears
to be of the standard of the beststatistical
offices.--Dennis Trewin, former-Australian
Statistician,past-President International
Statistical Institute,chair, UN-ECE Committee on
Managing Statistical Confidentiality and
Microdata Access (CES 2007)
33
Statistical confidentiality and securitysee the
on-site review by Dennis Trewinwww.hist.umn.edu/
rmccaa/ipums-global (click Trewin Report)
  • An Outsiders view from inside IPUMS-International
  • The best practice for an international
    repository of microdata
  • The security of IPUMS is first classthe
    standard of the best national statistical
    offices
  • in full compliance with the principles and
    recommendations of the ECE

34
IPUMS-International strengths
  1. Uniform legal authorization with national
    statistical authorities
  2. Access restricted to academics with need who
    agree to abide by stringent confidentiality
    protections. Sanctions against individual and
    institutiondenial of access to all microdata for
    the entire institution
  3. Strong technical methods of microdata
    anonymization
  4. Experienced integration teams
  5. Proven web-based access management system
  6. High producer and user satisfaction
  7. Sustainable MPC, NSF, NIH

35
Join us at the 58th ISI Dublin, Aug 21-26,
2011http//www.isi2001.ie
  • IPUMS Workshop, Aug 19-20.
  • Microdata sessions.
  • IPUMS Funding for delegates from developing
    countries.
  • IPUMS booth
  • Participate in ISI sessions.
  • Network with stat offices, international
    agencies, etc.

36
Thank you!Morewww.hist.umn.edu/rmccaa/ipums-g
lobal see Durban workshop (2009) Microdata
recovery, Jamaica reportLisbon workshop
(2007)Saint Lucia report Contact
rmccaa_at_umn.edu this ppt is also available
atipums-global (See Port of Spain workshop)
Write a Comment
User Comments (0)
About PowerShow.com