HIPAA and its Implications on Epidemiological Research Using Large Databases PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: HIPAA and its Implications on Epidemiological Research Using Large Databases


1
HIPAA and its Implications on Epidemiological
Research Using Large Databases
  • K. Arnold Chan, MD, ScD
  • Harvard School of Public Health
  • Channing Laboratory,
  • Birgham Womens Hospital
  • and Harvard Medical School

1
2
Brief outline of this presentation
  • Using large linked automated data for public
    health research
  • Data development processes to ensure
    HIPAA-compliance
  • Examples
  • Some thoughts

3
Two types of data for public health research
  • Primary data
  • Prospectively collected
  • Well-designed data collection tool
  • Informed consent
  • Secondary data
  • Data originally collected for other purposes
  • May be proprietary
  • Privacy and confidentiality (particularly
    important if no prior authorization)
  • Different data systems

4
Large linked healthcare databases
  • Health insurance claims data
  • Medicaid
  • Medicare
  • Managed Care Organizations (MCO)
  • Automated medical records
  • Hospital / Clinic IT systems
  • Availability of written records
  • Need to contact patients / individuals ?

5
Public health research within MCOs
  • Harvard Community Health Plan (subsequently
    became Harvard Pilgrim HealthCare)
  • Kaiser Permanente (several states)
  • Group Health Cooperative (Seattle area)
  • Others
  • HMO Research Network
  • 10 MCOs across the U.S.

6
Public health research within MCOs
  • Different types of MCOs
  • Group model
  • Staff model
  • Different relationship with hospitals
  • Implications on data access
  • MCOs with research programs
  • Separate research departments
  • Full-time investigators and support staff

7
Data elements in the MCO data
  • Demographic information
  • Membership
  • Start date, termination date, benefit plan, ...
  • Office visits
  • Type of visit, diagnosis(es), special procedures
  • Special examinations
  • Radiology, Laboratory examinations
  • Hospitalizations
  • Drug dispensings
  • Linkable by a unique ID

8
HIPAA and Research with Databases
  • Authorization from individual research subjects
    not feasible
  • Individual authorization may be waived by
    Institutional Review Board or Privacy Board
  • Minimal Risk
  • Data reported in aggregate fashion
  • No single-case report
  • Minimum necessary principle
  • De-identification

9
HIPAA and Research with Databases
  • Single MCO studies
  • Investigators and research staff are MCO
    employees
  • Multiple-MCO studies
  • May involve transferral of data across MCOs or to
    a Data Center
  • Other types of studies not covered in this
    presentation
  • e.g. Generate a de-identified dataset for public
    or commercial use

10
HIPAA and data development
  • Do not move individual level data unless
    absolutely necessary
  • Generate summary tables at each study site
  • Combine the tables for final report
  • Smalley et al. Contraindicated use of cisapride
    the impact of an FDA regulatory action. JAMA
    2000 284 3036-9.

11
(No Transcript)
12
HIPAA and data development
  • Randomly generated Study ID to replace True ID
  • Crosswalk between the two stored at secured
    location
  • Destroy the crosswalk after successful linkage of
    data and quality check
  • Implications for storage and back-up

13
HIPAA and data development
  • Roll-up / transform variables
  • Age --gt Age groups
  • National Drug Code --gt Drug or Group of drugs
  • ICD-9 diagnosis code --gt Disease
  • e.g. A man born on Dec 10, 1934 with diagnosis
    code xxx.yy received durg 55555-333-22
  • 65-70 y/o m with Heart Failure received Digoxin

14
HIPAA and data development
  • Preserve temporal sequence of events
  • but disguise the real dates
  • e.g. Drug use during pregnancy study
  • 29 year-old received 55555-333-22 on Nov 25, 1999
    and delivered a baby on Dec 10, 1999
  • --gt
  • 26-30 year-old mother delivered in 1999, baby
    exposed to amoxicillin at -16 days

15
HIPAA and data development
  • Only extract information relevant to the study
  • e.g. A study of osteoporosis does not require
    information on subjects' mental health status
  • Co-morbid conditions may be relevant
  • Use proxy measures to describe level of
    comorbidity
  • Charlson's Index (based on concomitant diagnoses)
  • Chronic Disease Score (based on co-medications)

16
HIPAA and data development
  • Geocoding
  • Describe social-economic status of study subjects
    based on census tract data
  • Send out (Study ID, address) to a geocoding firm
  • (Study ID, X1, X2, X3) returned
  • X1 education level
  • X2 income level
  • X3 race/ethnicity information

17
An example
  • Finkelstein et al. Decreasing Antibiotic Use
    Among US Children The Impact of Changing
    Diagnosis Patterns. Pediatrics 2003 112 620-7.
  • Data elements involved
  • Date of birth, gender
  • Membership
  • Drug dispensings
  • Diagnoses in close proximity to antibiotics
    dispensings
  • Data from nine MCOs

18
Finkelstein et al. Pediatric antibiotics use study
  • Data development at each MCO
  • Extract antibiotics use information
  • Extract diagnosis of interest (infections)
  • Use date of birth, gender, and membership data to
    calculate person-time of interest
  • Refined, aggregate data forwarded to the Data
    Center
  • Rate of antibiotics use
  • of antibiotics use / 1,000 person-years
  • for each age-gender group

19
HIPAA and data development
  • Individual identification is needed for certain
    types of research
  • Obtain medical records
  • Contact patient to conduct interview and/or
    request specimen
  • Linkage with external data
  • Cancer registry
  • National Death Index

20
HIPAA and data development
  • The process
  • Data extraction, transformation, reduction, and
    de-identification carried out at each MCO
  • Governed by State laws and local HIPAA-compliant
    Standard Operating Procedures
  • Principle of Limited Dataset / Minimum necessary
  • The goal
  • Highly processed and de-identified data available
    for concatenation across study sites and complex
    analyses

21
k-anonymity and large datasets
  • The goal
  • A de-identified dataset at a certain level of
    individual anonymity
  • A 43 year-old man with hypertension, diabetes,
    and anxiety, taking atenolol, rosiglitazone, and
    lorazepam
  • vs.
  • A man 40-45 taking a beta-blocker and a
    thiazolidenedione

22
HIPAA, Data Storage and Access
  • Implications on Data Backup Plans
  • Data need to be destroyed after the report is
    published
  • Data only used to support pre-defined analyses
  • Ancillary analysis are possible after IRB review
    and approval

23
Epidemiology studies using large databases
  • In the old days ...
  • Give me all the data, do what I say ...
  • What if the investigator / reviewer want to do
    THIS analysis ?
  • Use existing datasets to test new hypothesis
  • Good research practice
  • Define necessary data elements according to
    research protocol
  • Pre-defined analytic plan

24
Epidemiology studies using large databases
  • Keys to protection of human subjects
  • Competent, responsible investigators and staff
  • IRB review and oversight
  • Data development guidelines
  • e.g. Good Epidemiology Practice
  • Information technology
  • Some reasonable rules/guidelines are better than
    no guideline
Write a Comment
User Comments (0)
About PowerShow.com