Title: Data Privacy and Epidemiologic Research
1Data Privacy and Epidemiologic Research
- Harry Guess
- Merck Research Laboratories
ãMerck Co., Inc.
2Outline
- Epidemiologic and health services research
- Contrast with clinical research
- Human subjects protection in health services
research - Identifiable data and protection of privacy
- Public health importance of studies using
identifiable medical data - Conclusions
- Where can mathematical research help?
3Health Services Research (HSR)
- Health services research is a multidisciplinary
field of inquiry, both basic and applied, that
examines the use, costs, quality, accessibility,
delivery, organization, financing, and outcomes
of health care services to increase knowledge and
understanding of the structure, processes, and
effects of health services for individuals and
populations
Protecting Data Privacy in Health Services
Research Committee on the Role of Institutional
Review Boards in Health Services Research Data
Privacy Protection (Editors). Division of Health
Care Services. National Academy of Sciences
Press. Washington, DC. 2000. ISBN 0-309-07187-9
4Much Health Services Research Makes Use of
Medical Records Previously Collected for Other
Purposes
5Typical Information available in Claims Databases
PHYSICIAN CLAIMS Member identifier Provider
identifier Date of service ICD-9-CM codes
or CPT-4 procedure codes
OUTPATIENT PHARMACY CLAIMS Member identifier
Pharmacy identifier NDC code Generic code Drug
strength Dosage form Quantity dispensed Days
supply Prescribing physician ID Date filled
MEMBERSHIP DATA Member identifier Date of
birth Gender Date of enrollment Date of
disenrollment Benefit plan number
HOSPITAL CLAIMS Member identifier Provider
identifier Date of admission Date of
discharge DRG code ICD-9-CM codes Length of stay
6 Record Linkage can create a longitudinal record
of events for each patient
- Using the member identification number and the
dates of events (e.g., prescriptions, medical
diagnoses), the records can be linked over time
to produce a longitudinal record of events for
each patient - Patient 1234 02/26/01 - prescription for Drug A
- Patient 1234 03/15/01 - prescription for Drug A
- Patient 1234 04/10/01 - prescriptions for Drug
A, Drug B - Patient 1234 04/20/01 - hospitalization for
acute renal failure - Putting together each patients longitudinal
sequence of events yields a record of a cohort of
many patients, each of whom has been followed
individually over time.
7Patient A
Patient B
Patient C
Patient D
X
Patient E
Patient F
X
8 - Once longitudinal records have been created,
personal identifiers such as member numbers or
social security numbers are generally replaced by
encrypted numbers or by sequential numbers (0001
for the first patient, 002 for the second, etc) - Such data are not fully anonymous as long as
someone (e.g, the managed care organization)
holds the key whereby the actual member numbers
can be re-identified - In epidemiologic research and in pharmaceutical
clinical trials the patient identifiers
maintained for data analyses are typically - study site identifiers
- sequential patient numbers
- The links between these and actual patient
identifiers are almost always held by the
investigators (for clinical trials) or managed
care organizations (for epidemiologic studies)
9Example Health Care Databases in Saskatchewan
- All residents have full coverage of medical care
services and drugs listed in the provincial
formulary. - Approximately 1 million current residents.
- Six separate provincial databases can be linked
by an individuals unique Health Services Number
(HSN) - Eligible population registry
- Prescription drug data (1975-1987 1989-present)
- Hospital services data
- Physician services data
- Cancer registry and vital statistics
- Reference Downey W, et al Health Databases in
Saskatchewan. Ch. 20, pp. 325-345. In
Pharmacoepidemiology. 3d Ed. B. Strom (Ed).
10Clinical and Health Services Research Two
different paradigms
- Clinical Research
- Typically prospective
- Typically involves at most a few thousand
patients, except in the most expensive of studies - Research risks may involve possibility of
physical harm (e.g., from adverse reactions) - Study-specific informed consent is easily
obtained as part of the investigator - patient
interaction
- Health Services Research
- Often retrospective
- Typically involves analyses of medical records
that have been collected previously and for other
purposes from many thousands or millions of
patients - Research risks have to do with potential harm
from release of health information - Study-specific informed consent is often
impossible to obtain without either invalidating
the study or making it prohibitively expensive
11Another difference between clinical and health
services research
- Much health services research is intended to
improve the quality of medical care - hence the boundary between health care quality
assurance and research is often not as sharply
defined in health services research as in
clinical research - Research means a systematic investigation,
including research development, testing, and
evaluation, designed to develop or contribute to
generalizable knowledge. (45 CFR 164.501) - if there is any element of research in an
activity, that activity should undergo review for
the protection of human subjects (The Belmont
Report)
12Human Subjects Protection in Health Services
Research
- Ethical principles are the same as for clinical
research - Respect for persons
- Underlies the requirement for oversight of
research by an Institutional Review Board (IRB) /
Privacy Board / Ethics Committee - Requires patient informed consent - with certain
well-defined exceptions - Beneficence
- Requires that risks of the research be reasonable
in relation to possible benefits and that any
risks to research subjects be minimized - Justice
- Requires fairness in sharing risks and benefits
of research
The Belmont Report http//ohrp.osophs.dhhs.gov/
humansubjects/guidance/belmont.htm
13Waiving Informed Consent The Belmont Report
- A special problem of consent arises where
informing subjects of some pertinent aspect of
the research is likely to impair the validity of
the research.. - In all cases of research involving incomplete
disclosure, such research is justified only if it
is clear that - (1) incomplete disclosure is truly necessary to
accomplish the goals of the research, - (2) there are no undisclosed risks to subjects
that are more than minimal, and - (3) there is an adequate plan for debriefing
subjects, when appropriate, and for dissemination
of research results to them. - Care should be taken to distinguish cases in
which disclosure would destroy or invalidate the
research from cases in which disclosure would
simply inconvenience the investigator.
14Conditions in The Common Rule under which an
IRB may alter or waive the requirement of
obtaining informed consent for human subjects
research
- An Institutional Review Board (IRB) may alter or
waive the requirements to obtain informed consent
if it finds and documents that - (1) The research involves no more than minimal
risk to the subjects - (2) The waiver or alteration will not adversely
affect the rights and welfare of the subjects - (3) The research could not practicably be carried
out without the waiver or alteration and - (4) Whenever appropriate, the subjects will be
provided with additional pertinent information
after participation. - 45 CFR 46.116(d)
15HIPPA
- 45 CFR 160-164 Standards for Privacy of
Individually Identifiable Health Information - This regulation is the second final regulation to
be issued in the package of rules mandated under
title II subtitle F section 261264 of the Health
Insurance Portability and Accountability Act of
1996 (HIPAA), Public Law 104191, titled
Administrative Simplification. - CFR U.S. Code of Federal Regulations
16Privacy Protections for Human Research
Participants
- The HIPAA Privacy Prohibition
- The HIPAA Privacy Standards generally state that
a covered entity may not use or disclose
protected health information (PHI), except as
permitted or required by the regulation. - 45 CFR 164.502
17 HIPPA Some Definitions
- Covered Entity
- Health plan Health care clearinghouse Health
care provider that transmits any health
information in electronic form. - Protected Health Information (PHI)
- Individually Identifiable Health Information
(IIHI) in the possession of a Covered Entity,
whether transmitted or maintained through
electronic media, in hard copy, or by other means - Individually Identifiable Health Information
(IIHI) - Information about the physical or mental health
of an individual that is created or received by a
covered entity and that identifies or can
reasonably be used to identify the individual
18HIPPA Privacy Protections for Human Research
Participants
19 HIPPA Effect on Researchers
- HIPAA applies to covered entities, not covered
individuals. - Covered entities include healthcare providers
that transmit PHI in electronic form. - Thus, a researcher, who is not a healthcare
provider or other covered entity, who receives
PHI from a covered entity is not directly subject
to the HIPAA regulation. - Indirect control on how a researcher can access
data from a covered entity is provided by HIPPA
in several ways, including - through requiring that the covered entity obtain
authorization from each patient whose data will
be used for anything other than treatment,
payment, or healthcare operations - or through detailed criteria that an IRB or
Privacy Board must find to have been met in order
for a covered entity to release PHI for a
research study without specific authorization
from each patient
20Conditions in HIPPA Regulations under which an
IRB or Privacy Board may waive or alter informed
consent requirements for releasing identifiable
health information
- (A) The use or disclosure of protected health
information involves no more than minimal risk to
the individuals - (B) The alteration or waiver will not adversely
affect the privacy rights and the welfare of the
individuals - (C) The research could not practicably be
conducted without the alteration or waiver - (D) The research could not practicably be
conducted without access to and use of the
protected health information - Continued
21 HIPPA Conditions for waiving consent (continued)
- (E) The privacy risks to individuals whose
protected health information is to be used or
disclosed are reasonable in relation to the
anticipated benefits if any to the individuals,
and the importance of the knowledge that may
reasonably be expected to result from the
research - (F) There is an adequate plan to protect the
identifiers from improper use and disclosure - (G) There is an adequate plan to destroy the
identifiers at the earliest opportunity
consistent with conduct of the research, unless
there is a health or research justification for
retaining the identifiers, or such retention is
otherwise required by law and - (H) There are adequate written assurances that
the protected health information will not be
reused or disclosed to any other person or
entity, except as required by law, for authorized
oversight of the research project, or for other
research for which the use or disclosure of
protected health information would be permitted
by this subpart. - 45 CFR 164.512(i)
22De-Identification of health information What
are the requirements in HIPPA?
23To de-identify medical data under HIPPA, one
must either remove all of the following 18
identifiers or provide an expert statistical
determination that the risk of identifying an
individual would be very small
- Names
- All geographic subdivisions smaller than a State
(some complex exceptions) - All elements of dates (except year)for dates
...including birth date, admission date,
discharge date, date of death and all ages over
89 - Telephone numbers
- Fax numbers
- Electronic mail addresses
- Social security numbers
- Medical record numbers
- Health plan beneficiary numbers
- Account numbers
- Certificate/license numbers
- Vehicle identifiers and serial numbers, including
license plate numbers - Device identifiers and serial numbers
- Web Universal Resource Locators (URLs)
- Internet Protocol (IP) address numbers
- Biometric identifiers, including finger and voice
prints - Full face photographic images and any comparable
images and - Any other unique identifying number,
characteristic, or code
45 CFR 164.514
24Expert statistical determination that the risk of
identifying an individual is very small
- A person with appropriate knowledge of and
experience with generally accepted statistical
and scientific principles and methods for
rendering information not individually
identifiable - (i) Applying such principles and methods,
determines that the risk is very small that the
information could be used, alone or in
combination with other reasonably available
information, by an anticipated recipient to
identify an individual who is a subject of the
information and - (ii) Documents the methods and results of the
analysis that justify such determination. - 45 CFR 164.514
25Why such caution?
26Use of external public information in combination
with databases can reveal patient names and
addresses
Public information e.g. voter registration
rolls linking names and addresses with DOB,
gender, Zip Codes
Database information DOB, Gender, Zip Code,
Identification of the patient names with
the medical information in the database
27Using public information to uniquely identify
people Cambridge, Massachusetts Voting List -
54,805 voters
- Percent of voters whose names and addresses were
uniquely identified by the data elements - 12
- 29
- 69
- 97
- Data Elements
- Birth date alone
- Birth date and gender
- Birth date and 5-digit ZIP Code
- Birth date and full postal code
Sweeney L Journal of Law, Medicine Ethics.
1997 2598-110
These results are not surprising. This is about
what one would expect from the rough
approximation (1- 1/k)N-1 exp(-N/k), where k
available dates of birth in a voter cohort and N
54,085. The other numbers look plausible as
well, by similar reasoning.
28Even more stringent forms of data privacy
protection than HIPPA are favored by some
29 - Some members of the academic medical community
believe that identifiable medical records should
generally not be used for public health research
or quality assurance without specific informed
consent and that these objectives can be
accomplished sufficiently well by use of fully
anonymous data - In addition, we must abandon the use of
identifiable medical records for quality
assurance, detection of fraud, and public health,
with narrowly defined exceptions. In most cases,
mathematical algorithms applied to anonymous data
perform these functions more effectively, more
quickly, more accurately, and more cheaply.
Welch CA. NEJM August 2001 345371-2
30Some states have required individual patient
informed consent for all medical records research
- In 1996, Minnesota enacted a law that placed
stringent consent requirements on the use of
patient data for research. - Records created since January 1, 1997 could not
be used for research without the patients
written authorization.
31 Effect of Minnesota legislation requiring
specific informed consent on response rate for a
medical records study
- STUDY DESIGN Seizures associated with a pain
medication -- part of FDA post-marketing
surveillance to evaluate adverse events with
approved drugs. - DATA COLLECTION Informed consent for Minnesota
plan members consisted of (1) letter from
health plan medical director, (2) 2nd mailing to
non-respondents, and (3) a follow-up telephone
call to non-respondents. - PRINCIPAL FINDING --- very low participation
rates - 19 (26/140) of health plan members in Minnesota,
where informed consent was required, returned a
signed consent form - In 5 other states, where patient informed consent
was not required, health care providers granted
access to patient medical records for 93
(123/132) of the members. - CONCLUSIONLegislation requiring study-specific
consent was associated with low participation and
increased time to completion. Efforts to protect
privacy may conflict with ability to produce
valid research to safeguard and improve public
health. - McCarthy DB, et al Medical records and
privacy empirical effects of legislation. Health
Serv Res 1999 34417-25.
32 - The Minnesota law has subsequently been amended
to permit use of records where the patient does
not respond to 2 requests for authorization
mailed to the patients last known address. - At Mayo Clinic, that change decreased the
percentage of patient records that the patient
consent requirement made unavailable for studies
from 20.7 percent to 3.2 percent. Mayo Clinic
researchers remain concerned that variations in
the rate of refusal among different patient
groups, for example, young versus old, may tend
to skew the results obtained from these data1,2. - 1MEDICAL PRIVACY REGULATION. GAO Report 01-584.
April 2001. - 2S. J. Jacobsen and others, Potential Effect of
Authorization Bias on Medical Record Research,
Mayo Clinic Proceedings, Vol. 74, No. 3 (April
1999), p. 333
33Informed Consent for all Medical Records Research
using identifiable patient data is required by
the October 2000 Declaration of Helsinki
- Paragraph 1
- .Medical research involving human subjects
includes research on identifiable human material
or identifiable data. - Paragraph 22
- In any research on human beings, each potential
subject must be adequately informed of the aims,
methods, sources of funding, any possible
conflicts of interest, institutional affiliations
of the researcher, the anticipated benefits and
potential risks of the study and the discomfort
it may entail. The subject should be informed of
the right to abstain from participation in the
study or to withdraw consent to participate at
any time without reprisal. After ensuring that
the subject has understood the information, the
physician should then obtain the subject's
freely-given informed consent, preferably in
writing. If the consent cannot be obtained in
writing, the non-written consent must be formally
documented and witnessed. - These paragraphs imply that informed consent must
be obtained for all research involving
identifiable medical records. The Declaration
provides no exceptions, unlike the Belmont
Report, the Common Rule, and the HIPPA
regulations
34The new Declaration of Helsinki requirement of
informed consent for all research using
identifiable human data has prompted criticism
- Strict application of the declarations
principles would make a wide range of clinical,
biological, and epidemiological research
impractical or invalid. - Sir Richard Doll. BMJ 2001 3231421-1422
- He gave 2 examples of epidemiological research
which he judged would have been impossible to
conduct validly if individual informed consent
had been required -
- His 1957 study in 14,000 patients documenting
that radiation therapy increased the risk of
subsequent cancer - A recently published epidemiological study
showing that neither induced nor spontaneous
abortion increased breast cancer risks
35Medical records databases are essential for
addressing many important research questions in
public health
36Effectiveness of Influenza Vaccine in the
Elderly
- Retrospective cohort study using administrative
database1 - Setting Minnesota HMO 25,000 persons 65
years of age - Results About 50 reduction in influenza and
pneumonia hospitalizations - Almost 50 reduction in all-cause
mortality - Direct cost saving of 117 per person
vaccinated - Conclusion Influenza vaccination of the elderly
reduces both mortality costs -
- Comment
- To have required individual informed consent for
this study would have made it prohibitively
expensive, and could have impaired validity
because of selection bias - 2NJEM 1994, 331778-84
37Evaluating Effects of Reimbursement Rule Changes
on Health Care Utilization and Costs
- Example
- Effect of legislatively-mandated Medicaid
drug-payment limits on admissions to hospitals
and nursing homes1 - Methods
- Matched cohort study of Medicaid claims in two
states for high-risk elderly patients before,
during, and after the limit - Conclusion
- "Limiting reimbursement for effective drugs
puts frail, low-income, elderly patients at
increased risk of institutionalization in nursing
homes and may increase Medicaid costs." - Comment
- The study required Medicaid claims data from 2
states. If individual informed consent had been
required, the study would have been impossible to
conduct - 1Soumerai SB, et al Effects of Medicaid
drug-payment limits on admission to - hospitals and nursing homes. NEJM 1991
3251072-7.
38Why medical records studies are important in
public health
- To monitor the health of populations and to
detect emerging disease problems, e.g., trends
and patterns in asthma, renal disease, coronary
heart disease, cancer - To identify populations at high risk for disease
and to identify factors that are either
potentially harmful or helpful - To determine the effectiveness of health
interventions as they are used in clinical
practice, e.g., monitoring effects of new
vaccines - To quantify prognosis, e.g., survival statistics
for various stages and grades of cancer and for
cardiovascular disease - To assess usefulness of diagnostic tests and
screening programs, e.g., colon cancer screening,
mammography for breast cancer screening
Melton LJ The threat to medical records
research. NEJM 1997 3371466-1470
39Conclusions
- Epidemiologic research using identifiable medical
records has played a vital role in advancing
public health and medical knowledge - There is every indication that it should be able
to play as great or greater role in the future,
especially as larger records linkage systems are
put into place - A requirement for individual study-specific
informed consent for each medical records study,
as advocated by some, would make much health
services and epidemiologic research either
invalid or so expensive as to be impossible
40How can mathematical research help?
- Improving methodology for protecting patient
privacy while permitting use of large data sets
for epidemiologic research and surveillance - Defining acceptable professional standards for
what needs to be done to certify that a releasing
a given data set to a given recipient poses a
very small risk of identifying any individuals