Title: HIRU Health Information Research Unit for Wales
1(No Transcript)
2HIRUHealth Information Research Unit for Wales
3- Centre for Health Information, Research
Evaluation (CHIRAL) - Institute of Life Sciences
- School of Medicine
- Swansea University
4Centre for Health Information Research
Evaluation
- CHIRALs research areas
- Health Services Research
- Clinical Epidemiology Diabetes
- Social and Epidemiological Psychiatry
- Primary care
- Injury and Environment
- Health Information Management
- Mathematical Modelling
- Clinical Research Unit
- Qualitative Research
5Cross cutting themes
- Patient-based trials
- Trials and evaluations of complex interventions
- Cohort studies
- Advanced methodologies (outcomes, qualitative,
modelling, etc.) - Use of routine data for research
6Development of a health environment
information research platform Health
Information Research Unit
7Simple questions which cannot be easily answered
in Wales
- What is the prevalence of any chronic disease in
any area of Wales? - Is disease x increasing or decreasing in Wales?
- What amount of money/resource is spent on
different disease groups in Wales? - Is the health of a population being adversely
affected by living closer to - If care is redesigned in a particular way what
will be the likely impact on primary/secondary
care and different populations?
8More questions which cannot be easily answered in
Wales
2.
- What is the impact of poverty/socioeconomic
status/deprivation on the demand for health
services? - Is this practices higher/lower referral
rate/prescribing etc due to a different
disease/illness burden? - What additional workload is needed to meet
national service frameworks (NSFs)/changes in
contract etc? - How many patients in Wales would benefit from a
new treatment supported by NICE and how much
would this cost? - How does the physical environment influence
health ?
9HIRU Background and methods
- Vast amounts of electronically-stored information
held in - the NHS
- local government
- central government
- other organisations. . . . much of it held at
an individual level - These data, were they all to be available for
simultaneous analysis, offer enormous potential
to conduct and support research
10The challenge
- Data collected for wide variety of purposes
- Generally only used for original purpose (if at
all!) - Data held in large number of organisations
- Held in variety of formats, variation in
ontologies - Known and unknown data quality issues
- Data extraction and transportation difficult
- Data volumes enormous
- Data protection / confidentiality hurdle
- Intelligent (thoughtful) analysis essential
11The future looks good . . .
- National Programmes of NHS IT (Connecting for
Health and Informing Healthcare in Wales) offer
potentially excellent datasets for research. - Current priority is delivering clinical benefits,
although secondary uses of data is being
considered. - Still some time until research uses of data will
be fully realised - Full breadth of non-NHS data sources unlikely to
be in scope in near future
12Moving forward . . .
- Health Information Research Unit (HIRU) funded by
research centre grant by Wales Office of Research
Development (WORD) - Three year grant (in the first instance long
term research infrastructure development) - Staff appointed Sept 06
- Formally launched by Minister of Health and
Social Care, Welsh Assembly Government, Nov 06
13HIRU built on a good pedigree of work in Swansea
- Clinical systems development and implementation
- RCP iLab project
- Public Health analysis
- Using routine data for primary research
- Blue C supercomputer at School of Medicine,
Swansea University
14Computing infrastructure
- Blue C supercomputer, one of the fastest
computers in Europe dedicated to Life Science
research - Strategic partnership with IBM (through School of
Medicines Institute of Life Sciences initiative) - Advanced software toolset (database, data mining,
GIS) - Specialist support through Anix
15Blue C Computer
16HIRU programme aims
- Develop new methodologies for accessing and
combining routine data in ways which do not
breech data confidentiality rules and
regulations, but which still permit the use of
data for a wide range of research purposes. - Explore how to use routinely collected and other
data to support large scale multi-site
intervention and cohort studies and policy
relevant research. - Develop innovative analyses of large and combined
datasets - Develop methods for data capture to common
standards and definitions in multiple and remote
locations.
17Focus on Wales
- Small country - 3 million population
- Accessible leadership
- Desire to be smart
- 22 Local Health Boards ( PCT) coterminous with
Local Authorities, 14 NHS Trusts - Wide variation in health from near best to worst
- National Programme for IT (Informing Healthcare)
working incrementally)
18Progress through partnership
- Health Solutions Wales (HSW)
- Welsh Assembly Government Information Services
Division (CHIP Programme PCIMT) - Informing Health Care
- National Public Health Service Wales
- CRC Cymru organisations and other research groups
- And particularly . . .
- NHS and Local Authority organisations
19Our approach
- Individual confidentiality is at the core of our
approach (DAPP) - No identifiable confidential information is
passed to anyone involved - The project is about information linkage across
domains and organisations - We act responsibly with the data. HIRU is not a
data mart. Analysis is strictly protocol-driven.
Disclosure control rules are applied to ensure
confidentiality. - We use data for health-related research and to
provide useful information back to partner
organisations that provide data - Information used for knowledge not for
performance management!
20HIRU methodology (example)
Construct ALF
Validate
HIRU (Blue C)
Health Solutions Wales
Data Provider
Anonymisation process
Recombine
Encrypt and load
Operational system
HIRU (Blue C)
21Matching Anonymisation Process
22Matching
- Each record matched to a definitive population
register for Wales - the NHS Administrative
Register carefully maintained by the BSCs and
HSW - Exact match against date of birth, forename and
surname - Lexicon techniques common variations of
forename, such as "Liz" and "Beth" for
"Elizabeth" - Then, if no match, use
- Soundex alternate phonetic spellings of the
forename or surname, using Soundex functions. - Fuzzy Matching Are there possible matches of
either date of birth, address or name?
Allowances for possible data entry errors.
Possible matches are scored and weighted against
frequencies of occurrence within the Welsh
population. Matches are currently accepted if
the probability is above 90.
23Anonymisation
- NHS number encrypted using highly secure 256 bit
encryption algorithm - Unique surrogate value assigned against the
encrypted value to become the Anonymous Linking
Field (ALF) - Encryption key held within HSW and only known by
them - Personal details deleted away before the file is
returned to HIRU as File 3 - Second encryption of ALF occurs while data is
flowed to HIRU (key known only to HIRU DBA) to
create the ALF-E - Under development R-ALFs (Anonymised
households)
24Geocoding
- Each address / postcode replaced by a geocode
- Lower Super Output Area standard small area
geography of c. 1,500 people - 1896 LSOAs in Wales
25Data warehouse design
- Built in DB2 Data Warehouse Edition
- On IBM P Series supercomputer running AIX
- Supports the entire lifecycle of the data HIRU
receives from - loading of external data
- encryption and second level anonymisation
- base storage
- historisation of facts and dimensions
- data cleansing and
- presentation of the data to users
26(No Transcript)
27SAIL Pilot Project
- SAIL Secure Anonymous Information Linkage
- LHB/LA (PCT) area as early adopter pilot (pop.
227k) - Developing and refining the methods
- Testing quality
- Research question generation
- Sharing the benefits
- Reporting the learning
- Planning for scaling-up
28SAIL Data Bank Pilot (Swansea)
- Access and load a range of datasets
- Inpatient day case admissions
- Outpatient appointments
- AE data
- Social Services data (older people, mental
health, learning disability, children) - Births and deaths
- GP morbidity prescriptions (35 of 36
practices) - Child health database
- Pathology results from NHS Trust
- NHS Direct Wales 0845 call centre contacts
-
- Many more in the pipeline . . . . .
29Volumetrics (Swansea pilot)
- With national data for inpatients and outpatients
and local (Swansea) data for primary care and
social services, data volumes are large - Information on 2.42 million people
- Detailed SAIL information on 227k people
- Success rate for ALFing NHS data c 99
- Success rate for social services data currently
85 (using tight matching criteria) - Number of health events recorded 34,700,000
- Number of records in the databank 105,million
and rising!
30Patient Journey Analysis- Health and Social Care
31Post-pilot developments (in progress)
- Data extraction utility for all 497 GP practices
(weekly extracts) - Automated incremental feeds from national
datasets (monthly) - National data transportation fabric (for NHS and
non-NHS) with full file handling, checking,
receipting etc) - Fully automate processes for matching,
anonymisation and database loading - Support national clinical data collections for
research - Extend depth and breadth of health data suppliers
to cover all Wales - Focus on non-health suppliers (social care,
education and housing) - Building on pathology data to integrate imaging
and biomedical testing data - Part of Phase 1 of ONIX with NCRI Informatics
Initiative (live meta-data index)
32Data transportation fabric
33Building clinical (2 care) datasets
- Working with Wales-wide groupings of clinical
specialities - Agree a basic national research dataset
- Developed a generic data architecture to support
data collection, for all conditions - Make available web tool for data capture and
local analysis - Anonymised version fed into SAIL Data Bank
- Starting with stroke, ankylosing spondylitis, and
soon cancer
34Individual-level data acquisition strategy
AE Attendances
Social Services
Other national datasets
Clinical data collections
Child Health
GP Out of Ours
Education attainment
Pathology
Radiology
NHS Direct
Housing
Cancer
Screening
Anomalies
Out patients
Inpatients
GP Data
Geographical coverage
Ecological / environmental data
35Some datasets individual and ecological
- Clinical databases
- Cancer
- Screening (multiple conditions)
- Congenital Anomalies
- Myocardial Infarction
- Diabetes
- Stroke
- Arthropathies
- Etc.
- Ecological datasets (many are GIS)
- Census
- Ordnance Survey - Mastermap
- Social Housing
- Transport
- Environmental Health
- Planning
- Leisure
- Government departments and agencies
- Individual level - health
- Population (NHSAR)
- Inpatients -PEDW (HES)
- Births
- Deaths
- Outpatients
- AE
- GP Data
- Laboratory systems
- Child Health Database Wales
- NHS Direct Wales
- Radiology- Reporting and Imaging
- Individual level non health
- Social Services
- Educational Attainment
- Housing
36Assuring confidentiality
- Data Anonymisation Process approved by wide range
of Information Governance officials and bodies
(Caldicott etc) - Large scale and very rich data (all anonymised)
- Subsets of data not sent externally
- Access controls necessary
- Research protocol driven access
- Analysis plans crafted to ensure no disclosure
(avoiding low cell counts) - Secure analysis laboratory with secure
workstations - Portals in development with no data transfer
37Types of research supported by data linkage
- Demonstrating and explaining variation in health
service utilisation with a view to developing
testable hypotheses - Improving capacity for (and efficiency of)
clinical trials - Improving drug and device safety (long term
follow-up) - Population health studies and evaluation of
policy initiatives - Hybrid cohort studies using traditional
recruitment and e-approaches - Health system and biomedical Modelling
38Thank you
39HIRU programme aims
- Develop new methodologies for accessing and
combining routine data in ways which do not
breech data confidentiality rules and
regulations, but which still permit the use of
data for a wide range of research purposes. - Explore how to use routinely collected and other
data to support large scale multi-site
intervention and cohort studies and policy
relevant research. - Develop innovative analyses of large and combined
datasets - Develop methods for data capture to common
standards and definitions in multiple and remote
locations.
40Clinical trials linked data
- Quicker and more accurate feasibility studies,
searching multiple databases for
inclusion/exclusion criteria - Development of methods for electronic enhanced
recruitment (EER) of participants - Testing efficacy of EER an RCT within an RCT
- Provision of long term outcomes
41Identifying suitable trial participants Site
screening
Sample population (n6474)
- Excluded (n 6409)
- Does not have type 2 diabetes (n 6222)
- Is lt18 years (n 0)
- Not receiving 1 or 2 OADs for 2 months (n
171) - The subjects most recent HbA1c is not 7.5 and
lt 11 (if on mono therapy) or 7 and lt 10
(if on combo therapy) and do they have a BMI lt 40
kg/m2 (n 6) - Smoked within the last 6 months (n 3)
- The subject has clinically significant active
pulmonary disease (excluding asthma) or
cardiovascular disease (unstable angina within
the last 6 months or MI within the last 12 months
and/or heart failure NYHA class I to IV) (n 6) - The subject has proliferative retinopathy or
maculopathy requiring acute treatment (n 0) - The subject has uncontrolled / untreated
hypertension ( 180/100 mmHg) (n 1) - The subject has had treatment with systemic
steroids within the past 2 months prior to
screening (n 0)
Potential participants identified (n65)
42An RCT within an RCT
- Test the hypothesis that electronic enhanced
recruitment (EER) works - More patients entered
- Patients recruited more quickly
- Cost effective
- Cluster RCT design
- Research practices randomised into I and C
- EER instituted in I Group
43Example of HSR using linked HIRU data (with
Cannings, Butler, Dunstan Cardiff)
- Are patients who are not prescribed an antibiotic
for an acute Respiratory Tract Infection (RTI) at
a higher risk of developing a complication than
those who are prescribed an antibiotic? - Cohort Patients diagnosed with a first
episode of an acute RTI in 24 general
practices - Exposure Antibiotics prescribed at first
presentation of RTI - Outcome Complications diagnosed in Primary and
Secondary Care
44(No Transcript)
45(No Transcript)
46Can routinely collected, electronically stored
data be used for health technology assessment by
randomised controlled trial?
- We repeated the analysis of four exemplar RCTs
using data extracted from local electronic data
systems - Studies were small, multicentre HTAs in South
Wales addressing four different technologies
(open access to outpatients investigation of
sleep apnoea autologous blood transfusion
surgery for incontinence) - Funded by the HTA Programme
47Summary of exemplar studies
48Data sources
- Central returns PEDW
- Hospital information systems PAS Pathology
Radiology - Clinical system GeneCIS (symptoms, signs,
diagnoses, interventions)
49Conclusions
- Routinely collected data can support RCTs if
clinically rich, and held in electronic form - Patient and professional preference would still
need to be collected - Data availability, validity and standardisation
must be improved - Costs would be less, and larger trials could be
run -
- Williams JG et al The value of routine data in
health technology assessment can randomised
trials rely on existing electronic data? Health
Technology Assessment 2003vol 7no 6 - Cohen et al Estimating the marginal value of
better research output Designed vs routine
data in randomised controlled trials. Health
Economics 200312959-74
50Can outcomes be monitored using clinical data as
a proxy?
- Yes - if clinical data is captured in structured
form in sufficient detail - Symptoms and signs can be used as proxy measures
for generic and disease specific HRQL measures - Assessment of the usefulness and cost in routine
practice needed -
- Hutchings H, Cheung WY, Williams JG et al
International Journal of Health Technology
Assessment 2005
51Longitudinal Tracking Patient Journey Analysis
- Being able to track anonymised individuals across
multiple datasets longitudinally has huge
potential benefits - Understanding complex NHS and SS care packages
- Economic modelling
- Effects of interventions in one sector on others
- More comprehensive and long term follow up
52Population health studies
- Electronic cohort studies eCohort
- Suitable when exposures and outcomes are
routinely collected - Electronically enhanced cohorts a hybrid
between a traditional and eCohort with reduced
costs - Case series, case control studies, etc
- Evaluation of policy initiatives interrupted
time series
53Future Research plans
- Work collaboratively with thematic research
networks in Wales - Develop/support major research platforms
- Public Health Centre of Excellence
- MRC Centre for Ageing Research
- NIHR Injury Prevention
- Wellcome Trust LADA Cohort
- WORD/RCs - Environments for Healthy Living a
Family Cohort Study on the impact of the changing
social, physical and technological environment on
health - Cost effectiveness of conducting national
clinical audits using routine data (Health
Fundation)
54I could go on and on . . .But I wont!Thank
you!
55Demonstrating and explaining variation in health
service utilisation examples using record
linkage
- Radiology rates in AE
- Hospital admissions from ankle fractures
- Co-morbidity adjustment for outcomes
- MI admissions
- Length of stay Stroke admissions
56(No Transcript)
57(No Transcript)
58 2 way linked data admission rates for ankle
fractures by hospital
59Helping to understand trends in emergency
attendances and admissions
60(No Transcript)
61(No Transcript)
62Emergency Hospital Admission rates for Swansea
(LSOA)
Demand on NHS for Emergency Admissions for
2004-2005
63 Wider area comparisons LOS variability
64Outcome Measurement
- WAG is moving to measuring/commissioning on
outcomes - Important that measures used are robust and fair
- HIRU is assisting with methodological
developments in outcome measurement (SLIM)
project - Reconstructing QoL measures from Routine Data
65Co-morbidity adjustment for outcomes
- Co-morbidity affects outcomes
- Are co-morbidities reliably recorded on hospital
discharge data ? - What is the variation in frequency of
co-morbidity recording by hospitals? - What is the frequency of COPD in subsequent
admissions where listed as co-morbid factor in
primary admission with an MI?
66(No Transcript)
67Frequency of co-morbidity recording in hospital
discharges
- Variation in recording between trusts no
logical reasons for this data quality issue - 80 likelihood of subsequent mention of COPD
- Could be improved by primary/secondary care data
linkage - import co-morbidities from primary care
- severity and diagnostic validation from pathology
68Majority Use NHS Trustsin Wales Admissions
Outpatients
69An example of collaborative operational research
- Clinicians/NHS would like to know the likely
impact of bowel screening on workload - Collaboration between Screening Services/ WCISU /
HIRU - Plan to link anonymised cancer registry data with
utilisation datasets in primary and secondary
care - Determine service utilisation by stage of cancer
- Model likely impact on service workload
- Plan services to anticipate needs
703 way linkage Stroke Survival Rates
71Mapping and GIS applications
- The physical and built environment influences
health - An area where local government can strongly
influence health - An area where there is tremendous opportunity for
collaborative work and research between health
and local government - Development of mapping, GIS and increasing
numbers of ecological datasets offers huge
potential
72The Built Environment and Health
- How does the design of the built environment,
planning and land use policies influence health
through? - Providing equitable access to services
- Supporting physical activity through walking and
cycling - Providing safe environments for pedestrians
- Providing safe and exciting play areas for
children - And many more.
73Mastermap topographic layer
74Mastermap premise types from address layer
75Example of traffic calming distribution
76Social equity in the provision of traffic calming
77Primary School Entry Children Posterior mean
relative risks for overweight children
Non-Spatial Model
78Percentage of overweight children by 5th of
Deprivation, LSOA, WIMD 2005, n13,416
79Collaborative research with City and County of
Swansea
- Development and testing of automatic pedestrian
activity devices - Studies in pedestrian safety
- Evaluating interventions aimed at increasing
physical activity - Use of GIS datasets to support research into
environmental influences on health - Playground distribution
- Others.
80(No Transcript)
81(No Transcript)
82Distance to Play Areas in Swansea
(m)
2.3km
83Example of research question using linked data
- Are patients who are not prescribed an antibiotic
for an acute Respiratory Tract Infection (RTI) at
a higher risk of developing a complication than
those who are prescribed an antibiotic? - Cohort Patients diagnosed with a first
episode of an acute RTI in 24 general
practices within Swansea LHB, in 2005. - Exposure Antibiotics prescribed at first
presentation of RTI - Outcome Complications (quinsy, pneumonia etc)
diagnosed in Primary and Secondary Care
84Linkage
Hospital data
85(No Transcript)