Case Study for Clinical Relevancy: Asthma - PowerPoint PPT Presentation

About This Presentation
Title:

Case Study for Clinical Relevancy: Asthma

Description:

... volume depletion, acute myocardial infarction, intermediate coronary syndrome, cardiac dysrhythmias, heart failure, acute upper respiratory infections, ... – PowerPoint PPT presentation

Number of Views:1914
Avg rating:3.0/5.0
Slides: 42
Provided by: nancyb70
Learn more at: https://www.i2b2.org
Category:

less

Transcript and Presenter's Notes

Title: Case Study for Clinical Relevancy: Asthma


1
Case Study for Clinical Relevancy Asthma
Scott T. Weiss, M.D., M.S.
Professor of Medicine Harvard Medical
School Director, Center for Genomic
Medicine Director, Program in Bioinformatics Assoc
iate Director, Channing Laboratory Brigham and
Womens Hospital Boston, MA
2
Outline
  • Context focus on process and data
  • Overview of Asthma DBP
  • Smoking as an example of the data issues
  • Predicting COPD in those with asthma
  • Predicting asthma exacerbations
  • Genetic prediction of asthma exacerbations
    current status
  • DNA collection
  • Lessons Learned
  • Conclusions

3
Context
  • Channing Lab - extensive genetics
    pharmacogenetics resources focused on airways
    diseases
  • Faculty with clinical, epidemiology, genetic, and
    bioinformatics training and experience
  • multidisciplinary research collaborative track
    record
  • Good i2b2 driver from bench to clinic
  • Strong focus and direction for Cores

4
Broad Goals of Channing Program in Predictive
Medicine
  • Genetic variation ? clinical practice
  • ? Disease risk (asthma diagnosis)
  • ? Natural history (exacerbations)
  • ? Individual response to medication
    (pharmacogenetics)
  • Develop predictive tests (genetic and nongenetic)
    in Channing populations
  • Validate these tests in Partners asthma cohort
    (PAC) at least as proof of concept

5
I2B2 Airways DBP Overview
6
Before we start
  • Numerous important covariates
  • e.g. age, tobacco, comorbidities, medications
  • Adjust outcomes for covariates
  • Some (eg age, gender,Dx, encounter) readily
    available
  • Obtained through Core 4
  • Others require substantial effort e.g.
    medications, tobacco use, comorbid conditions
  • Collaboration - NLP experts in Core 1

7
Phenotypes from text
  • Extract specific data items
  • Medication
  • Smoking status
  • Diagnoses (Co-morbidity)
  • Extract findings to assist with case selection
  • Extract findings to assist with clinical
    predictions

8
Smoking Status- Examples
Smoker
Non-Smoker
Past Smoker
???
Hard to pick
Hard to pick
9
Smoking -Text Processing
Manually classified
10
Smoking Status
Preliminary results
  • Raw sample 20,000 reports
  • Feature extraction gt3000
  • Feature selection 25 - 1000
  • Gold standard sample cases 2,800
  • Correct classification rate 46 - 81(compared to
    Gold Standard)

11
Smoking Status
Preliminary results
Baseline performance
Increase, combine features should improve
performance
12
Data Extraction
Data Mining Pipeline
13
Asthma Preceding COPD
  • Significant overlap of asthma and COPD DX
  • Common denominator smoking
  • Asthma is known to precede and predict the
    development of COPD independent of smoking
  • Could we develop a multivariate clinical
    predictor that would predict which asthmatics
    would get COPD?

14
Study Design
  • Source Partners Healthcare Research Patient
    Data Repository (RPDR).
  • RPDR MGH, BWH, etc clinical repository for
    researchers.
  • Training 9349 asthmatics (843 COPD, 8506
    controls) first encounter 1988 1998.
  • Test A future set of 992 asthmatics (46 COPD,
    946 controls) first encounter from 1999-2002.

15
Data Collection
  • Criteria Patients observed for at least 5
    years, at least 18 at the first encouter, and
    race, sex, height, weight, and smoking available.
  • Comorbodities International Classification of
    Diseases, 9th Revision (ICD-9) codes as admission
    diagnosis or ER primary diagnosis (104)
  • COPD ICD-9 code for Chronic Bronchitis,
    Emphysema Chronic Airways Obstruction, not
    otherwise specified.

16
Analysis
  • Model A Bayesian network was generated from the
    training set of 9349 asthmatics (843 COPD, 8506
    controls) encountered between1988 and 1998 from
    104 comoribities and race, gender, age, smoking.
  • Results The risk of COPD is modulated by
    gender, race, and smoking history, and 14
    comorbidities Viral and chlamydial infections,
    diabetes mellitus, volume depletion, acute
    myocardial infarction, intermediate coronary
    syndrome, cardiac dysrhythmias, heart failure,
    acute upper respiratory infections, acute
    bronchitis and bronchiolitis, pneumonia, early or
    threatened labor, normal delivery, shortness of
    breath, respiratory distress.

17
Network Model
18
Validation
  • Propagation a Bayesian network can compute the
    probability distribution of any variable given an
    instance of some or all the other variables.
  • Test data a future set of 992 asthmatics (46
    COPD, 946 controls) first encounter from
    1999-2002.
  • Prediction for each patient, predict the
    probability of COPD given the other elements in
    the network (co-morbidities and demographics).
  • Validation compare the predicted with the
    observed COPD status.

19
Predictive Validation
20
One variable at the time
21
Asthma Exacerbations
  • Asthma attacks involve worsening of asthma
    symptoms including bronchoconstriction and
    inflammatory response
  • Major cause of morbidity and mortality in asthma
  • 11.7 million Americans have an exacerbation every
    year (3.9 million children)
  • In US children, exacerbations are the third
    leading cause of hospitalizations (198,000
    occurrences per year)
  • Cost of asthma exacerbations US4 billion
    dollars, Partners20 million dollars

22

23

24

25
RPDR Exacerbation Prediction

26
Genetic Prediction of Asthma Exacerbation
  • Objective
  • Predict asthma exacerbation from genetic data
  • Subjects
  • 290 CAMP participants
  • Not on steroids
  • Followed for 10 years
  • Have genetic data available
  • Phenotype
  • Case Reported overnight hospitalization(s)
    (n83)
  • Control No overnight hospitalizations or ER
    visits (n207)
  • Genotype
  • 2443 SNPs from 349 candidate genes
  • In Hardy-Weinberg equilibrium among controls
  • Minor allele frequency gt 0.05

27
Exacerbation Model
132 of 2443 SNPs in 55 of 349 genes predict
exacerbation
28
Validation
  • Method Prediction on fitted values
  • Result Area under the ROC curve (AUROC) is 0.97
  • AUROC measures accuracy as trade-off between
    sensitivity and specificity

AUROC Rating
0.5 - 0.6 Fail
0.6 - 0.7 Poor
0.7 - 0.8 Fair
0.8 - 0.9 Good
0.9 - 1.0 Excellent
AUROC 0.97
29
Cross-Validation
  • Method 20-fold cross-validation to test
    robustness
  • Data is split into 20 groups
  • One group is used as independent and remaining 19
    are used to quantify the model
  • (2) is repeated until each group has been
    independent set
  • Result AUROC is 0.84 (good)

AUROC 0.84
30
Partners Asthma DNA collection 1
  • Recruit Partners asthma patients
  • Partners Asthma Center, NWH, MGH
  • High quality spirometric phenotyping
  • Blood for DNA extraction and storage
  • Children and adults
  • High cost (gt1000/subject)
  • Low intensity 6 months only 100 subjects
    recruited
  • Doctors and patients need education

31
Partners Asthma DNA collection 2
  • Recruit Partners asthma cohort patients
  • Leverage CRIMSON blood samples
  • Leverage data mart for phenotype data
  • Blood for DNA extraction and storage
  • Children and adults cases and controls
  • low cost (lt30/subject)
  • High intensity 9 months gt3000 subjects recruited

32
Figure 1 Data Flow for Asthma DBP
Channing RPDR ADMPN Send to RPD
converts ADMPN to
MRN sends to pathology
Pathology (Crimson) MRN Crimson
ID ADMPN sends back to Channing with
sample for DNA extraction
Figure 1 Legend Deidentified data file analyzed
by Channing subjects for DNA collection selected.
File sent to RPDR converted back to MR and sent
to Crimson. Samples identified and given Crimson
ID ADMPN and sample Sent back to Channing.
33
Recruitment for DBP from Crimson at BWH Asthma
Cases by Utilization and Race
34
Recruitment for DBP from Crimson at BWH Asthma
Cases and Controls by Race
35
Summary of Samples to 04/07/08
36
Lessons learned 1
  • Get what you ask for
  • Regular meetings, regular meetings
  • Negotiate your demands
  • Tools are not enough
  • Leverage your peers
  • Recruiting patients is hard work
  • IRB is hard work

37
Lessons learned 2
  • You can never have enough statistics or
    bioinformatics
  • Genotyping and its technologies are secondary
  • The RPDR data are dirty!
  • Listen to Shawn
  • Be flexible

38
Summary Airways disease as a driver for i2b2
  • Typical complex disease challenge
  • Big impact on health care system
  • Potential for large clinical impact
  • Core 1 Extracting phenotypes from free text
    statistical models
  • Core 2 Viewer for CRC
  • Core 4 Data provisioning

39
Conclusions
  • The stronger the existing program, the more
    successful the I2B2 collaboration
  • Communication is key
  • Fit the question to the data not the other way
    around
  • Data access will be an issue for the future

40
Collaborators (and what they did)
  • Scott, Zak, John, and Susanne money, project
    management, IRB, and big picture
  • Ross Channing bioinformatics, file structures,
    geek to geek translation with the cores, beta
    testing, 850 collection, IRB, links to other
    genetic bioinformatics tools and projects
  • Shawn and Vivian asthma and control data mart
  • Anne, LJ, James nongenetic predictors in CAMP
  • Marco and Blanca nongenetic predictors in PAC
  • Marco and Blanca genetic predictors in CAMP
  • Marco and Blanca genetic predictors in PAC
  • Lynn Crimson

41
Acknowledgments
  • Ross Lazarus Susanne Churchill
  • Blanca E. Himes Anne Fuhlbrigge
  • Marco F. Ramoni LJ Wei
  • Isaac Kohane James Sigornivitch
  • Shawn Murphy Lynn Bry
Write a Comment
User Comments (0)
About PowerShow.com