Title: Improving Access to Clinical Data Locked in Narrative Reports: An Informatics Approach
1Improving Access to Clinical Data Locked in
Narrative Reports An Informatics Approach
Division of Biomedical Informatics University of
California, San Diego
2Overview
- The promise of natural language processing (NLP)
- Challenges of developing NLP in the clinical
domain - Challenges in applying NLP in the clinical domain
- Improving access to text through NLP resources
3The promise of NLP
- Vast growing amounts of clinical text
- Rich in information
- Patient care
- Evaluation/QC
- Comparative effectiveness research
- Epidemiology
- Locked in free text
- Natural language promising can help unlock that
information - Encouraging NLP success stories
4The promise of NLP
Results ... higher sensitivity and lower
specificity compared with patient safety
indicators based on discharge coding.
- NLP captures
- Renal failure
- Pulmonary embolism
- Deep vein thrombosis
- Sepsis
- Pneumonia
- Miocardial infarction
The promise of natural language processing ...
may be closer than ever.
5- Other promising NLP accomplishments ...
- Smoking status (Savova, Hazlehurst)
- Peripheral arterial disease (Pathak)
- Medication extraction (Uzuner)
- Pneumonia (Chapman)
- Colonoscopy quality metrics (Harkema)
- Breast cancer recurrence (Carrell)
- Colorectal cancer screening behavior (Denny)
- Rheumatoid arthritis (Zeng)
6Overview
- The promise of natural language processing (NLP)
- Challenges of developing NLP in the clinical
domain - Challenges in applying NLP in the clinical domain
- Improving access to text through NLP resources
7NLP Success
- Fresh off its butt-kicking performance on
Jeopardy!, IBMs supercomputer "Watson" has
enrolled in medical school at Columbia
University, New York Daily News February 18th
2011
IBM's computer could very well herald a whole
new era in medicine." ComputerWorld February 17,
2011
Dr. Watson??
8Clinical NLP Since 1960s
- Why has clinical NLP had little impact on
clinical care?
9Barriers to Development
- Sharing clinical data difficult
- Have not had shared datasets for development and
evaluation - Modules trained on general English not sufficient
- Insufficient common conventions and standards for
annotations - Data sets are unique to a lab
- Not easily interchangeable
10- Limited collaboration
- Clinical NLP applications silos and black boxes
- Have not had open source applications
- Reproducibility is formidable
- Open source release not always sufficient
- Software engineering quality not always great
- Mechanisms for reproducing results are sparse
11Overview
- The promise of natural language processing (NLP)
- Challenges of developing NLP in the clinical
domain - Challenges in applying NLP in the clinical domain
- Improving access to text through NLP resources
12Security Privacy Concerns
- Clinical texts have many patient identifiers
- 18 HIPAA identifiers
- Names
- Addresses
- Items not regulated by HIPAA
- tight end for the Steelers
- Unique cases
- 50s-year-old woman who is pregnant
- Sensitive information
- HIV status
Institutions are reluctant to share data
13- Lack of user-centered development and scalability
- Perceived cost of applying NLP outweighs the
perceived benefit (Len DAvolio)
14Overview
- The promise of natural language processing (NLP)
- Challenges of developing NLP in the clinical
domain - Challenges in applying NLP in the clinical domain
- Improving access to text through NLP resources
15(No Transcript)
16Access to Resources for Developing NLP Algorithms
17Resources for NLP Developers
Knowledge Bases
Domain Schema Ontology
Modifier Ontology
Clinical Data
Annotations
Modifiers of clinical elements
Linguistic representation of clinical elements
Annotation Environment
Disease colon cancer Experiencer
family Negation no Historical yes
Patient denies a family history of colon cancer
Evaluation
Melissa Tharp
18Schema Ontology Elements
19Schema Ontology Relationships
20Modifier Ontology
- Modifiers are important for interpreting text
- Chest radiograph confirms pneumonia
- Family history of pneumonia
- No evidence of pneumonia
Affirmation/negation Uncertainty Experiencer Histo
rical/Recent Severity
Allowable modifiers For each clinical element
21Modifier Ontology
Types of modifiers
Linguistic expressions
Actions
Translations
22Schema Ontology Imports Modifier Ontology
- Medications
- Type
- Dose
- Frequency
- Route
- Diagnosis
- Negation
- Uncertainty
- Severity
- History
- Experiencer
Consistent with other models Clinical element
models, cTAKES type system, Common model
23Domain Ontology for NLP
- Instance of schema ontology
- Clinical elements from a particular domain
24Synonyms Misspellings Regular expressions
25Resources for NLP Experts
Schemas
- Lack of shareable data is a barrier
- University of Pittsburgh Repository
- 111,045 reports of 9 types
- 600 users
- No longer available
- MT Samples
- 2,300 reports from MTSamples.com
- De-identified
Clinical Data
Annotations
Annotation Environment
Evaluation
26Resources for NLP Experts
Schemas
- AMIA NLP Working Group
- ShARe - Sharing Annotated Resources
- 5R01GM090187 Chapman, Savova, Elhadad
- 600 clinical notes from MIMIC II repository
- Annotate disorders and modifiers
- Anatomic location
- Map to SNOMED codes
- CLEF Shared Task 2013 and 2014
- https//sites.google.com/site/shareclefehealth/
Clinical Data
Annotations
Annotation Environment
Evaluation
B South, D Mowery, S Velupillai, L Christensen, S
Meystre
27Resources for NLP Experts
Schemas
- Distributed annotation in secure environment
Annotator Registry
Clinical Data
eHOST
Annotation Admin
Annotations
Annotation Environment
Web application iDASH cloud
Client app
Evaluation
VA, SHARP, and NIGMS S Duvall, B South, B
Adams, G Savova, N Elhadad, H Hochheiser
28Annotator Registry
- Annotators
- Enlist for annotation
- Certify for annotation tasks
- Personal health information
- Part-of-speech tagging
- UMLS mapping
- Set pay rate
- NLP Admins
- Search for annotators
- http//nlp-ecosystem.ucsd.edu/annotators
291. Assign annotators to a task
302. Create a Schema
313. Assign users and set time expectations
324. Keep track of progress
33Resources for NLP Experts
Schemas
- Distributed annotation in secure environment
Annotator Registry
Clinical Data
eHOST
Annotation Admin
Annotations
Annotation Environment
Web application iDASH cloud
Client app
Evaluation
34Resources for NLP Experts
Schemas
- Compare output of NLP annotators
- NLP system vs human annotation
- View annotations
- Calculate outcome measures
- Drill down to all levels of annotation
- Perform error analysis
Clinical Data
Annotations
Annotation Environment
Evaluation
35Select Classifications to View
Document annotations
Outcome Measures for Selected Annotations
Report List
Attributes for Selected Annotation
Relationships for Selected Annotation
VA and ONC SHARP Christensen, Murphy, Frabetti,
Rodriguez, Savova
36Access to Information in Text
37Users Concepts Cough Dyspnea Infiltrate on
CXR Wheezing Fever Cervical Lymphadenopathy
Controlled Vocabs Dry cough Productive
cough Cough Hacking cough Bloody cough
Which concepts?
38Users Concepts Cough Dyspnea Infiltrate on
CXR Wheezing Fever Cervical Lymphadenopathy
Attribute-values Temp 38.0C Low-grade
temperature
What values?
39Efficient Access to Information in the Patient
Chart
Family history of colon cancer
Knowledge Author
Schema Builder
Chart Review Interface
Disease colon cancer Experiencer
family Negation no Historical yes
NLP Schema
Domain Ontology
40Knowledge Author
- Front end interface for users
- Back end
- Schema ontology
- Modifier ontology
- Output
- Domain ontology
- Schema for NLP system
B Scuba, F Fana, Liqin Wang, Mingyuan Zhang, Y
Liu, M Kong, F Drews
41Questions Discussion
African American Adult
42Ibuprofen
43Ibuprofen p.o.
44No family history of colon cancer
Linguistic modifiers
45Calls Voogo synonym tool
46Access Information in Patient Chart
- Navigate patient data more efficiently
- Point chart reviewer to ambiguous and
contradictory information - Reduce bias
Knowledge Author
Chart Review Interfaces
47Access Information in Patient Chart
Knowledge Author
NLP
Viz
Subjects, Diagnoses Findings, Anatomical Locations
EMR
Chart Review Interfaces
Feedback improve models
Population Patient Document Expression
User Identifies Patients Meeting Criteria
Interactive Search and Review of Clinical Records
with Multi-layered Semantic Annotation NLM
1R01LM010964-01. Chapman, Wiebe, Hwa.
48Population View
49Patient View
50Access to NLP Tools and Interfaces
51Access to NLP Tools
v3NLP (Zeng, Divita) pyConText (Chapman) RapTat
(Matheny, Gobbell)
NLP Workbench
Classifier Workbench
NLP Platform
KB
Annotations
Visualization Workbench
Mix Match
Edit
Correct
User
52TextVect
User
Select NLP Features Select NLP Features
X N-grams
X UMLS Concepts
Part-of-speech tags
X Negation
Select Representation Select Representation
Binary
X Count
tf-idf
NLP Workbench
Classifier Workbench
TextVect
Visualization Workbench
Feature Selection Algorithms
NLP Tools
Yes No No
Training Set
Yes 1 0 0 0 1 1 1
No 0 0 1 1 0 0 0
No 0 0 0 1 0 1 0
A Kumar, C Elkan, S Abdelrahman https//github.com
/abhishek-kumar/TextVect
53Evaluation of TextVect
54I2b2 dataset Micro-F-Measure
Baseline 0.71
Average 0.91
Best 0.97
TextVect 0.95
CMC dataset Micro-F-Measure
Average 0.77
Best 0.89
TextVect 0.82
55Access to Visualizations of NLP Output
NLP Workbench
Classifier Workbench
Visualization workbench
NLP System
Visualization Workbench
Annotations
56Timeline View
Jianlin Shi, T Wang, E Shenvi, R El-Kareh, M
Tharp, R Reeves
57Access to Understanding
58Access to UnderstandingClinical Notes
- Chief Complaint
- Hypoxic respiratory failure
- Major Surgical or Invasive Procedure
- Intubation.
- History of Present Illness
- 81 yo man w/ho CAD, COP, PVD, AAA xfered from OSH
for mngmt resp failure. Pt was found _at_ home by
EMS followign c/o 05-29 "crushing",
nonradiating SSCP. Pt diaphoretic during
transport. Sat 84--gt94 on NRB. Given ASA, NT,
nebs en route to OSH where started on BIPAP and
eventually intubated. BP on arrival 240/140 so
started on NTG drip titrated up until BP fell to
90/58 resulting in IVF, dopamine. Given 80 IV
lasix. First set enzymes negative and BNP 1700.
Pt xferred for further management.
- Definitions
- Medical terms
- Acronyms/abbreviations
- Pictures
- Internet sites
- Biomedical literature
- Normal range checking
59(No Transcript)
60Conclusion
- Collaborations for NLP improve ability to
- Create potentially useful resources and tools
- Provide access to
- Resources for NLP development
- Information in reports
- NLP and visualization tools
- Major challenge is applying NLP
- Future need
- More integration with other tools
- More coordination
61Acknowledgments
BLU Lab
Collaborators
- Lee Christensen
- Melissa Tharp
- Mike Conway
- Danielle Mowery
- Bill Scuba
- Milan Kovacevich
- Dieter Hillert
- Samir Abdelrahman
- Leah Willis
- Bob Angell
- Sumithra Vellupilai
- Maria Kvist
- Maria Skeppstedt
- Aron Henrikkson
- Brian Chapman
- David Carrell
- Sascha Dublin
- Zia Agha
- Stephane Meystre
- Scott DuVall
- Jianlin Shi
- Harry Hochheiser
- Jan Wiebe
- Rebecca Hwa
- Guergana Savova
- Noemie Elhadad
- Michael Matheny
- Rob El-Kareh
- Ruth Reeves
- Qing Zeng
- Guy Divita
- Frank Drews
62Questions Discussion
- wendy.chapman_at_utah.edu