Improving Access to Clinical Data Locked in Narrative Reports: An Informatics Approach - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Improving Access to Clinical Data Locked in Narrative Reports: An Informatics Approach

Description:

Title: Natural Language Processing for Biosurveillance Author: Wendy Chapman Last modified by: Wendy Chapman Created Date: 5/6/2004 2:21:06 PM Document presentation ... – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 63
Provided by: WendyC73
Category:

less

Transcript and Presenter's Notes

Title: Improving Access to Clinical Data Locked in Narrative Reports: An Informatics Approach


1
Improving Access to Clinical Data Locked in
Narrative Reports An Informatics Approach 
  • Wendy W. Chapman, PhD

Division of Biomedical Informatics University of
California, San Diego
2
Overview
  • The promise of natural language processing (NLP)
  • Challenges of developing NLP in the clinical
    domain
  • Challenges in applying NLP in the clinical domain
  • Improving access to text through NLP resources

3
The promise of NLP
  • Vast growing amounts of clinical text
  • Rich in information
  • Patient care
  • Evaluation/QC
  • Comparative effectiveness research
  • Epidemiology
  • Locked in free text
  • Natural language promising can help unlock that
    information
  • Encouraging NLP success stories

4
The promise of NLP
  • Murff (2011)JAMA

Results ... higher sensitivity and lower
specificity compared with patient safety
indicators based on discharge coding.
  • NLP captures
  • Renal failure
  • Pulmonary embolism
  • Deep vein thrombosis
  • Sepsis
  • Pneumonia
  • Miocardial infarction

The promise of natural language processing ...
may be closer than ever.
5
  • Other promising NLP accomplishments ...
  • Smoking status (Savova, Hazlehurst)
  • Peripheral arterial disease (Pathak)
  • Medication extraction (Uzuner)
  • Pneumonia (Chapman)
  • Colonoscopy quality metrics (Harkema)
  • Breast cancer recurrence (Carrell)
  • Colorectal cancer screening behavior (Denny)
  • Rheumatoid arthritis (Zeng)

6
Overview
  • The promise of natural language processing (NLP)
  • Challenges of developing NLP in the clinical
    domain
  • Challenges in applying NLP in the clinical domain
  • Improving access to text through NLP resources

7
NLP Success
  • Fresh off its butt-kicking performance on
    Jeopardy!, IBMs supercomputer "Watson" has
    enrolled in medical school at Columbia
    University, New York Daily News February 18th
    2011

IBM's computer could very well herald a whole
new era in medicine." ComputerWorld February 17,
2011
Dr. Watson??
8
Clinical NLP Since 1960s
  • Why has clinical NLP had little impact on
    clinical care?

9
Barriers to Development
  • Sharing clinical data difficult
  • Have not had shared datasets for development and
    evaluation
  • Modules trained on general English not sufficient
  • Insufficient common conventions and standards for
    annotations
  • Data sets are unique to a lab
  • Not easily interchangeable

10
  • Limited collaboration
  • Clinical NLP applications silos and black boxes
  • Have not had open source applications
  • Reproducibility is formidable
  • Open source release not always sufficient
  • Software engineering quality not always great
  • Mechanisms for reproducing results are sparse

11
Overview
  • The promise of natural language processing (NLP)
  • Challenges of developing NLP in the clinical
    domain
  • Challenges in applying NLP in the clinical domain
  • Improving access to text through NLP resources

12
Security Privacy Concerns
  • Clinical texts have many patient identifiers
  • 18 HIPAA identifiers
  • Names
  • Addresses
  • Items not regulated by HIPAA
  • tight end for the Steelers
  • Unique cases
  • 50s-year-old woman who is pregnant
  • Sensitive information
  • HIV status

Institutions are reluctant to share data
13
  • Lack of user-centered development and scalability
  • Perceived cost of applying NLP outweighs the
    perceived benefit (Len DAvolio)

14
Overview
  • The promise of natural language processing (NLP)
  • Challenges of developing NLP in the clinical
    domain
  • Challenges in applying NLP in the clinical domain
  • Improving access to text through NLP resources

15
(No Transcript)
16
Access to Resources for Developing NLP Algorithms
17
Resources for NLP Developers
Knowledge Bases
Domain Schema Ontology
Modifier Ontology
Clinical Data
Annotations
Modifiers of clinical elements
Linguistic representation of clinical elements
Annotation Environment
Disease colon cancer Experiencer
family Negation no Historical yes
Patient denies a family history of colon cancer
Evaluation
Melissa Tharp
18
Schema Ontology Elements
19
Schema Ontology Relationships
20
Modifier Ontology
  • Modifiers are important for interpreting text
  • Chest radiograph confirms pneumonia
  • Family history of pneumonia
  • No evidence of pneumonia

Affirmation/negation Uncertainty Experiencer Histo
rical/Recent Severity
Allowable modifiers For each clinical element
21
Modifier Ontology
Types of modifiers
Linguistic expressions
Actions
Translations
22
Schema Ontology Imports Modifier Ontology
  • Medications
  • Type
  • Dose
  • Frequency
  • Route
  • Diagnosis
  • Negation
  • Uncertainty
  • Severity
  • History
  • Experiencer

Consistent with other models Clinical element
models, cTAKES type system, Common model
23
Domain Ontology for NLP
  • Instance of schema ontology
  • Clinical elements from a particular domain

24
Synonyms Misspellings Regular expressions
25
Resources for NLP Experts
Schemas
  • Lack of shareable data is a barrier
  • University of Pittsburgh Repository
  • 111,045 reports of 9 types
  • 600 users
  • No longer available
  • MT Samples
  • 2,300 reports from MTSamples.com
  • De-identified

Clinical Data
Annotations
Annotation Environment
Evaluation
26
Resources for NLP Experts
Schemas
  • AMIA NLP Working Group
  • ShARe - Sharing Annotated Resources
  • 5R01GM090187 Chapman, Savova, Elhadad
  • 600 clinical notes from MIMIC II repository
  • Annotate disorders and modifiers
  • Anatomic location
  • Map to SNOMED codes
  • CLEF Shared Task 2013 and 2014
  • https//sites.google.com/site/shareclefehealth/

Clinical Data
Annotations
Annotation Environment
Evaluation
B South, D Mowery, S Velupillai, L Christensen, S
Meystre
27
Resources for NLP Experts
Schemas
  • Distributed annotation in secure environment

Annotator Registry
Clinical Data
eHOST
Annotation Admin
Annotations
Annotation Environment
Web application iDASH cloud
Client app
Evaluation
VA, SHARP, and NIGMS S Duvall, B South, B
Adams, G Savova, N Elhadad, H Hochheiser
28
Annotator Registry
  • Annotators
  • Enlist for annotation
  • Certify for annotation tasks
  • Personal health information
  • Part-of-speech tagging
  • UMLS mapping
  • Set pay rate
  • NLP Admins
  • Search for annotators
  • http//nlp-ecosystem.ucsd.edu/annotators

29
1. Assign annotators to a task
30
2. Create a Schema
31
3. Assign users and set time expectations
32
4. Keep track of progress
33
Resources for NLP Experts
Schemas
  • Distributed annotation in secure environment

Annotator Registry
Clinical Data
eHOST
Annotation Admin
Annotations
Annotation Environment
Web application iDASH cloud
Client app
Evaluation
34
Resources for NLP Experts
Schemas
  • Compare output of NLP annotators
  • NLP system vs human annotation
  • View annotations
  • Calculate outcome measures
  • Drill down to all levels of annotation
  • Perform error analysis

Clinical Data
Annotations
Annotation Environment
Evaluation
35
Select Classifications to View
Document annotations
Outcome Measures for Selected Annotations
Report List
Attributes for Selected Annotation
Relationships for Selected Annotation
VA and ONC SHARP Christensen, Murphy, Frabetti,
Rodriguez, Savova
36
Access to Information in Text
37
Users Concepts Cough Dyspnea Infiltrate on
CXR Wheezing Fever Cervical Lymphadenopathy
Controlled Vocabs Dry cough Productive
cough Cough Hacking cough Bloody cough
Which concepts?
38
Users Concepts Cough Dyspnea Infiltrate on
CXR Wheezing Fever Cervical Lymphadenopathy
Attribute-values Temp 38.0C Low-grade
temperature
What values?
39
Efficient Access to Information in the Patient
Chart
Family history of colon cancer
Knowledge Author
Schema Builder
Chart Review Interface
Disease colon cancer Experiencer
family Negation no Historical yes
NLP Schema
Domain Ontology
40
Knowledge Author
  • Front end interface for users
  • Back end
  • Schema ontology
  • Modifier ontology
  • Output
  • Domain ontology
  • Schema for NLP system

B Scuba, F Fana, Liqin Wang, Mingyuan Zhang, Y
Liu, M Kong, F Drews
41
Questions Discussion
African American Adult
  • wwchapman_at_ucsd.edu

42
Ibuprofen
43
Ibuprofen p.o.
44
No family history of colon cancer
Linguistic modifiers
45
Calls Voogo synonym tool
46
Access Information in Patient Chart
  • Navigate patient data more efficiently
  • Point chart reviewer to ambiguous and
    contradictory information
  • Reduce bias

Knowledge Author
Chart Review Interfaces
47
Access Information in Patient Chart
Knowledge Author
NLP
Viz
Subjects, Diagnoses Findings, Anatomical Locations
EMR
Chart Review Interfaces
Feedback improve models
Population Patient Document Expression
User Identifies Patients Meeting Criteria
Interactive Search and Review of Clinical Records
with Multi-layered Semantic Annotation  NLM
1R01LM010964-01. Chapman, Wiebe, Hwa.
48
Population View
49
Patient View
50
Access to NLP Tools and Interfaces
51
Access to NLP Tools
v3NLP (Zeng, Divita) pyConText (Chapman) RapTat
(Matheny, Gobbell)
NLP Workbench
Classifier Workbench
NLP Platform
KB
Annotations
Visualization Workbench
Mix Match
Edit
Correct
User
  • Interact
  • Customize

52
TextVect
User
Select NLP Features Select NLP Features
X N-grams
X UMLS Concepts
Part-of-speech tags
X Negation
Select Representation Select Representation
Binary
X Count
tf-idf
NLP Workbench
Classifier Workbench
TextVect
Visualization Workbench
Feature Selection Algorithms
NLP Tools
Yes No No
Training Set
Yes 1 0 0 0 1 1 1
No 0 0 1 1 0 0 0
No 0 0 0 1 0 1 0
A Kumar, C Elkan, S Abdelrahman https//github.com
/abhishek-kumar/TextVect
53
Evaluation of TextVect
54
I2b2 dataset Micro-F-Measure
Baseline 0.71
Average 0.91
Best 0.97
TextVect 0.95
CMC dataset Micro-F-Measure
Average 0.77
Best 0.89
TextVect 0.82
55
Access to Visualizations of NLP Output
NLP Workbench
Classifier Workbench
Visualization workbench
NLP System
Visualization Workbench
Annotations
56
Timeline View
Jianlin Shi, T Wang, E Shenvi, R El-Kareh, M
Tharp, R Reeves
57
Access to Understanding
58
Access to UnderstandingClinical Notes
  • Chief Complaint
  • Hypoxic respiratory failure
  • Major Surgical or Invasive Procedure
  • Intubation.
  • History of Present Illness
  • 81 yo man w/ho CAD, COP, PVD, AAA xfered from OSH
    for mngmt resp failure. Pt was found _at_ home by
    EMS followign c/o 05-29 "crushing",
    nonradiating SSCP. Pt diaphoretic during
    transport. Sat 84--gt94 on NRB. Given ASA, NT,
    nebs en route to OSH where started on BIPAP and
    eventually intubated. BP on arrival 240/140 so
    started on NTG drip titrated up until BP fell to
    90/58 resulting in IVF, dopamine. Given 80 IV
    lasix. First set enzymes negative and BNP 1700.
    Pt xferred for further management.
  • Definitions
  • Medical terms
  • Acronyms/abbreviations
  • Pictures
  • Internet sites
  • Biomedical literature
  • Normal range checking

59
(No Transcript)
60
Conclusion
  • Collaborations for NLP improve ability to
  • Create potentially useful resources and tools
  • Provide access to
  • Resources for NLP development
  • Information in reports
  • NLP and visualization tools
  • Major challenge is applying NLP
  • Future need
  • More integration with other tools
  • More coordination

61
Acknowledgments
BLU Lab
Collaborators
  • Lee Christensen
  • Melissa Tharp
  • Mike Conway
  • Danielle Mowery
  • Bill Scuba
  • Milan Kovacevich
  • Dieter Hillert
  • Samir Abdelrahman
  • Leah Willis
  • Bob Angell
  • Sumithra Vellupilai
  • Maria Kvist
  • Maria Skeppstedt
  • Aron Henrikkson
  • Brian Chapman
  • David Carrell
  • Sascha Dublin
  • Zia Agha
  • Stephane Meystre
  • Scott DuVall
  • Jianlin Shi
  • Harry Hochheiser
  • Jan Wiebe
  • Rebecca Hwa
  • Guergana Savova
  • Noemie Elhadad
  • Michael Matheny
  • Rob El-Kareh
  • Ruth Reeves
  • Qing Zeng
  • Guy Divita
  • Frank Drews

62
Questions Discussion
  • wendy.chapman_at_utah.edu
Write a Comment
User Comments (0)
About PowerShow.com