Information Extraction From Medical Records - PowerPoint PPT Presentation

About This Presentation
Title:

Information Extraction From Medical Records

Description:

Training a computer to recognize commonally used reporting phraseology will organize extraction better with more precise, concise outputs. – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 20
Provided by: Goo7414
Learn more at: http://cms.uhd.edu
Category:

less

Transcript and Presenter's Notes

Title: Information Extraction From Medical Records


1
Information Extraction From Medical Records
  •  
  • by Alexander Barsky

2
Current Methodology
  • Broad assessment of patient contained in
    beginning of chart with references to more
    specific areas. Specific divisions follow broad
    assessment. Records are listed in chronological
    order of activity.

3
Chart Example
  • .

4
Problem 
  •   
  • A patient's medical chart is very detailed and
    very complex in nature. Any attempt to quickly
    locate specific information will be met with
    frustration.

5
Example
  • .

6
Solution
  • Create a system that properly extracts wanted
    information based on a predefined set of
    parameters.
  •  
  •  
  • Example "Hormonal imbalance during puberty".
    Retrieve all references to hormonal imbalances
    but only between two specific time periods in
    medical chart.

7
Tool At our disposal
  • JAPE  Java Annotation Patterns Engine.
  •     Use pattern matching and semantic 
    extraction
  •  
  • GATE General Architecture for Text Engineering.
  •     Use Information Extraction, document
    annotation, and 
  •             XML output.
  •  
  • C      Visual C Winforms.
  •     Use Medium for conversion between XML and
    .csv file                    formats.
  •  
  •          

8
Solution Methodology
  • 1. Create corpus of documents in GATE.
  • 2. Introduce rules for information extraction.
  • 3. Annotate documents in corpus.
  • 4. Output annotated documents in XML.
  • 5. Strip file of unnecessary elements and convert
    to .csv.
  •  

9
(No Transcript)
10
                        ANNIE
  •         A-Nearly-New-Information-Extraction-System
  •   
  • -Tokeniser - splits sentence into simple tokens
  • -Gazetter - identify entity names contained in
    lists
  • -Sentence Splitter - splits text into sentences
    based on lists.
  • -Parts of Speech Tagger - identifies text as
    different  POS.
  • -Coreference Matcher- identifies relationships
    between previously defined entities.     

11
Success in Information Extraction is based on
integrating most if not all ANNIE components 
  • -

12
        JAPE Key to Extraction
  • -

13
                  JAPE Example
  • -

14
XML Output
  • -

15
Problem Too much unorganized information.
 Solution XLST to the rescue!!!
  •  
  • XLST - Extensible Stylesheet Language
    Transformations
  •  
  •  
  • - Add specific rules to seperate needed from
    unnecessary information.

16
XLST Example
  • -Find all the nodes within the ltLookupgt. Add
    string between the tags.

17
CSV File Type Comma  Seperated Value - Used to
present information in a tabular system. Useful
for analyzing large amount of data in an easy to
understand format. Most common program to use it
is Excel.  
  • .

18
Potential Problem
  • Regardless of how well all the ANNIE tools are
    utilized and how well the JAPE rules are defined,
    proper recall precentage won't ever be exact.

19
Solution Machine Learning
  • Machine learning is our best chance to increase
    precision  of output results. Training a computer
    to recognize commonally used reporting
    phraseology will organize extraction better with
    more precise, concise outputs. Lucky for us, GATE
    include plugins to program machine learning.
Write a Comment
User Comments (0)
About PowerShow.com