Title: Automated Retrieval and Generation of Brain CT Radiology Reports
1Automated Retrieval and Generation of Brain CT
Radiology Reports
2Outline
- Background
- Motivation
- Research Work
- Conclusion
3Background
- Computer Tomography (CT) has been used to examine
the abnormality of human brain due to various
causes - The result of each brain CT examination consists
of - A set of CT scan image
- A report written by a radiologist
4Abnormalities
- Head traumas
- epidural hemorrhage(EDH)
- acute subdural hemorrhage (SDH_Acute)
- chronic subdural hemorrhage (SDH_Chronic)
- intracerebral hemorrhage (ICH)
- intraventricular hemorrhage (IVH)
- subarachnoid hemorrhage (SAH)
- Fractures
- Edemas
- Others
- Midline shift
- Etc.
5Background
Normal
EDH
6Background
SDH_Acute, SDH_Chronic, Midline Shift
ICH
7Background
Report Unenhanced axial CT head was obtained. No
previous study is available for comparison.
There is acute subdural haemorrhage overlying
the left convexity midline falx, which measures
up to a maximum of 1.4 cm in thickness.
Subarachnoid haemorrhage is seen in the sulci at
the left fronto-temporal lobe, bilateral Sylvian
fissure cistern and the basal cistern.
Intraventricular extension of haemorrhage with
blood seen in all four ventricles is noted. There
is intraparenchymal haemorrhage in the bilateral
frontal lobes raising the suspicion of
haemorrhagic contusion. There is considerable
mass effect with midline shift to the right,
generalised effacement of cerebral sulci and
compression of the left lateral ventricle.
Prominence of the right temporal horn is
suspicious for a hydrocephalus. No skull vault
fracture is seen in the CT scan.
8Background
Comments Acute left fronto-temporal-parietal
subdural haematoma with bifrontal parenchymal
haematoma and bilateral subarachnoid haemorrhage
with intraventricular extension. Associated mass
effect with midline shift to the right,
compression of the left lateral ventricle and
generalised effacement of cerebral sulci.
Hydrocephalus with right ventricle dilated.
9Motivation
- Radiology reports contain rich information which
is not used in many medical database systems - The proposed system is aimed to
- Provide convenient search functions for radiology
reports and images - Help doctors, radiologists, and medical
informaticians to gather needed information for
their research - Give references to radiologists to compare
results - Facilitate education systems for researchers,
junior doctors, and medical students - Integrate medical records from various sources
- Provide platform for medical community to
exchange information and knowledge
10Two Research Directions
- Automated Retrieval and Generation of Brain CT
Radiology Reports - Content-based Retrieval of CT Scan Brain Images
11Related Work
- Information Extraction from Radiology Reports
- Automatic Generation of Medical Reports
- Free Text Assisted Medical Image Retrieval
12Research Work
- Information Extraction from Radiology Reports
- Automatic Generation of Medical Reports
- Free Text Assisted Medical Image Retrieval
13Related Work
- MedLEE Medical Language Extraction and Encoding
System - RADA RADiology Analysis Tool
- Statistical Natural Language Processor for
Medical Reports
14MedLEE Medical Language Extraction and Encoding
System
15RADA Radiology Analysis Tool
16Statistical Natural Language Processor for
Medical Reports
17Statistical Natural Language Processor for
Medical Reports
An example of structured representation output
18Challenges
- Negations
- Insufficient understanding of the text
- Ungrammatical writing styles
- Large vocabulary
- Assumed knowledge between writer and reader
19Related Work
- Information Extraction from Radiology Reports
- Automatic Generation of Medical Reports
- Free Text Assisted Medical Image Retrieval
20Automatic Generation of Medical Reports
- Most existing medical report automatic generation
systems use the following template filling
approaches - Structured Data Entry
- Mail Merge
- Canned Text
21Challenges
- NLG
- NLG is still premature application of medical
document generation - There is still no system based on NLG principles
in routine use generates medical reports with
fluent, concise and readable text - Challenges of NLG in general domain also exist in
medical domain - Systems that automatically generate medical
report from medical images are still lacking.
22Related Work
- Information Extraction from Radiology Reports
- Automatic Generation of Medical Reports
- Free Text Assisted Medical Image Retrieval
23Free Text Assisted Medical Image Retrieval
- NeuRadIR Web-Based Neuroradiological Information
Retrieval System - Information Retrieval on MR Brain Images and
Radiology Reports
24NeuRadIR
25MRI Brain Image and Report Retrieval
26Challenges
- Complexity of the system, as the system
- Consists of many functional components
- Needs knowledge from various research areas
27Research Areas
- Information Extraction from Brain CT Radiology
Reports - Automatic Generation of Brain CT Radiology
Reports - Radiology Reports Assisted Brain CT Images
Retrieval
28Research Areas
- Information Extraction from Brain CT Radiology
Reports - Automatic Generation of Brain CT Radiology
Reports - Radiology Reports Assisted Brain CT Images
Retrieval
29Information Extraction from Brain CT Radiology
Reports
- Our major task in this research area is to
extract structured medical findings from the free
text brain CT radiology reports
30Input Output
An extra-dural haematoma overlying the right
frontal lobe is seen measuring 1.2 cm in
thickness.
Finding haematoma
type extradural
location overlying
brain_part lobe
orientation right
orientation frontal
thickness 1.2 cm
31System Architecture
- The system will have these components
- Document Chunker
- Parser
- Term Mapper
- Finding Extractor
- Report Constructor
32Document Chunker
- Decompose the radiology report into three
sections - Reasons for examination
- Detailed description of observations and findings
- Comments or conclusion
- We will focus on second and third sections, as
they contain medical findings
33Parser
- Parse each sentence of a report and outputs a
typed dependence tree - Parser output example
nullseen nsubjpasshematoma detAn amodextr
a-dural partmodoverlying dobjlobe dett
he amodright amodfrontal auxpassis pa
rtmodmeasuring dobjcm num1.2 prep-inthic
kness
Grammatical relation to parent word
34Term Mapper
- Maps words to standard forms specified in our
medical knowledge source (Unified Medical
Language System UMLS and other radiology
thesaurus) - Reduces spelling variations
35Finding Extractor
- Apply semantic rules that are derived from
semantic features of the words to translate the
typed dependency relationship to logical
relationship between findings and modifiers
(findings attributes) - Merge the same finding from different sentences
into one finding - Remove the redundant finding
36Report Constructor
- Construct structured report according to
findings, modifiers, and their logical
relationship extracted from the finding extractor
37Research Areas
- Information Extraction from Brain CT Radiology
Reports - Automatic Generation of Brain CT Radiology
Reports - Radiology Reports Assisted Brain CT Images
Retrieval
38Automatic Generation of Brain CT Radiology
Reports
- A traditional approach based on typical NLG
system - Content determination
- Discourse planning
- Sentence aggregation
- Lexicalization
- Referring expression generation
- Linguistic realization
39Content Determination
- Creates a set of messages from the features
extracted from the new brain CT Images - Doctors use size, shape and location of the
potential hemorrhage region to determine head
trauma types - The system uses similar features for content
determination area, major axis length, minor
axis length, eccentricity, solidity, extent,
adjacency to skull, adjacency to background
40Content Determination
Image Segmentation
Features Extraction
41Discourse Planning
- Uses Rhetorical Structure Theory (RST) to
organize the text based on relationships that
hold between parts of the text
42Sentence Aggregation
- Groups messages together into sentences and
paragraphs
43Sentence Aggregation
- Groups messages together into sentences and
paragraphs
44Lexicalization
- Decides which specific words and phrases should
be chosen to express the domain concepts and
relations which appear in the messages - Uses hardcoded specific word and phrases to
standardize the output language radiology
reporting - Uses NLG system to generate radiology reports of
various writing styles to cater different user
groups (at later stage of our project)
45Final Steps
- Referring Expression Generation
- Linguistic Realization
46A Machine Learning Approach
- Based on the concept of statistical machine
translation - Image and report are two representations of the
same medical condition - In a sense, image and text are two different
languages
47Statistical Machine Translation
48Syntax Tree Based SMT
49Report Generation based on SMT concepts
50Research Areas
- Information Extraction from Brain CT Radiology
Reports - Automatic Generation of Brain CT Radiology
Reports - Radiology Reports Assisted Brain CT Images
Retrieval
51Radiology Reports Assisted Brain CT Images
Retrieval
52Radiology Reports Assisted Brain CT Images
Retrieval
53Project Status
- Project Funding Sources
- University Research Grant
- Ministry of Education Academic Research Grant
- Project Collaborators
- School of Computing, NUS
- National Neuroscience Institute
- Institute for Infocomm Research
- Project Phases
- Phase I Pilot Study (Feb 2007 April 2008)
- Phase II RD (April 2008 Mar 2011)