Title: Automating Concept Representation in the Biomedical Literature:
1Automating Concept Representation in the
Biomedical Literature Preliminary Results of
Incorporating MetaMap into the Telemakus System
Debra Revere Telemakus Project Dept of Medical
Education Biomedical Informatics School of
Medicine University of Washington NLM Mentor
Alan Aronson July 2003
2Overview of Talk
- Introduction background
- Current Telemakus system
- Demo
- Incorporating MetaMap into Telemakus Results
- Issues
- Future directions
3Intro Context of the Problem
- Information explosion
- Specialization can create barriers
- Need for information retrieval tools that provide
answers rather than lists of documents
4Issue Answers not Lists
5Our Questions
Is there a format more conducive to rapid review
of retrieved citations that also presents an
accurate representation of the research methods
findings? How do retrieved citations relate
to one another? Can research literature be
mined to identify connections not previously
noted?
6Telemakus Components
- Database elements extraction
- Research concept
- relationship extraction
- Research report schema
- Visual exploration interface
7Research Methods, Materials Data Extraction
8Research Concept Relationship Extraction
9Concept Identification Relationships
Figure Heading The relationship between
insulin infusion rate (IIR) and visceral fat
(VF). Points presenting the 4 ad libitum (AL),
18 AL, and 18 caloric restricted (CR)
rats. Extracted Research Concepts
Relationships insulin infusion rate visceral
fat insulin infusion rate ad libitum insulin
infusion rate caloric restriction visceral fat
ad libitum visceral fat caloric restriction
10Research Report Schema
- Based on schema theory
- we understand the world in terms of prototypical
patterns (scripts, schemas, narratives) in which
are embedded a vast array of relationships,
concepts, and vocabulary words. - Research Report Schema
- represents research environment, methods and
outcomes - capitalizes on standardization of research report
format (abstract, intro, materials methods,
results, discussion) - includes standard bibliographic info, research
design methods, research findings derived from
data tables figures - The Research Report Schema serves as a surrogate
- for the research report
- to facilitate searching rapid review of
retrieved documents
11Research Report Schema
12Research Report Schema
13Visual Exploration Interface
- Concept Maps
- used to show inter-relationships
between concepts extracted from a body
of domain documents
14Visual Exploration Interface
15Document Processing Database Building
- Fetcher
- Extractor
- CrossCheck
16System Architecture
17Putting it all together Demo
Research Report Schema Visualization
Concept Representation Concept
Relationships
18Problem How to automate concept identification?
19Trigger Terms ? Fact vs Finding
20Experiment Remove Specific Semantic Types
utterance('00000000.tx.1',"Effect of aging on
growth hormone-induced growth hormone receptor
and Janus-activated kinase 2 phosphorylation"). Â
(map(-1000,ev(-1000,'C0205414','Effect','Effecti
ve',effect,qlco,1,1,1,1,0,yes,no)
)). (map(-1000,ev(-1000,'C0001811','Ageing','A
ging',ageing,orgf,tmco,1,1,1,1,0,yes,n
o) )). (map(-836,ev(-904,'C0034839','Growth
Hormone Receptor','Receptors, Somatotropin',
growth, hormone,receptor,aapp,rcpt,1,2,1,
2,0,6,6,3,3,0,yes,no), ev(-632,'C0205263'
, 'Induced','Induced',induced,ftcn,3,3,1,
1,0,no,no))). (map(-868,ev(-722,'C0169661',
'Janus kinase 2','Janus kinase 2',janus,kinase,'2
',aapp,enzy, 1,1,1,1,0,3,3,2,2,0,
4,4,3,3,0,no,no), ev(-604,'C0879526',activat
e, activate,activate,ftcn,2,2,1,1,1,no
,no), ev(-804,'C0031715','Phosphorylation','Phosph
orylation',phosphorylation,npop,5,5,1,1,
0, yes,no))).
21Results
- recall 44.81
- precision (normal MetaMap processing) 15.46
- precision (MetaMap with STs removed) 33.41
22Future Work
- Continuing refinement of MetaMap processing to
improve recall precision - Investigate using other NLM tools to
- automate relationship analysis
- Tackle Issues re
- performance of system in real world
- does system actually support researchers learning
something not known before or, even more novel,
not previously reported in the literature?
23Summary
The Telemakus system is unique in combining
document surrogates with interactive concept maps
of linked relationships across groups of research
reports Telemakus formalizes representation of
the research methods results of scientific
reports, thus offering a potential strategy to
enhance the scientific discovery
process. Scalability is an issue automating
concept and relationship analysis is essential
for Telemakus to expand beyond this specific
domain. MetaMap shows promise as a means of
addressing the concept analysis problem other
NLM tools need to be explored as potential means
for addressing the scalability issue.
24TRY TELEMAKUS
Telemakus http//www.telemakus.net/ Telemakus
is funded, in part, by the Ellison Medical
Foundation http//www.ellisonfoundation.org/ Te
lemakus is a component of the Ellison-funded
Science of Aging project in partnership with
AAAS and Highwire Press at Stanford
University http//sageke.sciencemag.org/
25Project Team
Sherrilynne Fuller, Principal Investigator Debra
Revere, Research Coordinator Paul F. Bugni, Lead
Software Engineer David J. Owens, Programmer Lisa
Tisch, Information Analyst Heather L. Fuller,
Information Analyst Lucas Reber, Systems
Administrator Wendy Kramer, Fiscal
Specialist George M. Martin, Chair, Scientific
Advisory Committee, Ellison Medical Foundation
26THANK YOU!!
27