Toward a Medical Semantic Web - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Toward a Medical Semantic Web

Description:

NCI Center for Bioinformatics and Information Technology ... Forms semantic component for Cancer Bioinformatics Grid (caGRID) Vocabulary Products ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 34
Provided by: ani869
Category:
Tags: medical | semantic | toward | web

less

Transcript and Presenter's Notes

Title: Toward a Medical Semantic Web


1
Toward a Medical Semantic Web
  • Gilberto Fragoso
  • Enterprise Vocabulary Services
  • NCI Center for Bioinformatics and Information
    Technology
  • The Semantic Web meets the Deep Web (SWDW'08)
  • July 23, 2008

2
Topics
  • Background
  • Mission
  • NCI EVS main products
  • NCIT and BGT production cycle in brief
  • Tooling, Support Challenges
  • Infrastructure
  • Editing Support GUI (NCIEditTab)
  • Classification Performance
  • Explanation Facility
  • Semantic Media Wiki
  • BiomedGT

3
EVS Mission and Products
  • Services and resources that address NCI's needs
    for controlled vocabulary. See
    http//ncicb.nci.nih.gov/NCICB/infrastructure/caco
    re_overview/vocabulary
  • Clinical, translational, and basic research have
    overlapping but specialized terminology needs.
  • EVS integrates different conceptual frameworks
  • Creates terminological and taxonomic conventions
    across systems
  • Provides common terminology for annotation and
    coding
  • Forms semantic component for Cancer
    Bioinformatics Grid (caGRID)
  • Vocabulary Products
  • NCI Thesaurus an ontology-like terminology
  • NCI Metathesaurus maps vocabularies
  • External vocabularies served MedDRA, HL7,
    NDF-RT, LOINC, GO, Zebrafish,
  • BiomedGT (new)

4
NCI Thesaurus
  • Reference Terminology for NCI, caBIG, Partners
  • A Federal Standard Terminology in some areas
  • Broad coverage of the cancer research and
    clinical domain including prevention and
    treatment trials
  • Neoplastic and other Diseases
  • Findings and Abnormalities
  • Anatomy, Tissues, Subcellular Structures
  • Agents, Drugs, Chemicals
  • Genes, Gene Products, Biological Processes
  • Animal Models Mouse, other
  • Research techniques and management, apparatus,
    clinical and lab, radiology, imagery

5
NCI Thesaurus
  • Published Monthly
  • Public domain, open content license
  • 70,000 Concepts hierarchically organized into
    domains
  • Description-logic based
  • Concept History
  • Available on-line and by download (OWL, Ontylog
    XML, flat files)
  • Accessed through caCORE 3.2 in deprecation with
    Apelon DTS backend and through caCORE 4.0 and
    LexBIG server

http//ncicb.nci.nih.gov/download/evsportal.jsp
6
Biomedical Grid Terminology (BiomedGT)- New
  • Goals
  • Open, publicly accessible collaboratively
    developed terminology for translational research
  • Concept orientation
  • DL based, support reasoning by end-users
  • Federated sub-ontologies
  • Content maintained by experts in the relevant
    research communities.
  • Edited in Protégé, content to be added in
    multiple ways, but one way is through a semantic
    media wiki

7
NCIT Production Environment
Conflict Detection and Resolution
Test Environment
Classification
Release Candidate
  • Workflow Manager
  • Work Manager Client
  • DB Schema Master Baseline

History
Baseline
Terminology Server
Work Lists
Change sets
Publishable History
History Processing and Validation
  • Individual Editor
  • Workflow Client
  • Editing Application
  • DB Schema Current NCI Baseline

Editing History
Migration to Production
Individual Baseline
Classification
Classification is performed on the client
8
BGT and NCIT in OWL
  • Advantages of OWL
  • W3C Recommendation for Web
  • Non-proprietary, semantics are published
  • Disadvantages
  • Nascent technology, some features for vocabulary
    development not yet there
  • Tool support for vocabulary development
  • Editors
  • Classifiers and Classification Services

9
Challenges
  • Editing environment
  • Collaboration with Stanford on Protégé/OWL,
    database backend, support for imports, client
    server
  • Dedicated GUI support (NCIEditTab)
  • Classification Challenges, Clark Parsia
  • Perfomance of existing classifiers, runtime
    classification
  • More expressive DL explanation facility
  • Access of one classification run to all editors
    in client-server environment

10
Current BGT Production Environment(and future
NCIT)
  • Workflow Manager
  • Prompt and Classification
  • done in server

Test Environment
Publish
Release Candidate
History
Terminology Server
Conflict Detection and Resolution, and Classificat
ion
Baseline
Editing History
  • Individual Editor
  • Editing Application

Migration to Production
Wiki Collaborators (specific to BGT)
Classification is desirable From the client in
client-server mode
11
Dedicated GUI for Protege
12
(No Transcript)
13
Challenges
  • Editing environment
  • Collaboration with Stanford on Protégé/OWL,
    database backend, support for imports, client
    server
  • Dedicated GUI support (NCIEditTab)
  • Classification Challenges, Clark Parsia
  • Perfomance of existing classifiers, runtime
    classification
  • More expressive DL explanation facility
  • Access of one classification run to all editors
    in client-server environment

14
Reasoner Engineering
  • Maturing the Pellet reasoner
  • Case Study NCIt Classification Services
  • Prior to initial work non-terminating
  • Improving resource efficiency 9 hours
  • Algorithmic optimizations 5 minutes
  • Incremental updates seconds

15
Explaining NCIt
  • Goal Improve the efficiency of editors by
    identifying problems and causes
  • Solution automatic analysis servicesexplanation,
    debugging, repair
  • Based on mature, formal KR
  • Explanations legible to editors, not just
    logicians
  • Increase editor confidence in toolchain

16
Explanation
17
Genesis of BiomedGT
NCI Thesaurus Evaluation2006-2007
  • Goals
  • Review and report of OBO criteria and relevant
    ISO standards for semantic quality and federation
    of terminologies, semantic quality and
    consistency.
  • Review content and structure for compliance
  • Document examples of how compliance would be
    achieved

18
Genesis of BiomedGT
Among the Recommendations
  • 1) Unravel the vocabulary
  • Partition into
  • Words and their definitions (Lexicon /
    Dictionary)
  • Categorization and navigational nodes (Thesauri)
  • Ontology
  • Identify external resources and
  • Include ability to reference general upper level
    ontologies
  • Named relationships with other external resources
  • 2) Enable collaboration create environment where
    SMEs can collaborate and discuss

19
Unraveling the Vocabulary
Traditional Hierarchical System
(C. Chute, Mayo Clinic)
20
Unraveling the Vocabulary
in BGT
owlThing
BFO
BGT Thesaurus Nodes
BGT Word Nodes
BioTop
BGT Ontology Nodes
21
Unraveling the Vocabulary
in BGT
owlThing
BFO
BGT Thesaurus Nodes
BGT Word Nodes
BioTop
Sparse Trees, Populated by the DL Classifier
BGT Ontology Nodes
Shallow Trees
22
Reuse of Resources
owlThing
BFO
BGT Thesaurus Nodes
  • External Namespaces, Reference (GO, ChEBI, JAX)
  • External Namespaces, Modeled (CTCAE, NPO?)

BioTop
BGT Ontology Nodes
3) Internal Namespaces, Collaboratively Modeled
(NPO?)
23
Enable Collaboration
Semantic MediaWiki
  • Mediawiki extension
  • Focus is on capturing wiki (particularly
    Wikipedia) in a formal, computational fashion
  • Berlin CategoryCity
  • ? Berlin rdftype City
  • City CategoryGeographical Feature
  • ? City rdfssubClasOf Geographical_Feature
  • Berlin hasPopulation80,000,000
  • ? ltpropertyHasPopulation rdfdatatypexmlsdoub
    le"gt80000000lt/propertyHasPopulationgt

24
(No Transcript)
25
Category Drill-Down
26
Ajax-Based Search
27
Node Display
28
Workflow Propose Changes
29
Workflow Propose Changes
30
Workflow Propose Changes
31
Workflow Propose Changes
32
BiomedGT Collab Cycle
Using Semantic Media Wiki And NCI Protégé
33
Acknowledgements
Classification and Explanation Michael Smith,
Michael Grove, Evren Sirin (Clark Parsia,
LLC) Protégé Infrastructure Timothy Redmond,
Tania Tudorache (Stanford Medical
Informatics) Editing Plug-in, Workflow
Plug-in Bob Dionne (Dionne Assoc), David Yee
(Northrop Grumman) Semantic Media Wiki Harold
Solbrig, Russ Hamm (Apelon, Inc) Guoquian Jiang,
Deepak Sharma, Sridar Dwarkanath (Mayo
Clinic) Wilberto Garcia (Northrop Grumman) EVS
Group Sherri de Coronado, Frank Hartel, Larry
Wright, Margaret Haber, Gilberto Fragoso (NCI
CBIIT)
Write a Comment
User Comments (0)
About PowerShow.com