Title: Terminology and the Semantic MediaWikiEcoterm IV Vienna
1Terminology Curation with the Semantic MediaWiki
- Harold Solbrig
- Informatics Architect
- Apelon, Inc.
2(No Transcript)
3The Primary Task
- Evaluate the roles, categories and organization
of the National Cancer Institute (NCI)s Cancer
Thesaurus with respect to - Upper Level Ontological Principles
- ISO TC37 Related principles
- As with Ontology construction, it was understood
by all parties that this was a process not a
goal.
4Approach
- Gather appropriate upper level ontologies (BFO,
Dolce, Top Bio, UMLS Semantic Net and OBO
Relations Ontology) into a single, readily
referenced format - Load NCI Thesaurus into same format
- Multiple parties review, annotate, recommend and
categorize - Publish, analyze and evaluate results
5Solution
- By using the Semantic MediaWiki (SMW), we were
able to accomplish all of the goals in a (very)
reasonable period of time
6Discussion
- We also discovered that, with some extensions,
the SMW could be useful for publishing,
annotating and cross-referencing other
terminological (and other..) resources.
7Questions?
8Wikis
- Community developed
- Collaborative
- Organic to the very core
- Primary focus (to date) is human consumption
- Traceable, provenance automatically recorded,
differences, undo and redo.
9MediaWiki
- http//en.wikipedia.org/wiki/Wiki
- Base for WikiPedia and many others
- Key characteristics
- Web based editing
- Page links
- Categories
- Templates
10MediaWiki
- Fully documented using (surprise!) mediawiki
- Rich mechanisms for discussion, curation, export,
etc.
11(No Transcript)
12Common constructs
- Train Transport hyperlink to page named
Train_Transport - Italic, Bold
- Bullet point
- http//www.w3c.org/ The W3C hyperlink
- and much more
13Templates
14Templates
15Sample Template
Extension call
Parameter
16Semantic MediaWiki
17Semantic MediaWiki
- 3 Key extensions to MediaWiki
- Categories Class
- PageA CategoryX ? pageA rdfType
categoryX - CategoryY CategoryX ? categoryY
rdfssubClassOf categoryX - Links Role
- PageA PageB ? PageA hasPartPageB
- Attributes DataProperty
- population32,154,773
- Includes datatypes
18Categories and Relations
19Attributes
20Semantic Rendering
RDF (!)
Relation
Attribute Value
Type (or superClass)
21Thesaurus Content
22Templates?
Gene_Product_Is_Biomarker_Type The role is
used to designate the type of Kind
CategoryNCI_Kind Semantic Type
NCI_Semantic_TypeCategorySN_Conceptual_Entity
Conceptual Entity
Brittle, not readily changed
23Templates?
OntylogDescriptionnsNCItextThe role is
used to designate KindnsNCItargetKind
ResourceRefnameSemantic_TypensNCItargetCon
ceptual_EntitytargetnsSN
Can readily be updated viat template
24Link to another NCI comment
Link to external Ontology
Categorization in external Ontology
Commentary
25Computed
26How is it Working?
27What can we do to improve it
28Terminology
- Centrally curated
- Central to the practice of medicine
- Insurance and reporting
- Regulatory
- Research
- Clinical Practice
- Information Sharing
- ICD-9, CPT-4, SNOMED,
29Clinical Terminology
- Quality and content is important
- Needs central vetting, integration, qa
- Central model doesnt scale
- Need input from (many) experts
- Need visible, active feedback loop
30Terminology Workflow 1995
Books PDF
Distribution
(3)
Controlled Terminology
Lists and Tables
(2)
(1)
Curation
(4)
31Terminology Workflow 1995
Books PDF
Distribution
(3)
Controlled Terminology B
(2)
Lists and Tables
(1)
Curation
32Terminology Workflow 2008
(3)
Common Distribution Model
Distribution
Controlled Terminology
(2)
(4)
Online Services
(1)
Curation
(5)
33Terminology Workflow 2008
(3)
Controlled Terminology B
Common Distribution Model
Distribution
Controlled Terminology
(2)
(4)
Online Services
(1)
Curation
(5)
34Common Distribution Model
- LexGrid
- (a little bit of) OWL
- NCI Thesaurus SNOMED CT
- Still requires LexGrid-like additions
- Pushing the envelope
- UMLS RRF
- Although underspecified as a model
35Online Services
- OMG Terminology Query Services
- Not heavily used
- Perceived (incorrectly) as CORBA specific
- Perceived as too complex
- Object oriented and stateful
- ANSI Common Terminology Services
- Being adopted
- Necessary but not sufficient
- Stateless
- CTS-2
- Co-development beginning w/ HL7 OMG
36Online Services
- LexBIG
- LexGrid for the Bio Informatics Grid
- Robust query specification
- Meets many end-user (developers) requirments
- Not simple to implement it actually adds value
- Not a standard - but will be used to guide CTS-2
37Workflow and Feedback
(3)
Common Distribution Model
Distribution
Controlled Terminology
(2)
(4)
Online Services
(1)
Curation
(5)
38The Feedback Component
Curation
39The Feedback Component
Common Distribution Model
Semantic MediaWiki ()
Distribution
Annotations and Change Requests
Online Services
Community Review
Version Staging
Curation
40Issues and Next Steps
- (1) SHARED Semantics
- Definition
- Synonym
- References
- DLSome
- DLAll
-
- 12620 anyone?
41Issues and Next Steps
- (2) Figure out namespaces
- NCIActivity, AgroVocFish,
- NCI_Activity, AgroVoc_Fish
- ???
- (2a) Identifiers (Activity vs. C12345)
- (2b) Versions
- (2c) URIs (vs. URLs)
- Internal
- External
42Certification and Sanctioning
- Who can edit?
- Who can validate?
- Who selects updates?
- (see http//en.citizendium.org/wiki/Main_Page
43Automatic Export
- Selecting sets of updates
- Formatting update recommendations for target
curators, etc
44Synchronization
- Changes implemented in terminology
- Update wiki pages
- Say what changed
- What changes are incorporated by value? By
reference?
45Approach and Responsible Parties
- Shared Semantics
- Core set based on LexGrid OWL
- Post on WIKI and link on SMW site
- Assigned to Apelon, Mayo, NCI, ???
- Extend to OBO, SKOS (?), XMDR
- Connections to 12620
46Time Frame and Assignments
- URIs, namespaces, naming
- UK NCR (CancerGrid) looking at unAPI and
servers - (Hopefully) can provide URI resolver svc.
- Short term use templates / extensions
47Content
- SNOMED-CT, ICD-9-CM, many, many others are
already available via. Apelon DTS Services - Available soon
- FMA, HL7 Version 3 Terminology, OBO Foundry (GO,
PATO, etc) as time permits - Others as needed (and funded)
48What weve got to date
- Apelon DTS Server Extension
- Includes both defined and classified view (!)
- Export in restful XML (currentely Apelon, soon to
be LexGrid) - XMDR Export Format
- Protégé (Native and OWL 3.2) prototype
- Done by Mayo
- Both import and export
- Still needs templates
49Questions?
- This time for real ?
- Note SMW will be made externally available (w/
simple password) once we get contract specific
info cleaned up (NCI will probably publish
shortly) contact hsolbrig_at_apelon.com for
access.