Title: Scale and Context: Issues in Ontologies to link Health- and Bio-Informatics
1Scale and Context Issues in Ontologies to link
Health- and Bio-Informatics
- Alan Rector, Jeremy Rogers, Angus Roberts, Chris
Wroe Bio and Health Informatics Forum/Medical
Informatics GroupDepartment of Computer Science,
University of Manchester - rector_at_cs.man.ac.ukwww.cs.man.ac.uk/mig
img.man.ac.ukwww.clinical-escience.orgwww.open
galen.org
2Organisation of Talk
- Informal presentation, motivation examples
- Intro to logic based ontologies
- How to use logic based ontologies to represent
scales and context - Making context modular normalisation
- Recurrent distinctions
- and tests for those distinctions
- Making logic based ontologies usable
- Views and Intermediate Representations
- Summary
3Example Problems of Context
- Classification by multiple axes
- e.g. Molecular action, physiologic, and
pathological effects - Chloride transport Cystic fibrosis
- Biological Scope
- eg. Normal/Abnormal, Human/Mouse
- Conceptual view
- e.g. the Digital Anatomist Foundational Model of
organs vs Clinical convention Is
the pericardium a part of the heart?
4Basic Approach
- Separate information into independent modules
- Normalise the ontology
- The truth, the whole truth, and nothing but the
truth - Add explicit contextual information
- Dont distort the structure
- Add context to it explicitly
5Why use Logic-based Ontologies?
becauseKnowledge is Fractal!
Requirements are Diverse
Coherence without Uniformity!
6Logic-based Ontologies Conceptual Lego
gene
protein
cell
expression
chronic
acute
bacterial
deletion
polymorphism
ischaemic
7Logic-based Ontologies Conceptual Lego
SNPolymorphism of CFTRGene causing Defect in
MembraneTransport of ChlorideIon causing Increase
in Viscosity of Mucus in CysticFibrosis
Hand which isanatomicallynormal
8Logic based ontologies
- A formalisation of semantic nets, frame systems,
and object hierarchies via KL-ONE and KRL - is-kind-of implies (logical
subsumption) - Dog is a kind of wolf meansAll dogs are
wolves - Modern examples DAMLOIL /OWL?)
- Older variants LOOM, CLASSIC, BACK, GRAIL,
K-REP,
9Logic Based Ontologies The basics
Validating (constraining cross products)
Primitives
Descriptions
Definitions
Reasoning
Thing
red partOf Heart
red partOf Heart
(feature pathological)
10Bridging Bio and Health Informatics
- Define concepts with pieces from different
scales and disciplines and then combine them - Polymorphism which causes defect which causes
disease - Use concepts which make context explicit
- Hand which is anatomically normal ? has five
fingers Normal human prostate ? has three
lobes - Use different subproperties for different
contexts - Abnormalities of clinical parts of the heart
11Bridging Scales with Ontologies
Species
Genes
Function
Disease
12Use composition to express context
- Normal and abnormal
- Hand ? isSubdivisionOf some UpperExtremity
- Hand AnatomicallyNormal ? hasSubdivision
exactly-5 fingers - Homologies and Orthologies
- Thumb of Hand of Human ? hasFeature Opposable
- Thumb of Hand of NonHumanPrimate ? hasFeature
Opposable
13More detailed example
Body
14(No Transcript)
15Represent context and views by variant properties
is_structurally_part_of
16What we want to avoid combinatorial explosions
- The Exploding Bicycle From phrase book to
dictionary grammar - 1980 - ICD-9 (E826) 8
- 1990 - READ-2 (T30..) 81
- 1995 - READ-3 87
- 1996 - ICD-10 (V10-19 Australian) 587
- V31.22 Occupant of three-wheeled motor vehicle
injured in collision with pedal cycle, person on
outside of vehicle, nontraffic accident, while
working for income - and meanwhile elsewhere in ICD-10
- W65.40 Drowning and submersion while in bath-tub,
street and highway, while engaged in sports
activity - X35.44 Victim of volcanic eruption, street and
highway, while resting, sleeping, eating or
engaging in other vital activities
17The Cost 1 Normalising (untangling) Ontologies
18The Cost 1 Normalising (untangling)
OntologiesMaking each meaning explicit and
separate
PhysSubstance Protein ProteinHormone
Insulin Enzyme Steroid
SteroidHormone Hormone ProteinHormone
Insulin SteroidHormone
Catalyst Enzyme
PhysSubstance Protein ProteinHormone
Insulin Enzyme Steroid
SteroidHormone Hormone
ProteinHormone Insulin
SteroidHormone Catalyst Enzyme
build it all by combining simple trees
Hormone Substance playsRole-HormoneRole Pro
teinHormone Protein playsRole-HormoneRoleS
teroidHormone Steroid playsRole-HormoneRole
Catalyst Substance playsRole
CatalystRole Insulin ? playsRole HormoneRole
Enzyme ?? Protein playsRole-CatalystRole
19NormalisationBuilding ontologies from orthogonal
trees
- Each tree is homogeneous and based on subsumption
- One prinicple one of function, structure,
cause, - Every primitive has exactly 1 primitive parent
- All multiple classification done by the logic
- All self-standing primitives disjoint
20The Cost 2 Clean Distinctions Tests
- Repeating patterns within levels
- Structures vs Substances
- Flavours of part-whole
- Part-whole vs containment, connection, branching
- Process/Event vs Thing (Endurant vs
Perdurant) -
- Repeating patterns across levels
- Multiples at one level act as substances at the
next - Substances span levels structures are specific
to a level
21Repeating Patterns within each level
- Structures vs Substances (Discrete vs Mass)
- Structures are made of substances
- Organs are made of tissue
- Parts portions
- Structures have parts subdivisions,
- Substances have portions
- Portions can have proportions concentrations
22Tests
- Structures (Discrete)
- Can you count it? Is one part different from
another? Is it made of something(s)? - Books, organs, ideas, individual cells,
organisations, - Substance (Mass)
- Are all bits the same? Can something be made of
it? Can you talk about A piece of it? A lump
of it? A stream of it? - Water, sodium, tissue, blood,
23Repeating Patterns within each level
- Part-whole vs containment
- Parthood is organisational
- The wall is part of the cell
- The cornea is part of the eye
- Containment is physical
- The inclusion is contained in the cell
- The marrow is contained in the bone
- Often occur together
- Nucleus is a part of and contained in the cell
- The retina is part of and contained in the eye
24Tests
- Parts
- If I take the part away, is the whole incomplete?
- If the part is damaged is the whole damaged?
- If I do something to the part do I do something
to the whole? - Containment
- Is the contained thing inside the container?
- Is the relationship spatial/physical? (or
temporal?)
25Repeating Patterns bridging levels
- Multiples of structures at one level behave as
substances at the next - Blood is made of in part a multiple of red
cellsTissue is made of in part a multiple of
cellsA rash is a multiple of spotsPolyposis
is a multiple of polypsA flock is a multiple
of birds - Multiples are not Sets
- Not defined by members
- Membership can change (intensional rather than
extensional) - Action on the singleton is not action on the
multipleAction on the whole is (usually) action
on the singletons - If I treat a spot, I do not treat the rash
- If I treat the rash, I treat the spots
26Tests
- Multiples
- Name for the singleton grain, cell, bird?
- Singletons are countable?
- Multiple is measurable rather than countable?
- Odd to say part-of This cell is part of the
Arm?
27But make it simple
- Intermediate representations and views
- OWL Detailed Schema is the Assembler Language
- FaCT/SHIQ/ is the machine code
- Almost no one writes in assembler
- let alone machine code
- Separate terms and concepts
- Language/labels from concepts
28 Layered Architecture
Protégé OilEd-II ?
DL
29ExampleAn Intermediate Representation for
Surgery
- "Open fixation of a fracture of the neck of the
left femur" - MAIN fixing
- ACTS_ON fracture
- HAS_LOCATION neck of long bone
- IS_PART_OF femur
- HAS_LATERALITY left
- HAS_APPROACH open
30The formal assembler version
(SurgicalProcess which isMainlyCharacterisedBy
(performance which isEnactmentOf
(SurgicalFixing which
- hasSpecificSubprocess (SurgicalAccessing
- hasSurgicalOpenClosedness
(SurgicalOpenClosedness which - hasAbsoluteState surgicallyOpen))
actsSpecificallyOn (PathologicalBodyStruc
ture which lt involves Bone
hasUniqueAssociatedProcess
FracturingProcess hasSpecificLocat
ion (Collum which
isSpecificSolidDivisionOf
(Femur which
hasLeftRightSelector
leftSelection))gt))))
31Result
- Training time 3 mo ? 3 days 3 days
- Productivity 25/day ? 100/day
- Central reconciliation 50 ? 10
- Local cycle time 3 months ? lt1 week
- Dependencies High ? Low
- Author satisfaction Low ? High
- Disputes Frequent ? Rare
- Repeatability Low ? High
Even Pre Web!
32Navigation vs Retrieval/ReferenceAccess
terminology Reference terminology
- Access follows model of use
- e.g. MeSH, MEDCin
- Hierarchy is what is needed next to hand
- People find easy Software hard
- Retrieval follows model of meaning
- Logic based ontologies
- Hierarchy means is-kind-of / subsumption
- People may find odd Software is easy
- Need Both - visualisations of both
- The logic based structure isnt enough
- Views and intermediate representations
33Whats in a View/ Intermediate Representation?
Language
linguisticgeneration search
User Oriented Structures
semantictransformations Filters
Explicit Context in Ontology Assembler
34SummaryLet the logic engine do the work
- Logic based ontologies can bridge granularities
represent context explicitly - And manage the potential combinatorial explosions
- To do so
- Views and Interface usable, flexible easy to
learn - Entry, Navigation, Use are different
- Structure explicit modular Normalised
- Conception clean testable distinctions
- Tools Architecture - layered comprehensive
- The logic is the assembly language
35(No Transcript)
36Some Healthcare Terminologies
37Some Healthcare Terminologies
- ICD 9/10
- Traditional paper thesauri
- -CM versions essential for billing (and AM)
- CPT Clinical Procedure Terminology
- Simple list
- Clinical Terms (Read Codes) V2
- Simple hierarchy
- Still dominant in UK general practice
- SNOMED-CT
- At least logic assisted
- Political questions
- NCI Cancer Ontology
- Logic based in parts work in progress
38Others
- Standards Related
- Loinc laboratory data
- Increasingly structured logic assisted
aspirations - HL7 Vocabulary TC
- Specialised vocabularies Inspiration for OHT
- Links to RxNorm
- Snomed Dicom Microglossary (SDM)
- Image related information not related tNOMED
- Open Source
- OpenGALEN Common Reference Model
- Logic based multilingual a resource rather
than a terminology - Basis of UK Drug Ontology
- Open Health Terminology
- Watch this space
- Focusing on UMLS
- Likely to be at least logic assisted
39Special Purpose
- Anatomy
- Digital Anatomist Foundational Model of
AnatomyFMA - Principled frame based representation
- Superb reference point for structural anatomy
- Needs functional and clinical supplements
- http//sig.biostr.washington.edu/projects/da/
- Drugs
- RxNorm and VA projects
- See Steve Brown Stuart Nelson
- UK Primary Care Drug DictionaryUKCPRS (Secondary
Care)Drug Ontology (OpenGALEN based) - MEDDRA, FDA, Proprietary, , ,
40Unified Medical Language System (UMLS)
- Common reference point and link to MeSH Terms and
literature - De facto standard for universal identifiers
- Concept Unique Identifiers (CUIs)
- Lexical Unique Identifiers (LUIs)
- String Unique Identifiers (SUIs)
- Valuable in itselfHuge resource for mining and
restructuring - Udo Hahn and Stefan SchulzCoMMeT Conceptual
Model of Medical Terminology - http//www.coling.uni-freiburg.de/pub/schulz/comm
et/ - Alexa McCray is speaking next