Title: Pax Terminologica
1Biomedical Ontologies The State of the Art
Barry Smith and Werner Ceusters MIE,
Sarajevo, August 30
2Part 1 Barry Smith Ontologies are
Representations of What is General in
Reality Part 2 Werner Ceusters Referent
Tracking Pinning Ontologies to Instances in
Reality
3Uses of ontology in PubMed abstracts
4By far the most successful GO (Gene Ontology)
5Youre interested in which genes control heart
muscle development 17,536 results
6Microarray data shows changed expression
of thousands of genes.
How will you spot the patterns?
7Youre interested in which of your hospitals
patient data is relevant to understanding how
genes control heart muscle development
8Lab / pathology data EHR data Clinical trial
data Family history data Medical
imaging Microarray data Model organism data Flow
cytometry Mass spec Genotype / SNP data How will
you spot the patterns? How will you find the data
you need?
9One strategy for bringing order into this huge
conglomeration of data is through the use of
Common Data Elements
- Discipline-specific (cancer, NIAID, )
- Do not solve the problems of balkanization (data
siloes) - Do not evolve gracefully as knowledge advances
- Support data cumulation, but do not readily
support data integration and computation
10An ontology is not a terminology
- Existing term lists and CDEs
- built to serve specific data-processing
- in ad hoc ways
- Ontologies
- designed from the start to ensure
integratability and reusability of data - by incorporating a common logical structure
11How does the Gene Ontology work?
with thanks to Jane Lomax, Gene Ontology
Consortium
12GO provides a controlled system of
representations for use in annotating data
- multi-species, multi-disciplinary, open source
- contributing to the cumulativity of scientific
results obtained by distinct research communities - compare use of kilograms, meters, seconds in
formulating experimental results
13(No Transcript)
14(No Transcript)
15(No Transcript)
16GO provides answers to three types of questions
- for each gene product
- in what parts of the cell has it been identified?
- exercising what types of molecular functions?
- with what types of biological processes?
- when is a particular gene product involved
- in the course of normal development?
- in the process leading to abnormality
- with what functions is the gene product
associated in other biological processes?
17Some pain-related terms in GO
- GO0048265 response to pain
- GO0019233 sensory perception of pain
- GO0048266 behavioral response to pain
- GO0019234 sensory perception of fast pain
- GO0019235 sensory perception of slow pain
- GO0051930 regulation of sensory perception of
pain - GO0050967 detection of electrical stimulus
during sensory perception of pain - GO0050968 detection of chemical stimulus
involved in sensory perception of pain - GO0050966 detection of mechanical stimulus
involved in sensory perception of pain
18- GO0050968 detection of chemical stimulus
involved in sensory perception of pain
19GO provides a tool for algorithmic reasoning
20Hierarchical view representing relations between
represented types
21GO allows a new kind of biological research,
based on analysis and comparison of the massive
quantities of annotations linking GO terms to
gene products
22One standard method
- Sjöblöm T, et al. analyzed13,023 genes in 11
breast and 11 colorectal cancers - using functional information captured by GO for
given gene product types - identified 189 as being mutated at significant
frequency and thus as providing targets for
diagnostic and therapeutic intervention. - Science. 2006 Oct 13314(5797)268-74.
23Uses of GO in studies of
- Biomedical discovery acceleration, with
applications to craniofacial development. PMID
19325874 - Persistent changes in spinal cord gene expression
after recovery from inflammatory hyperalgesia a
preliminary study on pain memory. PMID 18366630 - Spinal cord transcriptional profile analysis
reveals protein trafficking and RNA processing as
prominent processes regulated by tactile
allodynia. PMID 17069981 - Immune system involvement in abdominal aortic
aneurisms (PMID 17634102)
24100 mill. invested in literature curation using
GO
- over 11 million annotations relating gene
products described in the UniProt, Ensembl and
other databases to terms in the GO - experimental results reported in 52,000
scientific journal articles manually annoted by
expert biologists using GO - ontologies provide the basis for capturing
biological theories in computable form
25GO is amazingly successful in overcoming problems
of balkanization
- but it covers only generic biological entities of
three sorts - cellular components
- molecular functions
- biological processes
- and it does not provide representations of
diseases, symptoms,
26Extending the GO methodology to other domains of
biology and medicine
27 RELATION TO TIME GRANULARITY CONTINUANT CONTINUANT CONTINUANT CONTINUANT OCCURRENT
RELATION TO TIME GRANULARITY INDEPENDENT INDEPENDENT DEPENDENT DEPENDENT
ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality(PaTO) Biological Process (GO)
CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Phenotypic Quality(PaTO) Biological Process (GO)
MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Function (GO) Molecular Process (GO)
The Open Biomedical Ontologies (OBO) Foundry
28Ontology Scope URL Custodians
Cell Ontology (CL) cell types from prokaryotes to mammals obo.sourceforge.net/cgi- bin/detail.cgi?cell Jonathan Bard, Michael Ashburner, Oliver Hofman
Chemical Entities of Bio- logical Interest (ChEBI) molecular entities ebi.ac.uk/chebi Paula Dematos, Rafael Alcantara
Common Anatomy Refer- ence Ontology (CARO) anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland,
Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.washington. edu JLV Mejino Jr., Cornelius Rosse
Functional Genomics Investigation Ontology (FuGO) design, protocol, data instrumentation, and analysis fugo.sf.net FuGO Working Group
Gene Ontology (GO) cellular components, molecular functions, biological processes www.geneontology.org Gene Ontology Consortium
Phenotypic Quality Ontology (PaTO) qualities of anatomical structures obo.sourceforge.net/cgi -bin/ detail.cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos
Protein Ontology (PrO) protein types and modifications (under development) Protein Ontology Consortium
Relation Ontology (RO) relations obo.sf.net/relationship Barry Smith, Chris Mungall
RNA Ontology (RnaO) three-dimensional RNA structures (under development) RNA Ontology Consortium
Sequence Ontology (SO) properties and features of nucleic sequences song.sf.net Karen Eilbeck
29OBO Foundry
- recognized by NIH as framework to address
mandates for re-usability of data collected
through Federally funded research - see NIH PAR-07-425 Data Ontologies for
Biomedical Research (R01)
30The OBO Foundry
- Initial Candidate Members
- GO Gene Ontology
- CL Cell Ontology
- SO Sequence Ontology
- ChEBI Chemical Ontology
- PATO Phenotype (Quality) Ontology
- FMA Foundational Model of Anatomy
- ChEBI Chemical Entities of Biological Interest
- CARO Common Anatomy Reference Ontology
- PRO Protein Ontology
31The OBO Foundry
- Under development
- Disease Ontology
- Infectious Disease Ontology
- Mammalian Phenotype Ontology
- Plant Trait Ontology
- Environment Ontology
- Ontology for Biomedical Investigations
- Behavior Ontology
- RNA Ontology
- RO Relation Ontology
32 RELATION TO TIME GRANULARITY CONTINUANT CONTINUANT CONTINUANT CONTINUANT OCCURRENT
RELATION TO TIME GRANULARITY INDEPENDENT INDEPENDENT DEPENDENT DEPENDENT
ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality(PaTO) Biological Process (GO)
CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Phenotypic Quality(PaTO) Biological Process (GO)
MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Function (GO) Molecular Process (GO)
The Open Biomedical Ontologies (OBO) Foundry
33OBO Foundry is organized in terms of Basic Formal
Ontology
- Each Foundry ontology can be seen as an
extension of a single upper level ontology (BFO)
34Basic Formal Ontology (BFO)
Continuant
Occurrent (Process, Event)
Independent Continuant
Dependent Continuant
http//ifomis.uni-saarland.de/bfo/
35Fundamental Dichotomy
- Continuants preserve their identity through
change - vs.
- Occurrents (aka processes)
- have temporal parts
- unfold themselves in successive phases
- exist only in their phases
- have all their parts of necessity
36Ontology and Referent Tracking
types
Continuant
Occurrent process, event
Independent Continuant thing
Dependent Continuant quality
.... ..... .......
instances
37 CONTINUANT CONTINUANT CONTINUANT CONTINUANT OCCURRENT
INDEPENDENT INDEPENDENT DEPENDENT DEPENDENT
ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality(PaTO) Organism-Level Process (GO)
CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) Phenotypic Quality(PaTO) Cellular Process (GO)
MOLECULE Molecule (ChEBI, SO, RNAO, PRO) Molecule (ChEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Function (GO) Molecular Process (GO)
rationale of OBO Foundry coverage (homesteading
principle)
38 RELATION TO TIME GRANULARITY CONTINUANT CONTINUANT CONTINUANT CONTINUANT CONTINUANT OCCURRENT
RELATION TO TIME GRANULARITY INDEPENDENT INDEPENDENT INDEPENDENT DEPENDENT DEPENDENT
COMPLEX OF ORGANISMS Family, Community, Deme, Population Family, Community, Deme, Population Organ Function (FMP, CPRO) Population Phenotype Population Process
ORGAN AND ORGANISM Organism (NCBI Taxonomy) (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality(PaTO) Biological Process (GO)
CELL AND CELLULAR COMPONENT Cell (CL) Cell Com-ponent (FMA, GO) Cellular Function (GO) Phenotypic Quality(PaTO) Biological Process (GO)
MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Function (GO) Molecular Process (GO)
E N V I R O N M E N T
39The Gene Ontology (GO)
Continuant
Occurrent
biological process
Independent Continuant
Dependent Continuant
cell component
molecular function
Kumar A., Smith B, Borgelt C. Dependence
relationships between Gene Ontology terms based
on TIGR gene product annotations. CompuTerm 2004,
31-38. Bada M, Hunter L. Enrichment of OBO
Ontologies. J Biomed Inform. 2006 Jul 26
40Users of BFO
- GO / OBO Foundry
- NCI BiomedGT
- SNOMED CT
- ACGT Clinical Genomics Trials on Cancer Master
Ontology / Formbuilder (Case Report Forms for
Cancer Clinical Trials) - Ontology for Risks Against Patient Safety (RAPS)
(EU)
41Users of BFO
- MediCognos / Microsoft Healthvault
- Cleveland Clinic Semantic Database in
Cardiothoracic Surgery - Major Histocompatibility Complex (MHC) Ontology
(NIAID) - Neuroscience Information Framework Standard
(NIFSTD)
42IDO Infectious Disease Ontology
- MITRE, Mount Sinai, UTSouthwestern Influenza
- IMBB/VectorBase Vector borne diseases (A.
gambiae, A. aegypti, I. scapularis, C. pipiens,
P. humanus) - Colorado State University Dengue Fever
- Duke University Tuberculosis, Staph. aureus
- Case Western Reserve Infective Endocarditis
- University of Michigan Brucilosis
43Users of BFO
- Interdisciplinary Prostate Ontology (IPO)
- Nanoparticle Ontology (NPO) Ontology for Cancer
Nanotechnology Research - Neural Electromagnetic Ontologies (NEMO)
Ontology-based Tools for Representation and
Integration of Event-related Brain Potentials - Ontology for General Medical Science
44depends_on
Continuant
Occurrent process, event
Independent Continuant thing
Dependent Continuant quality
quality depends on bearer
.... ..... .......
45Specifically dependent continuants
- the quality of whiteness of this cheese
- your role as lecturer
- the disposition of this patient to experience
diarrhea -
46depends_on
Continuant
Occurrent process
Independent Continuant thing
Dependent Continuant quality
temperature depends on bearer
.... ..... .......
47Realizable dependent continuants
- plan
- function
- role
- disposition
- capability
- tendency
48Their realizations
- execution
- expression
- exercise
- realization
- application
- course
occurrents
49Continuant
Independent Continuant
Dependent Continuant
Non-realizable Dependent Continuant (quality)
Realizable Dependent Continuant (function, role,
disposition)
..... .....
50realization depends_on disposition
Continuant
Occurrent
Independent Continuant bearer
Dependent Continuant disposition
Process of realization
.... ..... .......
51Dependence
- a is dependent on b def. a is necessarily such
that if b ceases to exist than a ceases to exist
52Specifically Dependent Continuants
Specifically Dependent Continuant
if any bearer ceases to exist, then the quality
or function ceases to exist the color of my
skin the function of my heart to pump blood my
weight
Quality, Pattern
Realizable Dependent Continuant
53Generically Dependent Continuants
if one bearer ceases to exist, then the entity
can survive, because there are other
bearers (copyability) the pdf file on my
laptop the DNA (sequence) in this chromosome
Generically Dependent Continuant
Information Object
Gene Sequence
54Four distinct classificatory tasks
- of people (patients, carriers, )
- of diseases (cases, instances, problems, )
- of courses of disease (symptoms, treatments)
- of representations (records, observations, data,
diagnoses) - ICD confuses 1. 2.
- HL7, most standard terminologies, confuse 2. and
4
55Four distinct BFO categories
- person (patient, carrier, ) independent
continuant - disease (case, instance, problem, )
specifically dependent continuant - course of disease (symptom, treatment)
occurrent - representation (record, datum, diagnosis)
generically dependent continuant
56Four distinct BFO categories
- people (patients, carriers, ) independent
continuants - diseases (cases, instances, problems, )
dispositions - courses of disease (symptoms, treatments)
realizations of dispositions - representations (records, data, diagnoses)
generically dependent continuants
57Big Picture (with thanks to Richard Scheuermann)
58- A disease is a disposition rooted in a
- physical disorder in the organism and
- realized in pathological processes.
produces
bears
realized_in
etiological process
disorder
disposition
pathological process
produces
abnormal bodily features
signs symptoms
interpretive process
diagnosis
produces
recognized_as
used_in
59Elucidation of Primitive Terms
- bodily feature - an abbreviation for a physical
component, a bodily quality, or a bodily process. - disposition - an attribute describing the
propensity to initiate certain specific sorts of
processes when certain conditions are satisfied. - clinically abnormal - some bodily feature that
- (1) is not part of the life plan for an organism
of the relevant type (unlike aging or pregnancy),
- (2) is causally linked to an elevated risk either
of pain or other feelings of illness, or of death
or dysfunction, and - (3) is such that the elevated risk exceeds a
certain threshold level. - Compare baldness
60Definitions - Foundational Terms
- Disorder def. A causally linked combination of
physical components that is clinically abnormal. - Pathological Process def. A bodily process
that is a manifestation of a disorder and is
clinically abnormal. - Disease def. A disposition (i) to undergo
pathological processes that (ii) exists in an
organism because of one or more disorders in that
organism.
61Dispositions and Predispositions
- All diseases are dispositions not all
dispositions are diseases. - A predisposition is a disposition.
- Predisposition to Disease of Type X def. A
disposition in an organism that constitutes an
increased risk of the organisms subsequently
developing the disease X. - HNPCC is caused by a
- disorder (mutation) in a DNA mismatch repair gene
that - disposes to the acquisition of additional
mutations from defective DNA repair processes,
and thus is a - predisposition to the development of colon cancer.
62Definitions - Clinical Evaluation Terms
- Sign def. A bodily feature of a patient that
is observed in a physical examination and is
deemed by the clinician to be of clinical
significance. (Objectively observable features) - Symptom def. A experienced bodily feature of a
patient that is observed by and observable only
by the patient and is of the type that can be
hypothesized by a patient to be a realization of
a disease. (A restricted family of phenomena
including pain, nausea, anger, drowsiness, which
are of their nature experienced in the first
person) - Symptoms are subjective. But this does not mean
that there is no objective fact of the matter
whether a given symptom exists
63Cirrhosis - environmental exposure
- Etiological process - phenobarbitol-induced
hepatic cell death - produces
- Disorder - necrotic liver
- bears
- Disposition (disease) - cirrhosis
- realized_in
- Pathological process - abnormal tissue repair
with cell proliferation and fibrosis that exceed
a certain threshold hypoxia-induced cell death - produces
- Abnormal bodily features
- recognized_as
- Symptoms - fatigue, anorexia
- Signs - jaundice, splenomegaly
- Symptoms Signs
- used_in
- Interpretive process
- produces
- Hypothesis - rule out cirrhosis
- suggests
- Laboratory tests
- produces
- Test results - elevated liver enzymes in serum
- used_in
- Interpretive process
- produces
- Result - diagnosis that patient X has a disorder
that bears the disease cirrhosis
64Influenza - infectious
- Etiological process - infection of airway
epithelial cells with influenza virus - produces
- Disorder - viable cells with influenza virus
- bears
- Disposition (disease) - flu
- realized_in
- Pathological process - acute inflammation
- produces
- Abnormal bodily features
- recognized_as
- Symptoms - weakness, dizziness
- Signs - fever
- Symptoms Signs
- used_in
- Interpretive process
- produces
- Hypothesis - rule out influenza
- suggests
- Laboratory tests
- produces
- Test results - elevated serum antibody titers
- used_in
- Interpretive process
- produces
- Result - diagnosis that patient X has a disorder
that bears the disease flu
65Huntingtons Disease - genetic
- Etiological process - inheritance of gt39 CAG
repeats in the HTT gene - produces
- Disorder - chromosome 4 with abnormal mHTT
- bears
- Disposition (disease) - Huntingtons disease
- realized_in
- Pathological process - accumulation of mHTT
protein fragments, abnormal transcription
regulation, neuronal cell death in striatum - produces
- Abnormal bodily features
- recognized_as
- Symptoms - anxiety, depression
- Signs - difficulties in speaking and swallowing
- Symptoms Signs
- used_in
- Interpretive process
- produces
- Hypothesis - rule out Huntingtons
- suggests
- Laboratory tests
- produces
- Test results - molecular detection of the HTT
gene with gt39CAG repeats - used_in
- Interpretive process
- produces
- Result - diagnosis that patient X has a disorder
that bears the disease Huntingtons disease
66HNPCC - genetic pre-disposition
- Etiological process - inheritance of a mutant
mismatch repair gene - produces
- Disorder - chromosome 3 with abnormal hMLH1
- bears
- Disposition (disease) - Lynch syndrome
- realized_in
- Pathological process - abnormal repair of DNA
mismatches - produces
- Disorder - mutations in proto-oncogenes and tumor
suppressor genes with microsatellite repeats
(e.g. TGF-beta R2) - bears
- Disposition (disease) - non-polyposis colon
cancer - realized in
- Symptoms (including pain)
67Definition Etiology
- Etiological Process def. A process in an
organism that leads to a subsequent disorder. - Example toxic chemical exposure resulting in a
mutation in the genomic DNA of a cell infection
of a human with a pathogenic virus inheritance
of two defective copies of a metabolic gene - The etiological process creates the physical
basis of that disposition to pathological
processes which is the disease.
68Definitions - Diagnosis
- Clinical Picture def. A representation of a
clinical phenotype that is inferred from the
combination of laboratory, image and clinical
findings about a given patient. - Diagnosis def. A representation of a
conclusion of an interpretive process that has as
input a clinical picture of a given patient and
as output an assertion to the effect that the
patient has a disease of such and such a type.
69Definitions - Qualities
- Manifestation of a Disease def. A bodily
feature of a patient that is (a) a deviation from
clinical normality that exists in virtue of the
realization of a disease and (b) is observable. - Observability includes observable through
elicitation of response or through the use of
special instruments. - Preclinical Manifestation of a Disease def. A
manifestation of a disease that exists prior to
its becoming detectable in a clinical history
taking or physical examination. - Clinical Manifestation of a Disease def. A
manifestation of a disease that is detectable in
a clinical history taking or physical
examination. - Phenotype def. A (combination of) bodily
feature(s) of an organism determined by the
interaction of its genetic make-up and
environment. - Clinical Phenotype def. A clinically abnormal
phenotype.