Title: Formal Ontology and Information Systems
1Formal Ontology and Information Systems
- Barry Smith
- http//ifomis.de
2 - Institute for Formal Ontology and Medical
Information Science - (IFOMIS)
- Faculty of Medicine
- University of Leipzig
- http//ifomis.de
3The Idea
- Computational medical research
- will transform the discipline of medicine
- but only if communication problems can be solved
4Database standardization
- is desparately needed in medicine
- to enable the huge amounts of data
- resulting from trials by different groups
- to be fused together
5How make one system out of all of this?
- How resolve incompatibilities?
-
- ONTOLOGY the solution of first resort
- (compare kicking a television set)
- But what does ontology mean?
- Current answer a collection of terms and
definitions satisfying constraints of description
logic application ontology
6Description logic
- a decidable logic (thus much weaker than
first-order predicate logic) for manipulating
hierarchies of concepts/general terms)
7Enterprise Ontology
- A Sale is an agreement between two Legal-Entities
for the exchange of a Product for a Sale-Price. - A Strategy is a Plan to Achieve a high-level
Purpose. - A Market is all Sales and Potential Sales within
a scope of interest.
8Gene Ontology
- Molecular Function Ontology tasks performed by
individual gene products - examples transcription factor, DNA helicase
- Biological Process Ontology broad biological
goals accomplished by ordered assemblies of
molecular functions - examples mitosis, purine metabolism
- Cellular Component Ontology subcellular
structures, locations, and macromolecular
complexes - examples nucleus, telomere
9Example from Molecular Function Ontology
- hormone GO0005179
- digestive hormone GO0046659
- peptide hormone GO0005180 adrenocorticotrop
in GO0017043 glycopeptide hormone
GO0005181 follicle-stimulating hormone
GO0016913 - IS A
10as tree (joined by is a links)
- hormone
- digestive hormone peptide hormone
- adrenocorticotropin
glycopeptide hormone -
follicle-stimulating hormone
11Problem There exist multiple databases
- genomic
- cellular
- structural
- phenotypic
-
- and even for each specific type of information,
e.g. DNA sequence data, there exist several
databases of different scope and organisation
12What is a gene?
- GDB a gene is a DNA fragment that can be
transcribed and translated into a protein - Genbank a gene is a DNA region of biological
interest with a name and that carries a genetic
trait or phenotype - (from Schulze-Kremer)
13What is blood?
- Unified Medical Language System (UMLS)
- blood is a tissue
- Systematized Nomenclature of Medicine (SNOMED)
- blood is a fluid
14Another Example Statements of Accounts
- Company Financial statements may be prepared
under either the (US) GAAP or the (European) IASC
standards - These allocate cost items to different
categories depending on the laws of the countries
involved.
15Job
- to develop an algorithm for the automatic
conversion of income statements and balance
sheets between the two systems. - Not even this relatively simple problem has been
satisfactorily resolved - why not?
16The World Wide Web
- Vast amount of heterogeneous data sources
- Needs dramatically better support at the level
of metadata - The ability to query and integrate across
different conceptual systems - The currently preferred answer is The Semantic
Web, based on description logic - will not work
- How tag blood?
17Application ontology
- cannot solve the problems of database integration
- There can be no mechanical solution to the
problems of data fusion in a domain like medicine
18Applications ontology
- grew out of work in AI and in knowledge
representation - Ontologies are applications running in real time
19Applications ontology
- ontologies are inside the computer
- thus subject to severe constraints on expressive
power - (effectively the expressive power of description
logic)
20Applications ontology cannot solve the
data-fusion problem
- because of its roots in knowledge mining
-
21different conceptual systems
22need not interconnect at all
23because of the limits of knowledge mining
24we cannot make incompatible concept-systems
interconnect
just by looking at concepts, or knowledge we
need some tertium quid
25Applications ontology
- has its philosophical roots in Quines doctrine
of ontological commitment and in the internal
metaphysics of Carnap/Putnam - Roughly, for an applications ontology the world
and the semantic model are one and the same - What exists what the system says exists
26The Problem for the Quinean
- If an ontology is the set of ontological
commitments of a theory, how can we cope with
questions pertaining to the relations between the
objects to which different theories are committed?
27theories, semantic models, need not interconnect
at all
28What is needed
- in some sort of wider common framework which is
sufficiently rich and nuanced to allow concept
systems deriving from different sources to be
hand-callibrated
29What is needed
- is not an applications ontology
- but
- a reference ontology
- (something like old-fashioned metaphysics)
30Reference Ontology
- grew out of logic and analytic metaphysics
- An ontology is a theory of the relevant domain
of entities - Ontology is outside the computer
- seeks maximal expressiveness and adequacy to
reality - willing to sacrifice computational tractability
for the sake of representational adequacy
31Belnap
- it is a good thing logicians were around before
computer scientists - if computer scientists had got there first,
then we wouldnt have numbers - because arithmetic is undecidable
32It is a good thing
- Aristotelian metaphysics was around before
description logic, - because otherwise we would have only hierarchies
of - concepts/universals/classes and no individual
instances
33Reference Ontology
- a theory of the tertium quid
- called
- reality
- needed to hand-callibrate database/terminology
systems
34Methodology
- Get ontology right first
- (realism descriptive adequacy rather powerful
logic) - solve tractability problems later
35The Reference Ontology Community
- IFOMIS (Leipzig)
- Laboratory for Applied Ontology (Trento/Rome,
Turin) - Foundational Ontology Project (Leeds)
- Ontology Works (Baltimore)
- Ontek Corporation (Buffalo/Leeds)
- LandC (Belgium/Philadelphia)
- (CYC?)
36Domains of Current Work in Reference Ontology
- IFOMIS Leipzig Medicine
- Laboratory for Applied Ontology
- Trento/Rome Ontology of Cognition/Language
- Turin Law
- Foundational Ontology Project (Leeds) Space,
Physics - Ontology Works (Baltimore) Genetics, Molecular
Biology - Ontek Corporation (Buffalo/Leeds) Biological
Systematics - LandC (Belgium/Philadelphia) Medical NLP
- (? CYC Everything ?)
37Some Historical Background on Reference Ontology
38Recall
- GDB a gene is a DNA fragment that can be
transcribed and translated into a protein - Genbank a gene is a DNA region of biological
interest with a name and that carries a genetic
trait or phenotype - (from Schulze-Kremer)
39Ontology
- Note that terms like fragment, region,
name, carry, trait, type - along with terms like part, whole,
function, substance, inhere - are ontological terms in the sense of traditional
(philosophical) ontology
40Aristotle
First ontologist
41First ontology (from Porphyrys Commentary on
Aristotles Categories)
42Linnaean Ontology
43 Formal Ontology
- term coined by Edmund Husserl
- the theory of those ontological structures
- such as part-whole, universal-particular
- which apply to all domains whatsoever
44Edmund Husserl
45Husserl outlines a new methodof constituent
ontology
- to study a domain ontologically
- is to establish the parts of the domain
- and the interrelations between them
- especially the dependence relations
46Logical Investigations1900/01
- Aristotelian theory of universals and particulars
- theory of part and whole
- theory of ontological dependence
- the theory of boundaries and fusion
47Formal Ontology
- contrasted with material or regional ontologies
- (compare relation between pure and applied
mathematics) - Husserls idea
- If we can build a good formal ontology, this
should save time and effort in building reference
ontologies for each successive domain
48Basic Formal Ontology
49Basic Formal Ontology
- Aristotelian theory of universals and instances
- theory of part and whole
- theory of ontological dependence
- theory of boundary, continuity and contact
- theory of states, powers, qualities, roles
(SPQR-entities) - theory of processes
- theory of environments/niches/contexts and
spatial and spatio-temporal regions
50BFO
- not just a system of categories
- but a formal theory
- with definitions, axioms, theorems
- designed to provide the resources for reference
ontologies for specific domains - the latter should be of sufficient richness that
terminological incompatibilities can be resolves
intelligently rather than by brute force
51Three types of reference ontology
- 1) formal ontology framework for rigorous
definition of the highly general concepts such
as object, event, whole, part employed in every
domain - 2) domain ontology, a top-level system with a few
highly general concepts, applies formal ontology
to a particular domain, such as genetics or
medicine - 3) terminology-based ontology, a very large
system embracing many concepts and inter-concept
relations
52MedO medical domain ontology
- including sub-ontologies
- cell ontology
- drug ontology
- protein ontology
- gene ontology
53other sub-ontologies
- anatomical ontology
- epidemiological ontology
- disease ontology
- therapy ontology
- pathology ontology
- the whole designed to give structure to the
medical domain - (currently medical education comparable to
stamp-collecting)
54MedO
- and its various sub-ontologies will inherit the
definitions and axioms of BFO but will add new
definitions and axioms of their own
55Granularity
- cell ontology
- drug ontology
- protein ontology
- gene ontology
- imply that we need also a theory of granularity
56Ontology
- like cartography
- must work with maps at different scales
- How fit these maps (conceptual grids) together
into a single system? - IFOMIS is developing a theory of granular
partitions designed to provide a framework within
which different maps/views of the same reality
can be combined together
57(No Transcript)
58Part Two
- Reference Ontology
- and Situated Computing
59Shimon Edelmans Riddle of Representation
- two humans, a monkey, and a robot are looking at
a piece of cheese - what is common to the representational processes
in their visual systems?
60Answer
The cheese, of course
61Rodney Brooks
- Intelligence without Representation
- The world itself is our model
- opposition between the Engineering view and the
SMPA View
62SMPA model
- Sense Model Plan Act
- the agent first senses its environment through
sensors - then uses this data to build a model of the world
- then produces a plan to achieve goals
- then acts on this plan
63Proposal
- SMPA belongs to the same methodological universe
as Applications Ontology - If we want to build an intelligent agent within
this framework, there need to be representations
of the domain within which the agent acts which
are inside the computer
64Engineering Approach
- The system embodies a number of distinct layers
of activity (compare faculties of the mind) - These layers operate independently and connect
directly to the environment outside the system - Each layer operates as a complete system that
copes in real time with a changing environment - Layers evolve through interaction with the
environment (artificial insects/vehicles )
65Brooks Engineering Approach
- lends very little weight to the role of
representations or models - At the same time it insists that AI should use
the world in all its complexity in producing
systems that react directly to the world - An ontology appropriate for this approach would
have to include within its purview both the world
and the system, - thus be essentially richer than the system alone
66An intelligent system
- must be situated
- it is situatedness which gives the processes
within each layer meaning - meaning exists precisely in the relation to the
world, - the world serves also as to unify the different
layers together and to make them compatible
67Organisms, especially humans,
- fix their beliefs not only in their heads but in
their worlds, as they attune themselves
differently to different parts of the world as a
result of their experience. And they pull the
same trick with their memories, not only by
rearranging their parsing of the world (their
understanding of what they see), but by marking
it. - They place traces out there which changes what
they will be confronted with the next time it
comes around. Thus they don't have to carry their
memories with them. - Intelligence without Representation
68Andy Clark, Being There
- humans can accomplish much without building
detailed, internal models - they rely on external scaffolding maps,
models, tools, landmarks, buildings, language,
culture - we act so as to simplify cognitive tasks by
"leaning on" the structures in our environment.
69Compare the Ecological Psychology of J. J. Gibson
- To understand human cognition we should study
the moving, acting human person as it exists in
its real-world environment - and taking account of how it has evolved into
this real-world environment
70For Gibson
- we are like (multi-layered) tuning forks tuned
to the environment which surrounds us, - and for us human beings this is a social
environment which includes - traces of prior actions in the form of records
and representations
71Gibsonian Ecological View of Information Systems
- To understand information systems we should study
the hardware as it exists embedded in its
real-world environment - and taking account of the environment for which
it was designed and built - Information systems are like tuning forks they
resonate in tune to their surrounding
environments e.g. through their biological and
chemical sensors
72So what is the ontology of blood?
73We cannot solve this problem just by looking at
concepts (by engaging in further acts of
knowledge mining)
74concept systems may be simply incommensurable
75the problem can only be solvedin
Brooksian/Gibsonian fashion
by taking the world itself into account
76By looking not at concepts, representations,
- and their semantic models
- but rather at organisms acting in the world
- and standing at different levels in a range of
different sorts of relations to the world
77We then recognize
- that the same object can be apprehended at
different levels of granularity - at the perceptual level blood is a liquid (?)
- at the cellular level blood is a tissue
78This implies a view of ontology
- not as a theory of concepts
- but as a theory of reality
- But how is this possible?
- How can we get beyond our concepts?
- answer ontology must be maximally opportunistic
- it must relate not to beliefs, concepts,
syntactic strings but to the world itself
79Maximally opportunistic
- means
- look at concepts and beliefs critically
- and always in the context of a wider view which
includes independent ways to access the objects
themselves - at different levels of granularity
- and taking account of tacit knowledge of those
features of reality of which the domain experts
are not consciously aware
80Maximally opportunistic
- means
- look not at what the expert says
- but at what the expert does
- Experts have expertise knowing how
- Ontologists can have windows on reality, by
focusing on categories, and can extract some form
of knowing that - Gibsonianism experts dont know what the
ontologist knows
81Ontology must be maximally opportunistic
- This means
- dont just look at beliefs
- look at the objects themselves
- from every possible direction,
- formal and informal
- scientific and non-scientific
82Maximally opportunistic
- means
- look at the same objects at different levels of
granularity
83Second step select out the good
conceptualizations
- these have a reasonable chance of being
integrated together into a single ontological
system - based on tested principles
- robust
- conform to natural science
84Ontology
- like cartography
- must work with maps at different scales
85 Medical ontologies
- at different levels of granularity
- cell ontology
- drug ontology
- protein ontology
- gene ontology
- anatomical ontology
- epidemiological ontology
- Rigidly hierachical, modular organization with
many things which can go wrong
86There are many compatible map-like partitions
- many maps at different scales,
- all transparent to the reality beyond
87Partitions should be cuts through reality
- a good medical ontology should NOT be compatible
with the conceptualization of disease as - caused by evil spirits and demons and cured by
golems
88Three main sorts of partitions
- 1. substances and their parts
- 2. qualities/functions/roles
- 3. processes
- in addition
- spatial regions/niches
- spatio-temporal regions
- AS UNIVERSALS, AS PARTICULARS
891. Substances and their parts
- Patterned parts (carved out by fiat)
- chess board
- football pitch
- Brocas Region
- nervous system
902. Functions
- function of a screwdriver
- tied to processes
- generalized four-dimensional shapes (carved
out by fiat) - contextual dependence
- function of the heart
- function of the circulatory system
91Generalized 4-dimensional shapes
- as UNIVERSALS
- as PARTICULARS
92Once we understand functions
- we can also understand malfunctions
- broken screwdriver
- defective heart
93Application to Bodily Systems
- Immune system, digestive system
- are complex substances
- paradigm skeleton
- carved out by fiat from the whole organism in
terms of their functions - engaging in specific types of processes
94Multi-layered systems
- How one system can use another system to exercise
its function - Drug transport system uses circulatory system
- (Layered Mereotopology
- of substances
- of processes)
95Part 3
96Testing the BFO/MedO approach
- within a software environment for NLP of
unstructured patient records - collaborating with
- Language and Computing nv (www.landc.be)
97LC
- LinKBase worlds largest terminology-based
ontology - incorporating UMLS, SNOMED, etc.
- LinKFactory suite for developing and managing
large terminology-based ontologies
98LinKBase
- BFO and MedO designed to add depth, and so also
reasoning capacity - by tagging LinKBase terms with corresponding
BFO/MedO categories - ???