Title: Understanding Real-world Ontologies
1Understanding Real-world Ontologies
2Outline
- Analysis of real-world ontologies
- The (simplified) GALEN ontology.
- The National Cancer Institute (NCI) Thesaurus.
- The TAMBIS ontology.
- Advanced issues and design patterns
- Qualified versus unqualified number restrictions.
- Transitive propagation of properties.
- Nominals and pseudo-nominals.
3Analysis of Real-world Ontologies
4GALEN
- Ontology about medical terms and surgical
procedures. - Constructed in the 90s within the OpenGALEN
project. - Main applications
- Integration of clinical records, and
- decision support.
- GALEN
- is very large (35.000 concepts),
- is fairly expressive (SHIF description logic),
- has not been classified yet by any DL reasoner
- In this tutorial we use a smaller version, which
- is still large (3000 concepts),
- is similarly expressive as full GALEN,
- was first classified by the FaCT system.
5GALEN The Ontology at a Glance
- Size
- 3000 classes
- 500 object properties
- no individuals or datatypes
- Expressivity
- 350 General Concept Inclusion Axioms (GCIs).
- Concept constructors
- Conjunction (intersectionOf)
- Existential restrictions (someValuesFrom)
- 150 functional properties
- 26 transitive properties
6GALEN The (Unclassified) Hierarchies
- The class hierarchy
- Number of subsumption relations 1978
- Maximum depth of the tree 13
- No multiple inheritance
- Browse through it!
- The property hierarchy
- 4 properties with multiple inheritance
- Browse through it!
7GALEN Concept definitions and GCIs
- Concept definition
- Axiom of the form A C with
- A a concept name
- C a (possibly complex) concept
- A definition assigns a name A to a complex
concept C - Some examples
- LungPathology Pathology u 9 locativeAttribute.Lu
ng - RenalTransplant Transplanting u 9
actsOn.Kindney
8GALEN Concept definitions and GCIs
- Inclusion axioms
- Axioms of the form A v C
- A is a concept name
- C is a possibly complex concept
- Represent an incomplete (partial) definition
- Examples
- XRayMachine v ImagingDevice
- Candida v 9 hasFunction.AerobicMetabolicProcess
- In GALEN, some of these can be very complex
- check out the definitions of Knee Joint and
Kidney!
9GALEN Concept definitions and GCIs
- General Concept Inclusion Axioms (GCIs)
- Axioms of the form C D
- C,D can be complex
- May describe general (background) knowledge about
the ontology - Examples
- Secretion u 9 actsSpecificallyOn.Leucocidin v
- 9 isFunctionOf.StraphilococcusAureus
-
- 9 actsOn.Glucose u Transport u 9
carriesFrom.Blood v - 9 carriesTo.Cell
10Classifying GALEN
- Ontology statistics (revisited)
- Number of class subsumption relations 6729
- 1978 of which are told and the rest inferred
- Maximum depth of the class tree 15
- As opposed to 13 in the case of the unclassified
tree - Classes with multiple inheritance 408
- All multiple inheritance relations have been
inferred! - This was intended in the design of GALEN
- Maximum depth of the property tree 9
- No change with respect to the told tree
- Properties with multiple inheritance 4
- Again, no change with respect to the told
tree - Reasoning is mostly performed on classes and not
on properties -
11Modeling Choices
- The upper part
- Composed of the domain-independent concepts and
roles. - Examples
- TopCategory, DomainCategory, GeneralisedStructure
- Shallowly defined (mostly a taxonomy)
- The domain specific part
- Examples
- Plant, LungPathology,
- Richly defined
- Much more than just a taxonomy!
12Inferred Knowledge
- A trivial subsumption
- Why is PathologicalCondition a subclass of
DomainCategory? - Simply look at the definition of Pathological
Condition! - Another example
- Why is PathologicalBehavior a subclass of
PathologicalCondition? - Look at the definition of both classes
- Notice that Behavior is a subclass of
DomainCategory - A non-trivial subsumption
- Why are Achalasia Processes Pathological Body
Processes? - Try!
- If you dont succeed use the pinpointing
explanation service
13Classifying GALEN
- Simple and multiple inheritance
- Focus, for example, on PathologicalBodyProcess
- Navigate to its super-classes
- Fly the mother ship and see what is going on!
14The NCI Ontology
- Huge bio-medical ontology describing the Cancer
domain - Maintained by a dozen of domain experts
- Contains information about
- genes,
- diseases,
- drugs,
- research institutions,
- All with a cancer-centric focus
- Download it!
- http//www.mindswap.org/2003/CancerOntology
15NCI The Ontology at a Glance
- Size
- 30.000 classes
- 70 object properties
- no individuals or datatypes
- Expressivity
- Concept constructors
- Conjunction (intersectionOf)
- Existential restrictions (someValuesFrom)
- Axioms
- Definitions (no GCIs)
- Domain and range of properties
16NCI The (Unclassified) Hierarchies
- The class hierarchy
- Number of subsumption relations 103.232
- Maximum depth of the tree 19
- Classes with multiple inheritance 4636
- Browse through it!
- The property hierarchy
- No properties with multiple inheritance
- Browse through it!
17Axioms in NCI
- Examples
- Cancer_Gene v Gene u 9 hasFunction.Tumoregenesis
- Alzheimer_Disease v Dementia
- Domain(anatomic_Structure_has_Location)
Anatomy_Kind - Range(technique_hasPurpose) Clinical_Or_Research
_Activity_Kind
18The NCI Kinds
- Upper concepts representing the sub-domains of
NCI - Examples
- Anatomy.
- Biological processes.
- Chemicals and drugs.
- Organisms
- Properties relating the Kinds
19NCI
- Partitioning and crop-circles view of the
partitioning - Here, we give an intuition about the different
sub-domains in NCI, which ones are central and
which ones are side domains
20NCI and GALEN
- The domains of NCI and GALEN overlap. Both
ontologies define concepts such as - Anatomical parts bone, tissue, etc.
- Diseases
- Organisms,
- Example
- Check out how Femur is defined in NCI and GALEN
- Discuss the different modeling decisions and
focus of interest
21Tambis
- TAMBIS is a medical ontology constructed during
the early days of the Web. - The intended application was the integrated
access to information in a set of databases. - The OWL version was generated from the old format
using a script.
22Tambis The Ontology at a Glance
- Size
- 400 classes
- 100 object properties
- no individuals or datatypes
- Expressivity
- No General Concept Inclusion Axioms.
- Concept constructors
- Conjunction (intersectionOf)
- Disjunction (unionOf)
- Existential restrictions (someValuesFrom)
- Universal restriction (allValuesFrom)
- Cardinality restrictions
- Axioms
- Definitions (complete and partial)
- Transitive, functional, symmetric and inverse
properties
23Tambis the (unclassified) hierarchies
- Subclass relationships 226
- No multiple inheritance
- Maximum depth of class tree 6
- Maximum depth of property tree 2
-
24Tambis Example Axioms
- Tambis uses cardinality restrictions profusely
- See definition of anion
- Use of disjunction
- See definition of atom
- Use of universal restrictions
- See definition of book-title
- Use of complex nested restrictions
- See definition of complement-dna
- See definition of gene
- Disjointness axioms
- See definitions of metal, non-metal and metalloid
25Tambis Classification
- Subclass relationships 600
- compared to 226
- Classes with multiple inheritance 19
- compared to none
- Maximum deph of class tree 7
- compared to 6
- Maximum depth of property tree 2
- 144 unsatisfiable concepts!
26Tambis Unsatisfiable concepts
- Almost half of the concepts in Tambis are
unsatisfiable - The explanations are non-trivial
- Check out protein-structure and
macromolecular-part! - Distinguishing root and derived unsatisfiable
classes - derived unsatisfiable classes are unsatisfiable
because they depend on another unsatisfiable
concept. - definition of Enzyme,
- definition of Binding-site
- root unsatisfiable classes contain an
inherent contradiction - definition of Metal,
- definition of Non-metal,
- definition of Metalloid
27Tambis Repair
28Advanced Issues and Design Patterns
29Qualified Number Restrictions (QCRs)
- Existential restrictions in OWL DL are qualified
- Person u 9hasChild.Male
- Cardinality restrictions can only be qualified
with gt - Person u 9hasChild.Male
- The lack of QCRs has been identified as a major
limitation of OWL, especially in biomedical
applications - A quadruped is an animal with exactly four parts
that are legs - A medical oversight committee is a committee
which consists of at least five members of which
two are medical doctors, one is a manager and two
are members of the public.
30Qualified Cardinality Restrictions
- Can be approximated using property inclusion
and property range. - Quadruped Animal u ( 4 hasLeg)
- hasLeg v hasPart
- Range(hasLeg) Leg
31Qualified Cardinality Restrictions
- This approximation is unsound in general
- MedicalCommittee Committee u (3 hasMember)
u 1hasMember.MD u 1 hasMember. MD - Approximated by
- MedicalCommittee (3 hasMember) u
1hasMDMember u - 1hasNotMDMember
- hasMDMember v hasMember
- hasNotMDMember v hasMember
- Range(hasMDMember) MD
- Range(hasNotMDMember) MD
32Transitive Propagation of Properties
- In OWL, we can express transitive propagation of
a property - If Paris is located in France and France is
located in Europe, then France is located in
Europe. - If the hand is a part of the arm and the arm is
part of the human body, then the hand is a part
of the human body. - In OWL, however, we cannot express transitive
propagation of a property along a different
property - If an ulcer is located in the gastric mucosa and
the gastric mucosa is a part of the stomach, then
the ulcer is located in the stomach - If a burn is located in the foot and the foot is
part of the leg, then the burn is located in the
leg.
33Transitive Propagation of Properties
- Various patterns that approximate transitive
propagation have been proposed and used in
ontologies. - Use of the property hierarchy and transitivity
-
- Part_Of v Located_In
- Transitive(Part_Of)
- This pattern may yield to undesired results,
since part-whole relations may not always imply
location - The orange peal is part of the orange, but is it
located in the orange?
34Nominals in OWL-DL
- Define concepts in terms of individuals.
- Two constructs in OWL
- owloneOf, owlhasValue
- owloneOf - Enumeration of individuals.
- WineColor ? red, white, rose
- red, white, rose red t white t rose
- owlhasValue - Value restrictions.
- RedWine ? 9hasColor.red
- RockFan v 9hasIdol.elvis
35Nominals and Pseudo-nominals
- Reasoners traditionally do not support nominals
(only Aboxes) - Not enough implementation experience.
- Believed to be hard.
- Decision procedure for SHON in 2001!
- Example Wine ontology
- Used in OWL guide to demonstrate OWL.
- Large number of nominals used.
- No reasoner (even incomplete) could reason with
it! Only Pellet (very recently)
36Faking Nominals
- Pseudonominals Approximation to nominals
SpanishWine Wine u 9producedIn.spain FrenchWi
ne Wine u 9producedIn.france
Unsound!!
SpanishWine Wine u 9producedIn.Spain FrenchWine
Wine u 9producedIn.France France u Spain ?
37Pseudo-nominals unsoundness
- Suppose we define the concept of a wine that is
produced in at least three different countries - Wine u 3 producedIn.Country
- Suppose I have only two countries in my ontology
- Country Spain,France
- My concept is then unsatisfiable.
- Suppose we now use pseudo-nominals and treat
Spain and France as disjoint atomic concepts.
Then, our concept is satisfiable.