Title: Introduction to Ontologies
1Introduction to Ontologies
- Introduction
- Motivation
- Ontologies compared to other stuff
- Example first part
- Upper ontology specifics
- Example second part
- Discussion
2Ground Rules
- Practical goal of providing basic grounding in
knowledge engineering - There are many alternatives, to make progress
requires making choices and getting to work
3Definitions
- An ontology is a shared conceptualization of a
domain - An ontology is a set of definitions in a formal
language for terms describing the world
4Motivation
- select EMPDAT from PERSTAB where POSmgmnt
- What does it mean?
- PERSTAB is a table which lists employee data
- Whats an employee? How is an employee different
from a contractor? What if I want data on both? - Even if this information is available in English,
a machine has to read it
5Motivation (2)
- "Parenthood is a more general relationship than
motherhood." - "Mary is the mother of Bill."
- "Who are Bill's parents?
- "Mary is the parent of Bill.
- that fact is not stated anywhere, but can be
derived by a DAML application. Darpa Agent Markup
Language- for Semantic web.
Example from Why Use DAML? lthttp//www.daml.org
/2002/04/why.htmlgt
6Motivation (2) continued
- More formally stated, given the statements
- (motherOf subProperty parentOf)
- (Mary motherOf Bill)
- when stated in DAML, allows you to conclude
- (Mary parentOf Bill)
- Java code or a stored procedure could do this
sort of inference for facts in XML or SQL - But the DAML spec itself says the conclusion is
true - In contrast, different Java code could reach a
different conclusion
7Motivation (2) continued
- (Mary motherOf Bill)
- (parentOf inverse childOf)
- (Bill childOf ?X)
- ?X Mary
- The semantics of inverse is part of the DAML
spec
8Language Formality and Expressiveness
Human Language
Cyc
F-Logic
KIF
Machine Processing
OWL
OWLWeb ontology lang. More expressive than XML
RDF. It is the revision of DAMLOIL web ontology
lang.
Human Consumption
Machine Inference
SQL
Expressiveness
DAML
XML
Formality
9Content Formality and Size
WordnetOnline lexical DB synonym concepts,
relations. UMLS NLMs Unified Medical Lang Sys
for knowledge rep and retrieval. Cyc Commonsense
reasoning engine. SUMO Suggested Upper Merged
Ontology- IEEE stad- Uses KIF. Dictionary. DOLCE
Descriptive Ontology for Linguistic and cognitive
engineering.
Cyc
WordNet
SUMOdomain
SUMO
UMLS
Yahoo!
DOLCE
Taxonomy
Lexicons
Formal Ontology
Size
Formality
10Everything is not a Nail
- Ontology is not always the right tool for the job
- Face recognition, vehicle control systems etc
not the right applications for ontology
11Many Ways to Use Ontology
- As an information engineering tool
- Create a database schema
- Map the schema to an upper ontology
- Use the ontology as a set of reminders for
additional information that should be included - As more formal comments
- Define an ontology that is used to create a DB or
OO system - Use a theorem prover at design time to check for
inconsistencies - For taxonomic reasoning
- Do limited run-time inference in Prolog, a
description logic, or even Java - For first order logical inference
- Full-blown use of all the axioms at run time
12Upper Ontology
- An attempt to capture the most general and
reusable terms and definitions
13Motivation
- Ontologies may have different names for the same
things - type a relation between a class and an instance
- instance a relation between a class and an
instance - isa a relation between a class and an instance
-
- Ontologies may have the same name for different
things, and no corresponding terms - before a relation between two time points
- before a relation between two time intervals
- Either use the same upper ontology, or at least
map to a common upper ontology
14Formal Upper Ontologies
15Simple Methodology
- Extract nouns and verbs from a source text
- Find classes in SUMO for the nouns and verbs.
SUMO Suggested Upper Merged Ontology- IEEE stad-
Uses KIF. Dictionary. - Record a mapping as being either equal, subsuming
or instance. - type a single word that relates to the UBL term
in the "SUMO term" or "English Word" text areas
in the SUMO browser. UBLUniversal Business Lang. - Create a subclass of SUMO if it's a subsuming
mapping - Add properties to the subclass
- reusing SUMO properties
- extending SUMO properties by creating a
subrelation of an existing property - Add English definition to the class
- define constraints that express how the subclass
is more specific than the superclass - Express the classes and properties in KIF and
begin creating axioms, based on the English
definitions created previously
16Exercise
- Walk through the process of creating some
ontology content from a source text - Learn a general methodology
- Get practical familiarity with KIF and SUMO
17First Exercise (1)
- Seven Turkish nationals of Chechen origin
hijacked a Russia-bound Panamanian ferry in
Trabzon. The hijackers initially threatened to
kill all Russians on board unless Chechen
separatists being held in Dagestan, Russia, were
released. On 19 January 1998, the hijackers
surrendered to Turkish authorities outside the
entrance to the Bosporus. The passengers were
unharmed. - Identify items that need formalization start
with nouns and verbs
18First Exercise (2)
- Seven Turkish nationals of Chechen origin
hijacked a Russia-bound Panamanian ferry in
Trabzon. The hijackers initially threatened to
kill all Russians on board unless Chechen
separatists being held in Dagestan, Russia, were
released. On 19 January 1998, the hijackers
surrendered to Turkish authorities outside the
entrance to the Bosporus. The passengers were
unharmed. - Now create terms that correspond to the nouns and
verbs - Remove redundancy
- Are there any background notions that are not
explicit in the text?
19First Exercise (3)
- Seven Turkish nationals of Chechen origin
hijacked a Russia-bound Panamanian ferry in
Trabzon. The hijackers initially threatened to
kill all Russians on board unless Chechen
separatists being held in Dagestan, Russia, were
released. On 19 January 1998, the hijackers
surrendered to Turkish authorities outside the
entrance to the Bosporus. The passengers were
unharmed - Turkey, Chechnya, Nationality, Hijacking,
Threatening, Killing, Releasing, Holding,
Dagestan, Russia, Separatist, Entrance, Bosporus,
Unharmed, Panama, Trabzon, Authority, Outside,
boundFor, Ferry, onBoard
20SUMO Overview
- Understanding whats in the upper ontology, in
order to use it effectively
21High Level Distinctions
- The first fundamental distinction is that between
Physical (things which have a position in
space/time) and Abstract (things which dont) - Physical Abstract
22High Level Distinctions
- Partition of Physical into Objects and
Processes - Physical
- Object Process
23Objects
- Object
- SelfConnectedObject
- Substance
- CorpuscularObject
- Region
- Collection
-
24Processes
IntentionalProcess IntentionalPsychologica
lProcess RecreationOrExercise
OrganizationalProcess Guiding
Keeping Maintaining Repairing
Poking ContentDevelopment
Making Searching
SocialInteraction Maneuver Motion
BodyMotion DirectionChange
Transfer Transportation Radiating
DualObjectProcess Substituting
Transaction Comparing Attaching
Detaching Combining
Separating InternalChange
BiologicalProcess QuantityChange
Damaging ChemicalProcess
SurfaceChange Creation
StateChange ShapeChange
25Abstract
- SetOrClass
- Relation
- Proposition
- Quantity
- Number
- PhysicalQuantity
- Attribute
- Graph
- GraphElement
-
-
26A Little Bit of Logic
- Instance GeorgeBush, Iraq, BobsRightBigToe
- Class Human, Nation
- Relation WWI before WWII, Bill childOf Mary
- gt (read as implies) - if X then Y
- and X and Y are true
- or X or Y (or both) are true
- not not X the opposite of the truth of X
- exists ?X there exists something about which
the following is true
27A Little Structural Ontology
- (instance GeorgeBush Human) GeorgeBush is an
instance of the class of humans - (exists (?X)
- (parent ?X GeorgeBush)) there exists
something of which George Bush is the parent - (instance parent BinaryPredicate) the relation
of parent is a binary relation - (domain parent 1 Organism) the first argument
to the parent relation must be an instance of the
class Organism - (domain parent 2 Organism) similarly for the
second argument
28Linking to SUMO Terms
- Nation, Confining, Committing, SocialRole,
TransportationDevice, Killing, Near, Injuring,
citizen, (not), (exists) - Terms from the exercise (may or may not be the
same as SUMO terms) - Turkey, Chechnya, Nationality, Hijacking,
Threatening, Killing, Releasing, Holding,
Dagestan, Russia, Separatist, Entrance, Bosporus,
Unharmed, Panama, Trabzon, Authority, Outside,
boundFor, Ferry, onBoard - Use the terms in the first bullet to define the
terms in the second bullet - Use Nation to state (instance Turkey Nation)
29Formalization
- (exists (?TURK )
- (and
- (citizen ?TURK Turkey))
-
- )
30Formalization
- (exists (?TURK ?FERRY )
- (and
- (citizen ?TURK Turkey)
- (instance ?FERRY FerryBoat)
-
- )
31Formalization
- (exists (?TURK ?FERRY ?HIJACK)
- (and
- (citizen ?TURK Turkey)
- (instance ?FERRY FerryBoat)
- (instance ?HIJACK Hijacking)
- (agent ?HIJACK ?TURK)
- (patient ?HIJACK ?FERRY)
- (earlier ?HIJACK
- (DayFn 19
- (MonthFn January
- (YearFn 1998))))))
32Ontologies Working Group Goals
- Collect controlled vocabularies for sample
descriptions. - Define microarray concepts and their
relationships. - Provide bridge to ontologies from other knowledge
domains.
33What is an Ontology?
An ontology is a specification of a
conceptualization that is designed for reuse
across multiple applications and implementations.
a specification of a conceptualization is a
written, formal description of a set of concepts
and relationships in a domain of interest.
Peter Karp (2000) Bioinformatics 16269
34Ontologies in Gene Expression Databases
- Controlled vocabulary
- Define relationships through hierarchy (e.g.,
taxonomy) - Schema
- Concepts as objects or relational tables
- Attributes and data types provide specification
- Relationships specified through subclassing
(objects) or foreign keys (relational tables) - Knowledge representation
- Link to other domains (gene sequence annotation,
gene and protein roles, pathways) - Facilitate data exchange by mapping common
concepts
35Anatomy Hierarchy
36Experiment Tables
Experiment
37MAML DTD -gt UML mappingArray platform-MicroArray
Makup Language
38Other Domains
- Gene descriptions (Gene Ontology)
- Molecular function
- Biological process
- Subcellular localization
- Cellular and biochemical pathways (EcoCyc)
- Literature (MeSH)
- Phenotypes
- Others
Requires common set of terms (semantic mapping)
Or shared usage of identifiers (e.g. GenBank
accessions)
39www.mged.org
40Example Sample Descriptions
41MIAME Ontology
Define MIAME concepts and their relationships
incorporating MAML. Minimum Info About a
Microarray Experiment The goal is to generate a
document that will provide a clear and common
understanding of what should be reported and how.
The tables are a draft to form the basis for
such a document. Located at Ontology Working
Group home page.