Title: Foundations of the Semantic Web: Ontology Engineering
1Foundations of the Semantic WebOntology
Engineering
- Building Ontologies 1
- Alan Rector colleagues
2Goals for this module
- Be able to implement an ontology representation
in OWL-DL - Be able to elicit a conceptualisation
- Be able to formulate an ontology representation
- Be able to implement the ontology
representationin OWL-DL - Or be able to say you cant
- To understand the limits of OWL-DL ontologies
- Be ready to apply ontology representations in any
of several use cases - In one week, we cant build the
applicationsbut to build an ontology is only a
means to building applications - Without applications ontologies are pointless
3Ontologies and Ontology Representations
- Ontology a word borrowed from philosophy
- But we are necessarily building logical systems
- Physical symbol systems
- Simon, H. A. (1969, 1981). The Sciences of the
Artificial, MIT Press - Concepts and Ontologies/ conceptualisations
in their original sense are psychosocial
phenomena - We dont really understand them
- Concept representations and Ontology
representations are engineering artefacts - At best approximations of our real concepts and
conceptualisations (ontologies) - And we dont even quite understand what we are
approximating
4Ontologies and Ontology Representations (cont)
- Most of the time we will just say concept and
ontology but whenever anybody starts getting
religious, remember - It is only a representation!
- We are doing engineering, not philosophy
although philosophy is an important guide - There is no one way!
- But there are consequences to different ways
- and there are wrong ways
- and better or worse ways for a given purposes
- The test of an engineering artefact is whether it
is fit for purpose - Ontology representations are engineering artefacts
5Approach
- Design patterns
- Analogous to Java design patterns
- Standard ways to do things
- Someday they will be supported by tools,
buttoday you have to do it yourself - Elephant traps
- Common errors misconceptions
- Especially those that seem to work at first
- Foundations of knowledge representation
- 200 to 2000 years of experience you need not
repeat - Common dilemmas tradeoffs
- Things for which we dont have a perfect answer
6Why its hard (1)
- Clash of intuitions
- Subject Matter Experts motivated by custom
practice - Prototypes Generalities
- Logicians motivated by logic computational
tractability - Definitions and Universals
- Transparency predictability vs Rigour
Completeness - Neophytes (you?) caught in the muddled middle
7Why its hard (2)
- Conflation of Models
- Meaning Correctness of Classification
retrieval - Retrieval Task of discovery, search, or
finding - Use Task of data entry, decision
support, - Acquisition Task of capturing knowledge
- Quality assurance Criteria for whether it is
correct
8Beware
- DLs/OWL are not all of Knowledge Representation
- Knowledge Representation is not all of the
Semantic Web - The Semantic Web is not all of Knowledge
Management - The field is still full of controversies
- This module is to teach you about
implementation in OWL-DL
9Steps in developing an Ontology
- Establish the purpose
- Without purpose, no scope, requirements,
evaluation, - Informal/Semiformal knowledge elicitation
- Collect the terms
- Organise terms informally
- Paraphrase and clarify terms to produce informal
concept definitions - Diagram informally
- Refine requirements tests
10Steps in implementing an Ontology
- Implementation
- Develop normalised schema and skeleton
- Implement prototype recording intention
- Keep track of what you meant to do so you can
compare with what happens - Implementing logic-based ontologies is
programming - Scale up a bit
- Check performance
- Populate
- Possibly with help of text mining and language
technology - Evaluate quality assure
- Against
- Include tests for evolution and change management
- Monitor use and evolve
- Process not product!
11If this were three modules
- Knowledge elicitation and analysis
- A quick overview
- Implementation
- A solid introduction
- Evolution, ontology alignment, and management
- Left for another module
- But a major motivation for the methods taught in
this module - Normalisation and documentation of intentions
12Vocabulary
- Class ? Concept ? Category ? Type
- Instance ? Individual
- Property ? Slot ? Relation ? relationtype
? Attribute ? Semantic link type ? Role - but be careful about role
- Means property in DL-speak
- Means role played in most ontologies
- E.g. doctor_role, student role
13Related ideas
- Object Oriented Programming
- Java,a C, Smalltalk, etc.
- But OO programming is not knowledge
representation - Object Oriented Design (DB world)
- But data models are not ontologies either
- Although UML is often a good starting point
- Additional a-logical issues
- Difference between attributes and relations
- Issues of life cycle and handling of aggregation
- Notion of an instance
- Implicitly closed world
- Frame based systems, Semantic Nets, Traditional
AI - Where it all started but real differences
- RDF(S), Topic Maps and other node-and-arc
symbolisms - Whats in a link?
- The battles in standards committees continue
14Maintaining large Ontologies Conceptual Lego
SNPolymorphism of CFTRGene causing Defect in
MembraneTransport of ChlorideIon causing Increase
in Viscosity of Mucus in CysticFibrosis
Hand which isanatomicallynormal
OpenGALEN OWL
15The benefitsModularisationBridging Scales and
context with Ontologies
Species
Genes
Protein
Function
Disease
16Logic Based OntologiesIndividuals, Classes
Properties
Student
Ed_inst
anIndividualClassproperty
17Logic Based Ontologiesrestrictions
18Existential RestrictionsThe basic unit of
ontology development
- class (Student partial Person restriction
attends someValuesFrom Ed_inst) - All students attend some educational
institution. - ?x. Student(x) ? ?y. Ed_inst(y) attends(x,y)
- Student ? ? attends . Ed_inst
19Summary Concepts in OWL(The T-Box)
- Statements about classes/concepts/types
- Summarise statements about all individuals in a
given class - Interpreted as defining and describing concepts
- Existential Graphs
- The simplest default form
20Elephant Trap 1 Restrictgions using
allValuesFor instead of someValuesFor
universal ? existential ?
- When in doubt, use SomeValuesFrom
- The existential graph outlines the ontology
representation - Use allValuesFrom only when you are sure what you
are doing - Using AllValuesFor wrongly often works for a time
- but comes back to bite you later
- See supplementary exerciseExistentials
Universals All Only - http//www.cs.man.ac.uk/rector/CS646/Existentials
-and-Universals.ppt - http//www.cs.man.ac.uk/rector/CS646/existentials
-and-universals.daml
21Definitions
student attends someValuesFrom
CS_student
CS_Module
M0_CS646
attends
John
attends
Lindsey_music_soc
CS_student is defined
22Definitions Primitives (1)
- CS_student is defined
- The definition is sufficient to recognise a CS
student. - The words tell us what it means
- A computer science student not taking computer
science doesnt make sense - Class( CS_student complete Student
restriction attends someValuesFrom
CS_Module) - ?x. CS_student(x) ? ?y. CS_module(y)
attends(x,y) - Student ? ? attends . CS_module
-
23Definitions Primitives (2)
- Person is a primitive class undefined parent
of student - Class (Person partial Being restriction
hasPart cardinality2 Leg) - Having two legs is an interesting fact about all
persons, but it doesnt define person -
24Definitions Primitives (3)
- In this representation, we have also represented
Student as primitive (marked by keyword partial)
- although some would argue we could have defined
the concept - What should be primitive?
- No absolute rules
- Must start from some primitives someplace
- Often evolve from primitives to definitions
- E.g. we might decide to evolve the ontology to
define a student as a person who attends an
educational institution - class (Student sameClass-as Person and
restriction attends someValuesFrom Ed_inst)
25Definitions Primitives Summary
- Primitive classes
- Only necessary conditions
- i.e. use only ?
- OWL class( partial ) or subclass-of
- Defined classes
- Necessary and jointly sufficient conditions
- i.e. use ?
- OWL classs(complete) equivalent-classes()
26Elephant Trap 2Forgetting to set complete
instead of partial
- In the text or XML version
- in OilEd
27Lets start at the beginning
- You now have all you need to implement simple
existential ontologies, so lets go back to the
beginning - The goal build an ontology for the CS department
28Purpose and Scope
- Describe CS department to allow
- annotation of pages on semantic web
- If this works
- Express rules for equal opportunities for
personnel and rules about the format of modules
and courses
29What concepts do we need?
CS646
Lecture
Department
Student
Computer_Science_Department
Professor
Faculty_of_science
Handout
Course
Resource_centre
Seat
Lecturer
Exam
Slides
Lab
Computer
Alan_Rector
30How to group them
Handout
Slides
Computer
Lab
Lecture
Faculty_of_science
Student
Exam
Professor
Department
Lecturer
Computer_Science_Department
Alan_Rector
Course
Module
Resource_centre
CS646
Lab
31Make some sense of groups
Handout
Slides
Computer
Lab
Lecture
Faculty_of_science
Student
Exam
Professor
Department
Lecturer
Computer_Science_Department
Alan_Rector
Course
Module
Resource_centre
CS646
Lab
32Disambiguate
Handout
Slides
Computer
Lab_session
Lecture
Student
Exam
Faculty_of_science
Professor
Department
Lecturer
VUM_Computer_Science_Department
Alan_Rector
Course
Module
Resource_centre
CS646
Lab_room
33Try it again and again
34Abstract Label
?Things?
Handout
Slides
Educational_activities
Computer
People
Lab_session
Lecture
University_divisions
Student
Exam
Faculty_of_science
Professor
Department
Lecturer
VUM_Computer_Science_Department
Alan_Rector
Teaching_units
Course
Module
Resource_centre
CS646
Lab_room
35Link them up
?Things?
Handout
Slides
Educational_activities
Computer
People
Lab_session
Lecture
University_divisions
Student
Exam
Faculty_of_science
Professor
Department
Lecturer
attend
VUM_Computer_Science_Department
Alan_Rector
teach
Teaching_units
Course
Module
Resource_centre
CS646
Lab_room
36More Links
?Things?
Handout
Slides
Educational_activities
Computer
People
Lab_session
Lecture
University_divisions
Student
Exam
Faculty_of_science
Professor
Department
Lecturer
VUM_Computer_Science_Department
Alan_Rector
Teaching_units
Course
Module
Resource_centre
CS646
Lab_room
37Fill in and ladder up
?Things?
Handout
Slides
Teaching_activities
Computer
People
Lab_session
caucasian
Lecture
Black
University_divisions
study
Student
English
Exam
girl
Faculty_of_science
Professor
man
Department
boy
Lecturer
elderly
woman
VUM_Computer_Science_Department
Alan_Rector
child
adult
Teaching_units
Course
Module
Resource_centre
CS646
Lab_room
38Build from the middle outPick some links
taxonomies
includes
Teaching_unit Module Course
teaches
Teaching_activities Lecture Lab_session
Exam
Academic_staff Professor Lecturer
belongs_to
is_taught_by
gives
is_given_by
- QUESTIONS
- Is every Teaching_unit taught by a member of
academic staff? vice versa? - Is every Teaching_unit given by a member of
academic_staff? vice versa? - Does every Teaching_activity belong to a
Teaching_unit? vice versa?
39Part of the Simplest OilEd Version
40Some general rules
- Several kinds of modules
- For now we will consider ACS_modules and
Third_yr_modules - All ACS modules include labs and exams
- Third year modules include lectures exams
- Form the defined class of modules that include
exams? modules than include lectures? modules
that include labs?
41Before Classification
Click here to classify
complete button setmakes this a defined class
Must have clicked here to start reasoner
note inherited restriction
42After Classification
green symbol never to be clicked
red symbol always click before saving
Classifier has inferred whichkinds of modules
are subsumed by defined (pink)classes
43Another Question
- All ACS modules are short-fat modules
- What is short-fat
- What is the opposite of short-fat
- Long-thin
- Nothing can be both short-fat and long-thin
- They are disjoint
- Lets call this a Format and make it a kind of
ValueType - Modules are independent or self-standing
concepts - ValueTypes are dependent or refining or
non-self-standing concepts - Usually get a property to go with them
- unique can have only a single value
- Sometimes called functional
44Axioms for University Rules
- class (ACS_module restriction hasFormat
someValuesFrom Short_fat) - class (Third_year_module restriction
hasFormat someValuesFrom Long_thin) - class(Short_fat_module complete
(Module hasFormat someValuesFrom Short_fat) -
-
45Its already getting complicated
46But if we clear out the defined bits, it is just
a set of trees
Trees of primitives are the basic modules of a
normalised ontology
47Check to see if we are right
- It should not be possible to have
- a course which is both short_fat and long_thin
- a short_fat third_year_course
- a long_thing ACS_course
- So define them and see if they turn red(are
unsatisfiable) - Convenient to name them Probe_
48Probes_ (correctly) unsatisfiable
49Apply the same techniques to a more complicated
tangle
- University policy requires that all departments
have both women and men as academic staff - So we need to sort out the tangle of kinds of
people
50People from our first analysis plus a few extra
the University is worried about for Equal
Opportunities compliance
People
caucasian
Black
Student
Nonblack_lecturer
English
girl
Black_lecturer
Professor
man
boy
Lecturer
Young_lecturer
elderly
woman
Alan_Rector
child
adult
Woman_lecturer
Woman_professor
51Weve hardly started andalready tangled
wrong(see Black_woman_Professor)
52Normalisation (Untangling)
- Build the ontology out of simple trees of
primitives (Taxonomies) - No primitive has more than one primitive parent
- All multiple classification done by classifier
- (NB only works for Domain ontology
- but thats the part we are interested in most of
the time)
53Kinds of Taxonomies - 1
- Disjoint - nothing can be of both kinds
- As far as possible, all primitive taxonomies
should be disjoint. - When not disjoint, make artificially disjoint and
use a multi-valued (non-functional) property - In example make ethnicity disjoint
having an ethnicity
potentially multivalued - Covering
- Exhausts all possibilities
- True for qualitative formal/legal partitions
(fiat partitions) - Almost never true of natural kinds
54Kinds of Taxonomies (2)
- Intrinsic
- Characteristics inherent in the individual
described - Opposite is accidental
- Dont change (or do so only in odd circumstances)
- Sex is intrinsic
- Academic status is accidental
- Having a fever, or not, is accidental
- Temporal
- Value depends on time
- child-adult
- Academic rank
55Untangle identify characerisitics of taxonomies
Person Sex Male Female Emp_Role Ac_staff Professor Lecturer Age_group Adult Child Ethnicity (admin) White Non_white Black Indian West_indian Oriental
has_sex functionaly has_emp_role functionaly (University rule) has_age_group functionaly has_ethnicity functional ?y (admin y)
natural kind Self Standing Y disjointycovering yintrinsic ytemporal n Self Standing N disjoint ycovering nintrinsic ntemporal y Self Standing Y disjoint ycovering y (view)intrinsic ytemporal y Self Standing N disjoint y (fiat)covering nintrinsic ytemporal n Self Standing N
56Ontologies are Dependent on Views
- Decisions on previous slide depend on
administrative rules as well as the world - fiat a rule
- Used by Barry Smith to cover both
- Administrative distinctions
- Arbitrary linguistic dinstictions
- e.g. child adult
57Disjoint Axiom in OilEd
58A disjoint covering axiom in OilEd
female male coversSex_value_type
Sex_value_type? female? male
Sex_value_type subclass_of female?
male Sex_value_type ? female? male
BUT female ? Sex_value_typemale ?
Sex_value_type therefore equivalent_classes(
Sex_value_type (female? male ) )
disjoint box ticked
59Revised Properties
functional (aka unique/single-valued)
60Disjointness axiom
61Negation (Complement)
- If my intention really is negation, then it is
probably better to represent it that way - Rather than with a disjointness Axiom
- Non_white_ethnicity ? Ethnicity_Value_Type
White_ethnicity
62Result of definition by relative negation in OilEd
63Person before classificationMostly flat -
could be flatter(Black_professor has
deliberately been left in original form)
64Person After Classification(The tangle is all
in the pink (defined) classes
65The tangle is pink(defined)
66The Primitives (yellow) form clean modular Trees
67Exercise Finish Untangling Child-Adult
- File is Teaching-5-people-untangled.daml
68Elephant Trap 3OWL is Open WorldAll Only
And and Or
- So far, all knowledge bases have used only
- SomeValuesFrom ? existential qualifiers
- instersection-of ? ? and
- Almost entirely implicit in the software
- Open World
- Everything is possible until we say it is
impossible - Negation as unsatisfiability
- Closed world
- Nothing is true unless we say it is
- Negation as failure
- Almost all other software uses closed world
reasoning - Databases, logic programming, query languages,
69Lab-free courses
- Define a module without labs
- Easiest way is to define a module with labs and
then negate it - class(Module_with_labs complete
restriction includes someValuesFrom Lab) - Already in the KB
- class(Module_without_labs complete (not
Module_with_labs)) - Module_without_labs ought to subsume
Third_year_module - Because third year modules have no labs
70Try 1 doesnt work? Why?What have we not told
the reasoner?
71Try 1 Situation
includes allValuesFrom
(every)Module
Teaching_activity
Teaching_Activity
Module
Lab3
Lab5
Lab1
Lec2
m_ac_1
Lec3
Lec4
m_ac_2
m_y3_1
Exam2
Exam4
includes
m_y3_2
Note not necessarily every module includes some
teaching activity
72Try 2
- No disjointness axioms between lectures, labs and
exams. - Classes in OWL are non-disjoint by default
- Must add disjointness axioms explicitly
- all-disjoint(Lecture Lab Exam)
73Try 2 Add disjointness
includes allValuesFrom
(every)Module
Teaching_activity
(only)
Module
Lab3
Lab5
Lab1
m_ac_1
Lec3
Lec2
Lec4
m_ac_2
m_y3_1
Exam2
Exam4
includes
m_y3_2
Labs are separate from lectures and exams
74Try 2 no change
- What the reasoner knows
- Third_year_module restriction includes
someValuesFor Exam restriction includes
someValuesFor Lecture disjoint(Lecture,
Exam, Lab) - Would it be consistent with these axioms to have
a Third_year_module with Labs? - Try it and see if it turns red
75Try 2 plus probe
- Satisfiable
- Does not turn red
Probe not red
76 but it would still be consistent to add
includes allValuesFrom
(every)Module
Teaching_activity
(only)
Module
Lab3
Lab5
Lab1
m_ac_1
Lec3
Lec2
Lec4
m_ac_2
m_y3_1
Exam2
Exam4
includes
m_y3_2
77Try 3What else to add
- OWL is Open World
- The definitions mean at least this much and
anything else - Must add explicit closure restrictions
- Must say either
- Third year modules may have only lectures exams
- Third_year_module restriction (includes
allValuesFrom (Lecture or Exam)) Why
have we said or rather than and? - Third year modules may not have labs
- Third_year module subclass-of not
(restriction includes someValuesFrom Lab)
Why is the not outside the parentheses? - Are these definitions equivalent?
- Lets find out
typo in handout
78Add closure axiom
includes allValuesFrom
(every)Module
Teaching_activity
Module
Lab3
Lab5
Lab1
m_ac_1
Lec3
Lec2
Lec4
m_ac_2
m_y3_1
Exam2
Exam4
includes
m_y3_2
Ruled out by closure axiom
79Both work but try1 try2 are not equivalent
why?
80Several reasons
- Primitive rather than defined
- partial rather than complete (Elephant trap 2)
- Make them defined no change
- There could be other sorts of teaching activities
- Make teaching activities cover Lab Exam Lecture
- Claims these are the ONLY teaching activities
- Axiom Teaching_activity subclass-of (Lab or Exam
or Lecture - Already know Lab, Exam, Lecture subclasses of
Teaching Activity - So the covering is complete.
- Modules could include other things than teaching
activities - Module restriction includes allValuesFrom
TeachingActivity
81Then they are equivalent
s show equivalence
82NB In this case, this is probably false
- There could be many more kinds of teaching
activity - Therefore, best to remove the covering axiom
- Teaching_activity is a self-standing concept
- Therefore do not close
- Example used for demonstration of effect of
covering axioms
83There are many ways to represent the same thing
- Consider if
- Every person has some sex
- Person hasSex someValuesFrom Sex_VT
- male, female are kinds of sex
- male ? Sex_VT female ? Sex_VT
- male, female covers sex i.e are the only kinds
of sex - Sex_VT ? (male ? female)Sex_VT ? (male ? female)
- Or consider (in this case where hasSex is
functional) - Woman_1 ? Person hasSex someValuesFrom female
- Woman_2 ? Person hasSex allValuesFrom female
- Woman_3 ? Person hasSex someValuesFrom (not male)
- Woman_4 ? Person hasSex allValuesFrom (not male)
84Which to choose Style
- Choose the form least dependent on global axioms
- If possible use someValuesFrom
- Choose the simpler expression
- Avoid negation except where needed
- Do not try to show how clever you are
- Makes for unreadable, hard to maintain ontologies
- Just as clever code makes for unreadable, hard
to maintain programs - Make all primitives disjoint trees
- Joint them with definitions
- Conceptual Lego
85Elephant Trap 4Domain Range Constraints are
RestrictionsOther ways to say onlyDomain
Range Constraints
- In OWL, domain range constraints are equivalent
to allValuesFrom restrictions - teaches domain Academic_staff range Module
- Thing teaches allValuesFrom ModuleThing
isTaughtBy allValuesFrom Academic_staff - Violating domain/range constraints can
- Cause the thing to be classified (coerced) to the
specified class - This is often the wrong behaviour
- If other axioms make this inconsistent, then the
class in question will be unsatisfiable - And possible much of the rest of the ontology
86Domain / Range Constraints
- Reclassification / Coercion
- Bad style
- Hard to debug
- Likely to lead to problems in interpretation
later - One person thinks Hamlet represents Module
about the play Hamlet, the other that Hamlet
represents the play Hamlet - One person thinks Red is a nick-name for
somebody, the other a colour.
87Example of reclassi-fication likely tolead to
confusionAlso likely to de-normalise
ontologyby giving one primitive more than one
primitiveparent
88If Ontology Properly Normalised,All (domain)
primitives are Disjoint
- Easy to do by making all primitive branches of
each tree disjoint. - Although OilEd/OWL often returns these as huge
lists of pairwise disjoint axioms after saving - a round-tripping problem to be fixed in the
next generation of tools - If this is done, then most domain/range errors
result in unsatisfiable (inconsistent) models.
89But dont getmuch helpfinding errorand
notdoubly classifiedin OilEd
90Domain Range Constraints
- Despite problems, domain/range constraints are
important to avoid errors - e.g. has_academic_rank someValuesFrom
Professorinstead of has_academic_rank
someValuesFrom Professor_rank - However, likely to make much of ontology turn red
- Look for base cause
- Make small incremental changes
- Arguments about how to implement in future.d
91Elephant Trap 5Confusing kind-of part-of
- Many systems deal in in parts and wholes
- Anatomy
- Computer aided design
- Airbus UK, Boeing, etc
- Stock control
- Organisational charts
- Document maps
- XML is essential about documents their parts
- Web pages
-
92Library systems often mix the two (and others)
- Librarians / thesauri speak of
- is broader than / is narrower than
- Precisely to avoid implying which logical
relation applies - Formal representations must make the distinction
- Typical thesaurus
- Computer Microcomputer Mother_board
CPU Pentium
Transistor
isKindOf
isPartOf
isPartOf
isKindOf
isPartOf
93Logic based systems must unscramble
- Kinds
- Computer MicroComputer
- Parts of Computer CPU Pentium
- Circuit Components Transistor
- Parts
- Microcomputer Mother_board CPU
Transistor
94 but note we can rebuild withConceptual Lego
- restriction isPartOf someValuesFrom Computer
restriction isPartOf someValuesFrom
Microcomputer restriction isPartOf
someValuesFrom Mother_board restriction
isPartOf someValuesFrom CPU
restriction isPartOf someValuesFrom Pentium
restriction isPartOf someValuesFrom
transistor - The restrictions reconstruct the original
hierarchy - but make context explicit
- Would not work for isColourOf or isDesignedBy
- The new ontology is re-usable
- The old was not.
95Part-whole relations
- A major field in their own right
- Merology Merotopology
- Subject of a major part of later lectures on
patterns