Title: Next Generation Knowledge Management applying semantic web technology
1Next Generation Knowledge Managementapplying
semantic web technology
- John Davies
- Manager, Next Generation Web Research
2Overview
- Introduction to the Semantic Web
- XML, RDF, OWL
- Ontologies
- Semantic Web Knowledge Management
- SEKT project
- Research technology
- Applications
- Exploitation
3Limitations of the Web today
- Machine-to-human, not machine-to-machine
4The Semantic Web
- Tim Berners-Lee
- an extension of the current web in which
information is given well-defined meaning, better
enabling computers and people to work in
cooperation - An open platform allowing information to be
shared and processed - adding context and structure
5Scientific American, May 2001
6Where we are Today the Syntactic Web
Hendler Miller 02
7i.e. the Syntactic Web is
- A place where
- computers do the presentation (easy) and
- people do the linking and interpreting (hard).
- Why not get computers to do more of the hard
work?
Goble 03
8Hard Work using the Syntactic Web
- Complex queries involving background knowledge
- Find information about animals that use sonar
but are not either bats, dolphins or whales - Locating information in data repositories
- Travel enquiries
- Prices of goods and services
- Results of human genome experiments
- Delegating complex tasks to web agents
- Book me a holiday next weekend somewhere warm,
not too far away, and where they speak French or
English
Horrocks 03
9XML is a first step
- Semantic markup
- HTML ? layout
- use bold font
- Insert an image here
- XML ? content
- this part of the document is the product price
- this document describes a telecommunications
service
10XML
- ltplaygt
- lttitlegtThe Life and Death of King
Johnlt/titlegt - ltDramatis Personaegt
- ltpersonagtThe Earl of PEMBROKElt/personagt
- ltpersonagtThe Earl of ESSEXlt/personagt
-
- lt/Dramatis Personaegt
- ltStagedirgtSCENE England, the
Court.lt/Stagedirgt - ltactgtAct 1
- ltscenegtScene I.
- ltspeechgt
- ltspeakergtJohnlt/speakergt
- ltlinegtNow, Chatillon, what would
France with us?lt/linegt - lt/speechgt
11XML is a first step
- Semantic markup
- HTML ? layout
- XML ? content
- Metadata (with limitations)
- within documents, not across documents
- prescriptive, not descriptive
- No commitment on vocabulary and modelling
primitives - ltvehiclegt
- ltcargtford
- ltenginegtxyz123-4lt/enginegt
- ltmodelgtmondeogtlt/mondeogt
- lt/cargt
- lt/vehiclegt
- RDF and ontologies are the next step
12XML limitations for semantic markup
- XML per se makes no commitment on
- Domain specific ontological vocabulary
- Which words shall we use to describe a given set
of concepts? - Ontological modelling primitives
- How can we combine these concepts, e.g. car is
a-kind-of (subclass-of) vehicle - ? requires pre-arranged agreement on vocab and
primitives
13What are Ontologies?
- Ontologies provide a shared and common
understanding of a domain (medicine, finance, ) - a shared specification of a conceptualisation
- A simple example - Yahoo
- BusinessEconomy gt Finance gt Banking
- for WWW, defined using RDF(S) OWL
14Taxonomies
Animals
Vertebrates
Invertebrates
..
Insects
Arachnids
Reptiles
Mammals
15Ontology of People and their Roles
Employee
Expert
Analyst
Manager
Programme Mgr
Project Mgr
16Structure of an Ontology
- Ontologies typically have two distinct
components - Names for important concepts and relationships in
the domain - Elephant is a concept whose members are a kind of
animal - Herbivore is a concept whose members are exactly
those animals who eat only plants or parts of
plants - Background knowledge/constraints on the domain
- Adult_Elephants weigh at least 2,000 kg
- No individual can be both a Herbivore and a
Carnivore
Horrocks 03
17Why develop an ontology?
- To make define web resources more precisely and
make them more amenable to machine processing - To make domain assumptions explicit
- Easier to change domain assumptions
- Easier to understand and update legacy data
- To separate domain knowledge from operational
knowledge - Re-use domain and operational knowledge
separately - A community reference for applications
- To share a consistent understanding of what
information means
18Types of Ontologies
Guarino, 98
Describe very general concepts like space, time,
event, which are independent of a particular
problem or domain. It seems reasonable to have
unified top-level ontologies for large
communities of users.
Describe the vocabulary related to a generic
domain by specializing the concepts introduced in
the top-level ontology.
Describe the vocabulary related to a generic task
or activity by specializing the top-level
ontologies.
These are the most specific ontologies. Concepts
in application ontologies often correspond to
roles played by domain entities while performing
a certain activity.
19Ontologies - Some Examples
- General purpose ontologies
- The Upper Cyc Ontology, http//www.cyc.com/cyc-2-1
/index.html - IEEE Standard Upper Ontology, http//suo.ieee.org/
- Domain and application-specific ontologies
- RDF Site Summary RSS, http//groups.yahoo.com/grou
p/rss-dev/files/schema.rdf - Dublin Core, http//dublincore.org/
- UMLS, http//www.nlm.nih.gov/research/umls/
- Open Biological Ontologies http//obo.sourceforge
.net/ - FOAF www.foaf.org
- Ontologies in a wider sense
- Agrovoc, http//www.fao.org/agrovoc/
- UNSPSC, http//eccma.org/unspsc/
- DAML.org library http//www.daml.org/
20RDF and RDF-S
- W3C standards
- RDF-S defines the ontology
- classes and their properties and relationships
- what concepts do we want to reason about and how
are they related - there are authors, and authors write books
- RDF defines the instances of these classes and
their properties - Mark Twain is an author
- Mark Twain wrote Adventures of Tom Sawyer
- Adventures of Tom Sawyer is a book
21An example RDF Schema
Annotation of WWW resources and semantic links
domain
range
Writer
Book
hasWritten
subClassOf
FamousWriter
type
Schema(RDFS)
Data(RDF)
25/12/68
type
DoB
hasWritten
/twain.com/mark
books.com/ISBN00010475
22RDF
hasName (http//www.famouswriters.org/twain/mark
, Mark Twain) hasWritten (http//www.famousw
riters.org/twain/mark, http//www.books.org/ISB
N00001047582) title (http//www.books.org/ISBN0
0001047582, The Adventures of Tom
Sawyer) XML version ltrdfDescription
rdfabouthttp//www.famouswriters.org/twain/markgt
ltshasNamegtMark Twainlt/shasNamegt ltshasWritten
rdfresourcehttp//www.books.org/ISBN0001047/gt lt
/rdfDescriptiongt
23Conclusions about RDF(S)
- Next step up from plain XML
- (small) ontological commitment to modeling
primitives - possible to define vocabulary
- However
- no precisely described meaning
- no inference model
24Ontology and Logic
- Reasoning over ontologies
- Inferencing capabilities
- X is author of Y ? Y is written by X
- X is supplier to Y Y is supplier to Z ?
- X and Z are part of the same supply
chain - Cars are a kind of vehicle
- Vehicles have 2 or more wheels ?
- Cars have 2 or more wheels
25Web Ontology Language Requirements
- Desirable features identified for Web Ontology
Language - Extends existing Web standards
- Such as XML, RDF, RDFS
- Easy to understand and use
- Should be based on familiar KR idioms
- Formally specified
- Of adequate expressive power
- Possible to provide automated reasoning support
26OWL Language
- OWL is based on Description Logics knowledge
representation formalism - OWL (DL) benefits from many years of DL research
- Well defined semantics
- Formal properties well understood (complexity,
decidability) - Known reasoning algorithms
- Implemented systems (highly optimised)
- Three species of OWL
- OWL Full maximum expressivity, undeciable
- OWL DL based on SHIQ DL, decidable
- OWL Lite - subset of OWL DL, most efficient
reasoning
27Semantic Web Layers
Entailment of the Implicit
Explicit Semantics
Relational Distributed Data
Data Exchange
28Why OWL?
- OWL Web Ontology Language
- Owls superior intelligence is known throughout
the Hundred Acre Wood, as are his talents for
Writing, Spelling, other Educated and Special
tasks. - "My spelling is Wobbly. It's good spelling, but
it Wobbles, and the letters get in the wrong
places."
29- Semantic Web Knowledge Management
30Business Motivation Knowledge Management
- Corporate workers are overwhelmed with
information - from intranets, emails, external newslines, DMSs,
- but may still lack the information they require
- They need information
- filtered by semantics, not just keywords
- tailored to their interests and their task
context - in a form appropriate to their current physical
context - mobile phone, PDA, blackberry, laptop,
- aggregated from heterogeneous data sources
31SEKT
- addressing the semantic knowledge technology
research, development exploitation agenda - developing Next Generation Knowledge Management
(NGKM) - 6th framework IP project
- start date 1/1/2004
- 36 months, 12.5m
- www.sekt-project.com
32The inSEKTs
Vrije Universiteit Amsterdam
Siemens BS
Empolis
University of Sheffield
Universität Karlsruhe
BT
Ontoprise
Kea-pro
Universität Innsbruck
iSOCO
Sirma AI
Universitat Autònoma de Barcelona
Jozef Stefan Institute
33Semantic Web KM
- Making WWW information machine processable
- annotation via ontologies metadata
- offers prospect of enhanced knowledge management
- better knowledge access and sharing
- heterogeneous information sources, proactive
knowledge delivery, seamless knowledge access - significant research technology challenges are
outstanding
34SEKT - The Goal
- To deliver next generation semantic knowledge
technology through - Foundational research
- (Semi-)automatic ontology generation and
population - Human Language Technology Knowledge Discovery
- Ontology management (mediation, evolution,
inferencing) - Innovative technology development
- A suite of knowledge access tools
- Open source ontology middleware platform
- Validated by 3 case studies and
benchmarking/usability activties - Supported by a methodology
35Major RTD challenges
- Improve automation of ontology and metadata
generation - Research and develop techniques for ontology
management and evolution - Develop highly-scalable solutions
- Research sound inferencing despite inconsistent
models - Develop semantic knowledge access tools
- Develop methodology for deployment
36Key outcomes
- technological progress through development of
leading edge, integrated semantically-enabled KM
software tools - scientific progress through foundational research
- creation of awareness via dissemination, training
- showcases - 3 case study applications
- exploitation via open source, freeware and
proprietary software
37Key outcomes
- building the European Research Area through
collaboration with related IP and NoE projects in
this area for a coordinated impact strategy - SEKT, DIP, KnowledgeWeb SDK cluster
- http//sdk.semanticweb.org
- European Semantic Web Symposium
- http//www.esws2004.org/
- Conference series established
- ESWC05 Crete, May 2005
- 260 attendees
38Annotation is a key issue
- How do we handle legacy knowledge?
- automating metadata extraction
- using human language technology
- significant research technology challenges are
outstanding - creating and managing ontologies is an overhead
- semi-automatic generation of ontologies
- using knowledge discovery
- semi-automatic maintenance and evolution of
ontologies - plus ontology merging and mapping
- needs a multi-disciplinary approach
39Multidisciplinary approach
KD/HLT
Management evolution
KD/HLT
- Need to determine appropriate technology mix
- Semi-automatic
40The semantic desktop
- context-aware tools for access to
semantically-annotated knowledge tools - search, browse, visualise, summarise, share,
infer - integrated into day-to-day business processes
- automatic knowledge delivery based on current
context - activity, location, device, interests
- support multiple end-user devices (RDF-based)
- also support for on-the-fly metadata creation
- metadata creation as a side-effect of data
creation
41Semantic Annotation
42Semantic Browsing
43Semantic Browsing
44Semantic Search
45Semantic Search Results
46Semantic Search, Referring Documents
47Semantic Search, Referring Documents
48Real-life applications
helping newly-appointed judges
helping IT consultants
a corporate digital library
- Use/refinement of SEKT methodology
- Usability, business benefits and benchmarking
49Exploitation
- Exploitation key roles
- Systems integrator
- Several software vendors
- Sector-specific organisations
- Key outputs
- integrated suite of software components
- open source/freeware environment for semantic
knowledge applications - SEKT brand
50Exploitation - Target markets
- Direct exploitation e.g. enterprise search
- growing at 10 p.a.- driven by taxonomies
- Horizontal integration
- iComms - integrated communications
- portals content management
- CRM, eLearning, helpdesk, sales support
- Vertical markets
- tailoring functionality and ontologies
- legal, life sciences, consultancy
51Project Overview
52Summary
- Semantic Web
- machine-processable web-based data
- making the computer a device for computation
again! - Application of semantic web to Knowledge
Management - Research challenges remain
- Starting to deploy real applications
53Acknowledgements
- York Sure, University of Karlsruhe
- Sean Bechhofer, University of Manchester
54Thank youwww.sekt-project.comjohn.nj.davies_at_bt.c
om