Title: Building semantic applications
1Building semantic applications
ACAI05/SEKT05 ADVANCED COURSE ON KNOWLEDGE
TECHNOLOGIES
- Paul Warren
- BT
- paul.w.warren_at_bt.com
2Introductions myself, and SEKT
- Paul Warren
- Next Generation Web Research, BT
- http//www.bt.com
- http//www.btplc.com/Innovation
- SEKT http//www.sekt-project.com
- SEmantic Knowledge Technologies
- machine learning for ontology creation
- HLT for metadata extraction
- managing and reasoning with ontologies
- Motivated by corporate knowledge needs
- looking forward to the Semantic Web
3- Motivation
- The need for semantics
- Acquiring and using semantics
- Integrating information
- SEKT Applications
- Ontology engineering in SEKT
- Applications
- Challenges
4Motivation
- Knowledge management
- Knowledge worker productivity is the biggest
challenge facing organisations - 40 of U.S. workforce are knowledge workers
- Peter Drucker
- Information integration
- Heterogeneous data sources
- across or within organisations
- sensor networks
5The need for semantics
- Corporate workers are overwhelmed with
information - from intranets, emails, external newslines
- but may still lack the information they require
- They need information identified
- by semantics, not just keywords
- precise and complete
- by their interests and their task context
- defined semantically
6Higher precision, greater recall
the need for semantics
- Precision
- Find me information about Washington the man, not
the state or city - Find me information about a company called X
which operates in industry Y - Recall
- Ask for information about George W Bush and be
given documents on the President
7Precision in searching
the need for semantics
8Interests and context
the need for semantics
- Need information about Jaguar?
- interested in cars, the natural world, South
America - with a context defined by current activities
- Not just about searching
- interest context to share information
- and to push information to user
- plus many integrated applications
9Too much relevant information
the need for semantics
- They may even have too much relevant information
- Need to
- aggregate from disparate sources
- remove duplication
- present meaningfully
- classified
- summarised
10In the right form
the need for semantics
- Depending on physical context
- mobile phone, PDA, blackberry
- With appropriate visualisation
- relation between documents concepts
- And expressed in natural language
- where this aids understanding
- multilingual
- Integrated
- into the desktop applications
- Seamless
- proactive, not reactive
11Visualisation knowledge
the need for semantics
Key white - concepts orange projects
lighter shading - clusters of projects
12The goal
the need for semantics
- Finding and sharing knowledge through its
semantics - for improved precision and recall
- for the users interests and current context
- Presenting information
- visually
- in natural language
- Extracting information
- in a meaningful way, without duplication
- Displaying all relevant information
- from the document and the knowledgebase
13Acquiring and using semantics
- Some manually generated
- for high value applications
- e.g. life sciences
- Most (semi-)automatically generated
- machine learning / statistical techniques
- HLT ontology-based information extraction
- from context
14Context
acquiring and using semantics
- What is known about the author?
- use his interests to disambiguate
- What is attached to an object
- import an object, import its metadata
- Where is it?
- position in folder structure
- Provenance
- attachment from email
15Ontology modelling
acquiring and using semantics
sells to
employee size
operates in
16Understanding ontologies
acquiring and using semantics
On the left is a hierarchical classification of
companies. This distinguishes between private
and public companies and EU and non-EU companies.
Note that, unlike in a taxonomy, a class may
have more than one superclass. So that
companies on the New York Stock Exchange is
both a subclass of the class non E.U. companies
and also of the class public companies. The
classes are made up of instances, in this case
individual companies, which are not shown here.
Instances of a class are, of course, also
instances of its superclasses. So any instance
of companies on the London Stock Exchange (e.g.
BT) is also an instance of public companies,
E.U. companies and companies. On the right
is shown another part of the ontology, this time
concerned with classifying industries. All
classes in an ontology are related by a chain of
superclasses to a class Thing which contains
all instances in the ontology As well as classes
and instances, an ontology contains properties.
Properties, shown by arrows in the diagram, are
defined on a given class and are of two kinds.
One kind of property relates the instances of the
class to some literal value. An example of this
is the property employee size which could be
used to describe how many employees a company
has. The other kind of property relates
instances of one defined class to instances of
another, or the same, defined class. The
property operates in relates companies to
industries whilst sells to relates companies
to one another. The properties shown here apply
to all the subclasses of company (since
instances of the subclasses are also instances of
company), whilst we could have defined
additional properties specific to any of the
subclasses. Ontologies have formally defined
semantics. This means that computers can reason
about the constructs in an ontology. Computer
scientists, mathematicians and logicians have
developed a great deal of formal theory to
understand how to do this most effectively and
efficiently. Recently this has resulted in the
standardisation by the W3C of the ontology
language, OWL (http//www.w3.org/2004/OWL/). OWL
exists in a variety of species, which correspond
to varying degrees of implementational and
computational difficulty.
17Metadata
acquiring and using semantics
- Describing
- documents, sub-documents, pages
- author, creation date, topic(s), related to,
- entities within documents
- classes people, companies, roles
- relations CEO of
- building a knowledgebase
18Accessing a knowledgebase
acquiring and using semantics
19The knowledgebase
acquiring and using semantics
20Ontology-based information extraction
acquiring and using semantics
- Ryanair announced yesterday that it will make
Shannon its next European base, expanding its
route network to 14 in an investment worth around
180m. The airline says it will deliver 1.3
million passengers in the first year of the
agreement, rising to two million by the fifth
year.
21Information integration
- Motivated by
- Incompatible legacy systems
- Mergers and acquisitions
- Rapidly forming virtual organisations and supply
chains - Sensor networks
- Goal
- Merging information from heterogeneous
unstructured (text) sources - with structured information
22Mapping ontologies
information integration
- Semi-automatic techniques
- based on similarities of name structure
- or even sound (for 4)
- e.g. PROMPT suite plug-ins for Protégé
- Semantic mapping set based
- equality() mismatch (-)
- more general(?) more specific(?)
- overlap (n)
23Applications in SEKT
24Intelligent content management
SEKT applications
BT digital library
- Currently
- Two major document databases
- million articles abstracts plus some full text
- Text-based and some attribute-based querying
e.g. author, date - information spaces defined by queries
25Improving and extending
SEKT applications intelligent content management
- Better precision and recall
- in searching, alerting, sharing
- Automatic document annotation
- extending the knowledgebase
- clicking through to the knowledgebase
- An extended document corpus
- focussed crawling from Web and intranet
- Automatic classification
- extending and improving manual approach
- Browsing related documents
- Driven by interests and context
- learned from users behaviour
26SEKT architecture
SEKT applications
creating amending concepts, instances
annotating correcting annotations
27Knowledge management
SEKT applications
building on Siemens knowledgemotion
sharing and reusing knowledge across a global
team
28Improving knowledge sharing
SEKT applications knowledge management
- Sharing
- Elements
- presentations, lessons learned
- Solutions
- application module, graphical interface
- Project approaches
- methodologies, models
- Pre-packaged projects
- with direct sales impact
29Intelligent decision support
SEKT applications
a database of frequently asked questions using
semantic distance to identify questions and
answers
with justification drawn from comprehensive legal
databases
combining formal and informal knowledge
30Semantic distance
SEKT applications intelligent decision support
- Semantic distance is based on weighted path
length between concepts - Path length is based on navigation from one
concepts to another through any relation
available - Is-a
- Part-of
- Follows
- Actor
Source iSOCO
31Better decisions
SEKT applications intelligent decision support
- Using
- Ontology of Professional Legal Knowledge
- developed with DILIGENT methodology
- Rulings
- a variety of legal databases
- Mapping between models of PLK and rulings
32OPLK classes identified
SEKT applications intelligent decision support
33SEKT applications intelligent decision support
Intuitive ontological subdomains
PROCEEDINGS
34Using factorial analysis
SEKT applications intelligent decision support
35Ontological subdomains
SEKT applications intelligent decision support
36Architecture of Iuriservice
SEKT applications intelligent decision support
37Ontology engineering in SEKT
- PROTON PROTo ONtology
- 250 classes 100 properties
- domain independent
- compliance with popular standards
- good coverage of concrete entities
- people, organisations, numbers
- OWL Lite
38Person class
Ontology engineering in SEKT
- Subclass of Agent
- Superclass of Man and Woman
- hasPosition
- Person -gt JobPosition
- hasProfession
- Person -gt Profession
- hasRelative, isBossOf
- Person -gt Person
39Property and class hierarchies
Ontology engineering in SEKT
hasRelative
Agent
Group
Organization
Charity
Commercial Organization
Company
Airline
Bank
Insurance Company
Media Company
40Profiles in the Digital Library
Ontology engineering in SEKT
41Topics
Ontology engineering in SEKT
- UserProfile isCurrentlyInterestedIn Topic
- InspecRecord hasSubject Topic
- Topics are instances
- of the class Topic
- Compare taxonomic approach
- Avoids classes as property values
- OWL Full
42Classes as property values
Ontology engineering in SEKT
Source Representing Classes as Property Values,
Natasha Noy, W3C
43Diligent
Ontology engineering in SEKT
- DIstributed Loosely-controlled and evolvInG
Engineering of oNTologies - Motivated by the need to develop shared
ontologies for sharing knowledge - Ex-post analysis in biology domain
- Based on Rhetorical Structure Theory
- seeks to explain the coherence of texts
- identifies relations
- elaboration, evaluation, justification, contrast,
alternative, example, counter example, background
knowledge, motivation, summary, solutionhood,
restatement, purpose condition, preparation,
circumstance, result, enablement, list - DILIGENT uses subset of these
44Distributed and loosely controlled
Ontology engineering in SEKT
- The steps
- build domain experts, users
- local adaption users
- analysis and revision board
- local update - users
45Diligent Wiki
Ontology engineering in SEKT
46More applications
applications
- Portals
- building on content management
- Knowledge discovery
- Business intelligence
- Inter-enterprise cooperation
- overcoming heterogeneity
- Semantic desktop
- Communication
- Collaboration
- Semantic Grid
47Knowledge discovery
applications
- Extracting information from heterogeneous sources
- knowing your customer
- national security
- e.g. Semagix http//www.semagix.com
- Sentiment analysis
- IBMs WebFountainTM
- http//www.almaden.ibm.com/webfountain
- Intelliseek
- http//www.intelliseek.com
48Business intelligence
applications
- Text-driven business intelligence
- e.g. ClearForest
- http//www.clearforest.com
- Identifying trends and patterns
- Merging with structured data from databases
49The semantic desktop
applications
- Personal information management
- Desktop data as web resources
- Interoperable applications through common
(RDF-based) data standards - Items are first class objects
- Gnowsis http//www.gnowsis.org
- Haystack - http//haystack.lcs.mit.edu
- Fenfire - http//fenfire.org
50Extensible and interoperable
applications
app3, e.g. diary
context
mapping
Ontology and knowledgebase OWL
reasoning, ontology management and evolution
text mining
app1, e.g. diary
app2, e.g. idea management
51Keeping the context
applications
- When a file is emailed context is lost
- creation, classification
- and more is lost when the received file is
stored - sender, email thread
- Use to create metadata to enhance, e.g. search
52Communication
applications
- Using information extraction to detect linkages
- between personal databases
- onto intranet or Web
53Collaboration
applications
- Plus using semantics
- to find the right partners, e.g. in project
set-up - to create the right context for a conference
- agenda, minutes, documents
54Semantic Grid
applications
Source http//www.semanticgrid.org
- Definitions
- flexible, secure coordinated resource sharing
(David de Roure) - see also Wikipedia http//en.wikipedia.org/wiki/
Semantic_grid
55Grid services and resources
applications the semantic grid
- Semantic description for, e.g.
- resource discovery
- matchmaking
- negotiation
- composition
- monitoring
- Must be stateful compare current web services
56Semantic grid - challenges
applications the semantic grid
- Automated virtual organisations
- their formation and management
- Service negotiation and contracts
- Security, trust and provenance
- Self organisation
- David de Roure
- University of Southampton
57State-of-the-art
applications
- Text mining well developed
- Semagix, Intelliseek, ClearForest
- point solutions
- Standardisation currently mostly at XML level
- Little use yet of
- context
- OWL
- reasoning
58Challenges
- What do users really want?
- how not to overwhelm them?
- alerts, hyperlinks
- Differentiate between users?
- novice, sophisticate
- varying at different times
- What kind of user interfaces?
- to make use of all the metadata
59Bibliography - 1
- The semantic desktop
- Sauermann, L, The Gnowsis Semantic Desktop for
Information Integration, at the IOA Workshop of
the ISWC2005 Conference - Decker, S., Frank, M., The Networked Semantic
Desktop, in WWW2004 Workshop Application Design,
Development and Implementation Issues in the
Semantic Web - Chirita, P.A. et al, Activity Based Metadata for
Semantic Desktop Search, in The Semantic Web
Research and Applications, Springer, May / June
2005, p.p. 439-454 - The semantic grid
- De Roure, D, Jennings, N., Shadbolt, N., The
Semantic Grid Past, Present and Future, in
Proceedings of the IEEE, Vol. 93, No. 3, March
2005, p.p. 669-681
60Bibliography - 2
- Semantic annotation
- Kiryakov, A., et al, Semantic Annotation,
Indexing and Retrieval, Journal of Web Semantics,
Vol. 2, December 2004, p.p. 49-79 - http//www.ontotext.com/publications/SemAIR_SWJ.pd
f - Information integration
- Bouquet, P., Serafini, L., Zanobini, S., Semantic
Coordination A new approach and an application,
in Proceedings of ISWC 2003 - Giuinchiglia, F., and Shvaiko P., Semantic
Matching in The Knowledge Engineering Review,
18(3)265-280, 2004
61Bibliography - 3
- Ontology engineering
- Noy, N., Representing Classes as Property Values
on the Semantic Web, W3C Working Group, April
2005 - http//www.w3.org/TR/2005/NOTE-swbp-classes-as-val
ues-20050405/ - Tempich, C., Pinto, S., Sure, Y., Staab, S., An
Argumentation Ontology for DIstributed,
Loosely-controlled and evolvInG Engineering of
oNTologies (DILIGENT), ESWC2005, p.p. 241-256 - Legal case study
- Benjamins, R., The Semantic Web Legal
Application, iSOCO, May 2005 - http//bibo.incubadora.fapesp.br/portal/CursoSeman
ticWeb/Iuriservice.ppt
62Bibliography - 4
- General
- Introducing Semantic Technologies and the Vision
of the Semantic Web, Semantic Interoperability
Community of Practice (US) - http//colab.cim3.net/file/work/SICoP/WhitePaper/S
ICoP.WhitePaper.Module1.v5.4.kf.021605.doc - Evaluation and Market Report (WonderWeb project),
Top Quadrant - http//wonderweb.semanticweb.org/deliverables/docu
ments/D25.pdf