Title: Integrating Language Understanding agents into the Semantic Web
1Integrating Language Understanding agents into
the Semantic Web
- Akshay Java, Tim Finin, Sergei Nirenburg
11/04/2005
2Outline
- Motivation Language Understanding Agents
- Ontological Semantics
- Bridging the Knowledge Gap
- Preliminary Evaluation
- SemNews An Application Testbed
- Conclusion
- QA
3Motivation
- Intelligent agents need knowledge and
information. - Majority of content on the web remains in NL
text. - SW can benefit NLP tools in their language
understanding task
Facts from NL
NLP Tools
Natural Language
RDF/OWL
WWW
Semantic Web
Text Images Audio video
Ontologies Instances triples
Web of documents
Web of data
structured information
4Motivation
5Ontological Semantics
OntoSem is a Natural Language Processing System
that processes the text and converts them into
facts. Supported by a constructed world model
encoded in a rich Ontology.
6Ontological Semantics
Grammar Ecology MorphologySyntax
Ontology and Fact Repository
Lexicon and Onomasticon
Static Knowledge Resources
7Mapping OntoSem to web based KR
- OntoSem ontology is a frame based representation
- ONTOLOGY CONCEPT
- CONCEPT ROOT OBJECT-OR-EVENT PROPERTY
- SLOT PROPERTY FACET FILLER
- Translating OntoSem Ontology deals with mapping
its semantics into corresponding OWL
representation. - OntoSems supporting fact repositories are also
mapped to OWL. - The text meaning representation of the sentences
is now converted to OWL.
8Mapping OntoSem to web based KR
Fact Repository
NL Text
OntoSem
TMR
TMRs In OWL
Lexicon
OntoSem2OWL
Ontology
OWL Ontology
9Mapping Rules for Classes
- OntoSem LISP version
- (make-frame patent
- (definition
- (value (common "the exclusive right to make,
use or sell an invention, which is granted to the
inventor"))) - (is-a
- (value (common intangible-asset legal-right))))
- OWL Version
- ltowlClass rdfabout"ontosempatent"gt
- ltrdfssubClassOfgt
- ltowlClass rdfabout"ontosemintangible-asse
t"gt - lt/owlClassgt
- lt/rdfssubClassOfgt
- ltrdfssubClassOfgt
- ltowlClass rdfabout"ontosemlegal-right"gt
- lt/owlClassgt
- lt/rdfssubClassOfgt
10Mapping Rules for Properties
- Properties can be
- ObjectProperty owlObjectProperty
- Datatype Property owlDatatypeProperty
- Property hierarchy is defined by
owlsubPropertyOf - Domain maps to rdfsdomain
- Range maps to rdfsrange
- Restrictions are handled using owlRestriction
- Numeric datatypes are handled using XSD
11Mapping Rules for Properties
- (make-frame controls
- (domain
- (sem (common physical-event physical-object
social-event social-role))) - (range
- (sem (common actualize artifact
- natural-object social-role)))
- (is-a (value (common relation)))
- (inverse (value (common controlled-by)))
- (definition
- (value (common
- "A relation which relates concepts to what
they can control"))))
12Mapping Rules for Properties
(make-frame
- ltowlObjectProperty rdfID "controls"gt
- ltrdfsdomaingt
- ltowlClassgt
- ltowlunionOf rdfparseType"Collection"gt
- ltowlClass rdfabout"physical-event"/gt
- ltowlClass rdfabout"physical-object"/gt
- ltowlClass rdfabout"social-event"/gt
- ltowlClass rdfabout"social-role"/gt
- lt/owlunionOfgt
- lt/owlClassgt
- lt/rdfsdomaingt
- ltrdfsrangegt
- ltowlClassgt
- ltowlunionOf rdfparseType"Collection"gt
- ltowlClass rdfabout"actualize"/gt
- ltowlClass rdfabout"artifact"/gt
- ltowlClass rdfabout"natural-object"/gt
- ltowlClass rdfabout"social-role"/gt
- lt/owlunionOfgt
(domain
(range
(is-a
(inverse
13Mapping Rules for Facets
- Facets are a way to restricting the fillers that
can be used for a particular slot - SEM and VALUE
- Maps them using owlRestriction on a particular
property. - RELAXABLE-TO
- Add this to the classes present in
owlRestriction and add this information in the
annotation. - DEFAULT
- No clear way to represent non-monotonic reasoning
and closed world assumptions in Semantic Web. - DEFAULT-MEASURE
- similar to DEFAULT Facet, not handled.
- DEFAULT, DEFAULT-MEASURE used relatively less
frequently - NOT
- Not facet can be handled using owldisjointOf
- INV
- need not be handled since is-a slot is already
mapped to owlinverseOf
14Mapping Rules
Property Related Constructs
Case Frequency Mapped Using
1 domain 617 rdfsdomain
2 domain with not facet 16 owldisjointWith
3 range 406 rdfsrange
4 range with not facet 5 owldisjointWith
5 inverse 260 owlinverseOf
15Mapping Rules
Facet related constructs
Case Frequency Mapped Using
1 value 18217 owlRestriction
2 sem 5686 owlRestriction
3 relaxable-to 95 annotation
4 default 350 Not handled
5 default-measure 612 Not handled
6 not 134 owldisjointWith
7 inv 1941 Not required
16Translating TMR2OWL
- Translating TMRs involves instantiation of
concepts mapped in OWL. - Example
- (COME-1740
- (TIME
- (VALUE (COMMON (FIND-ANCHOR-TIME))))
- (DESTINATION
- (VALUE (COMMON CITY-1740)))
- (AGENT (VALUE (COMMON POLITICIAN-1740)))
- (ROOT-WORDS (VALUE (COMMON (ARRIVE))))
- (WORD-NUM (VALUE (COMMON 2)))
- (INSTANCE-OF (VALUE (COMMON COME)))
- ltontosemcome rdfabout"COME-1740"gt
- ltontosemdestination rdfresource"CITY-1740"/gt
- ltontosemagent rdfresource"POLITICIAN-1740"/gt
- lt/ontosemcomegt
17Evaluation
Built Ontology translation tool using Jena API
Total Triples Generated 102189 (including
bnode) Time to build the Model 10-40 sec Time
to do RDFS Inference 10 sec Time to do OWL
Micro 40 sec Time to do OWL Full ???? DL
Expressivity ELUIHEL - Conjunction and Full
Existential QuantificationU - UnionH - Role
HierarchyI - Role Inverse
Swoop Pellet Wonderweb
http//w3c.org/RDF/Validator/
After Translation
Total Number of Classes 7747Â (Defined
7747, Imported 0)Total Number of Datatype
Properties 0 (Defined 0, Imported 0)Total
Number of Object Properties 604Â (Defined
604, Imported 0)Total Number of Annotation
Properties 1 (Defined 1, Imported 0)Total
Number of Individuals 0 (Defined 0, Imported
0) NOTE This is using no Restrictions
OWL FULL
18Evaluation
- Syntactic Correctness was checked using OWL/RDF
validators. - Semantic Validation Full semantic validation
even for subsets of OWL is difficult. - Meaning Preservation some subset of the native
representation features such as DEFAULTS,
modality, case roles may be underrepresented or
not handled. - Feature Minimization Complex features could be
difficult for reasoners to handle hence we can
perform the translations at each of the levels
OWL Lite, OWL DL, OWL Full. - Translation Complexity OntoSem is an extensive
and large ontology (8000 concepts). Translation
itself is done syntactically but in general
translation might require reasoning which could
be an issue.
19Reasoning Capabilities
Finding Transitive Closures (RDFS reasoning)
- Buildfile build.xml
- init
- compile
- dist
- jar Building jar /home/aks1/software/ecli
pse/workspace/ontojena/dist/lib/ontojena.jar - run
- java MODEL OK
- java Resource http//ontosem.org/fire-eng
ine - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/fire-engine) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/all) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/physical-objec
t) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/inanimate) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/wheeled-vehicl
e) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/engine-propell
ed-vehicle) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/wheeled-engine
-vehicle)
vehicle
Inferred Triples
Land-vehicle
Engine-propelled--vehicle
Wheeled--vehicle
Wheeled-engine-vehicle
Truck
Fire-engine
20An Application Testbed SemNews
- SemNews Semantically Search and Browser news
- Aggregators collect the RSS news descriptions
form various sources. - The sentences are processed by OntoSem and are
converted into Text Meaning Representations
(TMRs) - Provides intelligent agents with the latest news
in a machine readable format
http//semnews.umbc.edu
21http//semnews.umbc.edu
22Agent understandable news
Provides RDF version of the news.
http//semnews.umbc.edu
23Semantacizing RSS
View structured representation of the RSS news
story.
Future versions would enable editing the facts
and provide provenance information
http//semnews.umbc.edu
24News stories are ontologically linked
Find news stories by browsing through the OntoSem
ontology.
http//semnews.umbc.edu
25Tracking Named Entities
Find stories about a specific named entity.
http//semnews.umbc.edu
26Browsing Facts
Fact repository explorer for named entity
Mexico shows that it has a relation
nationality-of with CITIZEN-235
Fact repository explorer for instance CITIZEN-235
shows that the citizen is an agent of ESCAPE-EVENT
http//semnews.umbc.edu
27Querying the semanticized RSS
RDQL Queries
Provides structured querying over text converted
into RDF representation.
http//semnews.umbc.edu
28Semantic Alerts
Alerts can be specified as ontological concepts/
keywords / RDQL queries. Subscribe to results of
structured queries
http//semnews.umbc.edu
29Conclusions
- Integrating language processing agents into the
SW would publish SW annotations and documents
that capture the texts meaning. - Migrating from native non-web based
representation to SW representation may be
loss-full but is still useful for many
applications. - SemNews application testbed demonstrates some
scenarios that can benefit from language
understanding agents.
30- QA
- Thank you.
- http//ebiquity.umbc.edu
- http//semnews.umbc.edu
31References
- Software Used
- 1 OntoSem http//ilit.umbc.edu/
- 2 RDF Validation service http//w3c.org/RDF/Vali
dator - 3 Jena Toolkit http//jena.sourceforge.net/
- 4 Swoop Ontology Viewer http//www.mindswap.org/
2004/SWOOP/ - 5 Pellet OWL DL Reasoner http//www.mindswap.org
/2003/pellet/ - 6 Wonder Web OWL Validator http//phoebus.cs.man
.ac.uk9999/OWL/Validator - Papers
- 1 Sergei Nirenburg and Victor Raskin,
Ontological Semantics, Formal Ontology and
Ambiguity - 2 Sergei Nirenburg and Victor Raskin,
Ontological Semantics, MIT Press, Forthcoming - 3 Sergei Nirenburg, Ontological Semantics
Overview, Presentation CLSP JHU, Spring 2003 - 4 Marjorie McShane, Sergei Nirenburg, Stephen
Beale, Margalit Zabludowski, The Cross Lingual
Reuse and Extension of knowledge Resources in
Ontological Semantics - 5 P.J Beltran-Ferruz, P.A Gonzalez-Calero, P.
Gervas Converting Mikrokosmos frames into
Description Logics. - 6 Sergei Nirenburg, Ontology Tutorial, ILIT
UMBC - Mailing Lists
- 1 Jena Developers jena-dev_at_yahoogroups.com
- 2 pellet users pellet-users_at_lists.mindswap.org
32Backup slides
33Static Knowledge Sources
- Ontology 8000 concepts
- Avg 16 properties each
- English Lexicon 45000 entries
- Spanish Lexicon 40000 entries
- Chinese Lexicon 3000 entries
- Fact repository 20000 facts
Sergei Nirenburg, Ontological Semantics
Overview, Presentation CLSP JHU, Spring 2003
34Text Meaning Representation (TMR)
35Text Meaning Representation (TMR)
He asked the UN to authorize the war.
REQUEST-ACTION-69 Â AGENT HUMAN-72
THEME ACCEPT-70 Â BENEFICIARY
ORGANIZATION-71 Â SOURCE-ROOT-WORD ask
TIME (lt (FIND-ANCHOR-TIME)) ACCEPT-70 Â
THEME WAR-73 Â THEME-OF REQUEST-ACTION-69
 SOURCE-ROOT-WORD authorizeORGANIZATION-71
 HAS-NAME United-Nations BENEFICIARY-OF
REQUEST-ACTION-69 Â SOURCE-ROOT-WORD
UNHUMAN-72 Â HAS-NAME Colin PowellÂ
AGENT-OF REQUEST-ACTION-69 SOURCE-ROOT-WORD
he reference resolution has been carried
outWAR-73 Â THEME-OF ACCEPT-70
 SOURCE-ROOT-WORD war
Example from Marjorie McShane, Sergei Nirenburg,
Stephen Beale, Margalit Zabludowski, The Cross
Lingual Reuse and Extension of knowledge
Resources in Ontological Semantics
36The OntoSem Ontology
FILLER
PROPERTY
FACET