Title: Text Understanding Agents and the Semantic Web
1Text Understanding Agents and the Semantic Web
- Akshay Java, Tim Finin, Sergei Nirenburg
01/04/2005
2Outline
- Motivation Language Understanding Agents
- Ontological Semantics
- Bridging the Knowledge Gap
- Preliminary Evaluation
- SemNews An Application Testbed
- Conclusion
- QA
3Motivation
- Intelligent agents need knowledge and
information. - Most Web content is NL text.
- SW can benefit NLP tools in their language
understanding tasks
Facts from NL
NLP Tools
Natural Language
RDF/OWL
WWW
Semantic Web
Text Images Audio video
Ontologies Instances triples
Web of documents
Web of data
structured information
4Motivation
5Ontological Semantics
OntoSem is a Natural Language Processing System
that processes the text and converts them into
facts. Supported by a constructed world model
encoded in a rich Ontology.
6Ontological Semantics
7Static Knowledge Sources
- Ontology
- 8000 concepts
- Avg 16 properties each
- Lexicons
- English 45000 entries
- Spanish 40000 entries
- Chinese 3000 entries
- Fact repository
- 20000 facts
- Onomasticon
- NNNNN names
8The OntoSem Ontology
FILLER
PROPERTY
FACET
ONTOLOGY CONCEPT CONCEPT ROOT
OBJECT-OR-EVENT PROPERTY SLOT
PROPERTY FACET FILLER
9Text Meaning Representation (TMR)
Word sense addressed disambiguated
A persistent fact stored in the FR
Semantic dependency established
10Text Meaning Representation (TMR)
REQUEST-ACTION-69 Â AGENT HUMAN-72
THEME ACCEPT-70 Â BENEFICIARY
ORGANIZATION-71 Â SOURCE-ROOT-WORD ask
TIME (lt (FIND-ANCHOR-TIME)) ACCEPT-70 Â
THEME WAR-73 Â THEME-OF REQUEST-ACTION-69
 SOURCE-ROOT-WORD authorizeORGANIZATION-71
 HAS-NAME United-Nations BENEFICIARY-OF
REQUEST-ACTION-69 Â SOURCE-ROOT-WORD
UNHUMAN-72 Â HAS-NAME Colin PowellÂ
AGENT-OF REQUEST-ACTION-69 SOURCE-ROOT-WORD
he reference resolution has been carried
outWAR-73 Â THEME-OF ACCEPT-70
 SOURCE-ROOT-WORD war
He asked the UN to authorize the war.
11Mapping OntoSem to web based KR
Fact Repository
NL Text
OntoSem
TMR
TMRs In OWL
Lexicon
OntoSem2OWL
Ontology
OWL Ontology
12Mapping Rules for Classes
- OntoSem LISP version
- (make-frame patent
- (definition
- (value (common "the exclusive right to make,
use or sell an invention, which is granted to the
inventor"))) - (is-a
- (value (common intangible-asset legal-right))))
- OWL Version
- ltowlClass rdfabout"ontosempatent"gt
- ltrdfssubClassOfgt
- ltowlClass rdfabout"ontosemintangible-asse
t"gt - lt/owlClassgt
- lt/rdfssubClassOfgt
- ltrdfssubClassOfgt
- ltowlClass rdfabout"ontosemlegal-right"gt
- lt/owlClassgt
- lt/rdfssubClassOfgt
13Mapping Rules for Properties
- Properties can be
- ObjectProperty owlObjectProperty
- Datatype Property owlDatatypeProperty
- Property hierarchy is defined by
owlsubPropertyOf - Domain maps to rdfsdomain
- Range maps to rdfsrange
- Restrictions are handled using owlRestriction
- Numeric datatypes are handled using XSD
14Mapping Rules for Properties
- (make-frame controls
- (domain
- (sem (common physical-event physical-object
social-event social-role))) - (range
- (sem (common actualize artifact
- natural-object social-role)))
- (is-a (value (common relation)))
- (inverse (value (common controlled-by)))
- (definition
- (value (common
- "A relation which relates concepts to what
they can control"))))
15Mapping Rules for Properties
(make-frame
- ltowlObjectProperty rdfID "controls"gt
- ltrdfsdomaingt
- ltowlClassgt
- ltowlunionOf rdfparseType"Collection"gt
- ltowlClass rdfabout"physical-event"/gt
- ltowlClass rdfabout"physical-object"/gt
- ltowlClass rdfabout"social-event"/gt
- ltowlClass rdfabout"social-role"/gt
- lt/owlunionOfgt
- lt/owlClassgt
- lt/rdfsdomaingt
- ltrdfsrangegt
- ltowlClassgt
- ltowlunionOf rdfparseType"Collection"gt
- ltowlClass rdfabout"actualize"/gt
- ltowlClass rdfabout"artifact"/gt
- ltowlClass rdfabout"natural-object"/gt
- ltowlClass rdfabout"social-role"/gt
- lt/owlunionOfgt
(domain
(range
(is-a
(inverse
16Mapping Rules for Facets
- Facets are a way to restricting the fillers that
can be used for a particular slot - SEM and VALUE
- Maps them using owlRestriction on a particular
property. - RELAXABLE-TO
- Add this to the classes present in
owlRestriction and add this information in the
annotation. - DEFAULT
- No clear way to represent non-monotonic reasoning
and closed world assumptions in Semantic Web. - DEFAULT-MEASURE
- similar to DEFAULT Facet, not handled.
- DEFAULT, DEFAULT-MEASURE used relatively less
frequently - NOT
- Not facet can be handled using owldisjointOf
- INV
- need not be handled since is-a slot is already
mapped to owlinverseOf
17Evaluation
Built Ontology translation tool using Jena API
Total Triples Generated 102189 (including
bnode) Time to build the Model 10-40 sec Time
to do RDFS Inference 10 sec Time to do OWL
Micro 40 sec Time to do OWL Full ???? DL
Expressivity ELUIHEL - Conjunction and Full
Existential QuantificationU - UnionH - Role
HierarchyI - Role Inverse
Swoop Pellet Wonderweb
http//w3c.org/RDF/Validator/
After Translation
Total Number of Classes 7747Â (Defined
7747, Imported 0)Total Number of Datatype
Properties 0 (Defined 0, Imported 0)Total
Number of Object Properties 604Â (Defined
604, Imported 0)Total Number of Annotation
Properties 1 (Defined 1, Imported 0)Total
Number of Individuals 0 (Defined 0, Imported
0) NOTE This is using no Restrictions
OWL FULL
18Evaluation
- Syntactic Correctness was checked using OWL/RDF
validators. - Semantic Validation Full semantic validation
even for subsets of OWL is difficult. - Meaning Preservation some subset of the native
representation features such as DEFAULTS,
modality, case roles may be underrepresented or
not handled. - Feature Minimization Complex features could be
difficult for reasoners to handle hence we can
perform the translations at each of the levels
OWL Lite, OWL DL, OWL Full. - Translation Complexity OntoSem is an extensive
and large ontology (8000 concepts). Translation
itself is done syntactically but in general
translation might require reasoning which could
be an issue.
19An Application Testbed SemNews
- Semantically Search and Browse news
- Aggregators collect the RSS news descriptions
form various sources. - The sentences are processed by OntoSem and are
converted into TMRs - Provides intelligent agents with the latest news
in a machine readable format - http//semnews.umbc.edu/
http//semnews.umbc.edu
20http//semnews.umbc.edu
21Agent understandable news
Provides RDF version of the news.
http//semnews.umbc.edu
22Semantacizing RSS
View structured representation of the RSS news
story.
Future versions would enable editing the facts
and provide provenance information
http//semnews.umbc.edu
23News stories are ontologically linked
Find news stories by browsing through the OntoSem
ontology.
http//semnews.umbc.edu
24Tracking Named Entities
Find stories on a specific named entity.
http//semnews.umbc.edu
25Browsing Facts
Fact repository explorer for named entity
Mexico shows that it has a relation
nationality-of with CITIZEN-235
Fact repository explorer for instance CITIZEN-235
shows that the citizen is an agent of ESCAPE-EVENT
http//semnews.umbc.edu
26Querying the semanticized RSS
RDQL Queries
Provides structured querying over text
repre-sented in RDF.
http//semnews.umbc.edu
27Semantic Alerts
Alerts can be specified as ontological concepts/
keywords / RDQL queries. Subscribe to results of
structured queries
http//semnews.umbc.edu
28Beyond keyword search
- Conceptually searching for content
- Find all news stories that have something to do
with a place and a terrorist activity. - Context based querying
- Find all events in which George Bush was the
speaker. - Reporting facts
- Find all politicians who traveled to Asia.
- Knowledge sharing
- Populating instances by mapping FOAF and DC to
OntoSem ontology.
29Current work
- Enron email corpus
- Profiles in terror
30Conclusions
- Integrating language processing agents into the
SW would publish SW annotations and documents
that capture the texts meaning. - Migrating from native non-web based
representation to SW representation may be
loss-full but is still useful for many
applications. - SemNews application testbed demonstrates some
scenarios that can benefit from language
understanding agents.
31For More Information
- Semnews application
- http//semnews.umbc.edu/
- OntoSem NLP system
- http//ilit.umbc.edu/
- UMBC ebiquity research group
- http//ebiquity.umbc.edu/
- This presentation
- http//ebiquity.umbc.edu/paper/html/id/260/
32References
- Software Used
- 1 OntoSem http//ilit.umbc.edu/
- 2 RDF Validation service http//w3c.org/RDF/Vali
dator - 3 Jena Toolkit http//jena.sourceforge.net/
- 4 Swoop Ontology Viewer http//www.mindswap.org/
2004/SWOOP/ - 5 Pellet OWL DL Reasoner http//www.mindswap.org
/2003/pellet/ - 6 Wonder Web OWL Validator http//phoebus.cs.man
.ac.uk9999/OWL/Validator - Papers
- 1 Sergei Nirenburg and Victor Raskin,
Ontological Semantics, Formal Ontology and
Ambiguity - 2 Sergei Nirenburg and Victor Raskin,
Ontological Semantics, MIT Press, Forthcoming - 3 Sergei Nirenburg, Ontological Semantics
Overview, Presentation CLSP JHU, Spring 2003 - 4 Marjorie McShane, Sergei Nirenburg, Stephen
Beale, Margalit Zabludowski, The Cross Lingual
Reuse and Extension of knowledge Resources in
Ontological Semantics - 5 P.J Beltran-Ferruz, P.A Gonzalez-Calero, P.
Gervas Converting Mikrokosmos frames into
Description Logics. - 6 Sergei Nirenburg, Ontology Tutorial, ILIT
UMBC - Mailing Lists
- 1 Jena Developers jena-dev_at_yahoogroups.com
- 2 pellet users pellet-users_at_lists.mindswap.org
33Backup slides
34Reasoning Capabilities
Finding Transitive Closures (RDFS reasoning)
- Buildfile build.xml
- init
- compile
- dist
- jar Building jar /home/aks1/software/ecli
pse/workspace/ontojena/dist/lib/ontojena.jar - run
- java MODEL OK
- java Resource http//ontosem.org/fire-eng
ine - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/fire-engine) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/all) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/physical-objec
t) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/inanimate) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/wheeled-vehicl
e) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/engine-propell
ed-vehicle) - java - (http//ontosem.org/fire-engine
rdfssubClassOf http//ontosem.org/wheeled-engine
-vehicle)
vehicle
Inferred Triples
Land-vehicle
Engine-propelled--vehicle
Wheeled--vehicle
Wheeled-engine-vehicle
Truck
Fire-engine
35Mapping Rules
Property Related Constructs
36Mapping Rules
Facet related constructs