Text Understanding Agents and the Semantic Web - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Text Understanding Agents and the Semantic Web

Description:

[java] - (http://ontosem.org/#fire-engine rdfs:subClassOf http://ontosem.org/#wheeled-vehicle) ... as subclas of vehicle. BUILD SUCCESSFUL. Total time: ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 37
Provided by: ebiqui
Category:

less

Transcript and Presenter's Notes

Title: Text Understanding Agents and the Semantic Web


1
Text Understanding Agents and the Semantic Web
  • Akshay Java, Tim Finin, Sergei Nirenburg
    01/04/2005

2
Outline
  • Motivation Language Understanding Agents
  • Ontological Semantics
  • Bridging the Knowledge Gap
  • Preliminary Evaluation
  • SemNews An Application Testbed
  • Conclusion
  • QA

3
Motivation
  • Intelligent agents need knowledge and
    information.
  • Most Web content is NL text.
  • SW can benefit NLP tools in their language
    understanding tasks

Facts from NL
NLP Tools
Natural Language
RDF/OWL
WWW
Semantic Web
Text Images Audio video
Ontologies Instances triples
Web of documents
Web of data
structured information
4
Motivation
5
Ontological Semantics
OntoSem is a Natural Language Processing System
that processes the text and converts them into
facts. Supported by a constructed world model
encoded in a rich Ontology.
6
Ontological Semantics
7
Static Knowledge Sources
  • Ontology
  • 8000 concepts
  • Avg 16 properties each
  • Lexicons
  • English 45000 entries
  • Spanish 40000 entries
  • Chinese 3000 entries
  • Fact repository
  • 20000 facts
  • Onomasticon
  • NNNNN names

8
The OntoSem Ontology
FILLER
PROPERTY
FACET
ONTOLOGY CONCEPT CONCEPT ROOT
OBJECT-OR-EVENT PROPERTY SLOT
PROPERTY FACET FILLER
9
Text Meaning Representation (TMR)
Word sense addressed disambiguated
A persistent fact stored in the FR
Semantic dependency established
10
Text Meaning Representation (TMR)
REQUEST-ACTION-69   AGENT HUMAN-72
THEME ACCEPT-70   BENEFICIARY
ORGANIZATION-71   SOURCE-ROOT-WORD ask
TIME (lt (FIND-ANCHOR-TIME)) ACCEPT-70  
THEME WAR-73   THEME-OF REQUEST-ACTION-69
  SOURCE-ROOT-WORD authorizeORGANIZATION-71
  HAS-NAME United-Nations  BENEFICIARY-OF
REQUEST-ACTION-69   SOURCE-ROOT-WORD
UNHUMAN-72   HAS-NAME Colin Powell 
AGENT-OF REQUEST-ACTION-69 SOURCE-ROOT-WORD
he reference resolution has been carried
outWAR-73   THEME-OF ACCEPT-70
  SOURCE-ROOT-WORD war
He asked the UN to authorize the war.
11
Mapping OntoSem to web based KR
Fact Repository
NL Text
OntoSem
TMR
TMRs In OWL
Lexicon
OntoSem2OWL
Ontology
OWL Ontology
12
Mapping Rules for Classes
  • OntoSem LISP version
  • (make-frame patent
  • (definition
  • (value (common "the exclusive right to make,
    use or sell an invention, which is granted to the
    inventor")))
  • (is-a
  • (value (common intangible-asset legal-right))))
  • OWL Version
  • ltowlClass rdfabout"ontosempatent"gt
  • ltrdfssubClassOfgt
  • ltowlClass rdfabout"ontosemintangible-asse
    t"gt
  • lt/owlClassgt
  • lt/rdfssubClassOfgt
  • ltrdfssubClassOfgt
  • ltowlClass rdfabout"ontosemlegal-right"gt
  • lt/owlClassgt
  • lt/rdfssubClassOfgt

13
Mapping Rules for Properties
  • Properties can be
  • ObjectProperty owlObjectProperty
  • Datatype Property owlDatatypeProperty
  • Property hierarchy is defined by
    owlsubPropertyOf
  • Domain maps to rdfsdomain
  • Range maps to rdfsrange
  • Restrictions are handled using owlRestriction
  • Numeric datatypes are handled using XSD

14
Mapping Rules for Properties
  • (make-frame controls
  • (domain
  • (sem (common physical-event physical-object
    social-event social-role)))
  • (range
  • (sem (common actualize artifact
  • natural-object social-role)))
  • (is-a (value (common relation)))
  • (inverse (value (common controlled-by)))
  • (definition
  • (value (common
  • "A relation which relates concepts to what
    they can control"))))

15
Mapping Rules for Properties
(make-frame
  • ltowlObjectProperty rdfID "controls"gt
  • ltrdfsdomaingt
  • ltowlClassgt
  • ltowlunionOf rdfparseType"Collection"gt
  • ltowlClass rdfabout"physical-event"/gt
  • ltowlClass rdfabout"physical-object"/gt
  • ltowlClass rdfabout"social-event"/gt
  • ltowlClass rdfabout"social-role"/gt
  • lt/owlunionOfgt
  • lt/owlClassgt
  • lt/rdfsdomaingt
  • ltrdfsrangegt
  • ltowlClassgt
  • ltowlunionOf rdfparseType"Collection"gt
  • ltowlClass rdfabout"actualize"/gt
  • ltowlClass rdfabout"artifact"/gt
  • ltowlClass rdfabout"natural-object"/gt
  • ltowlClass rdfabout"social-role"/gt
  • lt/owlunionOfgt

(domain
(range
(is-a
(inverse
16
Mapping Rules for Facets
  • Facets are a way to restricting the fillers that
    can be used for a particular slot
  • SEM and VALUE
  • Maps them using owlRestriction on a particular
    property.
  • RELAXABLE-TO
  • Add this to the classes present in
    owlRestriction and add this information in the
    annotation.
  • DEFAULT
  • No clear way to represent non-monotonic reasoning
    and closed world assumptions in Semantic Web.
  • DEFAULT-MEASURE
  • similar to DEFAULT Facet, not handled.
  • DEFAULT, DEFAULT-MEASURE used relatively less
    frequently
  • NOT
  • Not facet can be handled using owldisjointOf
  • INV
  • need not be handled since is-a slot is already
    mapped to owlinverseOf

17
Evaluation
Built Ontology translation tool using Jena API
Total Triples Generated 102189 (including
bnode) Time to build the Model 10-40 sec Time
to do RDFS Inference 10 sec Time to do OWL
Micro 40 sec Time to do OWL Full ???? DL
Expressivity ELUIHEL - Conjunction and Full
Existential QuantificationU - UnionH - Role
HierarchyI - Role Inverse
Swoop Pellet Wonderweb
http//w3c.org/RDF/Validator/
After Translation
Total Number of Classes 7747 (Defined
7747, Imported 0)Total Number of Datatype
Properties 0 (Defined 0, Imported 0)Total
Number of Object Properties 604 (Defined
604, Imported 0)Total Number of Annotation
Properties 1 (Defined 1, Imported 0)Total
Number of Individuals 0 (Defined 0, Imported
0) NOTE This is using no Restrictions
OWL FULL
18
Evaluation
  • Syntactic Correctness was checked using OWL/RDF
    validators.
  • Semantic Validation Full semantic validation
    even for subsets of OWL is difficult.
  • Meaning Preservation some subset of the native
    representation features such as DEFAULTS,
    modality, case roles may be underrepresented or
    not handled.
  • Feature Minimization Complex features could be
    difficult for reasoners to handle hence we can
    perform the translations at each of the levels
    OWL Lite, OWL DL, OWL Full.
  • Translation Complexity OntoSem is an extensive
    and large ontology (8000 concepts). Translation
    itself is done syntactically but in general
    translation might require reasoning which could
    be an issue.

19
An Application Testbed SemNews
  • Semantically Search and Browse news
  • Aggregators collect the RSS news descriptions
    form various sources.
  • The sentences are processed by OntoSem and are
    converted into TMRs
  • Provides intelligent agents with the latest news
    in a machine readable format
  • http//semnews.umbc.edu/

http//semnews.umbc.edu
20
http//semnews.umbc.edu
21
Agent understandable news
Provides RDF version of the news.
http//semnews.umbc.edu
22
Semantacizing RSS
View structured representation of the RSS news
story.
Future versions would enable editing the facts
and provide provenance information
http//semnews.umbc.edu
23
News stories are ontologically linked
Find news stories by browsing through the OntoSem
ontology.
http//semnews.umbc.edu
24
Tracking Named Entities
Find stories on a specific named entity.
http//semnews.umbc.edu
25
Browsing Facts
Fact repository explorer for named entity
Mexico shows that it has a relation
nationality-of with CITIZEN-235
Fact repository explorer for instance CITIZEN-235
shows that the citizen is an agent of ESCAPE-EVENT
http//semnews.umbc.edu
26
Querying the semanticized RSS
RDQL Queries
Provides structured querying over text
repre-sented in RDF.
http//semnews.umbc.edu
27
Semantic Alerts
Alerts can be specified as ontological concepts/
keywords / RDQL queries. Subscribe to results of
structured queries
http//semnews.umbc.edu
28
Beyond keyword search
  • Conceptually searching for content
  • Find all news stories that have something to do
    with a place and a terrorist activity.
  • Context based querying
  • Find all events in which George Bush was the
    speaker.
  • Reporting facts
  • Find all politicians who traveled to Asia.
  • Knowledge sharing
  • Populating instances by mapping FOAF and DC to
    OntoSem ontology.

29
Current work
  • Enron email corpus
  • Profiles in terror

30
Conclusions
  • Integrating language processing agents into the
    SW would publish SW annotations and documents
    that capture the texts meaning.
  • Migrating from native non-web based
    representation to SW representation may be
    loss-full but is still useful for many
    applications.
  • SemNews application testbed demonstrates some
    scenarios that can benefit from language
    understanding agents.

31
For More Information
  • Semnews application
  • http//semnews.umbc.edu/
  • OntoSem NLP system
  • http//ilit.umbc.edu/
  • UMBC ebiquity research group
  • http//ebiquity.umbc.edu/
  • This presentation
  • http//ebiquity.umbc.edu/paper/html/id/260/

32
References
  • Software Used
  • 1 OntoSem http//ilit.umbc.edu/
  • 2 RDF Validation service http//w3c.org/RDF/Vali
    dator
  • 3 Jena Toolkit http//jena.sourceforge.net/
  • 4 Swoop Ontology Viewer http//www.mindswap.org/
    2004/SWOOP/
  • 5 Pellet OWL DL Reasoner http//www.mindswap.org
    /2003/pellet/
  • 6 Wonder Web OWL Validator http//phoebus.cs.man
    .ac.uk9999/OWL/Validator
  • Papers
  • 1 Sergei Nirenburg and Victor Raskin,
    Ontological Semantics, Formal Ontology and
    Ambiguity
  • 2 Sergei Nirenburg and Victor Raskin,
    Ontological Semantics, MIT Press, Forthcoming
  • 3 Sergei Nirenburg, Ontological Semantics
    Overview, Presentation CLSP JHU, Spring 2003
  • 4 Marjorie McShane, Sergei Nirenburg, Stephen
    Beale, Margalit Zabludowski, The Cross Lingual
    Reuse and Extension of knowledge Resources in
    Ontological Semantics
  • 5 P.J Beltran-Ferruz, P.A Gonzalez-Calero, P.
    Gervas Converting Mikrokosmos frames into
    Description Logics.
  • 6 Sergei Nirenburg, Ontology Tutorial, ILIT
    UMBC
  • Mailing Lists
  • 1 Jena Developers jena-dev_at_yahoogroups.com
  • 2 pellet users pellet-users_at_lists.mindswap.org

33
Backup slides
34
Reasoning Capabilities
Finding Transitive Closures (RDFS reasoning)
  • Buildfile build.xml
  • init
  • compile
  • dist
  • jar Building jar /home/aks1/software/ecli
    pse/workspace/ontojena/dist/lib/ontojena.jar
  • run
  • java MODEL OK
  • java Resource http//ontosem.org/fire-eng
    ine
  • java - (http//ontosem.org/fire-engine
    rdfssubClassOf http//ontosem.org/fire-engine)
  • java - (http//ontosem.org/fire-engine
    rdfssubClassOf http//ontosem.org/all)
  • java - (http//ontosem.org/fire-engine
    rdfssubClassOf http//ontosem.org/physical-objec
    t)
  • java - (http//ontosem.org/fire-engine
    rdfssubClassOf http//ontosem.org/inanimate)
  • java - (http//ontosem.org/fire-engine
    rdfssubClassOf http//ontosem.org/wheeled-vehicl
    e)
  • java - (http//ontosem.org/fire-engine
    rdfssubClassOf http//ontosem.org/engine-propell
    ed-vehicle)
  • java - (http//ontosem.org/fire-engine
    rdfssubClassOf http//ontosem.org/wheeled-engine
    -vehicle)

vehicle
Inferred Triples
Land-vehicle
Engine-propelled--vehicle
Wheeled--vehicle
Wheeled-engine-vehicle
Truck
Fire-engine
35
Mapping Rules
Property Related Constructs
36
Mapping Rules
Facet related constructs
Write a Comment
User Comments (0)
About PowerShow.com