GinsengA Guided Input Natural Language Search Engine for Querying Ontologies - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

GinsengA Guided Input Natural Language Search Engine for Querying Ontologies

Description:

Abraham Bernstein. Christoph Kiefer. Ginseng A Guided Input Natural Language ... Making the Semantic Web Accessible to the Casual User. Problems of the Semantic Web ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 14
Provided by: abrahamb
Category:

less

Transcript and Presenter's Notes

Title: GinsengA Guided Input Natural Language Search Engine for Querying Ontologies


1
GinsengA Guided Input Natural Language Search
Engine for Querying Ontologies
  • Making the Semantic Web Accessible to the Casual
    User

2
Problems of the Semantic Web
  • Computational complexity
  • Expressiveness of DL
  • Markup prisoners dilemma
  • Formal logic is not simple
  • only 1 of Web users use Boolean Logic (Spink et
    al. 2001)
  • Spoerri (1993) showed
  • People arent at ease at constructing Boolean
    queries? MIT students constructed wrong queries

How can we bridge the gap between the
(description) logic-based Semantic Web and
real-world users, who are at least ill at ease
and, oftentimes, unable to use formal logic
concepts?
3
Possible Answers
  • Query by example
  • But the Semantic Web is in NF2 (and not on 1NF)
  • Graphical Query Languages
  • Are also formal (but maybe more intuitive)
  • Natural Language (Androutsopoulos et al. 1995)
  • High computational complexity
  • Difficult to adapt to new domains
  • Offers no guidance
  • Usually only process a part of English while
    suggesting full comprehension

4
GinsengDemo
5
Mooney Geo Ontology
  • classes 19, properties 30, instances 1350

6
Ginseng User Guidance
7
GinsengArchitecture
SimpleQuestion Grammar
Full Grammar
Grammar Compiler
Incremental parser
Jena SPARQL Engine
Jena Ontology Model
8
GinsengGrammar
  • (1) ltSTARTgt ltOQgt ? SELECT ltltOQgtgt
    WHERE (ltltOQgtgt)
  • (2a) ltOQgt which ltsubjectgt ltverbgt
    ltltsubjectgtgt ltltsubject1gtgt ltltverbgtgt
  • (2b) ltOQgt what ltsubjectgt ltverbgt
    ltltsubjectgtgt ltltsubject1gtgt ltltsubjectgtgt)
    (ltltsubject1gtgt ltltverbgtgt
  • (3) ltsubjectgt state ?state
    ltRDFtypegt ltgeostategt (typeltgeostategt)
  • (4) ltverbgt borders ltobjectgt -
    ltgeobordersgt ltltobjectgtgt (domainltgeostategt,
    rangeltgeostategt)
  • (5) ltobjectgt New York City ?newyorkcity
    ltgeonewYorkCitygt (typeltgeocitygt,
    ltgeocapitalgt)
  • (7) ltobjectgt Mississippi ?mississippi
    ltgeomississippiStategt (typeltgeostategt)

9
GinsengGrammar Details I
  • Static Grammar
  • 120 mostly empirically constructed rules
  • E.g., rules (1), (2a), (2b)
  • Dynamic Grammar
  • Parses Grammar and turns major elements into
    grammar elements
  • Class stateltNCcgt state?statelthttp//www.mo
    oney.net/geostategtltNCcgt states?statehttp/
    /www.mooney.net/geostate
  • State InstanceltNIgt West Virginia
    ?westvirginia lthttp//www.mooney.net/g
    eowestVirginiagt

10
GinsengGrammar Details II
  • PropertiesltowlObjectProperty
    rdfID"isMountainOf"gt ltrdfsdomain
    rdfresource"Mountain"/gt ltrdfsrange
    rdfresource"State"/gt ltginsengignore
    rdfvalue"id text"/gt ltginsengphrase
    rdfvalue"lies in"/gt ...lt/owlObjectProperty
    gt
  • ?
  • ltVCUgt lies in ltObjgt-ltgeoisMountainOfgt
    ltltObjgtgtltVCUgt lies in the state of ltObjgt-
    ltgeoisMountainOfgt ltltObjgtgtltVIUgt is
    in ltObjgt-ltgeoisMountainOfgt ltltObjgtgt

11
Evaluation
  • Dataset NL- Benchmark (Tang and Mooney 2001)
  • 3 domains (US-Geography, jobs, restaurants)
  • 1770 Questions (generated by undergraduates and
    web-interface)
  • Setup
  • Query reformulations with 30 randomly chosen Qs
  • Precision/recall
  • SUS (Brook 1997)
  • 30 subjects(each a pair of tools)

12
Evaluation Results
  • Ginseng and SQL, Ginseng was
  • faster than SQL queries (p 1.8 )
  • rated a better integrated (t-test p 3.5)
  • easier to learn (t-test p 0.97).
  • Higher overall SUS score (but not significant)
  • Retrieval Performance
  • Parsed 67.1 of appropriate queries (40 overall)
  • Precision 92.8
  • Recall 98.4.

13
Limitation
  • Learning cost
  • Subject pool
  • Size
  • Population
  • Benchmark with other tools
  • PRECISE had comparable results (e.g., Popescu et
    al. 2003)
  • NLP vs. Ginseng
Write a Comment
User Comments (0)
About PowerShow.com