Generating Rewrite Rules by Browsing RDF Data - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Generating Rewrite Rules by Browsing RDF Data

Description:

Users define rewrite rules, rules used to add 'virtual' RDF properties, augment ... Norewrite can be used to define that all instances of a particular RDF class ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 24
Provided by: creat8
Category:

less

Transcript and Presenter's Notes

Title: Generating Rewrite Rules by Browsing RDF Data


1
Generating Rewrite Rules by Browsing RDF Data
  • Ora LassilaNokia Research CenterCambridge, MA
    02142, USA
  • Presented by
  • Neha Majithia

2
Overview
  • Introduction
  • OINK
  • WILBUR
  • A Path Query Language
  • Rewriting path queries
  • Rules
  • Reasoning
  • Generating Rewrite Rules
  • Query by Browsing
  • Query using Virtual Properties
  • Related Work
  • Conclusions and Further Development

3
Introduction
  • The Semantic web encompasses the application of
    knowledge representation in the context of the
    World Wide Web
  • Emphasis is on machine-based interpretation of
    data and automatic processing.
  • In this paper, a different approach is taken
  • The Semantic Web can be visualized and
    presented to human users for browsing,
    exploration ,querying etc.
  • The emergence of tools like Piggy Bank showed
    the benefits of end user access to the Semantic
    web.

4
  • OINK is used to browse RDF data. The RDF data,
    structurally a graph can be presented to user as
    hypertext.
  • The path the user takes through the graph, which
    can be used as a template for find other items of
    interest.
  • The users path is interpreted as a query in a
    path query language. Realistic queries can be
    derived from the users navigational paths.
  • Automatic query generation is better than forcing
    the user to learn a query language.
  • Yet to conduct any user or usability studies of
    the mechanisms of generating automatic queries.

5
OINK
  • Open Integration of Networked Knowledge.
  • OINK-is a Semantic Web browser for RDF data. OINK
    allows users to access RDF data via browsing.
  • Web browser based user Interface.
  • Queries data in an RDF triple store, allows data
    viewing in a very intuitive way.
  • Allows simple integration of data.
  • OINK is built on top of Wilbur, is implemented in
    Common Lisp, loads almost 3,000 triples/sec.
  • No knowledge of schemata, knows only metamodel.
  • OINK allows users to build queries in WILBURQL
    path query language merely by browsing the data
  • Each node on the RDF graph is a webpage.

6
(No Transcript)
7
WILBUR
  • Nokia Research centers open source toolkit for
    RDF, written in Common Lisp.
  • Generic Semantic Web toolkit (reasoning,
    querying, etc)
  • Offers an API for manipulating RDF data.
  • Has a path-based query language WILBURQL.
  • Wilbur allows manipulation of RDF graphs.
  • Currently the system is being used as a debugging
    tool when building the Semantic Web applications.
  • Users define rewrite rules, rules used to add
    virtual RDF properties, augment queries by
    rewriting them before the actual query engine
    processes them.
  • The RDF storage system uses query rewriting to
    implement as RDF reasoner, the users see the
    underlying stored RDF documents as a single
    graph.

8
A Path Query Language
  • Queries are expressed as path patterns that match
    between root nodes and the members of the result
    set.
  • The query engine is invoked via the access
    function A(v, q, G), where v is a root node
    (i.e., a starting node of the query), q is a
    query expression, and G is the underlying graph
    database
  • Path expressions are either atomic or complex.
  • Atomic Path Expression
  • Either an RDF property or a special token known
    to the query engine.
  • Complex Path Expression
  • A query expression is of the form op(e1,.en)
    which are typical path expressions.

9
  • The op operator known to the query engine
  • Sequence (concatenation) seq(e1,.en) matches a
    sequence of n steps in the graph, consisting of
    sub expressions e1,.en.
  • Disjunction (alternation) or(e1,.en ) matches
    any one of sub expressions e1,.en. The sub
    expressions are matched in the order they are
    specified.
  • Repetition (closure) rep(e) matches the
    transitive closure of sub expression e.

10
  • Inverse Satisfaction of inv(e) requires the path
    defined by the subexpression e to be matched in
    reverse direction
  • Default value The expression value(r) will
    insert the value r into the result set of the
    query. This can be used to specify default
    values, using the following patternor(q,
    value(r)). This query would return at least one
    value, r, in its result set, irrespective of q.
  • Restriction restrict(e) matches the value e. For
    example, the query expression seq(e1,
    restrict(e2), e3) matches whatever seq(e1, e3)
    would match, but only if the node after the step
    e1 equals e2.

11
  • Traversal to property nodes
  • Two special tokens can be used in place of any
    property in a query expression
    predicate-of-subject allows traversal through a
    node representing a reified statement in RDF,
    even if this node did not actually exist in the
    graph, from the subject of the statement to the
    node naming its predicate. Similarly,
    predicate-of-object allows traversal from the
    object of the statement.

12
Rewriting Path Queries
  • WILBURs RDF(S) reasoner is implemented by
    rewriting access queries. Rewriting is done by
    recursively substituting all occurrences of those
    RDF properties that have a semantic theory
    (rdftype, rdfssubClassOf and rdfssubPropertyOf)
    with a more complex query expression that
    effectively constitutes something loosely
    resembling backward-chaining rules.
  • Rewritten queries create a view into the
    underlying graph database that contains RDF(S)
    closure of the original graph.

13
Using Rules to Rewrite Path Queries
  • Take the general form p ? q
  • where p is an atomic expression of the query
    language (i.e., something that could name an RDF
    property),
  • q is an arbitrarily complex query expression
    that does not contain p.
  • Rule Engine applies the rewrite the rules until
    no rule applies.
  • The resulting query is presented to the query
    engine that computes and retrieves the result
    set.

14
  • Norewrite
  • Expressions q contain subexpressions of the form
    norewrite(r), r is an atomic expression that may
    contain p.
  • Rule Engine does not try to rewrite the
    expression.
  • Query Engine ignores it such that norewrite(p) ?
    p.
  • We augment RDF properties by p ?or(norewrite(p),
    q).
  • We can give p the default value of r by using p?
    or(norewrite(p), value(r))

15
Reasoning as Path Query Rewriting
  • Norewrite can be used to define that all
    instances of a particular RDF class have to be
    instances of a superclass of this class.
  • t ?seq(norewrite(t), rep(s))
  • where t ? rdftype and s ? rdfssubClassOf
  • The subclass rule would be
  • s? rep(norewrite(s))
  • The full form of the rule is
  • t ?or(seq(norewrite(t), s),
  • seq(predicate-of-object, rdfsrange, s),
  • seq(predicate-of-subject, rdfsdomain, s),
  • value(rdfsResource))

16
  • Computation of the entire RDF closure with a few
    exceptions can be defined by these rules.
  • In the current implementation of WILBUR, the
    subproperties are handled by a special rewrite
    rule that cannot be specified with the simple
    rule language.
  • for a property p0 with subproperties p1, . . . ,
    pn this rewrite rule would be
  • p0 ? or(p0, p1, . . . , pn)

17
Generating Rewrite Rules
  • This system allows user interaction with RDF
    data.
  • OINK has been developed as a platform for
    building customized solutions for browsing
    complex data.
  • Basis of enabling user defined rules is automatic
    query generation in OINK.
  • OINK visualizing the rsschannel from the RSS
    schema, the channel used to represent syndicated
    new feeds such as blogs.
  • OINK shows both inbound and outbound edges as it
    relies on the RDF metamodel.
  • Everything on the page is clickable and can be
    navigated to.

18
(No Transcript)
19
Query by Browsing
  • Automatic query generation.
  • The user navigates through the RDF data using
    OINK.
  • The path taken provides basis of the query to be
    expressed in WILBURQL.
  • The steps taken to automatically generate a query
    are as follows
  • 1.From the rsschannel instance, we navigate to
    the home page URL of the blog in question (home
    pages are associated with the feed via the
    rdfsseeAlso property).
  • 2. The home page URL is associated with a
    dctitle property, giving the human-readable
    title for the blog.
  • 3. Invoking a query based on the path we have
    taken from the rssChannel class node, we get the
    list of the titles of all the blogs currently
    known to the system. The corresponding query is
  • seq(inv(rdftype),
  • inv(rdfsseeAlso),
  • dctitle)

20
(No Transcript)
21
Using Virtual Properties
  • Rewrite rules allow the values of new properties
    to be computed rather than being stored in the
    underlying database.
  • Defining a new property for blog RSS profiles,
    e.g. extitle and define the following rule
  • extitle ?seq(inv(rdfsseeAlso),
    dctitle)
  • WILBUR provides a class wilburPathRewriteRule as
    a subclass of rdfProperty.
  • Allows virtual properties with rdfsdomain
    definitions given a virtual property p with
    domain d, OINK uses p whenever visualizing an
    instance of the class d.
  • Currently work is being done on a user interface
    as part of OINK to allow users to add the domain.

22
Related Work
  • The idea of rewriting has been used as a general
    computational vehicle, transforming and
    optimizing programming languages.
  • Used as a customization mechanism and in
    performing service discovery and composition.
  • Used is query systems for semi-structured data
    providing augmented or federated views of
    databases.
  • Macro expansion, rewrite rules do not contain any
    variables or other types of patterns that require
    matching.

23
Conclusion
  • WILBURQL is used as it is naturally suited to
    dealing with paths.
  • Simple nature of the language makes rewriting
    easy.
  • Certain other features of the underlying such
    WILBUR toolkit reasoner- already use WILBURQL
    for their implementation.
  • Further Work
  • Query generation mechanism is work in progress
    (no formal user testing has been conducted at the
    time of writing)
  • Adding features for further customization, that
    will consequently take the system towards a
    platform for developing browsers/viewers for
    complex data.
Write a Comment
User Comments (0)
About PowerShow.com