SWSE: Semantic Web Search Engine - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

SWSE: Semantic Web Search Engine

Description:

SWSE exploits existing and emerging structured data to index and provide search ... provided to the client to transform the RDF/XML results to a HTML serialisation. ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 2
Provided by: brianc81
Category:

less

Transcript and Presenter's Notes

Title: SWSE: Semantic Web Search Engine


1
SWSESemantic Web Search Engine
SWSE is a web search application that provides
answers before links, allowing casual users find
the exact information they desire with minimal
effort. Current search engines do not exploit
available structured data, and mainly index flat
text documents. They do not allow for complex
queries to be posed, such as give me all
co-authors of Tim Berners-Lee or show me all of
the pictures of people I know. SWSE exploits
existing and emerging structured data to index
and provide search and browsing over a large
corpus. Currently the corpus consists of data
retrieved and converted to RDF from large
repositories (such as DBLP, CiteSeer, DMOZ, IMDb,
SwissProt, Wikipedia etc.) and also from the web
(such as HTML, XML i.e. RSS and Podcasts etc. and
RDF i.e. FOAF, DOAP, SIOC etc.). Currently we
index over 90 GB of raw data, summating to about
700M quads or statements. Data is cleansed,
converted and merged through use of object
consolidation. SWSE utilises YARS2 (Yet Another
RDF Store) to index the data retrieved. YARS2 is
a scalable, distributed store for RDF, and offers
keyword query and complex graph based query
functionality through a HTTP interface. The SWSE
user interface boasts a compact and intuitive
user interaction model (modelled to exploit
users experience in using current web search
interfaces) to allow casual users find the exact
information they require with minimal effort.
Users begin by specifying a keyword query, and
from there can incrementally build complex
queries using guided exploration provided by the
nodebrowser compass which ensures results at
each step. Results are initially serialised in a
ranked results listing. Each result can then be
clicked to retrieve its details. Alternatively,
browsing of the data graph is possible to explore
social networks and, more generally traverse the
information through available relationships
between resources.
  • Data retrieval via download of large repositories
    or crawling the web

  • Data converted to RDF using XSLT or specialised
    conversion code, with ontologies created to
    represent new schema
  • Object consolidation performed on dataset to
    merge data from different sources on equivalent
    instances
  • Data stored in a distributed YARS2 installation
    on 16 servers
  • YARS2 stores RDF in quads, an augmentation of
    N-Triples adding context to make sources of data
    traceable.
  • YARS2 uses Lucene to support keyword queries and
    stores data on-disk in sorted, compressed,
    blocked files to support complex queries.
  • YARS2 uses an in-memory sparse referencing index
    to accelerate data access from on-disk files.
  • Queries can be posed via a HTTP interface. YARS2
    supports a subset of SPARQL querying.
  • The SWSE user interface takes user queries and
    converts them to queries answerable by YARS2.
  • The UI then retrieves the results and serialises
    them to RDF/XML.
  • An XSLT stylesheet is provided to the client to
    transform the RDF/XML results to a HTML
    serialisation.
  • Initially a user is offered a keyword query to
    begin.
  • A user can click on any result to get a details
    view, giving all the available info for a result.
  • One can also restrict the results by the type of
    result they are looking for, e.g. Person, Movie,
    Document, Publication etc.
  • Once results have been restricted to a particular
    type, or if details view has been selected, a
    list of available inlinking and outlinking
    relationships are offered for navigation of the
    results graph.

Search and explore todays Semantic Web at
http//swse.deri.org/
DERI Galway Andreas Harth Aidan Hogan Jürgen
Umbrich
P 353 91 495006 andreas.harth_at_deri.org aidan.hog
an_at_deri.org
Write a Comment
User Comments (0)
About PowerShow.com