The Niagara Project - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

The Niagara Project

Description:

The Niagara Project 'I have avoided networking like the plague. I am terrified of getting [a connection] because it's like drinking from Niagara Falls. ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 20
Provided by: NAUG5
Category:
Tags: falls | niagara | project

less

Transcript and Presenter's Notes

Title: The Niagara Project


1
The Niagara Project
  • I have avoided networking like the plague. I am
    terrified of getting a connection because its
    like drinking from Niagara Falls.
  • - Arthur C. Clarke

2
Who is working on Niagara?
  • Professors DeWitt and Naughton _at_ UW, Maier _at_ OGI
  • Students Lots of them!
  • See http//www.cs.wisc.edu/niagara

3
Goals of the Niagara Project
  • In broadest terms, to
  • improve the precision of Internet searching
  • allow queries over the whole Internet (the FROM
    clause)
  • work over streams as well as static files
  • monitor the Internet for changes
  • Not finished yet...

4
Current status
  • Completed three java prototypes
  • A text-in-context XML search engine.
  • An XML-QL query engine.
  • An XML-QL trigger engine.
  • Doing the same thing (again) in C, maybe with
    Quilt as query language.
  • Finding (solving?) interesting research problems
    along the way...

5
Text-in-Context XML SE
  • Rather than ask
  • What are all the documents that contain the
    string Montreal?
  • We can ask
  • What are all the documents that contain ship
    departure information for a ship whose name is
    Montreal?

6
How it works
  • Locate documents by crawling the web or using
    explicit input from user.
  • Build local index on these docs that supports
    fast evaluation of Search Engine Query Language
    (SEQL) queries.
  • Return URLs of documents that satisfy SEQL
    queries.
  • Two uses stand alone, or part of XML-QL

7
XML-QL Query Engine
  • Evaluates queries expressed in XML-QL.
  • Result is XML
  • Different from Search Engine

Instead of asking Find all files with ship
departure events where the ships name
is Montreal? We can ask What is a list of
departure dates for ships named Montreal?
8
Ex Fragment of XML file...
Electrical Engineering

Robertson
Pedro
6988086 Robertson.Pedr
o_at_foo.edu 660 lty
9
XML-QL Query...
WHERE
"Electrical Engineering"


v2
v3
content_as v4 CONSTRUCT
v4
10
Important Question
  • Which documents should be consulted to answer an
    XML-QL query? We support three approaches
  • explicitly listed documents (in foo.xml)
  • documents conforming to DTD (conforms to
    some_dtd.xml)
  • documents that satisfy search engine predicates
    extracted from query

11
Example of third approach
  • Given the previous XML-QL query finding first and
    last names of EE faculty members, the system will
    extract this Search Engine query

department CONTAINS (deptname IS "Electrical
Engineering" AND faculty CONTAINS
name CONTAINS (lastname AND
firstname))
12
Control Flow for Typical Query
  • So full flow of typical XML-QL query
  • user submits XML-QL query
  • system extracts SEQL query from XML-QL, passes it
    to search engine
  • search engine evaluates SEQL query, returns list
    of URLs to XML-QL query engine
  • XML-QL engine fetches documents from URL list,
    evaluates query
  • Answer returned to the user.

13
XML-QL Trigger Engine
  • Goal
  • allow users to define triggers on XML files
    using XML-QL predicates.
  • Scale to huge numbers of triggers by exploiting
    commonality among sets of triggers.

14
Some research topics...
  • Semantics and impl. of queries over streams?
  • Use RDBMS for anything at all?
  • How smart should the search engine be?
  • Can you use caching anywhere?
  • Query optimization plan space, stats?
  • How should you index (cached?) XML?
  • What do you do with queryable sources?
  • How do you handle huge s of triggers?
  • Performance, performance, performance.

15
A Petabyte in your Pocket
  • David DeWitt,
  • Dave Maier _at_ OGI,
  • Jeff Naughton

16
Title of NSF ITR Project
  • What does it mean?
  • Goal is to have, available from a PDA, your
    evolving and customized view of all the on-line
    digital data that exists anywhere.
  • Goal is not to develop holographic memory
    technology or DNA-based storage units.

17
What the PetDB is
  • An example of what can be done with new software
    infrastructure termed Net Data Managers (NDMs.)
  • NDMs
  • focus on data movement as well as storage
  • store and query data of arbitrary types without a
    schema having been defined
  • execute queries and triggers over tens of
    thousands of information sites

18
Connection with Niagara...
  • Niagara is a very early prototype of a simple
    NDM.
  • Project goal
  • continue developing Niagara and working on
    research problems that arise
  • prototype a simple NDM application using Niagara
    to see if we are on the right track

19
For more information...
  • Web site http//www.cs.wisc.edu/niagara
  • Talk to me or any other Niagara project member...
Write a Comment
User Comments (0)
About PowerShow.com