RGMA : a new API - PowerPoint PPT Presentation

1 / 10

About This Presentation

Title:

RGMA : a new API

Description:

Local buffers so can't guarantee not to lose tuples. ... (5) producer recognises consumer, so sends all tuples since last tuple sent ... – PowerPoint PPT presentation

Number of Views:38

Avg rating:3.0/5.0

Slides: 11

Provided by: Werne67

Category:

more less

Transcript and Presenter's Notes

Title: RGMA : a new API

1
R-GMA a new API
Andy Cooke / Heriot-Watt University ltceeawc_at_macs.
hw.ac.ukgt
2
A new improved API

Why did we need to improve R-GMAs API?
the
old API had flaws
Local buffers so cant guarantee not to lose
tuples.
Static queries execute() didnt make sense.
Databases couldnt be cleaned.
My work since October
started implementing some of the new API
Lots of servlet refactoring (at last!)
Two new producers LatestProducer and
ResilientStreamProducer.

3
(No Transcript)
4
API Features APIBase

disconnect/ reconnect()
For APIs that are used infrequently their
machines can now be switched off!
setAutoInsertTimeStampEnabled()
Every tuple must have a timestamp but what does
it mean the time the tuple was inserted or the
time the measurement was made?
setTerminationInterval() and showSignOfLife()
The API must send heart-beats to its servlet in
order to stay alive, or to stay registered (GRRP
protocols).
setTupleChecking()
In case we dont trust that the tuples are
correct!

5
API Features Declarable, Insertable, Cleanable

Declarable declare/ undeclareTable()
Declarables are publishers that register their
views.
Note, its possible (soon!) for declarables to
introduce new tables to the schema.
Insertable insert()
Insertables are stream publishers that insert
streams of tuples.
These tuples take the form INSERT INTO cpuLoad
VALUES
Now a vector of tuples may be inserted at a go
if the method returns, then R-GMAs servlet
received them safely!
Cleanable declareTable(, cleanUpPedicate,
cleanUpInterval)
Cleanables are stream publishers that are
connected to servlets that store tuples locally
using a database DatabaseProducers (a history
producer) and LatestProducers.
The servlet starts a thread that cleans the
database periodically according to the policy.

6
API Features Archivers and Consumers

Consumer
Three query types are supported history,
continuous and latest snapshot.
Answers are returned as a stream.
Archiver
An Archiver is a republisher that poses a
continuous query, and publishes the answer.
They can now be constructed with an Insertable
StreamProducer for answering stream
queries.
DatabaseProducer for answering history
queries
LatestProducer for answering latest
queries

7
New Producer LatestProducer

- a stream producer that supports latest
snapshot queries
offers up-to-date values for each primary key
(previously, R-GMA tables had no primary keys).
Implementation
When declareTable() is called, the servlet
creates a new mysql database containing that
table.
When a tuple is received, the servlet first tries
to update the table. If this fails, the tuple is
inserted.
Snapshot queries are simply passed on to the
database.
Pros Cons
Possibility of locally being able to process
join queries.
can handle huge numbers of tuples (servlets
probably couldnt).
- The tuples arent available to continuous or
history queries!

8
New Producer ResilientStreamProducer

- a StreamProducer that can answer continuous
queries, and is resilient to crashes.
The servlet keeps a log of changes made to a
producers state (by serializing a Command
object).
Periodically (when?!) snapshots are taken, and
the InstanceTracker (a hash map of producers kept
by the servlet) is serialized and stored on disc.
Recovering from Failure
When the servlet restarts, the last snapshot of
the InstanceTracker is retrieved, and state
changes re-applied.
Then the registry is consulted, and any producers
that had timed-out are re-registered.

9
New Producer lossless Streaming (to do)
A protocol for lossless streaming between
servlets. (1) producer servlet crashes
(2) servlet recovers (2) producers re-registers
(maybe) (3) now tuples are never discarded (4)
Consumer told of producer (how?) (4) Consumer
requests streaming (5) producer recognises
consumer, so sends all tuples since last tuple
sent
10
Summary then what next?

Lots and lots and lots of coding!
Still have to finish Consumer API
Still have to finish C, C APIs
Not quite finished the ResilientStreamProducer
(my job)
Tests are failing just now, and hard to write for
API.
Demo for EU review very soon
But nervousness about whether R-GMA is robust
enough and fast enough!! (900 tuples in 30
sec. required)
talk of focus on robustness, not new
functionality!
But what next for us? (new functionality)
Joins? How?
Query optimisation using views?
( which is what were funded to research).