Scalable%20Trigger%20Processing - PowerPoint PPT Presentation

About This Presentation
Title:

Scalable%20Trigger%20Processing

Description:

Triggers for stock ticker notification. Create trigger T1 from stock ... stock.ticker = const1 and stock.value const2. Expression Signature: E2: stock.ticker ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 36
Provided by: Shy88
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Scalable%20Trigger%20Processing


1
Scalable Trigger Processing
  • Discussion of publication by
  • Eric N. Hanson et al
  • Int Conf Data Engineering 1999
  • CS561

2
Motivation
  • Triggers popular for
  • Integrity constraint checking
  • Alerting, logging, etc.
  • Commercial database systems
  • Limited triggering capabilities
  • 1 trigger/update-type on table or at best 100.
  • But Current technology doesnt scale well
  • And, internet and web-based applications may need
    millions of triggers.

3
An Example Trigger
  • Example stock ticker notification
  • Stock holding 100IBM
  • Query Inform an agent whenever the price of the
    stock holding crosses 10,000
  • Create Trigger stock-watch
  • from quotes q
  • on update(q.price)
  • when q.nameIBM and 100q.price gt 10,000
  • do raise event ThresholdCrossed(100q.price).
  • Note We may need 1,000 or millions of such
    triggers
  • Web interface may allow users to create such
    triggers

4
What Next?
  • Problem description
  • TriggerMan system architecture
  • Predicate index
  • Trigger processing

5
Problem Definition
  • Given Relational DB, Trigger statements, Data
    Stream
  • Find Triggers corresponding to each stream item
  • Objective Scalable trigger processing system
  • Assumptions
  • Number of distinct structures of trigger
    expressions is relatively small
  • All trigger expression structures small enough to
    fit in main memory

6
The Problem, once more.
  • Requires millions of triggers (on huge data).
  • Steps for trigger processing
  • Event monitoring
  • Condition evaluation
  • Executing triggered action
  • Response time for database operations critical !

7
Related Work

  • Parallel Processing
  • Gupt89,Hell98

Indexing
ECA Model (not scalable)
AI Forg82,Mira87 (smaller rule set)
Range Predicates, Marking-Based Hans96b,
Ston90 (large memory, complicated storage)
8
Overall Driving Idea
  • If large number of triggers are created, then
    many have the same format.
  • Triggers share same expression signature except
    that parameters substituted.
  • Group predicates from trigger conditions based on
    expression signatures into equivalence classes
  • Store them in efficient main memory data
    structures

9
TriggerMan System
10
Components
  • TriggerMan Datablade (lives inside Informix)
  • Data Sources
  • Local/remote tables/streams must capture
    updates and transmit to TriggerMan (place in a
    queue)
  • TriggerMan Client applications
  • Create /drop triggers, etc.
  • TriggerMan Driver
  • Periodically involve TmanTest() fn to perform
    condition testing and action execution.
  • TriggerMan console
  • Direct user interaction interface for trigger
    creation, system shutdown, etc.

11
TriggerMan Syntax
  • Trigger syntax
  • create trigger lttriggerNamegt in setName
  • optionalFlags
  • from fromList
  • on eventSpec
  • when condition
  • group by attributeList
  • having groupCondition
  • do action

12
Example Salary Increases
  • Update Freds salary when Bobs salary is updated
  • create trigger updateFred
  • from emp
  • on update (emp.salary)
  • when emp.name Bob
  • do execSQL update emp set salaryNEW.emp.salary
    where emp.nameFred

13
Example Real Estate Database
  • If new house added which is in neighborhood that
    salesperson Iris reprensents then notify her
  • House (hno,address,price,nno,spno)
  • Salesperson (spno,name,phone)
  • Represents (spno,nno)
  • Neighborhood (nno,name,location)
  • create trigger IrisHouseAlert
  • on insert to house
  • from salesperson s, house h, represents r
  • when s.name Iris and s.spnor.spno and
    r.nnoh.nno
  • do raise event NewHouseInIrisNeighborhood(h.hno,
    h.address)

14
Trigger Condition Structure
  • Expression signature
  • Expression signature consists of
  • Data source ID
  • Operation code, e.g. insert, delete, etc.
  • Generalized Expression (parameterized)

FROM Data src emp ON Event update WHEN
boolean exp.

Emp.name
CONSTANT
15
Condition structure (contd)
  • Steps to obtain canonical representation of WHEN
    clause
  • Translate expression to CNF
  • Group each conjunct by data source they refer to
  • Selection Predicate will be of form
  • (C11 OR C12 OR ..) AND ... AND (Ck1 OR ),
  • where each Cij refers to same tuple
    variable.
  • Each conjunct refers to zero, one, or more data
    sources
  • Group conjuncts by set of sources they refer to
  • If one data source, then selection predicate
  • If two data sources, then JOIN predicate

16
Triggers for stock ticker notification
  • Create trigger T1 from stock
  • when stock.ticker GOOG and stock.value lt
    500
  • do notify_person(P1)
  • Create trigger T2 from stock
  • when stock.ticker MSFT and stock.value lt
    30
  • do notify_person(P2)
  • Create trigger T3 from stock
  • when stock.ticker ORCL and stock.value lt
    20
  • do notify_person(P3)
  • Create trigger T4 from stock
  • when stock.ticker GOOG
  • do notify_person(P4)

17
Expression Signature
  • Idea Common structures in condition of triggers
  • Expression Signature
  • E1 stock.ticker const1 and stock.value lt
    const2
  • Expression Signature
  • E2 stock.ticker const3
  • Expression signature defines equivalence class of
    all instantiations of expression with different
    constants

T1 stock.ticker GOOG and stock.value lt
500 T2 stock.ticker MSFT and stock.value lt
30 T3 stock.ticker ORCL and stock.value lt 20
T4 stock.ticker GOOG
18
What to do now
  • Only a few distinct expression signatures, build
    data structures to represent them explicitly (in
    memory)
  • Create constant tables that store all different
    constants, and link them to their expression
    signature

19
Main Structures
  • A-treat Network
  • Network for trigger condition testing
  • For a trigger to fire, all conditions must be
    true
  • Expression Signature
  • Common structure in a trigger
  • E1 stock.ticker const1 and stock.value lt
    const2
  • Constant Tables
  • Constants for each expression signature

20
A-Treat Network to represent a trigger
  • For each trigger condition
  • stock.ticker const1 and stock.value lt const2

21
Condition Testing
  • A-Treat network is a discrimination network for
    trigger condition testing.
  • For a predicate to be satisfied, all its
    conjuncts should be true.
  • This is checked using A-Treat network.

22
A-Treat network (Hanson 1992)
  • Define rule SalesClerk
  • If emp.salgt30,000
  • And emp.dnodept.dno
  • And dept.namesales
  • And emp.jnojob.jno
  • And job.titleclerk
  • Then Action

23
Expression Signature Table
Ex. ID Data Source Signature Description Constant Table Number of Constants Constant Organization
E1 stock const_e1 2 Main Memory
E2 stock const_e2 1 Main memory

E1 stock.ticker const1 and stock.value lt
const2 E2 stock.ticker const3
24
Constant Tables
  • Tables of constants in trigger conditions

Const_e1
Ex. ID Trigger ID Constant 1 Constant 2 Next Node Rest
E1 T1 GOOG 500 Node 2
E1 T2 MSFT 30 Node 2
E1 T3 ORCL 20 Node 2
Const_e2
Ex. ID Trigger ID Constant 1 Next Node Rest
E2 T4 GOOG Null
T1 stock.ticker GOOG and stock.value lt
500 T2 stock.ticker MSFT and stock.value lt
30 T3 stock.ticker ORCL and stock.value lt 20
T4 stock.ticker GOOG
25
Tables
  • Primary tables
  • trigger_set (tsID, name, comments, creation_date,
    isEnabled)
  • Trigger (triggerID, tsID, name, comments,
    trigger_text, creation_date, isEnabled, )
  • Trigger cache in main memory for recently
    accessed triggers.

26
Predicate Index
  • Tables
  • expression_signature(sigID, dataSrcID,
    signatureDesc, constTableName, constantSetSize,
    constantSetOrganization)
  • const_tableN(exprID, triggerID, nextNetworkNode,
    const1, constK, restOfPredicate)
  • Root of predicate index linked to data source
    predicate indices
  • Each data source contains an expression signature
    list
  • Each expression signature links to its constant
    table.
  • Index expressions on most selective conjunct
    (rest on fly).

27
Predicate Index
hash(src-ID)
Goal Given an update, identify all predicates
that match it.
28
Processing Trigger Definition
  • Parse the trigger and validate it
  • Convert the when clause to conjunctive normal
    form
  • Group the conjuncts by the distinct sets of tuple
    variables they refer to
  • Form a trigger condition graph, that is,
    undirected graph with node for each tuple
    variable and edge for join predicates.
  • Build A-Treat network

29
Processing trigger definition (2)
  • For each selection predicate
  • If predicate with same signature not seen before
  • Add signature of predicate to list
  • And, add signature to expression_signature table
  • If signature has a constant placeholder in it,
    create a constant table for the signature.
  • Add constants
  • Else if predicate has constants, add a row to the
    constant table for the expression

30
Alternate Organizations
  • Storage for the expression signatures
    equivalence class
  • Main memory lists
  • Main memory index
  • Non-indexed database table
  • Indexed database table
  • For each expression signature, choose a
    structure depending on number of triggers.

efficiency
Scalability
31
Processing update descriptors
  • On getting an update descriptor (token)
  • (data src ID, operator code, old/new tuple)
  • Locate data source predicate index from root of
    predicate index.
  • For each expression signature, find constant
    matching the token using index.
  • Check additional predicate clauses against the
    token.
  • When all predicate clauses of a trigger have
    matched, pin the trigger in main memory
  • Bring in A-treat network representing that
    trigger to process aremaining part of trigger,
    like join, etc.
  • If trigger condition is satisfied, execute action.

32
Processing an Update
Update Stock (tickerGOOG, value495)
Root
Index of stock.tickerconst1
Other source Predicate index
E1
E2
E1 stock.ticker const1 and stock.value lt const2
Trigger ID Constant 1 Constant 2 Next Node
T1 GOOG 500 Node 2
T2 MSFT 30 Node 2
T3 ORCL 20 Node 2
const_e1
const_e2
const_e1
33
Some optimizations
  • For I 1 to N
  • Create trigger Ti from R when R.a100 do
  • List of trigger ids for R.a100
  • Constant sets and trigger Ids for equality
    conditions stored as
  • Lists
  • Clustered constant index
  • All entries of const1,..constk stored together
  • Enables fast retrieval of triggers ids together

34
Concurrency
  • Better scalability even on single processor

35
Opportunities of Concurrency
  • Tasks can be
  • Process a token to check matching rule
  • Run a rule action
  • Process a token against a set of conditions
  • Process a token to run a set of rule actions
    triggered by token.

36
Concurrency
  • Identified elements that can be parallelized
  • Token-level
  • Multiple tokens processed in parallel
  • Condition-level
  • Multiple selection conditions tested concurrently
  • Rule-action-level
  • Multiple rule actions fired at the same time
  • Data-level
  • Set of data values in the network processed in
    parallel

37
Concurrency
  • N NUM_CPUS TMAN_CONCURRENCY_LEVEL
  • N driver processes
  • Each driver process calls TmanTest() after T time
  • Balance switching overhead and avoid long
    executions

38
TmanTest()
  • while(total execution time of this invocation of
    TmanTest lt THRESHOLD and work is left in the task
    queue)
  • Get a task from the task queue and execute it.
  • Yield the processor so other Informix tasks can
    use it
  • if task queue is empty
  • return TASK_QUEUE_EMPTY
  • return TASKS_REMAINING

39
Concurrency
  • For k1 to M
  • Create trigger T_K
  • from R
  • when R.company "IBM"
  • do raise event
  • notify_user("userK", . )
  • Partition trigger sets into N sets.

40
Conclusion Overall Key Points
  • If a large number of triggers are created, many
    of them have almost the same format
  • Group triggers with same structure together into
    expression signature equivalence classes
  • Number of distinct signatures is small enough to
    fit into main memory (index)
  • Develop a selection predicate index structures
  • Architecture to build a scalable trigger system.
Write a Comment
User Comments (0)
About PowerShow.com