Semijoin Reduction in Query Processors - PowerPoint PPT Presentation

About This Presentation
Title:

Semijoin Reduction in Query Processors

Description:

Title: PowerPoint Presentation Last modified by: ashwini Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 20
Provided by: acin
Category:

less

Transcript and Presenter's Notes

Title: Semijoin Reduction in Query Processors


1
Semijoin Reduction in Query Processors
  • Stocker, Kossman, Braumandl, Kemper
  • Integrating Semi-Join-Reducers into
    State-of-the-Art Query Processors ICDE 2001

2
Introduction
  • Semijoin operator B SJ A
  • Reduces size of B can reduce cost of other
    operations

A
B
A
3
Agenda
  • Usefulness of semijoin reducers
  • Integration of semijoin reducers in a dynamic
    programming optimizer
  • Performance experiments
  • Variants for complex queries
  • Discussion

4
Distributed System
  • Reducing inter-site communication
  • Traditional application of semijoins
  • In a symmetric system useful when
  • Redundant semijoins are cheap
  • Cause significant reduction of intermediate
    results
  • In a centralized system, can help in reducing
    disk I/O

5
Client Server System
  • Will lead to load balancing effects
  • Clients can communicate with servers, servers
    mutually cannot
  • If C can be reduced will reduce response time,
    and communication cost

client
client
SJ
C(2)
B(1)
A(1)
C(2)
A(2)
A(1)
B(1)
6
Search Space
  • Join predicates
  • Full join
  • Semijoin A generalized selection on A
  • Controlled application of semijoins
  • Avoid plans like ((A SJ B) SJ B)
  • Redundant plans have no new predicates applied at
    a semijoin
  • Allow only join operations at a node which apply
    predicates not yet applied in a subtree
  • Predicate space thrice as big as that using joins

7
Dynamic Programming
  • Used in most commercial optimizers
  • Bottom-up optimization technique

8
Dynamic Programming
9
Access Root Algorithm
  • Implements the conventional approach to semijoins
  • Semijoins used to reduce base tables
  • Very easily integrated with existing systems

10
Access Root
11
Extension needed
  • Every table appears at most once in a reducing
    plan
  • Miss plans like (A SJ B) SJ (C SJ B)
  • Usually intermediate join results are large
    most benefit from reduction
  • Join Root Algorithm

12
Join Root Algorithm
  • Complete search space of non-redundant semijoin
    plans
  • Semijoins are applied at all query plan levels
  • Plans with multiple occurrences of tables
  • Semijoin and join generation integrated into one
    phase

13
Join Root Algorithm
  • Steps
  • Generate the initial set of base table access
    plans and include them in for all i
  • As in the dynamic pgming algorithm, optimize all
    subsets of size i, and use for size i1
  • Consider a subset S of size i
  • Consider a proper subset O of S. In the std
    dynamic pgming algo we perform

S

O
O
S-O
S-O
14
Join Root Algorithm
  • In the join root algo we perform
  • Note that these plans are stored as plans for S

P
S-O
O-P
P
O
S-O
SJ
O
P S-O
15
Join Root Algorithm
  • However since semijoin is only a reducer, the
    plan should be a plan for O
  • Thus plans are rearranged into their actual
    semantic categories
  • After that plans are pruned as before
  • After the rearrangement, it might be that several
    plans having semijoins are incomplete
  • Completed by applying joins using a fixpoint
    iteration scheme

16
Qualitative Aspects
  • Join graph topology very important factor
  • More predicates make the join root advantageous
  • Allocation schema useful mainly for distributed
    systems with replication
  • Query complexity running time of join root
    suffers with large number of relations
  • Network topology
  • Lower bandwidth availability and n/w restrictions
    (client server system) increase relative benefits

17
Running TimeChoice of algorithm depends on query
graph
18
Quality of Plans
  • 5 way join query in a client server environment
    with two servers

Classic dyn Access root Join root
Chain 1.65 1.0 1.0
Star 4.18 1.0 1.0
Bone 5.86 2.22 1.0
Avg SC
Classic dyn Access root Join root
Bone (sym) 1.64 1.37 1.0
Bone (Cl-Se) 5.86 2.22 1.0
Avg SC
19
Heuristics
  • Best k variants
  • Use of Base tables only as reducers
  • Limit the number of fix point iterations
  • These heuristics improve running time
    considerably without affecting quality in most
    cases
Write a Comment
User Comments (0)
About PowerShow.com