Title: Revisiting%20Pipelined%20Parallelism%20in%20Multi-Join%20Query%20Processing
1Revisiting Pipelined Parallelism in Multi-Join
Query Processing
- Bin Liu and Elke A. Rundensteiner
- Worcester Polytechnic Institute
- (binliurundenst)_at_cs.wpi.edu
http//www.davis.wpi.edu/dsrg
2Multi-Join Queries
- Data Integration over Distributed Data Sources
- i.e., Extract Transform Load (ETL) Services
Data Source
Data Warehouse
Data Source
Data Warehouse
Data Source
Persistent Storage
(1) High IO costs given large intermediate
results (2) Disk access undesirable since one
time process
3Applying Parallelism
- Processed in Main Memory of a PC Cluster
- Make use of aggregated resources (main memory,
CPU)
Network
Clusters of Machines
4Three Types of Parallelism
Pipelined Operators be composed into producer
and consumer relationship
Independent Independent operators run
simultaneously on distinct machines
Partitioned Single operator replicated and run
on multiple machines
5Basics of Hash Join
- Two-Phase Hash Join SD89, LTS90
- Demonstrated High Performance
- Potential High Degree of Parallelism
6Partitioned Hash Join
- Partition (Inputs) Hash Tables across Processors
- Have Each Processing Node Run in Parallel
7Left-Deep Tree SD90
Example Join Graph
R2
R8
R6
R1
R3
R9
R7
R5
R4
8Right-Deep Tree SD90
Example Join Graph
R2
R8
R6
R1
R3
R9
R7
R5
R4
9Tradeoffs Between Left and Right Trees
- Right-Deep
- ? Good potential for pipelined parallelism.
- ? Intermediate results exist only as a stream.
- ? Size of building relations can be predicted
accurately. - ? Large memory consumption.
- Left-Deep
- ? Less memory consumption
- ? Less pipelined parallelism
10State-of-the-Art Solutions
- Implicit Assumption
Prefer
Maximal Pipelined Parallelism !!!
11State-of-the-Art Solutions
- What if Memory Constrained Environments ?
- Strategy
-
- Break tree into several pieces, and
- Process one piece at a time (as pipeline)
- I.e., Static Right-DeepSD90,
- ZigZag ZZBS93,
- Segmented Right-Deep CLYY92.
Pipeline !
12Pipelined Execution
- Redirection Cost The intermediate results
generated may need to be partitioned to a
different machine.
- Optimal Degree of Parallelism? I.e., It may not
be necessary to partition R2 over a large number
of machines if it only has 1000 tuples?
13Pipelined Cost Model
- Compute n-way join over k machines
- Probing relation R0, building relations, R1, R2,
, Rn - Ii represents the intermediate results after
joining with Ri - Total Work (WbWp) Total Processing Time (TbTp)
14Break Pipelined Parallelism
- To Break Long Pipeline and Introduce Independent
Parallelism
- Large number of small pipelines
- High interdependence between pipelined segments
- i.e., P1 gt P2, P3 gt P4, P2 gt P4,
15Segmented Bushy Tree
- Basic Idea
- Compose large pipelined segment
- Run pipelined segments independently
- Compose bushy tree with minimal interdependency
To balance pipelined and independent parallelism
16Composing Segmented Tree
- Input A connected join graph G with n nodes.
- Number m specifies maximum number of nodes in
each graph. - Output Segmented bushy tree with at least n/m
subtrees. - completed false
- WHILE (!completed)
- Choose node V with largest cardinality that has
not yet been grouped as probing relation - Enumerate all subgraphs starting from V with at
most m nodes - Choose best subgraph, mark nodes in this group
as having been selected in original join graph - IF !(exist K, K is a connected subgraph of G
with unselected nodes) - (K.size() gt 2)
- completed true
-
-
- Compose segmented bushy tree from all groups
17Example
(1) R7, R8, R9, R6 (2) R7, R9, R6, R8 (3) R7, R4,
R8, R5 ...
(1) R1, R0, R2, R3 (2) R1, R2, R0, R3 (3) R1, R2,
R3, R4 ...
18Example Segmented Bushy Tree
19Machine Allocation
- Based on building relation sizes of each segment
- Nb total amount of building work.
- ki number of machines allocated to pipeline i
k3
Nb
R0
I1
k1
R5
I2
k2
R1
R2
R4
R3
R8
R7
R6
R9
20Insufficient Main Memory
- Break query based on main memory availability
- Compose segmented bushy tree for each part
R3
R8
R2
R19
R14
R12
R7
R4
R1
R9
R15
R18
R10
R13
R6
R5
R0
R16
R17
R11
21Experimental Setup
- 10 Machine Cluster
- Each machine has 2 2.4GHz Xeon CPUs, 2G Memory.
- Connect by gigabit ethernet switch
PIII 800M Hz PC, 256M Memory
Application
Oracle 8i
10 Machine Cluster
...
2 PIII 1G CPUs, 1G Memory
PIII 800M Hz PC, 256M Memory
Controller
22Experimental Setup (cont.)
- Generated Data Set with Integer Join Values
- Around 40 bytes per tuple
- Randomly Generated Join Queries
- Acyclic join graph with 8, 12, 16 nodes
- Each node represents one join relation
- Each edge represents one join condition
- Average join ratio is 1
- Cardinality of each relation is from 1K 100K
- Up to 600MB per query
23Pipelined vs. Segmented (I)
24Pipelined vs. Segmented (II)
25Insufficient Main Memory
26Related Work
- SD90 Tradeoffs in processing complex join
queries via hashing in multiprocessor database
machines. VLDB 1990. - CLYY92 Using segmented right deep trees for
execution of pipelined hash joins. VLDB 1992. - MLD94 Parallel hash based join algorithms for a
shared everything environment. TKDE 1994. - MD97 Data placement in shared nothing parallel
database systems. VLDB 1997. - WFA95 Parallel evaluation of multi-join
queries. SIGMOD 1995. - HCY94 On parallel execution of multiple
pipelined hash joins. SIGMOD 1994. - DNSS92 Practical skew handling in parallel
joins. VLDB 1992. - SHCF03 Flux an adaptive partitioning operator
for continuous query systems. ICDE, 2003.
27Conclusions
- Observation Maximal pipelined hash join
processing - Redirection costs? optimal degree of parallelism?
- Hypothesis Worthwhile to incorporate
independent parallelism into processing - Both, so several shorter pipelines in parallel
- Solution Segmented bushy tree processing
- Heuristics and cost-driven algorithm developed
- Validation Extensive experimental studies
- Achieve around 50 improvement over pure
pipelined processing