Title: CURE for Cubes: Cubing Using a ROLAP Engine
1CURE for CubesCubing Using a ROLAP Engine
VLDB 2006
- Konstantinos Morfonios
- Yannis Ioannidis
University of Athens
2Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
3Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
4Introduction
SELECT region, sum(revenue) FROM SALES WHERE
month September GROUP BY region
5Introduction
SELECT A, B, C, SUM(M) FROM R GROUP BY A, B, C
SELECT A, B, SUM(M) FROM R GROUP BY A, B
SELECT SUM(M) FROM R
6Introduction
- Problems
- Construction algorithm
- Storage scheme
- Focusing on ROLAP techniques (MVs)
- Stressed to limits?
- Complete solution?
Unclear (not finished with efficient storage)
Unclear (not focused on hierarchies)
7Introduction
Challenges of hierarchies
Efficient execution plan
- Small domains in the higher levels of dimension
hierarchies
New partitioning algorithm
- Number of tuples increases
Novel storage scheme
8Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
9Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
10Execution Plan
- Extend BUC (Bottom-Up-Cube) BR99
- Efficient pipelining
- Cheap identification of some kinds of redundancy
- Inherent support for iceberg cubes and holistic
functions - Existing BUC-based methods BU-BST WLFY02 and
QC-Tables LPH02
11Execution Plan
Dimensions A, B, C
ABC
AC
BC
AB
B
C
A
?
12Execution Plan
Dimensions A0?A1?A2, B0?B1, C0
13Execution Plan
Dimensions A0, A1, A2, B0, B1, C0
14Execution Plan
Dimensions A0, A1, A2, B0, B1, C0
15Execution Plan
Dimensions A0, A1, A2, B0, B1, C0
Height 3
16Execution Plan
Dimensions A0?A1?A2, B0?B1, C0
17Execution Plan
Dimensions A0?A1?A2, B0?B1, C0
18Execution Plan
Dimensions A0?A1?A2, B0?B1, C0
Height 6
19Execution Plan
- Important properties of BUC-based cubing
- Recursive calls at higher levels tend to be
cheaper - Benefits from early pruning recursion at some
node N increase with the number of ancestors of N
in the execution plan - Advantage of taller execution plans
ABC
AC
AB
A
20Execution Plan
CUREs Plan
21Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
22Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
23External Partitioning
R
24External Partitioning
Memory
R
25External Partitioning
Memory
R
26External Partitioning
Memory
R
27External Partitioning
Sound
Memory
R
28External Partitioning
- For sound partitioning Biggest partition M
- In flat datasets this holds in general
- In hierarchical datasets
29External Partitioning
?
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
30External Partitioning
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
31External Partitioning
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
32External Partitioning
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
?
33External Partitioning
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
34External Partitioning
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
?
35External Partitioning
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
36External Partitioning
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
37External Partitioning
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
A0/A2 times smaller than R? A2B0C0 50 MB
38External Partitioning
R 500 GB, M 1 GB
R/M 500
A0 (50,000)?A1 (500)?A2 (5)
39Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
40Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
41Storage Format
- Two types of redundancy
- Dimensional Redundancy (DR)
- Aggregational Redundancy (AR)
42Storage Format
Example with flat cube only for simplicity
43Storage Format
CUBE with DR
CUBE without DR
44Storage Format
CUBE with DR
CUBE without DR
45Storage Format
CUBE with DR
CUBE without DR
46Storage Format
CUBE with DR
CUBE without DR
47Storage Format
CUBE with DR
CUBE without DR
48Storage Format
Classify tuples according to AR into
- Common Aggregate
- Tuples (CATs)
CUBE with DR
CUBE without DR
49Storage Format
50Storage Format
51Storage Format
52Storage Format
53Storage Format
54Storage Format
55Storage Format
56Storage Format
57Storage Format
58Storage Format
- Purpose of the previous example
- Explanation of different types of redundancy
- Not construction algorithm
- Constructing an uncompressed cube and then
compressing it would be inefficient - Instead, CURE classifies tuples during
construction itself (details in the paper)
59Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
60Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
61Experimental Evaluation
- Hierarchical datasets APB-1
- Product Code (6,500) ? Class (435) ? Group (215)
? Family (54) ? Line (11) ? Division (3) - Customer Store (640) ? Retailer (71)
- Time Month (17) ? Quarter (6) ? Year (2)
- Channel Base (9)
- Flat datasets CovType, Sep85L, Synthetic
62Experimental Evaluation
- Two versions of CURE
- CURE
- CURE
63Experimental Evaluation
64Experimental Evaluation
65Experimental Evaluation
66Experimental Evaluation
67Experimental Evaluation
68Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
69Introduction
Execution Plan
External Partitioning
Storage Format
Experimental Evaluation
Conclusions
70Conclusions
- Main contribution CURE
- Efficient execution plan
- New partitioning algorithm
- Novel storage scheme
- Main advantages of CURE
- Efficient construction of complete cubes over
large datasets with arbitrary hierarchies - Cube compression
- Optimization opportunities for queries and
updates - Easy implementation
71Current and Future Work
- Study of indexing for queries and updates
- Comparison with the most prominent MOLAP and
Tree-based techniques
72Questions???
73Thank you!
74Storage Format
Memory Image
Disk Image
75Storage Format
45
65
100
110
150
Memory Image
Disk Image
76Storage Format
150
Memory Image
Disk Image
77Storage Format
Memory Image
Disk Image
78Storage Format
Memory Image
Disk Image
79Storage Format
Memory Image
Disk Image
80Storage Format
20
30
Memory Image
Disk Image
81Storage Format
30
Memory Image
Disk Image
82Storage Format
Memory Image
Disk Image
83Storage Format
Memory Image
Disk Image
84Storage Format
Memory Image
Disk Image
85Storage Format
Memory Image
Disk Image
86Storage Format
Memory Image
Disk Image
87Storage Format
Memory Image
Disk Image
88Storage Format
Memory Image
Disk Image
89Storage Format
Memory Image
Disk Image
90Storage Format
Memory Image
Disk Image
91Storage Format
Memory Image
Disk Image
92Storage Format
Memory Image
Disk Image
93Storage Format
Memory Image
Disk Image
94Storage Format
Memory Image
Disk Image
95Storage Format
Memory Image
Disk Image
96Storage Format
Memory Image
Disk Image
97Storage Format
Memory Image
Disk Image
98Storage Format
Memory Image
Disk Image
99Storage Format