Title: MOLAP: MultiDimensional OnLine Analytical Processing
1- MOLAPMulti-Dimensional On-Line Analytical
Processing
2The source The Case for Relational OLAP,
MicroStartegy Inc.
Atomicity (Gigabytes)
1000
x
Promotion Analysis DSS
Retail Merchant DSS
x
100
ROLAP
Banking Profit DSS
x
x
Insurance Policy DSS
10
Bank Credit Scoring DSS
x
x
Financials DSS
x
Utility Task DSS
MOLAP
Dimensionality
10
100
1000
OLAP/ROLAP
3The MOLAP Cube
Fact table view
Multi-dimensional cube
dimensions 2
43-D Cube
Multi-dimensional cube
Fact table view
day 2
day 1
dimensions 3
5Example
roll-up to region
Dimensions Time, Product, Store Attributes Pro
duct (descr, price, ) Store Hierarchies P
roduct ? Brand ? Day ? Week ? Quarter Store ?
Region ? Country
S3
Store
S2
roll-up to brand
S1
10 34 56 32 12 56
Juice Milk Coke Cream Soap Bread
Product
roll-up to week
M T W Th F S S
Time
56 units of bread sold in S1 on M
6Cube Aggregation Roll-up
Example computing sums
day 2
. . .
day 1
day1 day2
p1p2
129
s1s2s3
7Cube Operators for Roll-up
day 2
. . .
day 1
sale(s1,,)
129
sale(s2,p2,)
sale(,,)
8Extended Cube
day 2
sale(,p2,)
day 1
9Aggregation
Base data
1st Dimension
Initial aggregation in the 1st dimension
Fig. 1
10Aggregation
N-1 dimensional slice
1st Dimension
Completed aggregation in dimension 1
Fig. 2
11Aggregation
Case 2
Dim.3
Case 2
Dim.2
Case 1
Case 1
Dim.1
a. Directed aggregation in dimension 2, cases 1
and 2.
b. Directed aggregation in dimension 3, cases 1
and 2.
Fig. 3b
Fig. 3a
12Indexing
Fig. 4
13Index
Every single measure has an index INDk. Each row
of measures is being packed separately in records
. The entire slice is organized in a DATA FILE. A
DIRECTORY FILE keeps track of data file.
Z
XYZ - numeric values of time dim., product
dim., geography dim. e.g. 0360 044 03 Slice
0360 holds all the combinations of higher
dimensions
Y
X - 1st Dimension
Fig. 4
14Indexing
P1
P0
.
DATA BLOCK
.
Data
INDn1
Data
INDn2
DATA FILE
.
n
IND0
INDn
DIRECTORY FILE
Start XYZ
End XYZ
F_loc
Size
D_path
In-file location
Data file location
Fig. 5
15An example of indexing MDD sparse atomic data
D1 Store 100 D2 Product 1,000 D3 day 10,000 Each
store sells on average 5,000 different products
throughout the entire period (sparsity 0.05)
Geog. Z
Product Y
Days weeks months
quot. years
Time X
XYZ, e.g. 0360 044 03
Directory files
16- Aggregation performance
- D3 Stores 100
- D2 Products 1000
- D1 Days 10,000
- Its given that on average, each store sells
5,000 different products during the entire
period. - Sparsity
- Cube size 100 1,000 10,000 1,000,000,000
- Actual data 100 5,000 500,000
- Sparsity 0.05
- Data
- Record size R 82 Bytes
- Block factor bfr 1024/82 12
- Number of data blocks b 41,667
- Index (Time)
17- Single Retrieval store, product, day
- Binary search for index log2 1338 11 disk
accesses - one data access
- total 12 disk accesses
- 12 10msec 120 msec 0.12 sec
- Monthly aggregation store, product, month
- 1. Access to the first day at a product 12
accesses - 2. Once you are there, move serially through all
the product data to aggregate month by month - 41,667 blocks indexed by 1388 blocks are 41,667
1388 43,055 accesses - 3. 43,055 x 10msec 43sec
- Category aggregation store, category, day
(lets assume 100products/category) - Access all successive data blocks, each access 2
10msec 20msec - Total 41,667 20msec 83sec
18Operators
- ??????? Roll- up ????? ?? ??? ?????? ?? ???????,
?"? ??????? ????? ???????? ?? ?????.????? ????
???????? ????? ???? ????? ?? City (???), ???? ???
???? ????? ?? State (?????) ???? ?????? ???????
(????? ?? ?????? ?? ?? ????? ?????? ??????). - ??????? Drill- down ????? ?? ??? ?????? ??
??????? ??? ????? ???? ??????? ????? ????
?????.????? ???? ???????? ????? ???? ????? ??
Region, ???? ??? ???? ????? ?? State. - ??????? Push ???? ??? ????? ???? ???? ????? ??
??? ?????? ???? ????????? ?? ??? ????? ?? ???
???.????? ???? ?????? ??????? ???? ?????
??????? ?"? ???, ???? (???? 1) ???? (???? 3). ?"?
??????? ?? ????? ?????? "?????" ?? ???? ?????
??????? ???? ???? ????? ???? ??.?????? ?????
???? ????? ?????? lt??????, 2789gt, lt??????,
1744gt, ..., ???' - ??????? Pull ?????? ????? ?????? ?-Push ?????
???? ??? ?"? "?????" ??? ?????.??????? ??????,
???? ????? ?? ??? ????? ????????. ??? ???? ????
???, ???? ???, ????? ????? ??????? , ????? ??????
???? ?????? ???? ????? ????????.
19Operators
- ??????? Slice "????" ???? ??? ???????, ??
????-?????? ?????? ????? ???? ??????? ??? ??????
?-Slice. ????? ?? ?????? "?????" ???? ?????? ???,
??? ????? ???? ?? ?????? ??? ?? ???????.?????
????? ?? ??????? ?? ???? ???? "Model". ?? ????
Slice Model ????? Model ???? ???????? ??????? ???
?? ??? ?????? ???? ?????? ???? ?? ?????? (??????
???????? ????). - ??????? Dice (?????/?????) ????? ?? ???? ?????
????? ?? ?????? ?? ???? ????? ????? ?????? ??
?????? ?? ????? ???????? ?"? ????????. ????? ??
?????? "?????" ?? ???? ?? ??????? ?? ???? ??????
?? ???? ??? ???????? ?"? ?? ??????
????????.????? ????? ?? ??????? ?????? ?????
????? ?????? ?? ?????? Year ???
lt1990,1991,1992,1993,1994,1995,1996,1997,1998gt. - ??????? Dice Year lt1990,1991,1992gt ?????
????? ??? ?????? ?? ?????? ?????? ?????? ?????
?????? ?????? ?-lt1998,...,1993gt. - ??????? Select ??? ?????? ?? ????? ?-Dice. ?????
?? ????? ?????? ?????? ?? ???? ?????? ???? ??????
?? ????? ?????? ??????.????? ????? ?? ??????
?????? ???? ???????? Dice. ???????? Select ?????
?? ???? ???? ?????? ?? ??????? ????? ??????
?????? ?? ????? ?????? ??????. ??? ?? ????? ??
???????? 1990,1991,1992 Select Year ????
????? ????? ?? ?????? ??????? ?????
(1990,1991,1992).
20Aggregation Using Hierarchies
store
day 2
day 1
region
country
(store s1 in Region A stores s2, s3 in Region B)
21Slicing
PRODUCT p1
day 2
s1
s2
s3
day 1
44
4
day 1
day 2
12
50
TIME day 1
22Slicing Pivoting
(Pivoting rotation of a two-dimensional table.
Column and row are being changed)
23Twelve rules for evaluating OLAP products
(E.F.Codd)
- 1.Multidimensional Conceptual View
- 2.Transparency
- 3.Accessibility
- 4.Consistent Reporting Performance
- 5.Client-Server Architecture
- 6.Generic Dimensionality
- 7.Dynamic Sparse Matrix Handling
- 8.Multi-User Support
- 9.Unrestricted Cross-dimensional Operations
- 10.Intuitive Data Manipulation
- 11.Flexible Reporting
- 12.Unlimited Dimensions and Aggregation Levels
24Query Analysis Tools
- Query Building
- Report Writers (comparisons, growth, graphs,)
- Spreadsheet Systems
- Web Interfaces
- Data Mining