Chapter 4: Dimensions, Hierarchies, Operations, Modeling

About This Presentation
Title:

Chapter 4: Dimensions, Hierarchies, Operations, Modeling

Description:

Computation with binary Tree. 2. 2. 2. 2. 2. 2. 2. 2. 1. 1. 1. 1. 1. 1. 1. 1 ... daily sales during Christmas season. vacation bookings for skiing on fasching ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 37
Provided by: Bay81

less

Transcript and Presenter's Notes

Title: Chapter 4: Dimensions, Hierarchies, Operations, Modeling


1
Chapter 4 Dimensions, Hierarchies, Operations,
Modeling
2
Chapter 4.1 Hierarchical Dimensions
Def Hierarchical Dimensions are composite keys
with an order on the key attributes. Prefixes are
allowed as keys. Ex dimension Time ( Year,
Month, Day) legal keys are (Year) or (Year,
Month) or (Year, Month, Day) Def Basic facts
are values in cells with full foreign keys
3
Aggregations, Summaries
Def Aggregations are facts in cells with
partial keys. These facts are derived by
aggregation functions. In a cube with derived
facts the aggregation function must be specified.
Ex Sales on a monthly basis Sales (Year,
Month) S Sales (Year, Month, Days) Aggregation
Functions count, sum, avg, min, max, ...
4
Note on Aggregations
  • Aggregations may be stored explicitely in the
    cube, but then they should be secured by
    integrity constraints
  • Aggregations may be virtual and must be computed
    on demand when needed
  • i.e., classical tradeoff between storage space,
    performance, flexibility

5
Relational Modeling
Expand and complete partial key by ALL (Year,
Month, ALL) (ALL, Month, ALL) (ALL, ALL,
ALL) to obtain simple and complete relational
keys via special symbol ALL Question SQL to
compute complete cube with all aggregations from
base-cube?
6
Hierarchy Example
7
Chapter 4.2 OLAP Operations
Def Roll-up computes higher aggregations from
lower aggregations or base facts according to
hierarchies Ex for base facts (Year, Month, Day)
there are 3 hierarchical roll-up
functions Roll-up (Year, Month, ALL) Roll-up
(Year, ALL, ALL) Roll-up (ALL, ALL, ALL) which
are supported in general (canonical roll-ups)
8
Additional Roll-ups (ALL, Month, ALL)
etc. therefore 23 -1 aggregations or in
general 2m -1 aggregations for m hierarchy
levels Note see later chapters for the support
of arbitrary aggregations Note for m dimensions
with h1, h2, ...hm hierarchy levels there are
different aggregations for a given aggregation
function.
9
Size of base cube 2-dim example Dim1 (4, 5)
cardinality of the dimension levels Dim2 (6, 7,
2) (4 5) ( 6 7 2) 1680 Size of base
cube
42
20
84
10
Size of hierarchically aggregated Cube
4 - 6 7 2 336
4 5 6 7 - 840
- - 6 7 2 84
4 - 6 7 - 168
4 5 6 - - 120
- - 6 7 - 42
4 - 6 - - 24
4 5 - - - 20
- - 6 - - 6
4 - - - - 4
- - - - - 1
Number of cells per aggregation function 1645
11
Size of completely aggregated cube
4 5 6 7 2
0 0 0 0 0
0
0

0 0 0 0
0
0

0 0 0 0

0 0 0 0

12 7 14
24
24 x 6 144
168
5 x 168 840 840 1686 x 168
10084 x 1008 40325 x 1008 4032 1008
5040


12
Computation with binary Tree
4
5
1
20
4
6
1
1
6
24
120
20
4
1
1
1
7
1
7
7
7
20
168
24
28
4
140
840
120
2
1
2
2
2
1
1
1
1
1
2
2
1
2
2
1
4
8
20
28
56
24
48
40
168
336
140
120
1680
840
240
280
13
Size of the Cube
Lemma Given a data cube with m dimensions with
h1, ..., hm hierarchy levels resp. Let the
hierarchy levels of dimension i have Then
the base cube has and the cube
with all aggregations has
14
Size of the Cube (2)
  • The aggregated cube is larger than the base cube
    by the
  • factor

15
Size of the hierarchically aggregated Cube For a
hierarchy i with hi levels and there are
hierarchical aggregation possibilities ,
i.e. Lemma A hierarchically completely
aggregated data cube has
16
Ex (4 5) (6 7 2) size of the
hierarchically aggregated cube plus base cube
(1 4 20) (1 6 42 84) 25 133
3325 Ex (4 5) (6 7 2) ( 8 3) size of
base cube 40,320 hierarchically aggregated cube
plus base (1 4 20) (1 6 42 84)
(1 8 24) 3325 33 109,725
17
Ex (4 5) (6 7 2) ( 8 3) (5 9) size
of base cube 1 814,400 hierarchically aggregated
cube plus base 109,725 (1 5 45) 5
595,975
18
Additional comments on aggregations 1. In
addition to the size of the complete cube there
is a factor of 5 for the various aggregation
functions, e.g. sum, avg, min, max, count, ... 2.
So far we did not consider general restrictions,
e.g. all Saturdays in March or vacation
months July and August, which cross bounds of
hierarchy levels
Interactive query formulation results in an
unlimited number of aggregations
Optimization restrictions corresponding to
hierarchy levels shoud be pushed down, since they
lead to query boxes
19
Note See later chapters for multidimensional
indexes and MHC techniques and optimization of
ROLAP-algebra to support hierarchical canonical
aggregations like Roll-up (Year, Month,
ALL) Roll-up (Year, ALL, ALL) Roll-up (ALL,
ALL, ALL) but not Roll-up ( ALL, Month, ALL)
20
Optimization Problem Non-hierarchical
aggregation, e.g. March for all years decompose
into union of several restrictions, e.g. S Sales
(Year, Month, Day) where Month March
and (Year 1996 or Year 1997 or Year
1998) see later for translation into ROLAP
expression and transformations for optimization
21
Multiple Hierarchies e.g. the time
hierarchy Aggregation for month e.g. by
covering QB of weeks and postfiltering
22
Navigation Operations Drill Down first show
single result for aggregated value, e.g. sales
per day, then show hourly values for days with
very high or very low sales in order to plan
working hours for sales people better Other
Examples daily sales during Christmas
season vacation bookings for skiing on fasching
23
Roll-up Compute Aggregations
24
Slicing
Selection of a smaller data cube or even
reduction of a multidimensional datacube to fewer
dimensions by a point restriction in some
dimension (becomes pivot element)
25
Dicing (würfeln) rotate result, to show another
view, e.g. exchanging rows and columns Slice
management precomputing and caching of several
slices for later or special use, e.g. for a
special sales person
26
Chapter 4.3 Modeling Methodology
Purpose analysis of business processes,
characteristic facts (Kennzahlen) for managers to
support decisions (DSS) Steps of Decision
Process 1. Which business processes to model and
analyze? 2. What are the measures, where do they
come from? 3. Which degree of details, e.g.
minutes like in SAP? Which precision is required
for OLAP? 4. Common properties of measures to
determine dimensions? Brand, Time, geogr. Region,
Productgroup? Dependencies between levels of
hierarchies?
27
  • 5. Attributes of dimensions, e.g. of products
  • screen size of TV computers
  • cc and PS for cars
  • focal length for camera
  • Problem how common are properties to dimensions?
    Non common properties cannot be modeled by levels
    of dimensions, are called features at GfK (up to
    50), they are numbered, their meaning dependent
    on a specific dimension element, e.g.
  • TV screen size color audio system
  • Car transmission cc PS cyl ...

28
  • 6. Constant or changing attributes of dimensions?
    E.g.
  • New models of car makers
  • new powersource electrical, hydrogen, solar
  • attributes are rather stable, but still should be
    planned ahead! (mergers like Daimler-Crysler)
  • 7. Sparsity one hypercube or several, i.e.
    multicube model? Influences storage requirements,
    query formulation and performance, cannot be
    hidden easily from user, maybe by views?

29
  • 8. Caching and management of aggregates?

Time
Optimal Number of aggregates
30
Chapter 4.4 Comparison of OLAP Architectures
  • MOLAP Multidimensional OLAP
  • ROLAP Relational OLAP
  • 3. HOLAP Hybrid OLAP

31
MOLAP Architecture
32
MDDBMS in ANSI-X3-Sparc
33
Logical components of a MDDBMS
34
ROLAP Architecture
35
HOLAP Architecture
36
  • Reasons for MOLAP
  • performance
  • write access
  • Data Marts
  • functional power
  • Reasons for ROLAP
  • scalability
  • flexible precomputations, partial aggregates
  • parallelism
  • DB-mamagement and ACID
Write a Comment
User Comments (0)