Data Warehousing

About This Presentation

Title:

Data Warehousing

Description:

1997-2001: Bachelor Student in Karlsruhe ... Lenovo T61. 2. Computers. Latitude ... 1. D4.group. D4.product. D4.id. 1. 3. 3. 2. 2. 1. 2. 1 ... – PowerPoint PPT presentation

Number of Views:61

Avg rating:3.0/5.0

Slides: 105

Provided by: system53

Category:

more less

Transcript and Presenter's Notes

Title: Data Warehousing

1
Data Warehousing

Donald Kossmann
Carsten Binnig
http//systems.ethz.ch

2
All about me

What does Google say?

What is the Truth? ?
1997-2001 Bachelor Student in Karlsruhe
2001-2004 Software Developer at Research Center
in Karlsruhe
2002-2004 Master Student in Karlsruhe
2004-2007 PhD Student in Heidelberg
From 2008 Senior Researcher at ETH Zurich
Working with SAP on main-memory column-stores for
DWH
gt 50 in Walldorf (Germany)

3
Organization

Please raise your hands if yes ? !!!
Exercises Who of you will attend the exercise
group at
11-12
12-13
Who of you knows the following basics?
Relational algebra
Query processing and Query Optimization
(join-ordering, query plans, iterator-model )
Index-structures (single-attribute, composite- or
multidimensional-indexes, )

4
Query Processing and Indexing
5
Some Literature (1)

1 Chee Yong Chan, Yannis E. Ioannidis Bitmap
Index Design and Evaluation. SIGMOD Conference
1998 355-366
2 Patrick E. O'Neil, Goetz Graefe Multi-Table
Joins Through Bitmapped Join Indices. SIGMOD
Record 24(3) 8-11 (1995)
3 George P. Copeland, Setrag Khoshafian A
Decomposition Storage Model. SIGMOD Conference
1985 268-279
4 Stavros Harizopoulos, Velen Liang, Daniel J.
Abadi, Samuel Madden Performance Tradeoffs in
Read-Optimized Databases. VLDB 2006 487-498
5 Daniel J. Abadi, Samuel Madden, Nabil Hachem
Column-stores vs. row-stores how different are
they really? SIGMOD Conference 2008 967-980
6 Nicolas Bruno Teaching an Old Elephant New
Tricks. CIDR 2009
7 Philippe Cudré-Mauroux, Eugene Wu, Samuel
Madden The Case for RodentStore An Adaptive,
Declarative Storage System. CIDR 2009
8 Ravishankar Ramamurthy, David J. DeWitt, Qi
Su A Case for Fractured Mirrors. VLDB 2002
430-441

6
Some Literature (2)

9 Anastassia Ailamaki, David J. DeWitt, Mark D.
Hill, Marios Skounakis Weaving Relations for
Cache Performance. VLDB 2001 169-180
10 Daniel J. Abadi, Samuel Madden, Miguel
Ferreira Integrating compression and execution
in column-oriented database systems. SIGMOD 2006
671-682
11 Jun Rao, Kenneth A. Ross Cache Conscious
Indexing for Decision-Support in Main Memory.
VLDB 1999 78-89
12 Jun Rao, Kenneth A. Ross Making B-Trees
Cache Conscious in Main Memory. SIGMOD Conference
2000 475-486
13 Wolfgang Lehner Datenbanktechnologie für
Data-Warehouse-Systeme. Konzepte und Methoden
(Book). Dpunkt
14 George Eadon, Eugene Inseok Chong, Shrikanth
Shankar, Ananth Raghavan, Jagannathan Srinivasan,
Souripriya Das Supporting table partitioning by
reference in oracle. SIGMOD Conference 2008
1111-1122
15 Surajit Chaudhuri, Umeshwar Dayal An
Overview of Data Warehousing and OLAP Technology.
SIGMOD Record 26(1) 65-74(1997)
16 A. Silberschatz, H.F. Korth, S. Sudarshan
"Database System Concepts" (Fifth Edition),
McGraw-Hill. (Chapters 18, 20-22)
17 Namik Hrle Disruptive Technology Trends In
Information Management. ETH Zurich Computer
Science Colloquium, March 9, 2009

7
Motivation

OLAP Online Analytical Processing
Use Case Analyze data by aggregating measures
over multiple dimensions (drill-down, roll-up)
Example Sales of Amazon
What is the total price of all products
in product group Computer
sold in quarter 04/2008
in country Switzerland
grouped by region?

Fact table
Dimensions
8
OLAP vs. OLTP
9
Star Schema (Example)
Dimension D3 Region
Dimension D2 Time
1
1
Fact Table F Sales
n
n
n
n
1
1
Dimension D1 Payment
Dimension D4 Product
10
Star Schema (Example)
1
1
Ref. to dimensions
Measure(s)
n
n
n
n
1
1
11
Star Query

Query Pattern

Example

SELECT ltdimensionsgt, ltaggregation-function(measur
e)gt FROM F, D1, D2, ...., Dn WHERE
ltjoin-conditionsgt AND ltfilter-conditionsgt GROUP
BY ltdimensionsgt
SELECT D3.region, SUM(F.price) FROM F, D2, D3,
D4 WHERE F.d2 D2.idAND F.d3 D3.id AND F.d4
D4.id AND D2.quarter 4 AND D2.year 2008 AND
D3.country'Switzerland' AND D4.group'Computer' G
ROUP BY D3.region
12
Query Processing in a Nutshell
SELECT SUM(F.price), D3.region FROM F, D3 WHERE
F.d3 D3.id AND D3.country'Switzerland' GROUP
BY D3.region
Declarative Query
Scanner /Parser
Canonical Relational Algebra Expression (Tree)
Optimizer
Optimal Execution Plan (Code)
Execution Engine
Query Result
13
Query Optimization in a Nutshell
Canonical Relational Algebra Expression (Tree)
Query Rewrite
Optimizer
Cost-based Opt.
Optimal Execution Plan (Code)

Query Rewrite (Logical Optimization)
Use rules and heuristics to rewrite relational
algebra expression (e.g. pushdown of projection,
pushdown of group-by, )
Cost-based Optimization (Physical Optimization)
Use a cost model to find optimal plan (e.g.,
determine concrete join-algorithm) with lowest
costs ( of disk pages that must be read)

14
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

15
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

16
(1) Naïve Join-Strategy
SELECT SUM(F.price), D3.region FROM F, D2, D3, D4
WHERE F.D2 D2.id AND F.d3 D3.id AND F.d4
D4.id AND D2.quarter4 AND D1.year2008 AND
D3.country'Switzerland' AND D4.group'Computer' G
ROUP BY D3.region

Idea Iteratively join dimension tables to fact
table (OLTP-style)
Optimizer should reorder joins and start joining
those dimensions first that have highest
selectivity for fact table F
The first join could use a single attribute index
on the dimension key of fact table F

17
Example Naïve Join-Strategy
1
1
2
2
2
3
1
3
18
Problems with Naïve Join-Strategy

Fact table is usually much bigger than dimension
tables
If the first dimension does not effectively
reduce the tuples that need to be read from disk
for table F then
Using an index scan on F (which might produce
random disk accesses) might be slower than a full
table scan on F
A full table scan of F might also be too
expensive for big a F table
Example Fact table F has size 100GB on disk and
300MB/s can be read from disk (e.g., SATA 3)
The total time to read the whole fact table from
disk is 333s which might not be acceptable for
online reporting

19
(2) Cross Product Plan

Idea First create cross product for all
dimensions and then join result with fact table
using all dimension keys
Works well, if all dimensions are small
(otherwise this is very exp.)
Final join effectively reduces data from F and
could use compound index on all dimensions keys
in table F

20
Example Cross Product Plan
1
1
4
1
2
2
2
1
21
(3) Semi-Join Plan

Idea Restrict tuples that are read from F before
joining dimensions
A pre-selection of RIDs is created using
semi-joins of dimensions with F gt
single-attribute indexes on dimension keys in F
can be leveraged
Single RID lists are intersected (e.g., using
hashing)
Read tuples from F that are in the resulting
RID-list

22
Example Semi-Join Plan
1
1
1
2
1
1
1,2,3
0,1,2
0,1,3
23
What have we learned so far?

Different strategies for executing a star-join
Naïve (including join-reordering)
Cross-product plan
Semi-join plan
Question How to implement these plans
efficiently?

24
What are possible solutions?

General optimizations
Materialized Views (logical access path)
Horizontal-partitioning (logical access path)
Optimizations for the scan / selection
operations
Bitmap-Indexes (physical access path)
Optimizations for the join / semi-join
operations
Join-Indexes (physical access path)

25
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

26
Materialized Views (see 13)

Idea Pre-compute aggregated query results that
can efficiently answer other queries over star
schema
Compared to normal views materialized views are
stored on disk
Query optimizer chooses to scan materialized
views instead of tables to speed-up query
processing (e.g., by avoiding joins)
Trade-off speed-up of query processing ? costs
of keeping and maintaining materialized views

27
Example Materialized View
Query
Materialized View (M)
SELECT D3.region, SUM(F.price) as
totalsales2 FROM F, D2, D3, D4 WHERE F.d2
D2.idAND F.d3 D3.id AND F.d4 D4.id AND
D2.quarter 4 AND D2.year 2008 AND D4.group
'Computer' GROUP BY D3.region
SELECT D2.quarter D3.region, D4.group, SUM(F.price
) as totalsales FROM F, D2, D3, D4 WHERE F.d2
D2.idAND F.d3 D3.id AND F.d4 D4.id AND
D2.year 2008 GROUP BY D2.quarter, D3.region,
D4.group
SELECT D3.region, SUM(totalsales) as totalsales2
FROM M WHERE D2.quarter 4 AND D4.group
'Computer' GROUP BY D3.region
28
Materialized Views Questions

Which materialized view(s) can be used to answer
a query?
How to maintain a materialized view if the base
tables are updated?
Which materialized views should be created? gt
Workload
How to rewrite a query using a materialized view?
gt Optimizer

29
Derivability

Question Can a materialized view M be used to
answer a query Q?
Sufficient conditions (see 13)
Selection predicates PQ of Q are subsumed in
selection predicates PM of M
Group-by attributes GQ of Q are subsumed in
group-by attributes GM of M
Aggregation functions FQ of Q are compatible to
aggregation functions FM of M
All tables RM referenced by M must also be
referenced by Q (and be joined using the same
join predicate as joins are not lossless)
gt View M can be used to answer Query Q

30
Derivability Selection Predicates

Problem
In propositional logic, subsumption of predicates
is undecidable for two arbitrary predicates
Solution
Restrict predicates to use only simple predicates
(i.e., a op c a is an attribute, op is a comp.
operator and c is a constant)
Translate predicate into DNF
Example
Selection predicate PQ year2008 AND quarter4
View predicate PM year2008
gt PQ is subsumed in PM

31
Derivability Group-by Lattice

quarter
region
group
quarter, region
quarter, group
region, group
quarter, region, group

Arrows indicate which queries / views can be
derived from a view
Number of views for n functional independent
dimension attributes 2n
For functional dependent attributes only a
polynomial number of views (e.g. for country,
region, city we only need 3 views grouping by
country, region, or city as the func. dep.
county-gtregion-gtcity hold
gt Not all views may be materialized Select a
good subset!

32
Example Group-by Lattice

quarter
region
group
quarter, region
quarter, group
region, group
M2
M1
quarter, region, group

Assume we create two materialized views (M1, M2)
using the group-by attributes quarter, region
and region, group
Queries that group-by quarter, region or
can be derived from M1
Queries that group-by region, group or can
be derived from M2

33
Derivability Aggregation Functions

Classification aggregation functions
Additive F(X1?X2) F(F(X1), F(X2)) and F-1
exists gt SUM(a1, a2, a3) SUM( SUM(a1, a2),
SUM(a3) ) and SUM(a3) SUM(a1, a2, a3) - SUM(a1,
a2)
Semi-additive Same but F-1() does not exists gt
MIN(a1, a2, a3) MIN( MIN(a1, a2), MIN(a3) )
Additive-computable F(x) F(F1(X), F2(X), ,
Fi(X) ) where F1, F2,, Fi are (semi-)additive
functions gt AVG SUM / CNT
All these classes of aggregation functions can be
used by a view to answer other queries
Other aggregation functions (e.g., median) do not!

34
Example Aggregation Functions
Query
Materialized View (M)
SELECT MIN(F.price) FROM F, D2, D3, D4 WHERE F.d2
D2.idAND F.d3 D3.id AND F.d4 D4.id AND
D2.quarter 4 AND D2.year 2008 AND D3.region
'Zurich'
SELECT D2.quarter D3.region, D4.group, MIN(F.price
) as minprice FROM F, D2, D3, D4 WHERE F.d2
D2.idAND F.d3 D3.id AND F.d4 D4.id AND
D2.year 2008 GROUP BY D2.quarter, D3.region,
D4.group
SELECT MIN(F.price) FROM M WHERE D2.quarter
4 AND D3.region 'Zurich'
35
Derivability Set of Relations

All tables RM referenced by M must also be
referenced by Q and be joined using the same join
predicate
Problem Joins are not lossless if they are not
equi-joins along a foreign-key relationship
If lossless joins are used then not all tables of
M need to be referred in Q

36
Example Set of Relations

Materialized view M holds all suppliers that are
also customers (indicated by same primary key
value)
Query Q asks for all suppliers in region Zurich

Supplier
Customer
Materialized View (M)
Query (Q)
SELECT COUNT() FROM Supplier s, Customer c WHERE
s.suppkey c.custkey
SELECT COUNT() FROM Supplier s WHERE
s.region'Zurich'
gt Question Is Q allowed to use M ?
37
View Maintenance

Question How to maintain a materialized view if
new data is loaded into the DWH?
Different alternatives
Update strategy Rebuild vs. incremental-update
Freshness Immediate vs. on commit vs. deferred

38
View Maintenance Update

Rebuild
Rebuild view(s) completely from base tables or
from other views when data changes
Multi-Query-Optimization
Different views might be based on the same base
tables
Rebuild each view individually or combine rebuild
(e.g., first create an intermediate view M3 in
order to update view M1 and M2)?
Incremental-Update
Compute a view delta by looking at the updates
only (and not at the base tables) and merge delta
with view
Views must satisfy certain properties (e.g., only
additive aggregation functions to support insert
and delete operations)
gt Question (for you) What about semi-additive
agg. functions?

39
Example Rebuild

Two materialized views (M1 and M2) exist that
contain
Total sales of 2008 in Switzerland (SUM)
M1 groups-by quarter, region
M2 groups-by region, group
Rebuild M1 and M2 gt First build a view M3 that
Aggregates the sales of 2008 in Switzerland and
Groups-by quarter, region, group

Example (Views M3 and M1)

View M3
View M1
40
Example Incremental-update

Materialized view M1 contains
Total sales of 2008 in Switzerland (SUM)
M1 groups-by quarter, region
New data for quarter 4 is loaded into fact table
F
Reversals of products
New sales of the day
Incremental update of M1
Derive a delta on the granularity of the view
Merge delta with view

Example (View M1 and Delta)

View M1
Delta
41
Example Incremental-update

Materialized view M1 contains
Total sales of 2008 in Switzerland (SUM)
M1 groups-by quarter, region
New data for quarter 4 is loaded into fact table
F
Reversals of products
New sales of the day
Incremental update of M1
Derive a delta on the granularity of the view
Merge delta with view

Example (View M1 and Delta)

Merged View M1 Delta
42
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

43
Horizontal-Partitioning (see 12)

Idea Split a table into n disjoint parts with
the same schema gt Union of all parts equals
original table
Benefits
Skip certain partitions based on query predicates
when executing a filter operation (e.g., year
2008) gt pruning
Distribute partitions (to different
machines/cores) gt Parallelize certain query
operations (e.g., parallel hash joins)
Add / remove huge data sets (logically
corresponding together) in one operation
Partitioning Schemes Range, Hash, Round-Robin

44
Horizontal-Partitioning By-Range

Idea Partition a table by the value-ranges of a
certain attribute (called partitioning attribute)
of that table
A tuple belongs to a partition if the value of
the partitioning attribute falls into a certain
range of values
Problems Skew in data gt partitions with
different sizes
Example Fact table F contains an attribute year
Create a partition for each relevant year F2006,
F2007, F2008, F2009
A query filtering by F.year may be restricted to
read the corresponding partitions only
Example A query filtering by F.year 2008 only
has to read partition F2008

45
Horizontal-Partitioning By-Hash

Idea Partition a table by the hash-buckets of a
certain attribute (called partitioning attribute)
The partition a tuple belongs to is determined by
computing partition hash(attribute value)
number of partitions
Good hash-function Avoids skew in partition
sizes
Problems Range-queries need to read more
partitions
Example Fact table F contains an attribute year
Create partitions F0, F1, F2, F3 and use hash
function F.year 4
A query filtering by F.year may be restricted to
read the corresponding partitions only
Example A query filtering by F.year 2008 only
has to read partition F0 (2008 4 0)

46
Horizontal-Partitioning By-Reference

Idea Equi-partition tables that are connected by
a parent-child relationship (i.e., a 1m
relationship)
Example Partition the fact table and dimension
table by the same attribute (e.g., D2.year)
Using range-partitioning, the attribute D2.year
has to be replicated into the fact table gt
updates of dimensions are critical
Instead use reference (i.e., foreign-key
relationship) from fact-table to dimension table
in order to partition fact table by D2.year
New in Oracle 11gR1 (see 14)

47
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

48
Indexing Basics

Idea Find tuples of a table that have a given
value for a certain attribute (without executing
a table scan)
Example Index on attribute year (logical view)

Table D2
Index D2.year -gt RID
gt Typical Index Implementation B-tree (single
attribute)
49
Example B-tree Index
Logical view on index
Nodes hold separators and pointers to children
B-tree index
Leaves hold keys andpointers to RID lists

31,33,
28,29,30,32
gt How does a lookup for year 2008 work? And
for 2006 year 2008?
50
OLAP Indexing

OLAP allows different indexing techniques than
OLTP
Update performance of indexes does not matter!
Data is loaded in bulks (e.g., once a day)
Afterwards, read-mostly data access
(multidimensional value-based access not
key-based access)
gt Indexes should be optimized for reads (not for
updates)
Multidimensional indexes are important (e.g., to
support queries with filters like D2.year2008
and D2.quarter4)
Main memory resident indexes increase read
performance (i.e., compression and
cache-optimizations are important)
gt Number / Size of indexes is only limited by
Available disk space and time to load new data /
build indexes

51
Bitmap-Indexes Motivation

Problem Traditional indexes have problems to
support multiple dimensions
B-tree with (composite key)
Concatenate values of different attributes and
index this value
gt Problem The major attribute determines the
index organization
R-tree (multidimensional index)
Nodes hold n-dimensional overlapping regions
gt Problem Too many overlapping regions make
lookup expensive
UB-tree (multidimensional index)
Map a n-dimensional point to a z-value gt use
Btree
gt Problem Combinatorial explosion of possible
z-values

52
Example R-Tree

Example R-Tree for two attributes D2.year and
D2.quarter

Logical view (quarter, year) -gt RID
D2.quarter
Q4
Q3
Q2
Q1
2005
2006
D2.year
2007
2008
53
Example R-Tree

Example R-Tree for two attributes D2.year and
D2.quarter

R-tree index (quarter, year) -gt RID
D2.quarter
Q4
Q3
Q2

D2.quarter
Q1
19
18
22
23
2005
2006
D2.year
2007
2008
gt How does a lookup for (Q2, 2006) work here?
54
Bitmap-Indexes Idea (see 1)

Idea A bitmap-index stores a bit-vector Bv for
each distinct value v of a single attribute (e.g,
2 values for gender)
Each bit-vector has the size of the table (of
tuples in table)
The bit at position n of one bit-vector is set to
1 if the tuple with RIDn has the value
represented by that bit-vector
Bitmap indexes can efficiently support
multidimensional queries
Use one bitmap-index per attribute
Boolean operations to combine bitmap-indexes
(AND, OR, NOT) are efficient because 32/64 bits
can be combined in 1 CPU cycle

55
Example Bitmap-Index
Index D4.brand -gt RID
Bitmap Index D4.brand
Bitmap Index D4.group
Index D4.group -gt RID
56
Example Bitmap-Index

Query Sales in group Computers for brands
Dell, Lenovo)

SELECT SUM(F.price) FROM F, D4 WHERE F.D4
D4.id AND D4.group 'Computer' AND (D4.brand
'Dell' OR D4.brand 'Lenovo')
Bitmap Index D4.brand

In order to find all relevant RIDs of dimension
D4 for that query
Calculate B BCom ? ( BDell ? BLen )
B indicates RIDs that need to be read

Bitmap Index D4.group
B 110000 ? (100100 ? 010000)
110000
gt RIDs 0,1 of D4 need to be read
57
Bitmap-Indexes Problems

Bitmap-indexes for attributes with large domains
consume a lot of space
Example Assume that the attribute D4.brand has
1000 distinct values and the D4 table holds 1m
products
gt Space needed by bitmap-index for attribute
D4.brand 1bn bits 125m bytes ? 120Mb or 125
bytes per tuple (which is a lot)
For range-queries over attributes with large
domains many bit-vectors need to be read from
disk
Example Assume that we want to select all tuples
of table D2 (time) where 1999 year 2008
gt 10 bit-vectors need to be read for attribute
year

58
Decomposed Bitmap-Indexes

Idea express all attribute values v as a linear
combination v c1v1 cnvn (c1cn are
constants)
Create a single bitmap-index for each variable
v1vn
gt Reduce space overhead for large domains
Example Assume that we want to index an
attribute a with values 0999
A standard bitmap-index would need 1000
bit-vectors
Decomposition v v3100 v210 v11 (where
vi 09 )
We create 3 bitmap-indexes each with 10 different
bit-vectors for the values 09 gt we need 30
bit-vectors instead of 1000
However, a point-query (e.g. a578) needs to read
all three bit-vectors (v1, v2, v3) and AND-them

59
Example Decomposed Bitmap-Index
60
Example Decomposed Bitmap-Index
61
Example Decomposed Bitmap-Index
62
Example Decomposed Bitmap-Index
63
Example Decomposed Bitmap-Index

Query

a576510071061

RIDs

Bv3,5 ?Bv2,7 ?Bv1,6 00100
gt RID 3, ...
64
Range-encoded Bitmap-Indexes

Idea The bit at position n of a bit-vector Bv is
set to 1 if the value of the tuple at RIDn is
less equal v
We do not need a bit-vector for the max. value of
a domain (this is a tautology all tuples hold
values less equal max. value)
gt Reduce number of bit-vectors to be read for
range queries
Example Assume that we want to index the
attribute D2.year with values 1999 2009
We need bit-vectors for the value 19992008
B1999, , B2008
The range-query 2001 year 2004 needs to read
the two bit-vectors (B2000, B2004) and calculate
B2004 AND (NOT B2000)

65
Example Range-encoded Bitmap-Indexes

Query

2001 D2.year 2004

RIDs

B2004 ? NOT B2000 01010
gt How to process a point query like
?
D2.year2004
66
Bitmap-Indexes Compression

Many entries of a bit map may be zero (if the
attribute uses a lot of different values)
Therefore compression techniques may be applied
gt At query time bitmaps need to be decompressed
Compression Techniques
Decomposed Bitmap-Indexes
Run length encoding (RLE)
Approximate bitmaps ( e.g., similar to
bloom-filters)

67
Approximate Bitmap-Indexes

Idea For table of size n (i.e., having n rows),
we can use a bitmap-index that uses bit-vectors
of size m ?? n
Apply hash function to determine the index i into
the bit-vector for a given rid i hash(rid)
(e.g., i rid m)
Standard Attribute a has value v at pos. rid ?
Bvrid 1
Approx. Attribute a has value v at pos. rid ?
Bvhash(rid) 1
This means
Bvhash(rid) 1 Attribute a could have value
v at position rid
Bvhash(rid) 0 Attribute a does not have
value v at position rid
Approx. bitmaps may return a superset of
potential RIDs
False positives have to be post-filtered using
the predicate

68
Example Approximate Bitmap-Indexes

Example Table D2 has 10 rows that are mapped to
bit-vectors of size 4 by using rid 4 as a hash
function
For a query with D2.year2008, we know
B200811 gt tuples at pos. 5, 9, 13, might have
value 2008
gt Not all hash-functions have an
inverse-functions -gt use probing

For rid9 with D2.year2008, we set B200894
B20081 1
69
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

70
Join-Indexes (see 2)

Special case of a materialized view using RIDs
For two tables R and S a join index may have
three different types
Type 1 Join value -gt (R.RID, S.RID) i.e.,
all rows of R and S that have the specified join
value
Type 2 R.RID -gt S.RID, i.e., all rows of S
that join with row R.RID
Type 3 R.attribute -gt S.RID, i.e., all rows of
S that join with those rows in R having the
specified attribute value (not the join key)

71
Example Join-Index (Type 12)
D3 Region
F Sales
n
1

Type 1 Join val.-gt(D3.RID, F.RID)
1 -gt (0, 0, 5)
2 -gt (1, 2, 3)
3 -gt (2, 1, 4)
4 -gt (3, 6)

Type 2 D3.RID -gt F.RID
0 -gt 0, 5
1 -gt 2, 3
2 -gt 1, 4
3 -gt 6
gt Can be derived from Type 1

72
Example Join-Index (Type 3)
D3 Region
F Sales
n
1

Type 2 D3.RID -gt F.RID
0 -gt 0, 5
1 -gt 2, 3
2 -gt 1, 4
3 -gt 6

Type 3 D3.region -gt F.RID
Zurich -gt 0, 5
Basel -gt 2, 3
Luzern -gt 1, 4
Bavaria -gt 6
gt Can be derived from Type 2

gt How to create a type 3 index from type 2 (if
the attribute values are not distinct)?
73
Example Join-Index (Type 3)
D3 Region
F Sales
n
1

Type 2 D3.RID -gt F.RID
0 -gt 0, 5
1 -gt 2, 3
2 -gt 1, 4
3 -gt 6

Type 3 D3.country -gt F.RID
Switzerland -gt 0, 1, 2, 3, 4, 5
Germany -gt 6
gt Can be derived from join-index (type 2) by
Joining index with D3 on RID
Merging F.RID lists with same values

74
Join-Index Implementation (1)

Question How to represent RID lists?
As compact lists (Similar to inverted lists in
IR)
Uncompressed
value -gt int,int,int, terminator or
value -gt number of RIDS, int,int,int
Compressed (using some compression scheme)
value -gt int, diff to previous value, diff to
previous value, term.
As bitmaps
Works well if total number of RIDs is not too
large and the cardinality of the indexed
attribute is low

75
Join-Index Implementation (2)

Question Which RID-list representation should be
used?
Depending on the cardinality of an attribute
RID-list representations may be switched
Attribute value frequent (i.e., one value is used
in many tuples) bitmaps may be a good choice
Attribute value rare (i.e., one value is used
only in some tuples) RID lists may be better
A clever index implementation may switch between
the two representations at any time
Therefore, an ordered RID list can be regarded as
a way to compress a bitmap (and vice-versa)

76
Example Bitmap Join-Index (Type 2)
D3 Region
F Sales
n
1

Type 2 D3.RID -gt F.RID
0 -gt 0, 5
1 -gt 2, 3
2 -gt 1, 4
3 -gt 6

Bitmap (Type 2)

of bit-vectors D3
Bit-vector length F

77
Example Bitmap Join-Index (Type 3)
D3 Region
F Sales
n
1

Type 3 D3.country -gt F.RID
Switzerland -gt 0, 1, 2, 3, 4, 5
Germany -gt 6
gt Can be derived from Type 2 which maps D3.RID
-gt F.RID

Bitmap (Type 3)

Bitmap (Type 2)
B0 ? B1 ? B2 BSwitzerland
78
Star-Join using Bitmap Indexes

Semi-Join plan Read only those tuples from F
that contribute to the result by pre-selecting
the relevant RIDs using semi-joins
Execution Plan
Selections Use bitmap-indexes
Semi-join Use join-index (type 2)
Alternatively Plan
Selection and Semi-JoinUse join-index (type 3)
Intersection of RIDs
AND all bit-vectorsreturned from semi-joins
Read tuples from F
Use index-nested-loop join on RID

79
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

80
Wait-Time in DBMS
from 17

Problem Even in a well-tuned DBMS wait time can
dominate! Why?

81
Memory Hierarchy
CapacityAccess Time
Control Granularity
Bytes1-5ns
Register
Compiler 1-8 Bytes
K-MBytes2-20ns
Cache Controller 8-128 Bytes (Cache Line)
Cache lt-gt Main- Memory Gap
M-GBytes50-100ns
Operating System 1-16 Kbytes(Disk Block, Page)
Main-Memory lt-gt HD Gap
GBytesmsec
Operator Files (Mbytes)
TBytessec
Operator Files (Mbytes)
TBytessec-min
82
and things are getting worse

HD capacity vs. HD access time

CPU clock rate vs.
Main memory access time

Improvement
Improvement
10000
100
1000
100
10
10
1
1
1970
1980
1990
2000
1970
1980
1990
2000
gt We can store more and more data while
accessing the data relatively gets slower
83
Wait-Time in DBMS due to I/O
from 17

Wait time of CPU processes in current DBMS due to
I/O
Bring data from hard disk (rotating) into main
memory gt disk seek
Bring data from main memory to CPU caches gt
cache misses
gt Goal Increase temporal/spatial locality of
data in memory hierarchy

84
N-ary Storage Model in DBMS (NSM)
NSM (Disk Page)
D3 Region
Header
1
2
Zurich
Switzerland
3
Basel
Switzerland
4
Luzern
Switzerland
5
Bavaria
Germany
Baden-W.
Germany
o
o
o
o
o

In a DBMS data is stored clustered by pages on a
disk
Page Layout N-ary Storage Model / Slotted Page
All tuples are stored sequentially
All attributes of a tuple are stored together
gt Question What happens in the memory hierarchy?

85
NSM in Memory Hierarchy
SELECT id FROM D3 WHERE country'Switzerland'
CPU Cache
Disk
Main Memory
Cache Line 1
Cache Line 2
Cache Line 3
Cache Line 4

() NSM optimized for full-tuple access (as in
OLTP)
(-) Partial-tuple access hurts at all memory
levels

86
Vertical Partitioning

Idea Divide a table into multiple tables that
contain fewer columns (each stored on disk using
the NSM)
Cluster attributes into one table that are
accessed together
Unique key column is replicated in all tables for
joining
When scanning a table only relevant data must be
read from disk into main memory and then into CPU
caches

D3 Region
D3 Region (vertically partitioned)
87
Decomposed Storage Model (DSM)

Idea Extreme Case of vertical partitioning (see
3)
Divide a table into a set of two-column tables
(RID, attribute) gt one table for each attribute
Alternatively divide a table into one-column
tables storing the RID implicitly (array-like
representation)
Store resulting tables on separate disk pages

D3 Region
D3 Region (decomposed)
88
Decomposed Storage Model (DSM)
D3 Region (decomposed)
Disk Pages
Header
Header
Header
1
2
3
Zurich
Switzerland
5
4
Basel
Switzerland
Luzern
Switzerland
Bavaria
Germany
Baden-W.
Germany

() Very efficient when only a few attributes of
a table are accessed
(-) Full-tuple access needs to read data from
different disk pages

89
Column-Stores

Specialized stores that use DSM to layout data on
disk
Used to speed up OLAP (i.e., read-only) workloads
Questions What is different to traditional
row-oriented stores (as todays DBMS) that use
vertical partitioning?
Late materialization columns that are read from
disk are joined together into rows as late as
possible in a query plan gt avoid reading data
that is not relevant for the query result
Vectorized Query Processing Multiple values from
a column are passed as a block from one operator
to the next rather than using per-tuple iterators
gt inefficient for column-stores
Lightweight compression Reduce I/O to read data
from disk and enable query processing on
compressed data (e.g, RLE)

90
Column-Stores vs. Row-Stores (1)

Recent Study (see 4) Scanning a Table (NSM vs.
DSM)
Tuple width 150 bytes, 16 attributes, 9.5GB
table
Read 10 data of table using a filter operation
and vary the number of selected attributes (i.e.,
selected bytes per tuple)

91
Column-Stores vs. Row-Stores (2)

Two other studies 5, 6 Simulating a
column-store using a row-store
Both use Star Schema Benchmark for performance
evaluation
5 compares a column-store to another row-store
using
Vertically-partitioned tables
Tables which have one index per attribute using
index-only plans
Materialized views (as optimal case)
6 compares scan of relevant attributes (i.e.,
optimal column-store) to
A vertical partitioned row-store using a special
logical schema (explicit RLE)
(Generalized) Materialized views
Fight between column-store and row-store industry
5 A row-store can not reach the performance of
a column-store for read-mostly workloads because
column-stores use additional optimizations
6 We can teach a row-store the same tricks

92
Other Storage Models

PAX (Partition Attributes Across) (see 9)
Store all attributes of a tuple together in one
disk page
Inside one disk page store tuples partitioned
column-wise
Fractured-mirrors (see 8)
Keep both representations row-store and column
store
Depending on type of query pick appropriate store
Rodent-store (see 7)
Declaratively specify storage layout A table
uses different storage layout for different
attributes (row-wise, column-wise, )
Generate code for data access optimized for that
layout

93
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

94
Compression Query Processing

Heavyweight compression schemes
Data is stored highly compressed on disk
When data is read from disk it is decompressed
immediately
Afterwards query processing operations (i.e.,
filters, joins, ) execute on decompressed data
gt Improve I/O performance (i.e., load data from
disk into main memory) but high CPU-costs
Lightweight compression schemes
Data is sub-optimally compressed on disk
When data is read from disk it is not
decompressed immediately
Many query processing operations (i.e., filters,
joins, ) can be executed on compressed data
Optimally, data is only decompressed when the
final query result is built
gt Improve I/O performance and query execution
performance

95
Lightweight Compression Schemes (in
Column-Stores)

Run-length-encoding
Compresses runs of the same value in a column to
a compact singular representation (value, run
length)
Example
Store lt3, 3, 3, 3, 10, 10, 4, 4gt as lt(3,4),
(10,2), (4,2)gt
Aggregation (SUM) 3410242 instead of
3333101044
Dictionary-encoding
Replace long values (e.g., strings) with smaller
codes
Example
Replace values female and male of column
gender with 0 and 1
A column ltmale, male, male, female,
female, male, female, malegt can be stored
as lt1,1,1,0,0,1,0,1gt
Filter gendermale (string compare) is
rewritten as gender1 (integer compare)
Many other compression schemes (see 10)

96
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

97
Optimizing for CPU caches

Memory references do not have uniform costs
anymore
Gap between cache access and main memory access
gtImprove cache behavior (increase temporal and
spatial cache locality of data and instructions)
Cache conscious data structures and algorithms
Exploit cache with detailed knowledge about cache
sizes
Specialized approach
Cache oblivious data structures and algorithms
Exploit cache without detailed knowledge about
cache sizes
General approach

98
CSS-Tree (see 11)

Cache-Sensitive-Search-Tree Building a
cache-conscious directory over a sorted array of
elements
Improve on B trees
Store only separators in nodes and no pointers
gt More separators per node (higher fan-out,
lower height)
All nodes are stored in one contiguous array
gt Identify array offsets of child nodes by
arithmetic operations
Node size should not exceed cache line
gt Minimize cache misses when loading a node
Supports very efficient read operations und bulk
loads but updates are very expensive gt rebuild
tree

99
Example CSS-Tree (1)

Logical representation (Keys per node m3)

Physical Representation(contiguous array)

Internal nodes
Leaf nodes (not stored)
Sorted Array
E.g., leaf 13 element 22 (sorted array)
Sorted Array
Node 2 (values)
Child Node
9
10
11
12
100
Example CSS-Tree (2)

Lookup value v on node n
Binary search on node -gt key at pos. k
Compute offset of child node into array c
n(m1) (k1)
If c gt array of internal nodes then offset into
sorted array of leaves must be calculated (see
11)
Example
Lookup value 253 on node n2 -gt k1
Next child node offset -gt c 24 2 10

CSS-Tree (m3)

Logical
Physical
101
CSB-Tree (see 12)

Similar to B-tree
Store all children of one node in one contiguous
array
Store only one pointer for all children of a node
Lookup more efficient than for B-tree gt more
keys per node
Compared to CSS-tree
Lookup is little less efficient
Updates are much cheaper
Lookup value v on node n
Binary search on node -gt key k
Child node offset c k(m1)

CSB-Tree (m3)

Lookup value 25 on root -gt Key k 1
Child node offset-gt c 1 3 3

102
Agenda

Query Processing (over Star Schema)
Implementation Techniques in RDBMS (ROLAP)
Materialized Views
Horizontal-Partitioning
Bitmap-Indexes
Join-Indexes
Optimizations for Memory Hierarchy
Vertical-Partitioning / DSM
Compression
CSS-/CSB-Tree
Collaborative Scans

103
Collaborative Scans

Idea Share scan cursors among queries that have
to scan the same table
Example
Q1 SELECT SUM(F.price) FROM F, D3 WHERE
F.d3D3.id AND D3.country 'Germany'
Q2SELECT AVG(F.price), D3.region FROM F, D3
WHERE F.d3D3.id AND D3.country 'Germany'
GROUP BY D3.region
gt Q1 and Q2 can share the same scan cursors for
F and D3

104
Summary