Title: Topics 9-10: Database optimization for modern machines
1Topics 9-10 Database optimization for modern
machines
- Computer architecture is changing over the years,
so we have to change our programmes, too! - Most database operators, indexes, buffering
techniques and storage schemes were developed and
optimized over 20 years ago, so we need to
re-tune them for the modern reality.
2Moores Law Dictionary Term
- http//info.astrian.net/jargon/terms/m/moore_s_law
.html - Moore's Law /morz law/ prov. The
observation that the logic density of silicon
integrated circuits has closely followed the
curve (bits per square inch) 2(t - 1962) where
t is time in years that is, the amount of
information storable on a given amount of silicon
has roughly doubled every year since the
technology was invented. This relation, first
uttered in 1964 by semiconductor engineer Gordon
Moore (who co-founded Intel four years later)
held until the late 1970s, at which point the
doubling period slowed to 18 months. The doubling
period remained at that value through time of
writing (late 1999). Moore's Law is apparently
self-fulfilling. The implication is that
somebody, somewhere is going to be able to build
a better chip than you if you rest on your
laurels, so you'd better start pushing hard on
the problem.
3Features of Modern Machines and Future Trends
- CPU speed, memory bandwidth, and disk bandwidth,
follow Moores Law. On the other hand, memory
latency improves by only 1 per year. - Memories are becoming larger and slower
relatively to memory bandwidth and CPU speed. - Most of the time the processor of your PC is
idle, waiting for data to be fetched from memory. - The role of main memory caches is becoming very
important in the overall performance.
4Features of Modern Machines and Future Trends
(contd)
- Disk bandwidth also follows Moores Law.
- On the other hand, disk seek time improves by
only 10 per year. - Thus a random disk access today may cost as much
as 20-50 sequential accesses.
5Features of Modern Machines and Future Trends
(contd)
- Another characteristic of modern machines is that
their processors (AMD Athlon, Intel P4), have
parallel processing units able to process at the
same time multiple (e.g., 5-9) independent
instructions.
6The new bottleneck Memory Access
- Memory is now hierarchical two levels of caching
CPU
L1 cache-line
CPU die
L1 cache
L2 cache-line
L2 cache
Memory page
Main memory
- Memory-latency the time needed to transfer 1
byte from the main memory to the L2 cache. - Cache (L2) miss if the requested data is not in
cache and needs to be fetched from main memory - Cache-line The transfer unit from main memory to
cache (e.g., L2 cache-line 128 bytes)
7Example of Memory Latency Effects in Databases
- Employee(id 4 bytes, name 20 bytes, age 4
bytes, gender 1 byte)
Disk/Memory representation
12000 Mike Chan 30 M 12305 George Best 27 M ...
29 bytes
29 bytes
8Example of Memory Latency Effects in Databases
(contd)
- Example query how many girls work for the
company? - SELECT COUNT()
- FROM EMPLOYEE
- WHERE gender F
- What is the cost of this query, assuming that the
relation is in main memory? - We have to access the gender values, compare them
to F and add 1 (or 0).
9Example of Memory Latency Effects in Databases
(contd)
- Accessing a specific address from memory loads
also the information around it in the cache-line
(in parallel over a wide bus). - Example Accessing the gender of the first tuple
cache-line (128 bytes)
Cache
12000 Mike Chan 30 M 12305 George Best 27 M ...
MMem
29 bytes
29 bytes
10Example of Memory Latency Effects in Databases
(contd)
- Thus a cache-line at a time is accessed from main
memory. - If the requested information is already in cache
(e.g., the gender of the second tuple), main
memory is not accessed. - Otherwise, a cache-miss occurs.
11Effect of cache misses
- The stride.c program demonstrates how cache
misses may dominate the query processing cost. - http//www.csis.hku.hk/nikos/courses/CSIS7101/str
ide.c - initialize a random memory buffer
- sum 0
- ilimit 200000 //tentative
- for(i0 iltilimit istride)
- sum bufferi
- printf("sumd\n",sum)
stride5
12Effect of cache misses (contd)
13Re-designing the DBMS
- We need to redesign the DBMS in order to face the
new bottleneck memory access - New storage schemes are proposed to minimize
memory and disk accesses. - Query processing techniques are redesigned to
take under consideration the memory-latency
effects. - Algorithms are changed to take advantage of the
intraparallelism of instruction execution. - New instruction types are used (e.g., SIMD
instructions).
14Problems in optimizing the DBMS
- Programming languages do not have control over
the replacement policy of memory caches. These
are determined by the hardware. - As a result, we cannot apply buffer management
techniques for memory cache management. - Also, simple instructions generated by
programming languages cannot control the parallel
instruction execution capabilities of the
machine. - In many cases, we re-write the programs, trying
to fool the page replacement policy, and the
instruction execution in order to make them more
efficient.
15The N-ary storage model
- The N-ary storage model (NSM) stores the
information for each tuple as a sequence of bytes.
Disk/Memory representation
12000 Mike Chan 30 M 12305 George Best 27 M ...
29 bytes
29 bytes
16A Decomposition Storage Model (DSM)
- The table is vertically decomposed, and
information for each attribute is stored
sequentially.
a1
a2
a3
12000 Mike Chan 12305 George Best ...
12000 30 12305 27 ...
12000 M 12305 M ...
17Properties of DSM
- The relation is decomposed into many binary
tables, one for each attribute. - The attributes of a binary table are a surrogate
id and the attribute from the original relation. - The surrogate-id is necessary to bring back
together information for a tuple, by joining the
binary tables. - Example Print the information for tuple with
id12305
NSM
DSM
SELECT FROM EMPLOYEE WHERE id12305
SELECT id,name,age,gender FROM a1,a2,a3 WHERE
a1.id12305 AND a2.id a1.id AND a3.id a1.id
18Properties of DSM (contd)
- Advantages of DSM
- If the relation has many attributes, but queries
involve only few attributes, then - Much less information is read from disk, than in
NSM - Cache misses are fewer than in NSM, because the
stride is smaller - Projection queries are very fast
- Disadvantages of DSM
- If a tuple needs to be reconstructed, this will
require many joins. - The size of the decomposed relation is larger
than the original due to the replication of the
surrogate key.
19A Decomposition Model with Unary tables (Monet)
not materialized
Given the surrogate sid of a tuple, we can
compute its attribute value v by v
(table_address(sid - first_surrogate)size)
1 1 M F ...
20Partition Attributes Across
- Attempts to combine advantages of both NSM and
DSM, while avoiding their disadvantages. - The data are split to pages, like in NSM, but the
organization in each page is like DSM. Thus - Cache misses are minimized because the
information for a specific attribute is stored
compactly in the page. - Record reconstruction cost is low, since the
tuples are actually stored like in NSM on disk,
but vertically partitioned in each page, where
the reconstruction cost is minimal.
21How do pages look in each schema?
DSM
NSM
PAX
PAGE HEADER
12000
12000
PAGE HEADER
PAGE HEADER
12000
George
Mike Chan
12305
Mike Chan
30 M 12305
...
12305
...
Best
...
George Best 27 M
...
Mike Chan George Best
...
...
PAGE HEADER
...
30 27
...
PAGE HEADER
12000
30
54647
PAGE HEADER
PAGE HEADER
54647
12305 27 54647 33
John Kit
33 M 94765
...
94765
...
94765 27
...
Flora Ho 27 F
...
...
John Kit Flora Ho
...
PAGE HEADER
...
33 27
...
...
many tables
22Comparison between PAX and NSM
Cache
NSM
- SELECT AVG(age)
- FROM EMPLOYEE
- WHERE idlt20000
PAGE HEADER
12000
Mike Chan
30 M 12305
...
George Best 27 M
12000 Mike Chan 30 M
...
Cache
PAX
12000
PAGE HEADER
...
12305
12000 12305 ...
Mike Chan George Best
30 27 ...
...
relevant data are fetched to cache
30 27
...
23Comparison between PAX and DSM
DSM
- SELECT AVG(age)
- FROM EMPLOYEE
- WHERE nameM
Cache
PAGE HEADER
12000
Mike Chan
12305
George
Best
...
12000 Mike Chan 12305 George ...
...
qualifying ids have to be joined with other pages
12000 ...
join
Cache
PAX
12000
PAGE HEADER
...
12305
Mike Chan George ...
Mike Chan George Best
12000 ...
...
30 27
...
30 ...
join is avoided
24Is PAX always better than DSM?
- NO. If the relation has many attributes (e.g.,
10) and the query involves only few (e.g., 2)
then the join may be more beneficial, than
reading whole tuples in memory. - Example R(a1,a2,a3, ..., a10)
- Each attribute is 4 bytes long.
- DSM D1(id,a1), D2(id,a2),...,D10(id,a10).
- Query
- SELECT avg(a2)
- FROM R
- WHERE a1x
- Query selectivity is very high 1. The qualifying
ids can fit in memory - PAX has to read the whole table40bytes?R
- DSM has to read D1, apply the query, and create
an intermediate table X with qualifying ids (in
memory). Then it has to read D2 to get the
qualifying a2 that join with X. So the total
bytes DSM reads from disk are 16bytes?R.
25Summary
- In modern computer architectures the bottleneck
is memory access. In many cases the processor is
waiting for data to be fetched from memory. - Memory access is no longer random. When we
access a location in disk then data around it are
loaded to a fast memory chip (cache). Access
locality is very important. - Database operators, storage and buffering
schemes, indexes are optimized for the reduction
of cache misses.
26References
- George P. Copeland, Setrag Khoshafian, A
Decomposition Storage Model. SIGMOD, 1985. - Peter A. Boncz, Stefan Manegold, Martin L.
Kersten, Database Architecture Optimized for the
New Bottleneck Memory Access. VLDB, 1999. - Anastassia Ailamaki, David J. DeWitt, Mark D.
Hill, Marios Skounakis Weaving Relations for
Cache Performance, VLDB, 2001. - P. A. Boncz. Monet A Next-Generation DBMS Kernel
For Query-Intensive Applications. PhD
dissertation. Universiteit van Amsterdam, May
2002.