Erhan Erdin - PowerPoint PPT Presentation

1 / 45

About This Presentation

Title:

Erhan Erdin

Description:

Computer Architecture Support for Database Applications Erhan Erdin Pehlivan Outline Introduction Methodology of the Experiment Analysis of OLTP workloads Analysis ... – PowerPoint PPT presentation

Number of Views:4

Avg rating:3.0/5.0

Slides: 46

Provided by: T08

Category:

more less

Transcript and Presenter's Notes

Title: Erhan Erdin

1

Computer Architecture Support for Database
Applications

Erhan Erdinç Pehlivan

2
Outline

Introduction
Methodology of the Experiment
Analysis of OLTP workloads
Analysis of DSS workloads
Conclusion

3
Introduction

Today Database workloads alone motivate the sale
of vast quantities of symmetric multiprocessor
(SMP) machines,

4
Introduction

Unfortunately, due to some challenges,
commercial applications are often ignored in
preference to technical benchmarks, such as
SPEC(Standard Performance Evaluation Corporation)
Reasons
Complex standardized benchmarks.
Large hardware requirements for full scale.
Numerous configuration parameters.
Lack of useful proprietary information.

5
What is SMP

method of work management that treats all
processors equally
threads that can run concurrently on any
available processor
improves the total throughput of the system
requires applications that can take advantage of
multi-threaded parallelism

6
SMP ARCHITECTURE
7
SMP(Continued)

Advantages of SMP
High performance
Simplicity to program
Easier load balancing
Disadvantages of SMP
Low availability
Low scalability

8
Database Workloads

OLTP(Online transaction processing)
Ex Airline reservation systems
DSS(Decision Support Systems)
Ex Datawarehouse systems

9
Characteristics of OLTP and DSS

OLTP
uses short, moderately complex queries that read
and/or modify a relatively small portion of the
overall database.
have a high degree of multiprogramming,
DSS
typically long-running, moderately to very
complex queries, that scan large portions of the
database in a read-mostly fashion.
The multiprogramming level in DSS systems is
typically
much lower than that of OLTP systems.

10
Motivation

Since SPEC evaluations dont hold for DBMS,
architectural behavior of two standard database
workloads will be investigated in terms of
cycles per instruction (CPI) decomposition,
cache miss rates,
branch behavior.
superscalarness,
out-of-order execution

11
Methodology Experimental Platform

a commodity four-processor Intel-based SMP
server running Windows NT is chosen.

12
(No Transcript)
13
IO System Configurations(OLTP)
14
IO System Configurations(DSS)
15
Software Architecture(OLTP)

Transaction Processing Councils TPC-C benchmark

16
Software Architecture(OLTP)
17
Software Architecture(DSS)

Transaction Processing Councils TPC-D benchmark
the activity of a wholesale supplier in doing
complex business analysis.
analysis pricing and promotions, market share
study,shipping management, supply and Demand
management, profit and revenue management and
customer satisfaction study.
17 read-only queries and 2 update queries,

18
Software Architecture(DSS)
19
Pentium Pro Processor Architecture
20
Potential sources of stalls

misses to the L1 instruction cache
a branch misprediction
the instruction mix of the workload
the out-of-order execution engine

21
Measurement Methodology

NT performance monitor
Pentium Pro hardware counters.
Intel tool called emon

22
Analysis of OLTP Workloads

OLTP does short, moderately complex transactions
small, random I/O operations
large number of concurrent users, a high degree
of multiprogramming.
database implements locking,logging
The combination of these tasks
Large instruction working set
Larger data footprint

23
Experimental Results CPI
24
Experimental Results Memory System Behavior

How do OLTP cache miss rates vary with L2 cache
size?

25
Experimental Results Memory System

What effects do larger caches have on OLTP
throughput and stall cycles?

26

Experimental Results Processor Issues
How useful is superscalar issue and retire for
OLTP?
27
Experimental Results Processor Issues

How effective is branch prediction for OLTP?

28
Experimental Results Processor Issues

Is out-of-order execution successful at hiding
stalls for OLTP?

29
Experimental Results Multiprocessor Scaling
Issues

How well does OLTP performance scale as the
number of processors increases?

30
Experimental Results Multiprocessor Scaling
Issues

How do OLTP CPI components change as the number
of processors is scaled?

31
Experimental Results Multiprocessor Scaling
Issues

How prevalent are cache misses to dirty data in
other processors caches for OLTP?

32
Experimental Results Multiprocessor Scaling
Issues

Is the four-state (MESI) invalidation-based cache
coherence protocol worthwhile for OLTP?

33
Experimental Results Multiprocessor Scaling
Issues

How does OLTP memory system performance scale
with increasing cachesizes and increasing
processor count?

34
Analysis of Decision SupportWorkloads

DSS queries are typically long-running,
moderately to very complex queries,
Scan large portions of the database in a
read-mostly fashion.
Large sequential disk I/O read operations.
The multiprogramming level in DSS systems is
typically lower than that of OLTP systems.

35
Dss Workload
36

Experimental ResultsMemory System Behaviour

How do DSS cache miss rates vary with L2 cache
size?

37
Experimental ResultsMemory System Behaviour

What impact do larger L2 caches have on DSS
database performance and stall cycles?

38
Experimental ResultsMemory System Behaviour

How prevalent are cache misses to dirty data in
other processors caches in DSS?

39
Experimental ResultsMemory System Behaviour

Is the four-state (MESI) invalidation-based cache
coherence protocol worthwhile for DSS?

40
Experimental ResultsMemory System Behaviour

How does DSS memory system performance scale with
increasing cache sizes?

41
Experimental Results Processor Issues

How useful is superscalar issue and retire for
DSS?

BEHAVES LIKE OLTP
42
Experimental Results Processor Issues

How effective is branch prediction for DSS?

43
Experimental Results Processor Issues

Is out-of-order execution successful at hiding
stalls for DSS?

44
Conclusions for OLTP

out-of-order execution is only somewhat effective
for this database workload.
increased superscalar width for the out-of-order
engine may be helpful.
Innovation needed in branch prediction algorithms
and hardware structures to better support
database workloads.
caches are effective at reducing the processor
traffic to memory
Three-state (MSI) cache coherence protocol would
be better
the amount of time when the memory system is
unavailable decreases with larger caches,
increases with of processors

45
Conclusions for DSS

out-of-order execution provides potentially more
benefit for DSS than OLTP
DSS performance is less sensitive to L2 cache
size than OLTP performance.
Existing branch prediction schemes are more
effective for this workload.
Increasing the micro-operation retire width in
the Pentium Pros out-of-order RISC core may
provide performance improvements
Dirty misses are less prevalent for DSS than
OLTP.