Implementation of TPCH and TPCC toolkits - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Implementation of TPCH and TPCC toolkits

Description:

15-823 Hot Topics in DB Systems. In Class Presentation. December 7th, 2005 ... 15-823 Hot Topics in DB Systems. TPC-C: Description. TPC-C is an OLTP benchmark ... – PowerPoint PPT presentation

Number of Views:253
Avg rating:3.0/5.0
Slides: 26
Provided by: ippokrat
Category:

less

Transcript and Presenter's Notes

Title: Implementation of TPCH and TPCC toolkits


1
Implementation of TPC-H and TPC-C toolkits
In Class Presentation
  • Kun Gao
  • Ippokratis Pandis

December 7th, 2005
15-823 Hot Topics in DB Systems
2
Introduction 1
  • This paper is NOT
  • An ideas paper
  • A survey
  • An improvement on existing algorithm or technique
  • This paper presents
  • A required Implementation

3
Introduction 2
  • DBs are almost ubiquitous
  • The performance of a DB depends on many factors
  • Standardized DB workloads needed by
  • DB Systems Researchers
  • Networking
  • Operating Systems
  • Storage Systems
  • Commerical Vendors
  • DBMS companies
  • End users
  • You name it

4
Introduction 3
  • However, no open-source, publicly available
    toolkits for
  • TPC-H
  • TPC-C
  • Systems people need TPC-H/TPC-C that is
  • Easy deployable
  • Support many commercial DBMSs
  • Open source
  • Easy to tune
  • Require no Ph.D. in DBs in order to use

5
Outline
  • Motivation
  • TPC-H
  • TPC-C
  • Results
  • Conclusions

6
TPC-H Description
  • TPC-H is a DSS benchmark
  • Metric
  • TPC-H composite Query-per-Hour Performance
    (QphH_at_Size)
  • Price/Performance metric (/QphH_at_Size)
  • Consists of 21 Queries

7
TPC-H Characteristics
  • The main characteristics of TPC-H queries
  • Have a high level of complexity
  • Many operators and selectivity constraints
  • Use a variety of accesses
  • Have an ad-hoc nature
  • Examine and access a large percentage of the
    populated data and tables
  • Are all different from each other

8
TPC-H Schema
  • Business analysis classes
  • Pricing and promotions
  • Supply and demand management
  • Profit and revenue management
  • Customer satisfaction study
  • Market share study, and
  • Shipping management
  • SF1 ? 1GB DB

9
TPC-H Implementation
  • Currently supported
  • IBM DB2
  • Oracle
  • QPipe
  • Easy to use
  • Install the DB
  • Populate the DB
  • Run the queries
  • Post-process the results (Some scripts provided)

10
TPC-H Structure
  • lttoolsgt All the scripts are here
  • Important file lttpcd.setupgt Tuning Parameters
  • ltsamplesgt Setup files for various platforms
  • Set tablespaces/bufferpool/
  • ltappendixgt - DBGEN/QGEN scripts for TPC-H v1.
  • ltappendix.v2gt - DBGEN/QGEN script for TPC-H v2.
  • ltdrivergt - Batch programs for running queries
  • ltauditexe/acidgt - ACID test scripts.
  • ltauditexe/vergt - Verification scripts
  • ltqueriesgt - Queries template directories

11
Outline
  • Motivation
  • TPC-H
  • TPC-C
  • Results
  • Conclusions

12
TPC-C Description
  • TPC-C is an OLTP benchmark
  • High volume of read-only and update intensive
    transactions
  • The main characteristics of TPC-C
  • Simultaneous multiple transactions of all types
  • Multiple on-line terminal sessions
  • Significant need for efficient disk I/O
  • Required ACID
  • Non-uniform distribution of data access through
    primary and secondary keys

13
TPC-C Description
  • TPC-C consists of a mix of simple operations
  • New Order (45)
  • Customer makes purchases
  • Dominant TPC-C operation
  • Payment (45)
  • Updates customer payment records
  • Delivery Order Status Stock-Level (10)
  • Misc book keeping, etc

14
TPC-C Schema
  • Performance metric tpmC ? transactions per minute

15
TPC-C Implementation
  • Currently supported
  • DB2
  • Oracle
  • QPipe
  • Implement done as Stored Procedures
  • Usage/Structure similar with TPC-H

16
Outline
  • Motivation
  • TPC-H
  • TPC-C
  • Results
  • Conclusions

17
Setup - Hardware
  • We used two machines
  • Enceladus
  • SMP-processor
  • Two Intel P4 Xeon Processors
  • 4GB RAM
  • 8K Trace Cache, 20K L1D, 512KB L2 per processor
  • Crete
  • Multi-core IBM Power5
  • 2 dies, 2 cores per die, 2 threads per cores 8
    procs
  • 8GB RAM
  • 3 Levels of Caches
  • 64K L1I, 32K LID, 2MB L2 per processor, 36MB L3

18
Setup - Software
  • We used a commercial DB (IBM DB)
  • Enceladus
  • Ver. 8.1.0
  • 600MB bufferpool
  • Crete
  • Ver. 8.2.0
  • 800MB bufferpool
  • Also installed Oracle/QPipe. But no time for
    tests

19
SMP- vs. Multi-
20
MultiCore CPI Per Query
21
MultiCore Memory Misses
22
Outline
  • Motivation
  • TPC-H
  • TPC-C
  • Results
  • Conclusions

23
Conclusions
  • Need for TPC-H/TPC-C toolkits that are
  • Publicly available/Open source
  • User friendly
  • Easy to use
  • Intuitive
  • Require no expertise in DBs
  • Implemented these toolkits
  • Presented some preliminary results
  • Intuition DBs do not scale well to MultiCore

24
Future Work
  • Add support for more DBs
  • Run large scale study
  • Post-processing
  • CPI breakdowns
  • More platforms/OSs

25
Thank you!!
Any Questions??
More info
http//www.cs.cmu.edu/ipandis/tpc/
kgao_at_cs.cmu.edu ipandis_at_cs.cmu.edu
Write a Comment
User Comments (0)
About PowerShow.com