Title: The Impact of Disk IO on Multiprocessor CheckpointRecovery
1The Impact of Disk I/O on Multiprocessor
Checkpoint/Recovery
- CS 736 Project Presentation
- Friday, December 13, 2002
- Jarrod Lewis
- Lixin Su
2Overview Motivation
- SafetyNet checkpoint/recovery mechanism
- Fault-tolerant computing
- Maintains consistent checkpoints of system state
- Currently a hardware-only availability solution
- Output Commit Problem
- Takes time to validate a checkpoint
- Only validated fault-free data can be
communicated outside sphere of recovery - Cant rollback the outside world!
- I/O writes cannot be committed in normal way
3Overview Approach
- Addressing the output commit problem for disk I/O
- Standard solution is to delay/buffer writes
- Performance impact?
- Buffering mechanism?
- Evaluate with commercial workloads
- Hardware traces
- IBM RS/6000 server (AIX 4.3.3)
- Simulation
- Simics full system multiprocessor simulator
- DiskSim disk simulator
4Overview Summary
- Performance impact of disk write latency
- Increased time to disk write completion (0.04 ms
? 4 ms) - Can have significant impact on performance!
- Highly dependent on I/O characteristics
- Up to 5.8x slowdown for 4 ms delay in TPC-C
- Delaying/buffering mechanism
- Implemented buffer at disk controller
- Modest buffer size (10s of KB) needed to support
buffering - Distance (time) between disk writes is large (10s
of ms)
5Outline
- Overview
- Motivation and Approach
- Performance Impact
- Buffering Mechanism
- Conclusions
6Motivation
- SafetyNet checkpoint/recovery mechanism
- Globally consistent checkpoints
- Processors, memories, coherence permissions
- Recovers to a pre-fault checkpoint re-executes
- Checkpoint validation
- Determines which checkpoint is recovery point
- Determines when to interact with I/O devices
- Output Commit Problem
- Increasing checkpoint validation time
- () Reduces logging overhead (overwrites)
- () Tolerate longer latency faults
- (-) Longer output commit delays
7Approach
- Standard solution Delay I/O writes
- How does this affect performance? (Jarrod)
- Evaluate in Simics full system simulator
- Intercept and delay timing of disk writes
- Evaluate microbenchmark, commercial workloads
- Is delaying feasible? (Lixin)
- Collect disk traces on IBM RS/6000 server
- Evaluate traces with DiskSim simulator
- I/O characteristics
- Data buffering requirements
8Outline
- Overview
- Motivation and Approach
- Performance Impact
- Buffering Mechanism
- Conclusions
9Experiment I Performance Impact
- Simics multiprocessor simulator
- Functional simulator only
- Assumes each instruction takes 1 cycle to execute
- I/O can have different timing
- Access to source for devices (DMA, SCSI, etc)
- Intercepting disk writes
- Add fixed delay to each write
- Delays disk content update, processor
notification - Observe impact on execution time
10Simics/sun4u System Overview
11Simics/sun4u System Overview
12Simics/sun4u System Overview
Issue DMA Read (Disk Write)
1
13Simics/sun4u System Overview
DMA Controller issues request to SCSI Controller
2
NOTE This is the point where a disk write will
be delayed
14Simics/sun4u System Overview
Transfer Data from RAM onto Disk
3
15Simics/sun4u System Overview
SCSI Controller notifies DMA Controller when
write is complete
4
16Simics/sun4u System Overview
DMA Controller interrupts the Processor to notify
the write is done
5
17Performance Impact of Delayed Disk Writes
18Rate of Writing Data to Disks
19Outline
- Overview
- Motivation and Approach
- Performance Impact
- Buffering Mechanism
- Conclusions
20Trace Collection
- Commercial Workloads
- multi-user, multi-tier, multi-threaded,
multi-client - SPECmail2001, SPECweb99, TPC-b
- IBM RS/6000 server
- running AIX 4.3.3
21Workloads - SPECmail
- Benchmarking mail server performance
- Write intensive small writes
- Running Configuration
- 5 machines, 200 users, running 1 and a half days
22SPECweb and TPC-b
- SPECweb 99
- benchmarking HTTP server
- I/O intensive
- dynamic GET, POST, etc.
- TPC-b
- benchmarking online banking database
- Not I/O intensive but the first one I got
running - Multiple banks, user accounts, threads
23DiskSim 2.0
- Disk simulator from CMU
- Include device drivers, buses, controllers, etc
- Request queuing, block caching, etc
- Implemented a write buffer at controller level
- Important simulation parameters
- 11ms seek time, request scheduling/collapsing,
block caching, etc. - Compared with IBM UltraStar 2 disk series
24Write Interval Analysis
- Average Write Interval
- SPECMail 13 ms, SPECWeb 151 ms, TPC-b 3511
ms
25Buffer Size Sensitivity
- Factors that affect TTF (time to fill)
- I/O write intensity
- Write buffer size
26Outline
- Overview
- Motivation and Approach
- Performance Impact
- Buffering Mechanism
- Conclusions
27Conclusions
- Delaying disk writes does affect performance
- Performance degrades rapidly for larger delays
- Multiprocessor system (multiple disks) more
sensitive - Practical to implement a buffer at disk
controller level - Must be SafetyNet-aware
- For the current constraints of SafetyNet, only
conservative amount of buffer is needed
28Future Work
- Buffering mechanism at SafetyNet
- Buffer size and hardware complexity
- Mechanisms of I/O interception
- Develop solutions other than just delaying I/O
- e.g., logging