Hippodrome: Automatic Global Storage Adaptation - PowerPoint PPT Presentation

About This Presentation

Title:

Hippodrome: Automatic Global Storage Adaptation

Description:

... values are also calculated to ease understanding the behaviour of the system ... Otherwise, behaviour is the same as for constant-1, which is to be expected as ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 39

Provided by: iramCsB

Learn more at: http://iram.cs.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Hippodrome: Automatic Global Storage Adaptation

1
Hippodrome Automatic Global Storage Adaptation
Execute Application
Migrate to Configuration
Analyze Workload
Design New Configuration

Eric Anderson, Mustafa Uysal, Michael Hobbs,
Guillermo Alvarez, Mahesh Kallahalla, Kim Keeton,
Arif Merchant, Erik Riedel, Susan Spence, Ram
Swaminathan, Simon Towers, Alistair Veitch, John
Wilkes HP Labs Storage Systems Program

2
Hippodrome Why?

Computer systems very complex
System administrators very expensive
Let the computer handle it
Optimize the system for the workload as it
changes
Determine when to add/remove hardware
Two parts to talk
Description of framework for managing a large I/O
centric system
Experimental results showing when it works and
when it doesnt.

3
Hippodrome Lessons

Global system adaptation possible by use of four
parts of the loop
Solver Finds new "optimal" configuration
Models Predicts the performance of a
configuration
Analysis Generates summary of a workload
Migration Moves current configuration to new one
"Goodness" dependent on accuracy of models
Rate of adaptation dependent on "over-commit"
available in the system
A gradually increasing workload can always be
"good" if enough headroom exists

4
Hippodrome Our System

Targeted at applications running on large storage
systems
Solver chooses appropriate configuration for
array and mapping of application-level storage
units onto the array
Experiments use synthetic applications for ease
of understanding "good" behaviour
Applications run on an N-class server and access
an HP FC-60 disk array via switched fibre channel

5
Hippodrome Four Parts Needed for Adaptation
Execute Application
Migrate to Configuration
Analyze Workload
Design New Configuration

Analysis Generates summary of a workload
Models Predicts the performance of a
configuration
Solver Finds new "optimal" configuration
Migration Moves current configuration to new one
Solver and Models both part of "Design New
Configuration" step

6
Hippodrome Analysis, Models, Solver, Migration

Trace the I/O's generated, run through analysis
tools to create "workload" file.
Two parts generated from analysis
"stores" a logically contiguous fixed-size block
of storage. Usually implemented as a logical
volume
"streams" an access pattern to a particular
store. Currently defined as average request
rate, average request size, run count, on/off
time, overlap fraction
In our experiments, some additional per-stream
values are also calculated to ease understanding
the behaviour of the system

7
Hippodrome Analysis, Models, Solver, Migration

Two inputs to models
Device configuration Logical Units (LUNs) with
disk type, number of disks, raid level, stripe
size array controller associated with each LUN
Workload configuration List of stores on each
LUN and therefore the streams accessing that lun
and using the associated controller
Output is utilization of each component (disk,
controller, SCSI bus, etc.)
In our experiments, models calibrated to 6-disk
R5 LUN for 4k and 256k random I/Os at an accuracy
above 98 as the general models are still being
developed.

8
Hippodrome Analysis, Models, Solver, Migration

Two inputs to solver
The workload (streams and stores)
Description of "valid" configurations (what
devices to use, what raid levels to use, etc.)
Output of solver is a configuration
Array descriptions (LUNs, disks, controllers,
etc.)
The mapping of stores onto LUNs
Solver uses models to predict if a configuration
is valid (i.e. No component is over 100
utilized)
In our experiments, solver pinned to using 6-disk
R5 luns to match the models and to eliminate the
need to migrate between raid types.

9
Hippodrome Analysis, Models, Solver, Migration

Takes as input new "desired" configuration
Migrates the system to the new configuration
preserving the data and access to the data during
the migration
In our experiments, the synthetic application
does not care about the data, and so we simply
destroy the old configuration and create the new
one to do a "migration"

10
Hippodrome Experimental overview
Execute Application
Migrate to Configuration
Analyze Workload
Design New Configuration

Each experiment is a series of iterations around
the loop. Each iteration is called a "step"
Each step will provide three values
Deviation from target rate "goodness" metric 1
Average I/O response time "goodness" metric 2
Number of LUNs used

11
Experiment Grouping

Multiple variants of each "application"
constant-1 streams always on, I/O rate constant
constant-2 stream groups anti-correlated, I/O
rate constant when active
scaling-1 one store running as fast as possible
scaling-2 like constant-1, but streams are
enabled in different steps once enabled, a
stream will stay on
scaling-3 like constant-1, but stream I/O rate
increases as step number increases
All experiments show global adaptation possible

12
Hippodrome Experiments Demonstrate Lessons

"Goodness" dependent on accuracy of models
constant-1, constant-2 we show how to "break"
the loop.
Rate of adaptation dependent on "over-commit"
available in the system
constant-1, constant-2 we show how fast the
system converges
A gradually increasing workload can always be
"good" if enough headroom exists
scaling-2, scaling-3 we show that the
application always runs at its target rate

13
Hippodrome Experimental Hardware/Software

Array for experiments is HP FC-60
2 controllers, 6 trays
1 Ultra SCSI bus/tray (40MB/s)
4 Seagate 18GB, 10k RPM disks used/tray 24
total
4 6 disk R5 LUNs at 16k stripe size
1 LUN can do 625 random 4k reads/second
Host for experiments is HP N-Class
1 440 MHz CPU, 1 GB memory, HP-UX 11.00
2 100 MB/s fibre channel cards used
Locally developed synthetic application
(Buttress)
Host and array connected through Brocade switch

14
Hippodrome Common Experiment Parameters

Will vary stores, streams, target request
rate
Some parameters usually the same
Phasing all streams on at the same time
Store capacity 256 MB
Max. I/O's outstanding/stream 4
Headroom 0
Some parameters constant for all experiments
Request type 4k read
Request offset uniformly random across store,
aligned to 1k boundary
Run count 1 (no sequentiality in requests)
Arrival process open, poisson

15
Hippodrome Constant-1 experiments

Important result is shape of the graphs
Deviation from target rate converges to 0
Response time gets (much) better
luns used (in the end) matches required request
rate
Comments
Variants 0-3 have total RR of 2000 4 LUNs
Variants 4-6 experiment with filling a LUN to
start
Variants 5,6 differ only in the headroom

16
Constant-1 Deviation from Target Rate

Variants 0-5 converge to 95 CI of 0
Variant 4 converged even though the LUN was full
at the start
Variant 5 converged because of the 10 headroom
Variant 6 never converges models predict the LUN
is only 95 utilized

17
Constant-1 Response Time

Response times get an order of magnitude better
Variant 6 stays at the bad (0.15 second) average
response time

18
Constant-1 Number of LUNs

Lines offset slightly so different variants can
be seen
Goes up by 1 lun each step can't over commit
device to 200
Variants 4,5 have a total request rate lt 3625,
so only use 3 luns
Variant 6 stays at 1 lun, as would be predicted
by other results

19
Hippodrome Constant workload review

Given a constant workload, the loop converges to
the "correct" system in most cases
"Goodness" dependent on accuracy of models
We "break" the loop either through not enough
headroom or bad models
Rate of adaptation dependent on "over-commit"
available in the system
In general, it increases by 1 LUN per iteration
With a workload with idle time, it converges
faster
Now look at workloads that change

20
Hippodrome Scaling-2 experiments

Scaling-2 intended to simulate adding in
additional weeks in a data warehouse, additional
file systems, etc.
We turn on streams as the step number increases
Store capacity 64 MB, max. outstanding 4
Comments
Always "correct!" rate of increase is small
enough
Response time shows points where we added work
LUNs increases as necessary

21
Scaling-2 Deviation from Target Rate

Error bars are the same size as before scale is
much smaller
Amazingly, always within 95 confidence interval
of correct
Slightly above 0 deviation because of measurement
methodology

22
Scaling-2 Response Time

Scale is much smaller than for constant workloads
(max. of 0.055 s vs. 1s)
Now we can see when we add work and when we
remain constant
Height of peaks show how close to 100 the
previous step was
Slight trend upward more total I/Os and more
capacity actively used

23
Scaling-2 Response Time Variant 0 only

Now we can see when we add work and when we
remain constant
Height of peaks show how close to 100 the
previous step was

24
Scaling-2 Number of LUNs

Gradual increase in luns
Exact switch point dependent on specific increase
pattern
Changes close together as increase patterns are
similar

25
Hippodrome Scaling workload review

Handled order of magnitude increase in workload
without having serious slowdowns
Number of luns up by factor of 4
Could see points of additional work in response
time jumping and then settling
Question what other scaling up patterns are
useful?
One other group planned is different streams
scaling at different rates

26
Hippodrome Future Work

Shifting workloads (transaction processing in the
day, decision support at night)
Cyclic workloads (system is told about the
different shift positions)
More complete models, migration of actual data
More complex synthetic workloads
Simple "application" (TPC-B?)
Complex application (Retail Data Warehouse)
Support for global bounds on system size/cost

27
Hippodrome Four Parts Needed for Adaptation
Execute Application
Migrate to Configuration
Analyze Workload
Design New Configuration

Analysis Generates summary of a workload
Models Predicts the performance of a
configuration
Solver Finds new "optimal" configuration
Migration Moves current configuration to new one
Solver and Models both part of "Design New
Configuration" step

28
Hippodrome Lessons

Global system adaptation possible by use of four
parts of the loop
Solver Finds new "optimal" configuration
Models Predicts the performance of a
configuration
Analysis Generates summary of a workload
Migration Moves current configuration to new one
"Goodness" dependent on accuracy of models
Rate of adaptation dependent on "over-commit"
available in the system
A gradually increasing workload can always be
"good" if enough headroom exists

29
Hippodrome Automatic Global Storage Adaptation

Questions?
Joint work with Eric Anderson, Mustafa Uysal,
Michael Hobbs, Guillermo Alvarez, Mahesh
Kallahalla, Kim Keeton, Arif Merchant, Erik
Riedel, Susan Spence, Ram Swaminathan, Simon
Towers, Alistair Veitch, John Wilkes HP Labs
Storage Systems Program

30
Hippodrome Constant-2 experiments

Phasing is a very important workload property
Divide streams into groups (1..n), group start
times offset, then constant on/off pattern
Max. outstanding/stream 32 Target rate 40
Comments
Variant 1 shows faster adaptation because of idle
time
Variant 4/5 show what happens when analysis step
is wrong
Scaling groups proved uninteresting

31
Constant-2 Deviation from Target Rate

All except variant 4 converge 4 appears the same
as 5 in the analysis, but the two groups overlap
in 4 and are anti-correlated in 5
Variant 1 converges faster than the others this
is because the idle time between groups running
allows the system to drain requests

32
Constant-2 Response Time

Similar results to previous slide
Variant 4 does not get to a good response time
Variant 1 converges faster than others.

33
Constant-2 Number of LUNs

Now we see why variant 1 converges faster, it
gets to 4 luns in only 2 steps rather than three
this is because of the idle time.
Otherwise, behaviour is the same as for
constant-1, which is to be expected as in the
aggregate, constant-2 is the same as constant-1

34
Hippodrome Scaling-1 experiments

Scaling-1 intended to simulate something like a
disk copy that will run as fast as the disks will
go (it's disk bound, not cpu bound)
Worked for 3 iterations of the loop (even striped
the store across multiple LUNs), then wanted 5
luns, which is not available
Future work handling a global bound on the size
of the storage system (for example, you can't
spend more than 100,000)

35
Hippodrome Scaling-3 experiments

Scaling-3 intended to simulate adding work over
constant data set (e.g. more queries to DB)
We increase target request rate as step increases
Store capacity 64 MB, max. outstanding 4, max. RR
36
Comments
Always "correct!" rate of increase is small
enough
Response time shows points where we added work
LUNs increases as necessary
Initial deviations garbage due to low request rate

36
Scaling-3 Deviation from target rate

Ignore the graph before about step 4, request
rates too low, analysis seeing bursts and
calculating rates over that
Always supports target request rate

37
Scaling-3 Response Time