Hippodrome: Automatic Global Storage Adaptation - PowerPoint PPT Presentation

About This Presentation
Title:

Hippodrome: Automatic Global Storage Adaptation

Description:

... values are also calculated to ease understanding the behaviour of the system ... Otherwise, behaviour is the same as for constant-1, which is to be expected as ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 39
Provided by: iramCsB
Category:

less

Transcript and Presenter's Notes

Title: Hippodrome: Automatic Global Storage Adaptation


1
Hippodrome Automatic Global Storage Adaptation
Execute Application
Migrate to Configuration
Analyze Workload
Design New Configuration
  • Eric Anderson, Mustafa Uysal, Michael Hobbs,
    Guillermo Alvarez, Mahesh Kallahalla, Kim Keeton,
    Arif Merchant, Erik Riedel, Susan Spence, Ram
    Swaminathan, Simon Towers, Alistair Veitch, John
    Wilkes HP Labs Storage Systems Program

2
Hippodrome Why?
  • Computer systems very complex
  • System administrators very expensive
  • Let the computer handle it
  • Optimize the system for the workload as it
    changes
  • Determine when to add/remove hardware
  • Two parts to talk
  • Description of framework for managing a large I/O
    centric system
  • Experimental results showing when it works and
    when it doesnt.

3
Hippodrome Lessons
  • Global system adaptation possible by use of four
    parts of the loop
  • Solver Finds new "optimal" configuration
  • Models Predicts the performance of a
    configuration
  • Analysis Generates summary of a workload
  • Migration Moves current configuration to new one
  • "Goodness" dependent on accuracy of models
  • Rate of adaptation dependent on "over-commit"
    available in the system
  • A gradually increasing workload can always be
    "good" if enough headroom exists

4
Hippodrome Our System
  • Targeted at applications running on large storage
    systems
  • Solver chooses appropriate configuration for
    array and mapping of application-level storage
    units onto the array
  • Experiments use synthetic applications for ease
    of understanding "good" behaviour
  • Applications run on an N-class server and access
    an HP FC-60 disk array via switched fibre channel

5
Hippodrome Four Parts Needed for Adaptation
Execute Application
Migrate to Configuration
Analyze Workload
Design New Configuration
  • Analysis Generates summary of a workload
  • Models Predicts the performance of a
    configuration
  • Solver Finds new "optimal" configuration
  • Migration Moves current configuration to new one
  • Solver and Models both part of "Design New
    Configuration" step

6
Hippodrome Analysis, Models, Solver, Migration
  • Trace the I/O's generated, run through analysis
    tools to create "workload" file.
  • Two parts generated from analysis
  • "stores" a logically contiguous fixed-size block
    of storage. Usually implemented as a logical
    volume
  • "streams" an access pattern to a particular
    store. Currently defined as average request
    rate, average request size, run count, on/off
    time, overlap fraction
  • In our experiments, some additional per-stream
    values are also calculated to ease understanding
    the behaviour of the system

7
Hippodrome Analysis, Models, Solver, Migration
  • Two inputs to models
  • Device configuration Logical Units (LUNs) with
    disk type, number of disks, raid level, stripe
    size array controller associated with each LUN
  • Workload configuration List of stores on each
    LUN and therefore the streams accessing that lun
    and using the associated controller
  • Output is utilization of each component (disk,
    controller, SCSI bus, etc.)
  • In our experiments, models calibrated to 6-disk
    R5 LUN for 4k and 256k random I/Os at an accuracy
    above 98 as the general models are still being
    developed.

8
Hippodrome Analysis, Models, Solver, Migration
  • Two inputs to solver
  • The workload (streams and stores)
  • Description of "valid" configurations (what
    devices to use, what raid levels to use, etc.)
  • Output of solver is a configuration
  • Array descriptions (LUNs, disks, controllers,
    etc.)
  • The mapping of stores onto LUNs
  • Solver uses models to predict if a configuration
    is valid (i.e. No component is over 100
    utilized)
  • In our experiments, solver pinned to using 6-disk
    R5 luns to match the models and to eliminate the
    need to migrate between raid types.

9
Hippodrome Analysis, Models, Solver, Migration
  • Takes as input new "desired" configuration
  • Migrates the system to the new configuration
    preserving the data and access to the data during
    the migration
  • In our experiments, the synthetic application
    does not care about the data, and so we simply
    destroy the old configuration and create the new
    one to do a "migration"

10
Hippodrome Experimental overview
Execute Application
Migrate to Configuration
Analyze Workload
Design New Configuration
  • Each experiment is a series of iterations around
    the loop. Each iteration is called a "step"
  • Each step will provide three values
  • Deviation from target rate "goodness" metric 1
  • Average I/O response time "goodness" metric 2
  • Number of LUNs used

11
Experiment Grouping
  • Multiple variants of each "application"
  • constant-1 streams always on, I/O rate constant
  • constant-2 stream groups anti-correlated, I/O
    rate constant when active
  • scaling-1 one store running as fast as possible
  • scaling-2 like constant-1, but streams are
    enabled in different steps once enabled, a
    stream will stay on
  • scaling-3 like constant-1, but stream I/O rate
    increases as step number increases
  • All experiments show global adaptation possible

12
Hippodrome Experiments Demonstrate Lessons
  • "Goodness" dependent on accuracy of models
  • constant-1, constant-2 we show how to "break"
    the loop.
  • Rate of adaptation dependent on "over-commit"
    available in the system
  • constant-1, constant-2 we show how fast the
    system converges
  • A gradually increasing workload can always be
    "good" if enough headroom exists
  • scaling-2, scaling-3 we show that the
    application always runs at its target rate

13
Hippodrome Experimental Hardware/Software
  • Array for experiments is HP FC-60
  • 2 controllers, 6 trays
  • 1 Ultra SCSI bus/tray (40MB/s)
  • 4 Seagate 18GB, 10k RPM disks used/tray 24
    total
  • 4 6 disk R5 LUNs at 16k stripe size
  • 1 LUN can do 625 random 4k reads/second
  • Host for experiments is HP N-Class
  • 1 440 MHz CPU, 1 GB memory, HP-UX 11.00
  • 2 100 MB/s fibre channel cards used
  • Locally developed synthetic application
    (Buttress)
  • Host and array connected through Brocade switch

14
Hippodrome Common Experiment Parameters
  • Will vary stores, streams, target request
    rate
  • Some parameters usually the same
  • Phasing all streams on at the same time
  • Store capacity 256 MB
  • Max. I/O's outstanding/stream 4
  • Headroom 0
  • Some parameters constant for all experiments
  • Request type 4k read
  • Request offset uniformly random across store,
    aligned to 1k boundary
  • Run count 1 (no sequentiality in requests)
  • Arrival process open, poisson

15
Hippodrome Constant-1 experiments
  • Important result is shape of the graphs
  • Deviation from target rate converges to 0
  • Response time gets (much) better
  • luns used (in the end) matches required request
    rate
  • Comments
  • Variants 0-3 have total RR of 2000 4 LUNs
  • Variants 4-6 experiment with filling a LUN to
    start
  • Variants 5,6 differ only in the headroom

16
Constant-1 Deviation from Target Rate
  • Variants 0-5 converge to 95 CI of 0
  • Variant 4 converged even though the LUN was full
    at the start
  • Variant 5 converged because of the 10 headroom
  • Variant 6 never converges models predict the LUN
    is only 95 utilized

17
Constant-1 Response Time
  • Response times get an order of magnitude better
  • Variant 6 stays at the bad (0.15 second) average
    response time

18
Constant-1 Number of LUNs
  • Lines offset slightly so different variants can
    be seen
  • Goes up by 1 lun each step can't over commit
    device to 200
  • Variants 4,5 have a total request rate lt 3625,
    so only use 3 luns
  • Variant 6 stays at 1 lun, as would be predicted
    by other results

19
Hippodrome Constant workload review
  • Given a constant workload, the loop converges to
    the "correct" system in most cases
  • "Goodness" dependent on accuracy of models
  • We "break" the loop either through not enough
    headroom or bad models
  • Rate of adaptation dependent on "over-commit"
    available in the system
  • In general, it increases by 1 LUN per iteration
  • With a workload with idle time, it converges
    faster
  • Now look at workloads that change

20
Hippodrome Scaling-2 experiments
  • Scaling-2 intended to simulate adding in
    additional weeks in a data warehouse, additional
    file systems, etc.
  • We turn on streams as the step number increases
  • Store capacity 64 MB, max. outstanding 4
  • Comments
  • Always "correct!" rate of increase is small
    enough
  • Response time shows points where we added work
  • LUNs increases as necessary

21
Scaling-2 Deviation from Target Rate
  • Error bars are the same size as before scale is
    much smaller
  • Amazingly, always within 95 confidence interval
    of correct
  • Slightly above 0 deviation because of measurement
    methodology

22
Scaling-2 Response Time
  • Scale is much smaller than for constant workloads
    (max. of 0.055 s vs. 1s)
  • Now we can see when we add work and when we
    remain constant
  • Height of peaks show how close to 100 the
    previous step was
  • Slight trend upward more total I/Os and more
    capacity actively used

23
Scaling-2 Response Time Variant 0 only
  • Now we can see when we add work and when we
    remain constant
  • Height of peaks show how close to 100 the
    previous step was

24
Scaling-2 Number of LUNs
  • Gradual increase in luns
  • Exact switch point dependent on specific increase
    pattern
  • Changes close together as increase patterns are
    similar

25
Hippodrome Scaling workload review
  • Handled order of magnitude increase in workload
    without having serious slowdowns
  • Number of luns up by factor of 4
  • Could see points of additional work in response
    time jumping and then settling
  • Question what other scaling up patterns are
    useful?
  • One other group planned is different streams
    scaling at different rates

26
Hippodrome Future Work
  • Shifting workloads (transaction processing in the
    day, decision support at night)
  • Cyclic workloads (system is told about the
    different shift positions)
  • More complete models, migration of actual data
  • More complex synthetic workloads
  • Simple "application" (TPC-B?)
  • Complex application (Retail Data Warehouse)
  • Support for global bounds on system size/cost

27
Hippodrome Four Parts Needed for Adaptation
Execute Application
Migrate to Configuration
Analyze Workload
Design New Configuration
  • Analysis Generates summary of a workload
  • Models Predicts the performance of a
    configuration
  • Solver Finds new "optimal" configuration
  • Migration Moves current configuration to new one
  • Solver and Models both part of "Design New
    Configuration" step

28
Hippodrome Lessons
  • Global system adaptation possible by use of four
    parts of the loop
  • Solver Finds new "optimal" configuration
  • Models Predicts the performance of a
    configuration
  • Analysis Generates summary of a workload
  • Migration Moves current configuration to new one
  • "Goodness" dependent on accuracy of models
  • Rate of adaptation dependent on "over-commit"
    available in the system
  • A gradually increasing workload can always be
    "good" if enough headroom exists

29
Hippodrome Automatic Global Storage Adaptation
  • Questions?
  • Joint work with Eric Anderson, Mustafa Uysal,
    Michael Hobbs, Guillermo Alvarez, Mahesh
    Kallahalla, Kim Keeton, Arif Merchant, Erik
    Riedel, Susan Spence, Ram Swaminathan, Simon
    Towers, Alistair Veitch, John Wilkes HP Labs
    Storage Systems Program

30
Hippodrome Constant-2 experiments
  • Phasing is a very important workload property
  • Divide streams into groups (1..n), group start
    times offset, then constant on/off pattern
  • Max. outstanding/stream 32 Target rate 40
  • Comments
  • Variant 1 shows faster adaptation because of idle
    time
  • Variant 4/5 show what happens when analysis step
    is wrong
  • Scaling groups proved uninteresting

31
Constant-2 Deviation from Target Rate
  • All except variant 4 converge 4 appears the same
    as 5 in the analysis, but the two groups overlap
    in 4 and are anti-correlated in 5
  • Variant 1 converges faster than the others this
    is because the idle time between groups running
    allows the system to drain requests

32
Constant-2 Response Time
  • Similar results to previous slide
  • Variant 4 does not get to a good response time
  • Variant 1 converges faster than others.

33
Constant-2 Number of LUNs
  • Now we see why variant 1 converges faster, it
    gets to 4 luns in only 2 steps rather than three
    this is because of the idle time.
  • Otherwise, behaviour is the same as for
    constant-1, which is to be expected as in the
    aggregate, constant-2 is the same as constant-1

34
Hippodrome Scaling-1 experiments
  • Scaling-1 intended to simulate something like a
    disk copy that will run as fast as the disks will
    go (it's disk bound, not cpu bound)
  • Worked for 3 iterations of the loop (even striped
    the store across multiple LUNs), then wanted 5
    luns, which is not available
  • Future work handling a global bound on the size
    of the storage system (for example, you can't
    spend more than 100,000)

35
Hippodrome Scaling-3 experiments
  • Scaling-3 intended to simulate adding work over
    constant data set (e.g. more queries to DB)
  • We increase target request rate as step increases
  • Store capacity 64 MB, max. outstanding 4, max. RR
    36
  • Comments
  • Always "correct!" rate of increase is small
    enough
  • Response time shows points where we added work
  • LUNs increases as necessary
  • Initial deviations garbage due to low request rate

36
Scaling-3 Deviation from target rate
  • Ignore the graph before about step 4, request
    rates too low, analysis seeing bursts and
    calculating rates over that
  • Always supports target request rate

37
Scaling-3 Response Time
  • Variant 1 shows up-down pattern of changing then
    stabilizing workload
  • Always doing pretty well, big drops for variant 0
    as lun count increased

38
Scaling-3 Number of LUNs
  • Increases gradually, exact switch over dependent
    on variant specifics
Write a Comment
User Comments (0)
About PowerShow.com