Fair Share Scheduling - PowerPoint PPT Presentation

About This Presentation
Title:

Fair Share Scheduling

Description:

If arrivals are Poisson and service is exponentially distributed (M/M/1) then ... M/M/1. Essential nonlinearity often counterintuitive ... – PowerPoint PPT presentation

Number of Views:883
Avg rating:3.0/5.0
Slides: 50
Provided by: ethanb8
Learn more at: https://www.cs.umb.edu
Category:
Tags: fair | scheduling | share

less

Transcript and Presenter's Notes

Title: Fair Share Scheduling


1
Fair Share Scheduling
Ethan Bolker Mathematics Computer Science UMass
Boston eb_at_cs.umb.edu www.cs.umb.edu/eb Queens
University March 23, 2001
2
References
  • www.bmc.com/patrol/fairshare
  • www/cs.umb.edu/eb/goalmode

3
Coming Attractions
  • Queueing theory primer
  • Fair share semantics
  • Priority scheduling conservation laws
  • Predicting response times from shares
  • analytic formula
  • experimental validation
  • applet simulation
  • Implementation geometry

4
Transaction Workload
  • Stream of jobs visiting a server
    (ATM, time shared CPU, printer, )
  • Jobs queue when server is busy
  • Input
  • Arrival rate ? job/sec
  • Service demand s sec/job
  • Performance metrics
  • server utilization u ?s (must be ? 1)
  • response time r ??? sec/job (average)
  • degradation d r/s

5
Response time computations
  • r, d measure queueing delay
  • r ? s (d ? 1), unless parallel processing
    possible
  • Randomness really matters
  • r s (d 1) if arrivals scheduled (best case,
    no waiting)
  • r gtgt s for bulk arrivals (worst case, maximum
    delays)
  • Theorem. If arrivals are Poisson and service is
    exponentially distributed (M/M/1) then
  • d 1/(1- u) r s/(1- u)
  • Think virtual server with speed 1-u

6
M/M/1
  • Essential nonlinearity often counterintuitive
  • at u 95 average degradation is 1/(1-0.95)
    20,
  • but 1 customer in 20 has no wait at all (5 idle
    time)
  • A useful guide even when hypotheses fail
  • accurate enough (? 30) for real computer systems
  • d depends only on u many small jobs have same
    impact as few large jobs
  • faster system ? smaller s ? smaller u
    r s/(1-u) ? double win less
    service, less wait
  • waiting costly, server cheap (telephones) want u
    ? 0
  • server costly (doctors) want u ? 1 but scheduled

7
Scheduling for Performance
  • Customers want good response times
  • Decreasing u is expensive
  • High end Unix offerings from HP, IBM, Sun offer
    fair share scheduling packages that allow an
    administrator to allocate scarce resources (CPU,
    processes, bandwidth) among workloads
  • How do these packages behave?
  • Model as a black box, independent of internals
  • Limit study to CPU shares on a uniprocessor

8
Multiple Job Streams
  • Multiple workloads, utilizations u1, u2,
  • U ? ui lt 1
  • If no workload prioritization
    then all degradations are equal
    di 1/(1-U)
  • Share allocations are de facto prioritizations
  • Study degradation vector V (d1, d2, )

9
Share Semantics
  • Suppose workload w has CPU share fw
  • Normalize shares so that ?w fw 1
  • w gets fraction fw of CPU time slices when at
    least one of its jobs is ready for service
  • Can it use more if competing workloads idle?
  • No think share cap
  • Yes think share guarantee

10
Shares As Caps
  • Good for accounting (sell fraction of web server)
  • Available now from IBM, HP, soon from Sun
  • Straightforward (boring) - workloads are isolated
  • Each runs on a virtual processor with speed f

share f
dedicated system
u/f need f gt u !
utilization u
response time r r(1 ? u)/(f
? u)
11
Shares As Guarantees
  • Good for performance economy
    (use otherwise idle resources)
  • Shares make a difference only when there are
    multiple workloads
  • Large share resembles high priority
    share may be less than utilization
  • Workload interaction is subtle, often
    unintuitive, hard to explain

12
Modeling
13
Modeling
  • Real system
  • Complex, dynamic, frequent state changes
  • Hard to tease out cause and effect
  • Model
  • Static snapshot, deals in averages and
    probabilities
  • Fast enlightening answers to what if questions
  • Abstraction helps you understand real system
  • Start with a study of priority scheduling

14
Priority Scheduling
  • Priority state order workloads by priority (ties
    OK)
  • two workloads, 3 states 12, 21, 12
  • three workloads, 13 states
  • 123 (6 3! of these ordered states),
  • 123 (3 of these),
  • 123 (3 of these),
  • 123 (1 state with no priorities)
  • n wkls, f(n) states, n! ordered (simplex lock
    combos)
  • p(s) prob( state s ) fraction of time in
    state s
  • V(s) degradation vector when state s
    (measure this, or compute it using queueing
    theory)
  • V ?s p(s)V(s) (time avg is convex combination)
  • Achievable region is convex hull of vectors V(s)

15
Two workloads
d1 d2
d2
V(12) (wkl 1 high prio)
?
?
V(12) (no priorities)
achievable region
?
V(21)
d1
16
Two workloads
d1 d2
d2
V(12) (wkl 1 high prio)
?
?
V(12) (no priorities)
?
0.5 V(12) 0.5V(21) ? V(12)
?
V(21)
d1
17
Two workloads
d1 d2
d2
V(12) (wkl 1 high prio)
?
?
V(12) (no priorities)
note u1 lt u2 ? wkl 2 effect on wkl 1 large
?
V(21)
d1
18
Conservation
  • No Free Lunch Theorem. Weighted average
    degradation is constant, independent of priority
    scheduling scheme
  • ?i (ui /U) di 1/(1-U)
  • Provable from some hypotheses
  • Observable in some real systems
  • Sometimes false shortest job first minimizes
    average response time (printer queues,
    supermarket express checkout lines)

19
Conservation
  • For any proper set A of workloads
  • Imagine giving those workloads top priority.
  • Then can pretend other wkls dont exist. In that
    case
  • ?i ? A (ui /U(A)) di 1/(1-U(A))
  • When wkls in A have lower priorities they have
  • higher degradations, so in general
  • ?i ? A (ui /U(A)) di ? 1/(1-U(A))
  • These 2n -2 linear inequalities determine the
    convex achievable region R
  • R is a permutahedron only n! vertices

20
Two Workloads
conservation law (d1 , d2 ) lies on the line
d2 workload 2 degradation
u 1d1 u 2d2 1/(1-U)
d1 workload 1 degradation
21
Two Workloads
d2 workload 2 degradation
constraint resulting from workload 1
d 1 ? 1/(1- u1 )
d1 workload 1 degradation
22
Two Workloads
Workload 1 runs at high priority V(1,2) (1
/(1- u1 ), 1 /(1- u1 )(1-U) )
?
d2 workload 2 degradation
constraint resulting from workload 1
d1 ? 1 /(1- u1 )
d1 workload 1 degradation
23
Two Workloads
?
d2 workload 2 degradation
u 1d1 u 2d2 1/(1-U)
d2 ? 1 /(1- u2 )
?
V(2,1)
d1 workload 1 degradation
24
Two Workloads
?
V(1,2) (1 /(1- u1 ), 1 /(1- u1 )(1-U) )
d2 workload 2 degradation
achievable region R
u 1d1 u 2d2 1/(1-U)
d2 ? 1 /(1- u2 )
?
V(2,1)
d1 ? 1 /(1- u1 )
d1 workload 1 degradation
25
Three Workloads
  • Degradation vector (d1,d2, d3) lies on plane
    u1 d1 u2 d2 u3dr3 C
  • We know a constraint for each workload w
    uw dw ? Cw
  • Conservation applies to each pair of wkls as
    well u1 d1 u2 d2 ? C12
  • Achievable region has one vertex for each
    priority ordering of workloads 3! 6 in all
  • Hence its name the permutahedron

26
Three Workload Permutahedron
3! 6 vertices (priority orders) 23 - 2 6
edges (conservation constraints)
u1 r1 u2 d2 u3 d3 C
d3
V(1,2,3)
?
?
V(2,1,3)
d2
d1
27
Experimental evidence
28
Four workload permutahedron
4! 24 vertices (ordered states) 24 - 2 14
facets (proper subsets) (conservation
constraints) 74 faces (states)
Simplicial geometry and transportation
polytopes, Trans. Amer. Math. Soc. 217 (1976) 138.
29
Map shares to degradations- two workloads -
  • Suppose f1 and f2 gt 0 , f1 f2 1
  • Model System operates in state
  • 12 with probability f1
  • 21 with probability f2
  • (independent of who is on queue)
  • Average degradation vector
  • V f1 V(12) f2 V(21)

30
Predict Degradations From Shares(Two Workloads)
  • Reasonable modeling assumption f1 1, f2 0
    means workload 1 runs at high priority
  • For arbitrary shares workload priority order is
  • (1,2) with probability f1
  • (2,1) with probability f2

    (probability fraction of time)
  • Compute average workload degradation
    d1 f1 ? (wkl 1 degradation at high
    priority) f2 ? (wkl
    1 degradation at low priority )

31
Model validation
32
Model validation
33
Map shares to degradations- three (n) workloads -
  • f1
    f2 f3
  • prob(123) ------------------------------
  • (f1 f2 f3) (f2 f3)
    (f3)
  • Theorem These n! probabilities sum to 1
  • interesting identity generalizing adding
    fractions
  • prove by induction, or by coupon collecting
  • V ?ordered states s prob(s) V(s)
  • O(n!), ?(n!), good enough for n ? 9 (12)

34
Model validation
35
Model validation
36
The Fair Share Applet
  • Screen captures on next slides are from
    www.bmc.com/patrol/fairshare
  • Experiment with what if fair share modeling
  • Watch a simulation
  • Random virtual job generator for the simulation
    is the same one used to generate random real jobs
    for our benchmark studies

37
Three Transaction Workloads
???
  • Three workloads, each with utilization
    0.32 jobs/second ? 1.0
    seconds/job 0.32 32
  • CPU 96 busy, so average (conserved) response
    time is 1.0/(1?0.96) 25 seconds
  • Individual workload average response times depend
    on shares

???
???
38
Three Transaction Workloads
  • Normalized f3 0.20 means 20 of the time
    workload 3 (development) would be dispatched at
    highest priority
  • During that time, workload priority order is
    (3,1,2) for 32/80 of the time, (3,2,1) for 48/80
  • Probability( priority order is 312 )
    0.20?(32/80) 0.08

39
Three Transaction Workloads
  • Formulas on previous slide
  • Average predicted response time weighted by
    throughput 25 seconds (as expected)
  • Hard to understand intuitively
  • Software helps

40
Three Transaction Workloads
note change from 32
41
Simulation
42
When the Model Fails
  • Real CPU uses round robin scheduling to deliver
    time slices
  • Short jobs never wait for long jobs to complete
  • That resembles shortest job first, so response
    time conservation law fails
  • At high utilization, simulation shows smaller
    response times than predicted by model
  • Response time conservation law yields
    conservative predictions

43
Scaling Degradation Predictions
  • V ?ordered states s prob(s) V(s)
  • Each s is a permutation of (1,2, , n)
  • Think of it as a vector in n-space
  • Those n! vectors lie on of a sphere
  • For n large they are pretty densely packed
  • Think of prob(s) as a discrete approximation to a
    probability distribution on the sphere
  • V is an integral

44
Monte Carlo
  • loop sampleSize times
  • choose a permutation s at random from the
    distribution determined by the
    shares
  • compute degradation vector V(s)
  • accumulate V prob(s)V(s)
  • sampleSize 40000 works well independent of n!

45
Map shares to degradations(geometry)
  • Interpret shares as barycentric coordinates in
    the n-1 simplex
  • Study the geometry of the map from the simplex to
    the n-1 dimensional permutahedron
  • Easy when n2 each is a line segment and map is
    linear

46
Mapping a triangle to a hexagon
f3 1
f1 0
312
132
f1 1
?
M
f3 0
321
123
wkl 1 high priority
213
231
wkl 1 low priority
47
Mapping a triangle to a hexagon
f1 0
f1 1
?
23
48
Mapping a triangle to a hexagon
49
What This Means
  • Add a strong statement that summarizes how you
    feel or think about this topic
  • Summarize key points you want your audience to
    remember
Write a Comment
User Comments (0)
About PowerShow.com