Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher

Description:

Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona SC Ctr – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 31
Provided by: researchI6
Category:

less

Transcript and Presenter's Notes

Title: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher


1
Design and Implementation ofa Generic
Resource-SharingVirtual-Time Dispatcher
  • Tal Ben-Nun
  • Scl. Eng CS
  • Hebrew University

Yoav Etsion CS Dept Barcelona SC Ctr
Dror Feitelson Scl. Eng CS Hebrew University
Supported by the Israel Science Foundation, grant
no. 28/09
2
Design and Implementation ofa Generic
Resource-SharingVirtual-Time Dispatcher
  • Goal is to control share of resources, not to
    optimize performance important in virtualization

3
Design and Implementation ofa Generic
Resource-SharingVirtual-Time Dispatcher
  • Goal is to control share of resources, not to
    optimize performance important in virtualization

Same module used for diverse resources
4
Design and Implementation ofa Generic
Resource-SharingVirtual-Time Dispatcher
  • Goal is to control share of resources, not to
    optimize performance important in virtualization

Same module used for diverse resources
Mechanism used dispatch the most deserving
client at each instant
5
Design and Implementation ofa Generic
Resource-SharingVirtual-Time Dispatcher
  • Goal is to control share of resources, not to
    optimize performance important in virtualization

Same module used for diverse resources
Mechanism used dispatch the most deserving
client at each instant
Selection of deserving client using virtual time
formalism
6
Design and Implementation ofa Generic
Resource-SharingVirtual-Time Dispatcher
  • Goal is to control share of resources, not to
    optimize performance important in virtualization

Same module used for diverse resources
Mechanism used dispatch the most deserving
client at each instant
Selection of deserving client using virtual time
formalism
Implemented and measured in Linux
7
Motivation
  • Context VMM for server consolidation
  • Multiple legacy servers share physical platform
  • Improved utilization and easier maintenance
  • Flexibility in allocating resources to virtual
    machines
  • Virtual machines typically run a single
    application (appliances)

8
Motivation
  • Assumed goal enforce predefined allocation of
    resources to different virtual machines
  • (fair share scheduling)
  • Based on importance / SLA
  • Can change with time or due to external events
  • Problem what is 30 of the resources when
    there are many different resources, and diverse
    requirements?

9
Global Scheduling
  • Fair share usually applied to a single resource
  • But what if this resource is not a bottleneck?
  • Global scheduling idea
  • Identify the system bottleneck resource
  • Apply fair share scheduling on this resource
  • This induces appropriate allocations on other
    resources
  • This paper how to apply fair-share scheduling on
    any resource in the system

10
Previous Work I Virtual Time
  • Accounting is inversely proportional to
    allocation
  • Schedule the client that is farthest behind

11
Previous Work II Traffic Shaping
  • Leaky bucket
  • Variable requests
  • Constant rate transmission
  • Bucket represent buffer
  • Token bucket
  • Variable requests
  • Constant allocations
  • Bucket represents stored capacity

12
Putting them Together RSVT
  • Resource sharing all clients make progress
    continuously
  • Generalization of processor sharing
  • Each job has its ideal resource sharing progress
  • This is considered to be the allocation ai
  • Grows at constant rate
  • Each job has its actual consumption ci
  • Grows only when job runs
  • Scheduling priority is the difference
  • pi ai ci

13
Example
  • Three clients
  • Allocations roughly 50, 30, 20
  • Consumption always occur in resource time

Consumed resource time
Wallclock time
14
Bookkeeping
  • The set of active jobs is A
  • The relative allocation of job i is ri
  • During an interval T job k has run
  • Update allocations
  • Update consumptions

15
The Active Set
  • Active jobs (the set A) are those that can use
    the resource now
  • Allocations are relative to the active set
  • The active set may change
  • New job arrives
  • Job terminates
  • Job stops using resource temporarily
  • Job resumes use of resource

16
Grace Period
  • Intermittent activity process data / send packet
  • should retain allocations even when inactive
  • Thus ai continues to grow during grace period
    after it becomes inactive
  • Grace period reflects notion of continuity
  • Sub-second time scale

17
Rebirth
  • Resumption after very long inactive periods
    should be treated as new arrivals
  • Due to grace period, job that becomes inactive
    accrues extra allocation
  • Forget this extra allocation after rebirth period
  • (set ai ci)
  • Two order of magnitude larger than grace period

18
Implementation
  • Kernel module with generic functionality
  • Create / destroy module
  • Create / destroy client
  • Make request / set active / set inactive
  • Make allocations
  • Dispatch
  • Check-in (note resource usage)
  • Glue code for specific subsystems
  • Currently networking and CPU
  • Plan to add disk I/O

19
Networking Glue Code
App
  • Use the Linux QoS framework create RSVT queueing
    discipline

TCP
IP
QoS
queueing discipline
NIC
20
Networking Glue Code
App
  • Non-RSVT traffic has priority (e.g. NFS traffic)
    and is counted as dead time

TCP
IP
RSVT?
no
yes
enqueue
send immediately
select and send
NIC
21
CPU Scheduling Glue Code
  • Use Linux modular scheduling core
  • Add an RSVT scheduling policy
  • RSVT module essentially replaces the policy
    runqueue
  • Initial implementation only for uniprocessors
  • CFS and possibly other policies also exist and
    have higher priority
  • When they run, this is considered dead time

22
Timer Interrupts
  • Linux employs timer interrupts (250 Hz)
  • Allocations are done at these times
  • Translate time into microseconds
  • Subtract known dead time (unavailable to us)
  • Divide among active clients according to relative
    allocations
  • Bound divergence of allocation from consumption
  • Also handling of grace period (mark as inactive)
  • Also handling of rebirth (set ai ci)

23
Multi-Queue
  • At dispatch, need to find client with highest
    priority
  • But priorities change at different rates
  • Solution allow only a limited discrete set of
    relative priorities
  • Each priority has a separate queue
  • Maintain all clients in each queue in priority
    order
  • Only need to check the first in each queue to
    find the maximum

24
Experiment Basic Allocations
rate bandwidth
1 30.89?0.05
2 61.41?0.02
25
Experiment Basic Allocations
rate bandwidth
1 15.69?0.11
2 30.81?0.03
3 46.10?0.03
26
Experiment Active Set
27
Experiment Grace Period
28
Experiment Rebirth
29
Experiment Throttling
  • Two competing MPlayers
  • The one with higher allocation does not need all
    of it
  • Allocation tracks consumption

30
Conclusions
  • Demonstrated generic virtual-time based resource
    sharing dispatcher
  • Need to complete implementation
  • Support for I/O scheduling
  • More details, e.g. SMP support
  • Building block of global scheduling vision
Write a Comment
User Comments (0)
About PowerShow.com