Title: Virtuoso: Distributed Computing Using Virtual Machines
1Virtuoso Distributed Computing Using Virtual
Machines
- Peter A. Dinda
- Prescience Lab
- Department of Computer Science
- Northwestern University
- http//plab.cs.northwestern.edu
2People and Acknowledgements
- Students
- Ashish Gupta, Ananth Sundararaj, Bin Lin, Alex
Shoykhet, Jack Lange, Dong Lu, Jason Skicewicz,
Brian Cornell - Collaborators
- In-Vigo project at University of Florida
- Renato Figueiredo, Jose Fortes
- Funders/Gifts
- NSF through several awards, VMWare
3Outline
- Motivation and context
- Virtuoso model
- Virtual networking
- Its central importance
- Application traffic load measurement and topology
inference - Understanding user comfort with resource
borrowing - User-centric resource control
- Related work
- Conclusions
4- How do we deliver arbitrary amounts of
computational power to ordinary people?
5Distributed and Parallel Computing
- How do we deliver arbitrary amounts of
computational power to ordinary people?
Interactive Applications
6Distributed and Parallel Computing
- How do we deliver arbitrary amounts of
computational power to ordinary people?
Interactive Applications
7IBM xSeries virtual cluster (64 CPUs), 1 TB RAID
Interactivity Environment Cluster, CAVE (90
CPUs), 8 TB RAID
2 Distributed Optical Testbed Clusters IBM
xSeries (14-28 CPUs), 1 TB RAID
DOT clusters with optical connectivity IBM
xSeries (14-28 CPUs), 1 TB RAID Argonne,
U.Chicago, IIT, NCSA, others
Nortel Optera Metro Edge Optical Router
Distributed Optical Testbed (DOT) Private Optical
Network
Northwestern
8Grid Computing
- Flexible, secure, coordinated resource sharing
among dynamic collections of individuals,
institutions, and resources - I. Foster, C. Kesselman, S. Tuecke, The Anatomy
of the Grid Enabling Scalable Virtual
Organizations, International J. Supercomputer
Applications, 15(3), 2001 - Globus, Condor/G, Avaki, EU DataGrid SW,
9Complexity from Users Perspective
- Process or job model
- Lots of complex state connections, special
shared libraries, licenses, file descriptors - Operating system specificity
- Perhaps even version-specific
- Symbolic supercomputer example
- Need to buy into some Grid API
- Install and learn potentially complex Grid
software
10Users already know how to deal with this
complexity at another level
11Complexity from Resource Owners Perspective
- Install and learn potentially complex Grid
software - Deal with local accounts and privileges
- Associated with global accounts or certificates
- Protection/Isolation
- Support users with different OS, library,
license, etc, needs.
12Virtual Machines
- Language-oriented VMs
- Abstract interpreted machine, JIT Compiler, large
library - Examples UCSD p-system, Java VM, .NET VM
- Application-oriented VMs
- Redirect library calls to appropriate place
- Examples Entropia VM
- Virtual servers
- Kernel makes it appear that a group of processes
are running on a separate instance of the kernel
or run OS at user-level on top of itself - Examples Ensim, Virtuozzo, UML, VServer, FreeVSD
- Microkernels designed to host OSes
- Xeno VM
- Virtual machine monitors (VMMs)
- Raw machine is the abstraction
- VM represented by a single image
- Examples IBMs VM, VMWare, Virtual PC/Server,
Plex/86, SIMICS, Hypervisor, DesQView/TaskView.
VM/386
13VMWare GSX VM
14Isnt It Going to Be Too Slow?
Application Resource ExecTime (103 s) Overhead
SpecHPC Seismic (serial, medium) Physical 16.4 N/A
SpecHPC Seismic (serial, medium) VM, local 16.6 1.2
SpecHPC Seismic (serial, medium) VM, Grid virtual FS 16.8 2.0
SpecHPC Climate (serial, medium) Physical 9.31 N/A
SpecHPC Climate (serial, medium) VM, local 9.68 4.0
SpecHPC Climate (serial, medium) VM, Grid virtual FS 9.70 4.2
Small relative virtualization overhead compute-in
tensive
Relative overheads lt 5
Experimental setup physical dual Pentium III
933MHz, 512MB memory, RedHat 7.1, 30GB disk
virtual Vmware Workstation 3.0a, 128MB memory,
2GB virtual disk, RedHat 2.0 NFS-based grid
virtual file system between UFL (client) and NWU
(server)
15Isnt It Going To Be Too Slow?
Synthetic benchmark exponentially arrivals of
compute bound tasks, background load provided by
playback of traces from PSC Relative overheads lt
10
16Isnt It Going To Be Too Slow?
- Virtualized NICs have very similar bandwidth,
slightly higher latencies - J. Sugerman, G. Venkitachalam, B-H Lim,
Virtualizing I/O Devices on VMware Workstations
Hosted Virtual Machine Monitor, USENIX 2001 - Disk-intensive workloads (kernel build, web
service) 30 slowdown - S. King, G. Dunlap, P. Chen, OS support for
Virtual Machines, USENIX 2003 - However May not scale with faster NIC or disk
17Wont Migration Be Too Slow?
- Appears daunting
- Memory disk!
- Nonetheless
- Stanford Collective 20 minutes at DSL speeds!
- Sapuntzakis, et al, OSDI 2002, very deep work
- Wide variety of techniques
- Intel/CMU ISR 2.5-30 seconds from distributed
file system at LAN speeds - Our work 2-400 seconds with rsync on LAN
(Shoykhet) - Current project versioning file system (Cornell,
Patel)
18Virtuoso
- Approach Lower level of abstraction
- Raw machines connected to users network
- Mechanism Virtual machine monitors
- Our Focus Middleware support to hide complexity
- Ordering, instantiation, migration of machines
- Virtual networking and remote devices
- Connectivity to remote files, machines
- Information services
- Monitoring and prediction
- Resource control
19The Virtuoso Model
- User orders raw machine(s)
- Specifies hardware and performance
- Basic software installation available
- OS, libraries, licenses, etc.
- Virtuoso creates raw image and returns reference
- Image contains disk, memory, configuration, etc.
- User powers up machine
- Virtuoso chooses provider
- Information service
- Virtuoso migrates image to provider
- Efficient network transfer
- rsync, demand paging, versioned filesystems
20User Configuring a New VM
21The Virtuoso Model
- Provider instantiates machine
- Virtual networking ties machine back to users
home network - Remote device support makes users desktops
devices available on remote VM - Remote display support gives user the console of
the machine (VNC) - Resource control to give user expected
performance - User goes to his network admin to get address,
routing for his new machine - User customizes machine
- Feeds in CDs, floppies, ftp, up2date, etc.
22VM Running with Browser Console Display
23The Virtuoso Model
- User uses machine
- Shutdown, hibernate, power-off, throw away
- Virtuoso continuously monitors and adapts
- Virtual network as a monitoring platform
- Various mechanisms, all invisible to user
- Migrating the machine
- Routing traffic between machines
- Virtual network topology
- Predictive scheduling versus reservations
- Various goals
- Price
- Interactivity
- Direct User Feedback
R. Figueiredo, P. Dinda, J. Fortes, A Case For
Grid Computing on Virtual Machines, ICDCS 2003
24Context
Virtualized Audio
Interactive HPC Exemplar Application
A Framework for Distributed Computing Using
Virtual Machines
Virtuoso
User Comfort
Measuring Human Comfort
RTSA/Maestro
Achieving Human Comfort
Measuring, Inferring, and Predicting Dynamic
Resource and Application Behavior
Clairvoyance
URGIS
Representing and Querying the Computing
Environment as a Whole
25Outline
- Motivation and context
- Virtuoso model
- Virtual networking
- Its central importance
- Application traffic load measurement and topology
inference - Understanding user comfort with resource
Borrowing - User-centric resource control
- Related work
- Conclusions
26Why Virtual Networking? (with Sundararaj)
- A machine is suddenly plugged into your network.
What happens? - Does it get an IP address?
- Is it a routeable address?
- Does firewall let its traffic through?
- To any port?
How do we make virtual machine hostileenvironment
s as friendly as the users LAN?
27A Layer 2 Virtual Network (VLAN) for the Users
Virtual Machines
- Why Layer 2?
- Protocol agnostic
- Mobility
- Simple to understand
- Ubiquity of Ethernet on end-systems
- What about scaling?
- Number of VMs limited
- Hierarchical routing possible because MAC
addresses can be assigned hierarchically
A. Sundararaj, P. Dinda, Towards Virtual Networks
for Virtual Machine Grid Computing, USENIX VM 2004
28A Simple Layer 2 Virtual Network
Client
Server
VM monitor
SSH
Remote VM
Virtual NIC
Physical NIC
Physical NIC
Hostile Remote Network
Friendly Local Network
29A Simple Layer 2 Virtual Network
Client
Server
VM monitor
SSH
Remote VM
Virtual NIC
Physical NIC
Physical NIC
Hostile Remote Network
Friendly Local Network
30A Simple Layer 2 Virtual Network
Client
Server
UDP, TCP, TCP/SSL, or SSH tunnel
VM monitor
vnetd
vnetd
Remote VM
Virtual NIC
Physical NIC
Physical NIC
Hostile Remote Network
Friendly Local Network
31More Details
VM
Host Only Network
ethz
eth0
ethy
ethx
eth0
vmnet0
Client LAN
VNET
VNET
IP Network
Ethernet Packet Injected Directly into VM
interface
Host
Proxy
Client
Ethernet Packet Tunneled over TCP/SSL Connection
Ethernet Packet Captured by Promiscuous Packet
Filter
VNET 0.9 available from http//virtuoso.cs.northw
estern.edu
32Initial Performance Results (LAN)
Faster than NAT approach Lots of room for
improvement This version you can download and
use right now
33An Overlay Network
- Vnetds and connections form an overlay network
for routing traffic among virtual machines and
the users home network - Links can added or removed on demand
- Forwarding rules can be added or removed on
demand
34Bootstrapping the Virtual Network
VM
Vnetd
- Star topology always possible
- Connecting from client must have been possible
- Better topology may be possible
- Depends on security at each site
- Topology may change
- Virtual machines can migrate
35VM Layer
Vnetd Layer
Physical Layer
36Application communication topology and traffic
load application processor load
VM Layer
Vnetd Layer
Physical Layer
37Application communication topology and traffic
load application processor load
VM Layer
Vnetd Layer
Network bandwidth and latency sometimes topology
Physical Layer
38Application communication topology and traffic
load application processor load
VM Layer
Vnetd layer can collect all this information as
a sideeffect of packet transfers
Vnetd Layer
Network bandwidth and latency sometimes topology
Physical Layer
39Application communication topology and traffic
load application processor load
VM Layer
Vnetd layer can collect all this information as
a sideeffect of packet transfers and invisibly
act
Vnetd Layer
Network bandwidth and latency sometimes topology
Physical Layer
40Application communication topology and traffic
load application processor load
VM Layer
- Vnetd layer can collect all this information as
a sideeffect of packet transfers - and invisibly act
- VM Migration
Vnetd Layer
Network bandwidth and latency sometimes topology
Physical Layer
41Application communication topology and traffic
load application processor load
VM Layer
- Vnetd layer can collect all this information as
a sideeffect of packet transfers - and invisibly act
- VM Migration
- Topology change
Vnetd Layer
Network bandwidth and latency sometimes topology
Physical Layer
42Application communication topology and traffic
load application processor load
VM Layer
- Vnetd layer can collect all this information as
a sideeffect of packet transfers - and invisibly act
- VM Migration
- Topology change
- Routing change
Vnetd Layer
Network bandwidth and latency sometimes topology
Physical Layer
43Application communication topology and traffic
load application processor load
VM Layer
- Vnetd layer can collect all this information as
a sideeffect of packet transfers - and invisibly act
- VM Migration
- Topology change
- Routing change
- Reservation
Vnetd Layer
Network bandwidth and latency sometimes topology
Physical Layer
44Outline
- Motivation and context
- Virtuoso model
- Virtual networking
- Its central importance
- Application traffic load measurement and topology
inference - Understanding user comfort with resource
borrowing - User-centric resource control
- Related work
- Conclusions
45Application Traffic Load Measurement and Topology
Inference (With Gupta)
- Parallel and distributed applications display
particular communication patterns on particular
topologies - Intensity of communication can also vary from
node to node or time to time. - Combined representation Traffic Load Matrix
- VNET already sees every packet sent or received
by a VM - Can we use this information to compute a global
traffic load matrix? - Can we eliminate irrelevant communication from
matrix to get at application topology?
46Overall Steps
- Low level inter-VM traffic monitoring within VNET
- Compute rows and columns of traffic matrix for
local VMs - Reduction to a global inter-VM traffic load
matrix - Matrix denoising to determine application
topology - Offline to online
47Traffic Monitoring and Reduction
VM
Host Only Network
Ethernet Packet Format SRCDESTTYPEDATA
(size) VMTrafficMatrixSRCDESTsize Each VM
on the host contributes a row and column to the
VM traffic matrix Global reduction to find
overall matrix, broadcast back to VNETs Each
VNET daemon has a view of the global network
load
ethz
eth0
vmnet0
VNET
Host
Packets observed here
48Denoising The Matrix
- Throw away irrelevant communication
- ARPs, DNS, ssh, etc.
- Find maximum entry, a
- Eliminate all entries below alphaa
- Very simple, but seems to work very well for BSP
parallel applications - Remains to be seen how general it is
49Offline Results Synthetic Benchmark
50NAS IS Benchmark
51NAS IS Benchmark
52Online Challenges
- When to start? When to stop?
- Traffic matrix may not be stationary!
- Synchronized monitoring
- All must start and stop together
53When To Start? When to Stop?
Reactive Mechanisms
Proactive Mechanisms
Start when traffic rate exceeds threshold Stop
when traffic rate exceeds a second
threshold Non-uniform discrete event sampling
Provide support for queries by external agent
Keep multiple copies of the matrix, one for
each resolution (1s, 2s, 4s, etc)
What is the Traffic Matrix from the last time
there was at least one high rate source?
What is the Traffic Matrix for the last n seconds
?
54Overheads (100 mbit LAN)
- Essentially zero latency impact
- 4.2 throughput reduction versus VNET
A. Gupta, P. Dinda, Inferring the Topology and
Traffic Load of Parallel Programs Running In a
Virtual Machine Environment, In Submission.
55Online NAS IS on 4 VMs
56Outline
- Motivation and context
- Virtuoso model
- Virtual networking
- Its central importance
- Application traffic load measurement and topology
inference - Understanding user comfort with resource
borrowing - User-centric resource control
- Related work
- Conclusions
57Why Understand User ComfortWith Resource
Borrowing?(With Gupta, Lin)
- Provider supports both interactive and batch VMs
- Provider controls resources
- WFQ (Ensim)
- Priority (our nascent work)
- Periodic real-time schedule (our plans)
- How to use control to provide good interactive
performance cheaply?
58Why Understand User ComfortWith Resource
Borrowing?
- Interactive user specifies peak resource demand
for his VM - What level of resource borrowing is he willing to
tolerate? - Similar question in SETI_at_Home style distributed
parallel computing
59Understanding User Comfort System
- Windows-based distributed system for measuring
user comfort with resource borrowing - Borrowing degree of contention
- CPU Bandwidth
- Disk Bandwidth
- Memory pages
- 1.0 contention for CPU users tasks run half as
fast
60Understanding User Comfort System
Local Result Store
Global Result Store
Registration (Machine info)
Hot Sync (Result Post)
Hot Sync (Testcase Request)
Hot Sync (Testcases)
Local Testcase Store
Global Testcase Store
Client
Server
http//comfort.cs.northwestern.edu
61Controlled Study
- 30 people, 90 minutes each
- 4 application tasks
- Word, Powerpoint, IE, Quake
- Ramp, step, and blank testcases
- CPU, Disk, Memory X Step, Ramp 2 blanks
62A. Gupta, B. Lin, P. Dinda, Measuring and
Understanding User Comfort with Resource
Borrowing, HPDC 2004.
63Insights
- Users surprisingly tolerant, particularly for
disk and memory borrowing - Context is critical, user self-classification
much less so - Frog-in-the-pot only occasionally true
64Using User Feedback Directly
- Discomfort feedback as congestion indication a la
TCP Reno - Rate gt VM Priority
- Adaptive gain control for congestion avoidance
phase - Target maintain stable time between feedback
events - Somewhat promising, but very initial results
B. Lin, P. Dinda, D. Lu, User-driven Scheduling
Of Interactive Virtual Machines, In Submission.
65Outline
- Motivation and context
- Virtuoso model
- Virtual networking
- Its central importance
- Application traffic load measurement and topology
inference - Understanding user comfort with resource
borrowing - User-centric resource control
- Related work
- Conclusions
66Related Work
- Collective / Capsule Computing (Stanford)
- VMM, Migration/caching, Hierarchical image files
- Denali (U. Washington)
- Highly scalable VMMs (1000s of VMMs per node)
- CoVirt (U. Michigan)
- Xenoserver (Cambridge)
- SODA (Purdue)
- Virtual Server, fast deployment of services
- Internet Suspend/Resume (Intel Labs Pittsburgh /
CMU) - Ensim
- Virtual Server, widely used for web site hosting
- WFQ-based resource control released into
open-source Linux kernel - Virtouzzo (SWSoft)
- Ensim competitor
- Available VMMs IBMs VM, VMWare, Virtual
PC/Server, Plex/86, SIMICS, Hypervisor,
DesQView/TaskView. VM/386
67Conclusions and Status
- Virtual machines on virtual networks as the
abstraction for distributed computing - Virtual network as a fundamental layer for
measurement and adaptation - Virtuoso prototype running on our cluster
- 1st generation VNET released. 2nd generation in
progress, versioning file system released
68For MoreInformation
- Prescience Lab
- http//plab.cs.northwestern.edu
- Virtuoso
- http//virtuoso.cs.northwestern.edu
- Join our user comfort study!
- http//comfort.cs.northwestern.edu
69Papers
- R. Figueiredo, P. Dinda, J. Fortes, A Case For
Grid Computing on Virtual Machines, ICDCS 2003 - A. Gupta, B. Lin, P. Dinda, Understanding User
Comfort With Resource Borrowing, HPDC 2004 - A. Sundararaj, P. Dinda, Towards Virtual Networks
for Virtual Machine Grid Computing, USENIX VM
2004. - B. Cornell, P. Dinda, F. Bustamante, Wayback A
User-level Versioning File System For Linux,
USENIX 2004. - A. Sundararaj, P. Dinda, Exploring
Inference-based Monitoring of Virtual Machine
Resources, In Submission. - A. Gupta, P. Dinda, Inferring the Topology and
Traffic Load of Parallel Programs Running In a
Virtual Machine Environment, In Submission. - B. Lin, P. Dinda, User-driven Scheduling of
Interactive Virtual Machines, In Submission.
70Migration (With Shoykhet)
71Migration (With Shoykhet)
72Resource Control
- Owner has an interest in controlling how much and
when compute time is given to a virtual machine - Our approach A language for expressing these
constraints, and compilation to real-time
schedules, proportional share, etc. - Very early stages. Trying to avoid kernel
modifications.
73FrontPage
74Provider Registering A Machine
75Provider Machine List
76User Configuring a New VM
77Options For Registered VM
78Registered VM Configuration
79User Selects Physical Machine
80VM Running with Browser Console Display
81Options for a Suspended Machine
82Choosing A Machine To Migrate To
83Specifics of This Talk
Virtualized Audio
- Virtuoso Overview
- Virtual Networking
- Application Traffic Characterization and Topology
Inference - Understanding User Comfort With Resource
Borrowing - User-centric Resource Control
Virtuoso
User Comfort
RTSA/Maestro
Clairvoyance
URGIS