Xen and the Art of Virtualization SOSP 2003 - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Xen and the Art of Virtualization SOSP 2003

Description:

Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim ... Xen traps PTE updates and emulates, or unhooks' PTE page for bulk updates. MMU Micro-Benchmarks ... – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 32
Provided by: hpcCsTsi
Category:

less

Transcript and Presenter's Notes

Title: Xen and the Art of Virtualization SOSP 2003


1
Xen and the Art of VirtualizationSOSP 2003
  • Paul Barham, Boris Dragovic, Keir Fraser, Steven
    Hand, Tim Harris, Alex Ho, Rolf Neugebauery, Ian
    Pratt, Andrew Wareld
  • University of Cambridge Computer Laboratory

2
Virtualization Overview
  • Single OS image Virtuozo, Vservers, Zones
  • Group user processes into resource containers
  • Hard to get strong isolation
  • Full virtualization VMware, VirtualPC, QEMU
  • Run multiple unmodified guest OSes
  • Hard to efficiently virtualize x86
  • Para-virtualization UML, Xen
  • Run multiple guest OSes ported to special arch
  • Arch Xen/x86 is very close to normal x86

3
X86 CPU Virtualization
  • Xen runs in ring 0 (most privileged)
  • Ring 1/2 for guest OS, 3 for user-space
  • GPF if guest attempts to use privileged instr
  • Xen lives in top 64MB of linear addr space
  • Segmentation used to protect Xen as switching
    page tables too slow on standard x86
  • Hypercalls jump to Xen in ring 0
  • Guest OS may install fast trap handler
  • Direct user-space to guest OS system calls
  • MMU virtualisation shadow vs. direct-mode

4
MMU Virtualization
  • Critical for performance, challenging to make
    fast, especially SMP
  • Xen supports 3 MMU virtualization modes
  • Direct page tables
  • Shadow page tables
  • Hardware Assisted Paging
  • OS Paravirtualization compulsory for 1, optional
    (and very beneficial) for 2 3

5
MMU Virtualization Direct-Mode
6
MMU Virtualization Shadow-Mode
7
Para-Virtualizing the MMU
  • Guest OSes allocate and manage own PTs
  • Hypercall to change PT base
  • Xen must validate PT updates before use
  • Allows incremental updates, avoids revalidation
  • Validation rules applied to each PTE
  • Guest may only map pages it owns
  • Page table pages may only be mapped RO
  • Xen traps PTE updates and emulates, or unhooks
    PTE page for bulk updates

8
MMU Micro-Benchmarks
1.1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
L
X
V
U
L
X
V
U
Page fault (µs)
Process fork (µs)
lmbench results on Linux (L), Xen (X), VMWare
Workstation (V), and UML (U)
9
I/O Architecture
  • Xen IO-Spaces delegate guest OSes protected
    access to specified h/w devices
  • Virtual PCI configuration space
  • Virtual interrupts
  • Devices are virtualised and exported to other VMs
    via Device Channels
  • Safe asynchronous shared memory transport
  • Backend drivers export to frontend drivers
  • Net use normal bridging, routing, iptables
  • Block export any blk dev e.g. sda4,loop0,vg3

10
Device Channel Interface
11
System Performance
1.1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
L
X
V
U
L
X
V
U
L
X
V
U
L
X
V
U
SPEC INT2000 (score)
Linux build time (s)
OSDB-OLTP (tup/s)
SPEC WEB99 (score)
Benchmark suite running on Linux (L), Xen (X),
VMware Workstation (V), and UML (U)
12
TCP results
1.1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
L
X
V
U
L
X
V
U
L
X
V
U
L
X
V
U
Tx, MTU 1500 (Mbps)
Rx, MTU 1500 (Mbps)
Tx, MTU 500 (Mbps)
Rx, MTU 500 (Mbps)
TCP bandwidth on Linux (L), Xen (X), VMWare
Workstation (V), and UML (U)
13
Xen 3.0 Architecture
VM3
VM0
VM1
VM2
Device Manager Control s/w
Unmodified User Software
Unmodified User Software
Unmodified User Software
GuestOS (XenLinux)
GuestOS (XenLinux)
GuestOS (XenLinux)
Unmodified GuestOS (WinXP))
AGP ACPI PCI
Back-End
Back-End
SMP
Native Device Driver
Native Device Driver
Front-End Device Drivers
Front-End Device Drivers
VT-x
Event Channel
Virtual MMU
Virtual CPU
Control IF
Safe HW IF
32/64bit
Xen Virtual Machine Monitor
Hardware (SMP, MMU, physical memory, Ethernet,
SCSI/IDE)
14
Live Migration of Virtual Machines
  • Christopher Clark, Keir Fraser, Steven Hand,
    Jacob Gorm Hansen, Eric Jul, Christian Limpach,
    Ian Pratt, Andrew Warfield
  • University of Cambridge Computer Laboratory

15
Motivation
  • VM relocation enables
  • High-availability
  • Machine maintenance
  • Load balancing
  • Statistical multiplexing gain

16
Assumptions
  • Networked storage
  • NAS NFS, CIFS
  • SAN Fibre Channel
  • iSCSI, network block dev
  • drdb network RAID
  • Good connectivity
  • common L2 network
  • L3 re-routeing

17
Strategy
18
Strategy 2
VM active on host A Destination host
selected (Block devices mirrored)
Stage 0 pre-migration
Initialize container on target host
Stage 1 reservation
Copy dirty pages in successive rounds
Stage 2 iterative pre-copy
Suspend VM on host A Redirect network
traffic Synch remaining state
Stage 3 stop-and-copy
Activate on host B VM state on host A released
Stage 4 commitment
19
Pre-Copy Migration Round 1
20
Pre-Copy Migration Round 1
21
Pre-Copy Migration Round 1
22
Pre-Copy Migration Round 1
23
Pre-Copy Migration Round 1
24
Pre-Copy Migration Round 2
25
Pre-Copy Migration Round 2
26
Pre-Copy Migration Round 2
27
Pre-Copy Migration Round 2
28
Pre-Copy Migration Round 2
29
Pre-Copy Migration Final
30
Writable Working Set
  • Set of pages written to by OS/application
  • Pages that are dirtied must be re-sent
  • Hot pages
  • E.g. process stacks
  • Top of free page list (works like a stack)
  • Buffer cache
  • Network receive / disk buffers

31
Page Dirtying Rate
  • Dirtying rate determines VM down-time
  • Shorter iters ? less dirtying ? shorter iters
  • Stop and copy final pages
  • Application phase changes create spikes

32
ThanksThe End
Write a Comment
User Comments (0)
About PowerShow.com