Virtual Machines Background - PowerPoint PPT Presentation

1 / 69
About This Presentation
Title:

Virtual Machines Background

Description:

A virtual machine takes the layered approach to its logical conclusion. ... Apparently requires dynamic binary translation. Modifying the guest OS ... – PowerPoint PPT presentation

Number of Views:412
Avg rating:3.0/5.0
Slides: 70
Provided by: Ken667
Category:

less

Transcript and Presenter's Notes

Title: Virtual Machines Background


1
Virtual Machines Background
  • Adapted from Silberschatz

2
Virtual Machines
  • A virtual machine takes the layered approach to
    its logical conclusion. It treats hardware and
    the operating system kernel as though they were
    all hardware.
  • A virtual machine provides an interface identical
    to the underlying bare hardware.
  • For example, the operating system creates the
    illusion of multiple processes, each executing on
    its own processor with its own (virtual) memory.

3
Virtual Machines (Cont.)
  • The resources of the physical computer are shared
    to create the virtual machines.
  • CPU scheduling can create the appearance that
    users have their own processor.
  • Spooling and a file system can provide virtual
    card readers and virtual line printers.
  • A normal user time-sharing terminal serves as the
    virtual machine operators console.

4
System Models
Non-virtual Machine
Virtual Machine
5
Advantages/Disadvantages of Virtual Machines
  • The virtual-machine concept provides complete
    protection of system resources since each virtual
    machine is isolated from all other virtual
    machines. What might be bad about this?
  • This isolation, however, permits no direct
    sharing of resources.
  • A virtual-machine system is a perfect vehicle for
    operating-systems research and development.
    System development is done on the virtual
    machine, instead of on a physical machine and so
    does not disrupt normal system operation.
  • The virtual machine concept is difficult to
    implement due to the effort required to provide
    an exact duplicate to the underlying machine.

6
Java Virtual Machine
  • Compiled Java programs are platform-neutral
    bytecodes executed by a Java Virtual Machine
    (JVM).
  • JVM consists of
  • - class loader
  • - class verifier
  • - runtime interpreter
  • Just-In-Time (JIT) compilers increase performance

7
Java Virtual Machine
8
An Overview of Virtual Machine Architectures
  • Smith and Nair

9
Definitions
  • Instruction Set Architecture (ISA)
  • Precise specification of the interface between
    hardware and software
  • Application Binary Interface (ABI)
  • Defines how an application can work with a
    platform at the binary level. (Contrast with
    API.)
  • Includes user ISA, system call interface, etc.
  • Suppose an ABI is changed.
  • Recompile?
  • Source changes?

10
Virtualization
Application
Application
Guest
Application
OS
OS
VirtualISA
Virtual ISA
OS
VMM
VirtualMachine
ISA
Hardware
ISA
Hardware
Host
  • VMM also known as hypervisor.

11
Virtual Machine Uses
  • Emulation
  • One ISA can be used to emulate another.
  • Provides cross-platform portability.
  • Optimization
  • Emulators can optimize as they emulate.
  • Also can optimize same ISA to same ISA.
  • Replication
  • A single physical machine can be replicated,
    providing isolation between the VMs.
  • Composition
  • Two virtual machines can be composed, combining
    the functionality of each.

12
Process vs. System
  • Meaning of machine depends on perspective.
  • To a process, the machine is the system calls,
    libraries, etc.
  • Already abstract.
  • The entire system also runs on a machine.
  • Includes ISA, actual devices, etc.
  • Other kinds of machines?
  • As there are two perspectives, there are two
    kinds of virtual machines process and system.
  • Process virtual machine can support an individual
    process.
  • System virtual machine can run a complete OS plus
    environment.

13
Process vs. System
NativeApp
NativeApp
W32App
W32App
NativeApp
NativeApp
JavaProg
JavaProg
Windows
JavaVM
JavaVM
VMM
Linux
Linux
x86
x86
System VM
Process VM
Examples?
14
Process VMs
  • Multiprogramming
  • A process has the illusion of having the whole
    machine to itself.
  • Emulation
  • Interpreted. (Define.)
  • Translated. (Define.)
  • What are relative merits?
  • Dynamic optimizers
  • Especially useful with some kind of
    profile-directed translation.
  • High Level Language VMs
  • High-level language is compiled to an
    intermediate language.
  • VM then runs the intermediate language.
  • Example is Java Interpreted or translated?

15
System VMs
  • Same ISA
  • Classic (Define. Pros/cons?)
  • VMM built directly on top of hardware.
  • Most efficient, but requires wiping the slate
    clean.
  • Requires device drivers in the VMM.
  • Hosted (Define. Pros/cons?)
  • VMM built on top of existing OS.
  • Most convenient
  • Devices drivers supplied by host OS, VMM uses
    facilities provided by host OS.
  • Different ISA
  • Whole System VMs Emulation
  • ISA not the same, must emulate everything.
  • Co-Designed VMs Optimization
  • Hardware designed to support VMs.
  • Provides a clean design for virtualization.
  • Can be significantly more efficient.

16
Virtualization
  • The state of a machine must be maintained.
  • Physical machine latches, flip-flops, etc.
  • Virtual machine combination of physical machine
    and state emulated in software using RAM, etc.
  • At certain points in execution, such as a trap,
    the state of the machine must be materialized.
  • Not trivial due to complex hardware techniques
    used to provide high performance.
  • This ability to materialize the state is termed
    preciseness.
  • Three aspects of virtualization
  • State registers and memory
  • Instructions may involve emulation
  • State materialization when exceptions occur

17
Process VMs Virtualization
  • Multiprogramming
  • State
  • Mapped 11
  • Instructions
  • Native
  • State materialization
  • Provided by hardware
  • Dynamic translation
  • State
  • Registers mapped to host registers as available
    (overflow to memory). Memory mapped to host
    memory.
  • Instructions
  • Emulated
  • State materialization
  • Provided by VM software
  • HLL VMs
  • State
  • Mapped to host resources as available.
  • Instructions
  • Emulated, JIT compiled

18
System VMs Virtualization
  • Classic VMs
  • State
  • Mapped 11, except for privileged registers.
  • Instructions
  • Native, except trapping for priveleged
    instructions
  • State materialization
  • Provided by hardware
  • Whole System VMs
  • State
  • Mapped to available memory, not 11
  • Instructions
  • Emulated
  • State materialization
  • Provided by VM software
  • Co-Designed VMs
  • State
  • Mapped 11
  • Instructions
  • Block-level translated

19
Taxonomy
  • Process
  • Same ISA
  • Multiprogramming
  • Dynamic optimization
  • Different ISA
  • Dynamic translators
  • HLL VM
  • System
  • Same ISA
  • Classic OS VMs (IBM)
  • Hosted VMs
  • Different ISA
  • Whole system
  • Co-designed VMs

20
Key Ideas
  • VMs can support an individual process only, or
    can support a whole OS.
  • Can construct a useful taxonomy based on
  • process or system
  • same ISA or different ISA

21
Virtualizing I/O Devices on VMware Workstations
Host VMM
22
Virtualizing the PC Platform
  • Several hurdles
  • Non-virtualizable processor
  • Some privileged instructions fail silently. (Why
    is this a problem?) (Whats the solution?)
  • PC hardware diversity
  • Why is this problematic for a classic VM?
  • Pre-existing PC software
  • Must stay compatible
  • To address these, VMware uses a hosted VM. (Not a
    classic VM.)

23
Two Worlds
  • VMApp runs in the host, using the VMDriver host
    kernel component to establish the VMM.
  • CPU is thus executing in either the host world or
    the virtual world, using VMDriver to switch
    worlds.
  • World switches are expensive, since user and
    system state must be switched.

24
Architecture
VMApp
Host Kernel
VMDriver
VMNet
25
Virtualizing the NIC
  • I/O port operations by guest OS must be
    intercepted by VMM.
  • Must then be processed in the VMM (to maintain
    the virtual state).
  • Or executed in the host world. (When must it do
    what?)
  • Send operations start as a sequence of ops to
    virtual I/O ports.
  • Upon finalization of the send, the VMApp issues a
    host OS syscall to the VMNet driver, which passes
    it on the real NIC.
  • Finally requires raising a virtual IRQ to signal
    completetion.
  • Receive operations operate in reverse.
  • VMApps executes select() syscall on possible
    sources.
  • Reads packet, forwards it to VMM which raises a
    virtual IRQ.

26
Details
  • Send
  • Guest OS out to I/O port
  • Trap to VMDriver
  • Pass to VMApp
  • Syscall to VMNet
  • Pass to actual NIC driver
  • Receive
  • Hardware IRQ
  • Actual NIC delivers to VMNet driver
  • VMNet driver causes VMApp to return from select()
  • VMApp copies packet to VM memory
  • VMApp asks VMM to raise virtual IRQ
  • Guest OS performs port operations to read data
  • Trap to VMDriver
  • VMApp returns from ioctl() to raise IRQ

27
Reducing Network Virtualization Overheads
  • Handling I/O ports in the VMM
  • Many accesses dont involve actual I/O.
  • Let the VMM maintain the state, avoiding a worlds
    switch.
  • Send combining
  • If data rate is high, queue up packets, send them
    in a group.
  • IRQ notification
  • Use shared memory bitmap rather than requiring
    VMApp to call select() when an IRQ is received on
    the host system.

28
Performance Enhancements
  • Reducing CPU virtualization overhead
  • Find operations to the interrupt controller that
    have memory semantics and replace with MOV
    operation, which does not require intervention by
    the VMM.
  • Apparently requires dynamic binary translation.
  • Modifying the guest OS
  • Eliminate idle task page table switching, which
    is not necessary, since the idle task pages are
    mapped in every process page table.
  • Run idle task with page table of last process.
  • What would happen if the idle task had a bug and
    wrote to some random addresses?

29
Performance Enhancements
  • Creating a custom virtual device
  • Virtualizing a real device is somewhat
    inefficient, since the interface to these devices
    is optimized for real devices, not virtual
    devices.
  • Designing a custom virtual device can reduce
    expensive operations.
  • Disadvantage is that must write a new device
    driver in guest OS for this virtual device.
  • Modifying the host OS
  • VMNet driver allocates kernel memory sk_buff,
    then copies from VMApp to sk_buff.
  • Can eliminate copy by using memory from VM
    physical memory.
  • Bypassing the host OS
  • VMM uses own drivers, rather than going through
    the host OS. (Note that going through the host OS
    is using a kind of process VM provided by the
    host OS.)
  • Disadvantage is that you have to write your own
    VMM driver for every supported real device.

30
Summary
  • Main goal is to develop some understanding of the
    issues of hosted system VM performance.

31
Question
  • If overwrite privileged instructions with a brk
    instruction, how does the VMM know what
    instruction used to go there?

32
Xen and the Art of Virtualization
  • A (bad) play on Zen and the Art of Motorcycle
    Maintenance

33
Motivation
  • Server farm scenario
  • Multiple applications installed on machines.
  • Different customers.
  • (Whats admission control?)
  • Current approaches
  • Allow users to install and run apps
  • Configuration interaction between apps (like
    versions of Java jars, shared libraries, etc.)
    can lead to compatibility problems requiring
    time-consuming system administration to solve.
  • Behavior of one app can impact performance of
    another. Need performance isolation.
  • One approach is QoS.
  • Extend OS to provide QoS to apps.
  • (Whats the difference between QoS and real-time?
    QoS and perf. isolation?)

34
Use VMs
  • Instead use multiple VMs, one VM per app.
  • Each app can configure the entire OS exactly how
    it requires.
  • Relatively easier to implement algorithms at the
    VM level to isolate the performance behavior of
    different apps.
  • Requirements for successful partitioning
  • Isolation (Does VMware provide this?)
  • Accommodate heterogeneity
  • Good performance
  • To avoid performance penalties of VMs like
    VMware, use paravirtualization.

35
Design Principles
  • Support for unmodified binaries is essential.
  • Must virtualize all features required by
    existing ABIs.
  • Support for full multi-app OSs is important. (Not
    just process VMs.)
  • Complex configurations may have multiple
    processes and should be configured within a
    single VM.
  • Paravirtualization is necessary to obtain high
    performance and strong resource isolation.
  • For example, virtualizing page tables can result
    in many expensive traps.
  • Even on ISAs designed for virtualization,
    completely hiding the virtualization from guest
    OS risks correctness and performance.
  • For example, the VM should know real time (and
    not just virtual time) to handle things like
    timeouts.
  • Contrast with Denali security model.
  • Separate namespaces.
  • Xen uses hypervisor.

36
The VM Interface Overview
  • Memory management
  • Paging
  • Xen in top 64 MB of every AS, avoiding TLB flush
    for hypervisor transitions.
  • Guest OSs update actual hardware page tables
    through Xen, which improves performance. (But
    makes them aware of virtualization.)
  • Segmentation
  • Cannot install fully privileged segment
    descriptors.

37
The VM Interface Overview (contd.)
  • CPU
  • Protection
  • Guest OS must run at lower privilege. Since ring
    1-2 seldom used, run guest OS in ring 1.
  • Exceptions
  • Guest OSs must register handlers with Xen.
    Generally identical to original.
  • Safety is done by making sure it doesnt execute
    in ring 0.
  • System Calls
  • Fast handlers may be registered to avoid going
    through ring 0. Instead go from ring 3 to ring 1.
  • Does this change the ABI?
  • Page Fault
  • Page fault handler must be modified, fault addr
    in a priv reg.
  • Technique is for Xen to write to a location in
    the stack frame.
  • Device I/O
  • Network, Disk, etc.
  • All replaced with special, buffer-based event
    mechanism.

38
Porting
  • XP directly accessed PTEs, Linux used macros.
    (Why sig.?)

39
Control and Management
  • Separation of policy from mechanism
  • Microkernel like design
  • Basic control mechanism provided by hypervisor
    through a control interface
  • Policies implemented by a special distinguished
    guest OS instance (domain).
  • Scheduling parameters, phys mem allocations,
    domain creation/destruction, create/delete
    virtual network interfaces and block devices

40
Architecture
41
Details
42
Hypercalls and Events
  • Hypercalls
  • From domain to Xen
  • Explicit calls into the hypervisor by the guest
    OS. Used by guest OS for things like updating
    hardware page tables.
  • Events
  • From Xen to domain
  • Bitmask, and handler

43
Data Transfer
  • Presence of hypervisor is another layer, so
    imperative to minimize overhead.
  • For resource accountability
  • Minimize work to demultiplex data
  • Or, figure out as quickly as possible which
    domain it goes to.
  • Memory committed to I/O comes from relevant
    domains
  • Minimize cross-talk

44
I/O Rings
  • Buffers separate. How is pointer shared? How does
    reordering work? NBS.

45
CPU Scheduling
  • CPU Scheduling
  • BVT
  • Work-conserving
  • Latency vs. throughput
  • When would you want non-work-conserving?
  • Fast-dispatch (borrowing)

46
Time and Timers
  • Time and timers
  • Guest OSs made aware of real time, virtual time,
    and wall-clock time.
  • Real-time, nanosecs since boot, can be
    frequency-locked to external
  • Virtual time advances only when the guest OS is
    executing. Used for scheduling by the guest OS.
  • Wall-clock time? An offset from real time. (When
    would ever adjust?)
  • Xen-provided timers are used by guest OS.
  • Solves one efficiency problem with VMware
    Workstation.
  • Guest XP causes host to perform poorly, because
    must constantly deliver timer interrupts to XP to
    do things like smooth transition animations (like
    minimizing a window, etc.). Forcing the guest to
    use XP provided timer would eliminate the need to
    virtualize these timer interrupts.

47
Virtual Address Translation
  • Virtual address translation
  • Handled by Xen, batched updates.
  • Must be validated by Xen.
  • Type and ref count associated with each frame
  • Type is used to aid validation
  • For example, a page table frame needs to be
    validated once, but not afterwards.

48
Physical Memory
  • Physical memory
  • Reserved for each guest OS instance at time of
    creation.
  • Provides strong isolation.
  • But no sharing. What would be advantage of
    sharing?
  • OS may use an additional table to give the
    illusion of physical memory.
  • Might need to know hardware for optimizing
    placement.

49
Network
  • VIFs
  • Two I/O rings
  • Zero-copy

50
Disks
  • VBDs (Domain0 has direct access.)
  • Disk scheduling
  • Guest doesnt know the real layout
  • Xen does some reordering
  • (A bit of a violation of policy/mechanism.)
  • Scheduling is RR of batched requests, then
    elevator.
  • Also may have reorder barriers.
  • (How well does this provide isolation?)

51
Performance
52
Relative Performance
  • Compared Linux, XenoLinux, VMware 3.2, and UML.
  • Tests with others could not be published.
  • Tests
  • SPEC INT2000
  • Linux build
  • Native Linux 7 CPU is system.
  • Open Source Database Benchmark (OSDB) Information
    Retrieval (IR)
  • OSDB On-Line Transaction Processing
  • dbench
  • File system benchmark
  • SPEC WEB99
  • App level for Web servers (Apache)

53
Performance
54
Performance
55
Operating System BMs
  • What does SMP stand for?
  • Why might SMP be slower?
  • Why are the highlighted ones slower?
  • Why sig handling faster for Xen?

56
Operating System BMs
  • Needs hypercall.
  • Why more processes needs more time?
  • Why less sig diff with bigger WS?

57
Operating System BMs
  • mmap and PF require two transitions. (Why?)

58
Operating System BMs
  • Zero-copy

59
Concurrent VMs
  • Run on 2-CPU SMP
  • Apache only 28 improve over UP.
  • Xen improves 9 over UP.
  • Why slightly better sometimes?

60
PostgreSQL
  • Scores running multiple PostgreSQL on native
    Linux are 25-35 lower. Possibly due to SMP
    scalability plus poor use of block cache.
  • Weights seem to have an effect in the Info Retr
    case, but no effect in OLTP case due to lots of
    sync writes. Why sync writes?

61
Performance Isolation
  • Only 4 and 2 below earlier results.
  • Does this make sense?

62
Scalability of VMs
  • SPEC INT2000
  • Native Linux identifies as compute bound, and
    uses 50 ms time slice. (Why does this matter?)

63
Future Work
  • Universal buffer cache with COW
  • How might this be used?
  • Last chance page cache (LPC)
  • of non-zero length only when machine memory is
    undersubscribed.
  • Clean, evicted pages, added to LPC.
  • If faults, check LPC
  • (Why only clean pages?)

64
Key Ideas
  • A virtual ISA (paravirtualization) is better.
  • Better performance
  • Allows VMs to be isolated from one another. One
    VM cant cause the other to thrash, for instance.
  • Allows up to 100 OS instances
  • Making the guest OS aware of virtualization
    improves correctness and performance
  • Control and management of Xen itself is done from
    a guest OS, via a special interface.
  • Cherry picking?
  • Generally speaking, people always choose tests to
    show their work in best light.
  • Maybe hard to tell if complex situation.

65
Microkernels Meet Recursive Virtual Machines
  • Ford et al

66
Decomposition
  • Microkernels decompose functionality horizontally
    (mainly).
  • Monolithic services separated horizontally.
  • Moved up one layer.
  • Stackable VMMs decompose functionality
    vertically.
  • Each layer supplies some functionality.

67
Fluke
  • Uses a nested process architecture.
  • Each process provides a VM to its children,
    possibly with additional functionality.
  • Different from usual parent-child in that
    children are completely contained within and
    visible to parent.
  • This is necessary for the parent to be a VM to
    its children.
  • Two APIs
  • Low-level kernel API to microkernel for basic
    manipulation
  • High-level protocols to handle
  • Parent Interface
  • Process
  • MemPool
  • FileSystem
  • Nested VMs interact directly with microkernel for
    the low-level API, but interact with the parent
    VM for high-level protocols.
  • Parent VM will use interposition to add
    additional functionality. This is how the
    stacking works.

68
(No Transcript)
69
Key Ideas
  • Implement a microkernel that allows process
    virtual machines to be stacked.
  • Each virtual machine is a user-level server.
  • Stacking occurs through process nesting.
  • Use pass-through to avoid exponential behavior.
  • Mainly interesting for the ideas, performance is
    relatively poor, but may be improvable.
Write a Comment
User Comments (0)
About PowerShow.com