Title: Palacios and Kitten: New High Performance Operating Systems For Scalable Virtualized and Native Supercomputing
1Palacios and Kitten New High Performance
Operating Systems ForScalable Virtualized and
Native Supercomputing
- John R. Lange and Kevin Pedretti
- Trammell Hudson, Peter Dinda,
- Zheng Cui, Lei Xia,
- Patrick Bridges, Andy Gocke,
- Steven Jaconette,
- Mike Levenhagen and Ron Brightwell
- Northwestern University
- Sandia National Labs
- University of New Mexico
2Summary
- Palacios
- First VMM for scalable HPC
- Open Source and available
- Kitten
- First open source Lightweight Kernel for High
Performance Computing (HPC) - Open Source and available
- Proved HPC virtualization is effective at scale
- Performance within 5 of native
- Largest scale study of virtualization
3What is a virtual machine?
- Run an OS as an application
- Run multiple OS environments on a single machine
- Start, stop, pause
- Can easily move entire OS environments
Page Tables CPU state Hardware
Application
Application
Application
Guest
Application
Guest OS
Guest OS
Guest OS
OS
Host OS/VMM
Page Tables CPU state Hardware
VMM
Emulate
Hardware
Hardware
Hardware
4What are VMMs currently used for?
- Server Consolidation
- Fault tolerance
- Legacy application support
- Debugging
- Isolation
- Virtual appliances
- Failover and disaster recovery
- Market size
- 2007 5.5 billion
- 2011 11.7 billion
7.58 Billion
16.70 Billion
5High Performance Computing (HPC)
- Large scale simulations to solve Big Problems
6Virtualization in HPC
- Fault tolerance
- RedStorm MTBI target 50 hours
- RedStorm Min TTR 30 minutes 1 hour
- Broader usage
- Allow applications to select best OS
- Only if it doesnt degrade performance
- Tightly coupled parallel applications
- Very large scale
A.B. Nagarajan, F. Mueller, C. Engelmann, and
S.L. Scott Proactive Fault Tolerance for HPC with
Xen Virtualization ICS 2007
7Palacios VMM
- OS-independent embeddable virtual machine monitor
- Developed at Northwestern and University of New
Mexico - Open source and freely available
- Downloaded over 1000 times as of July 2009
- Users
- Kitten Lightweight supercomputing OS from Sandia
National Labs - MINIX 3
- Modified Linux versions
- Successfully used on supercomputers, clusters
(Infiniband and Ethernet), and servers
http//www.v3vee.org/palacios
8Palacios as an HPC VMM
- Minimalist interface
- Suitable for an LWK
- Compile and runtime configurability
- Create a VMM tailored to specific environments
- Low noise
- Contiguous memory pre-allocation
- Passthrough resources and resource partitioning
9Lightweight Kernel Timeline
1991 Sandia/UNM OS (SUNMOS), nCube-2 1991
Linux 0.02 1993 SUNMOS ported to Intel Paragon
(1800 nodes)? 1993 SUNMOS experience used to
design Puma First implementation of Portals
communication architecture 1994 Linux 1.0 1995
Puma ported to ASCI Red (4700 nodes)? Renamed
Cougar, productized by Intel 1997 Stripped down
Linux used on Cplant (2000 nodes)? Difficult to
port Puma to COTS Alpha server Included Portals
API 2002 Cougar ported to ASC Red Storm (13000
nodes)? Renamed Catamount, productized by
Cray Host and NIC-based Portals
implementations 2004 IBM develops LWK (CNK) for
BG/L/P (106000 nodes)? 2005 IBM ETI develop
LWK (C64) for Cyclops64 (160 cores/die)?
10Kitten An Open Source LWK
- Better match for user expectations
- Provides mostly Linux-compatible user environment
- Including threading
- Supports unmodified compiler toolchains and ELF
executables - Better match vendor expectations
- Modern code-base with familiar Linux-like
organization - Drop-in compatible with Linux
- Infiniband support
- End-goal is deployment on future capability system
http//software.sandia.gov/trac/kitten
11Complexity
- Scalable HPC performance requires minimal
overhead
Component Lines of code
Kitten 33,000
Palacios 28,000
Total 61,000
Xen 580k lines (50k 80k core)
KVM 50k-60k lines Kernel dependencies
(??) User level devices (180k)
12HPC Performance Evaluation
- Virtualization is very useful for HPC, but
- Only if it doesnt hurt performance
- Virtualized RedStorm with Palacios
- Evaluated with Sandias system evaluation
benchmarks
17th fastest supercomputer Cray XT3 38208
cores 3500 sq ft 2.5 MegaWatts 90 million
13Virtualized performance(Catamount)
Within 5
Scalable
HPCCG conjugant gradient solver
14Comparison of Operating Systems
Shadow Paging
Catamount
Compute Node Linux
HPCCG conjugant gradient solver
15Comparison of Operating Systems
Catamount
Compute Node Linux
CTH multi-material, large deformation, strong
shockwave simulation
16Large Scale Study
- Evaluation on full RedStorm system
- 12 hours of dedicated system time on full machine
- Largest virtualization performance scaling study
to date - Measured performance at exponentially increasing
scales - Up to 4096 nodes
- Publicity
- New York Times
- Slashdot
- HPCWire
- Communications of the ACM
- PC World
17Scalability at Large Scale(Catamount)
Within 3
Scalable
CTH multi-material, large deformation, strong
shockwave simulation
18Commodity Systems
- Kitten and Palacios fully support commodity
systems - Infiniband clusters
- Ethernet servers
- Generic PC hardware
- Palacios embeddable in many OSes
- Kitten
- MINIX 3
- Linux
- GeekOS
19Infiniband on Commodity Linux
(Linux guest on IB cluster)
2 node Infiniband Ping Pong bandwidth measurement
20Summary
- Virtualization can scale
- Near native performance for optimized VMM/guest
(within 5) - VMM needs to know about guest internals
- Should modify behavior for each guest environment
- Example Paging method to use depends on guest
- Black Box inference is not desirable in HPC
environment - Unacceptable performance overhead
- Convergence time
- Mistakes have large consequences
- Need guest cooperation
- Guest and VMM relationship should be symbiotic
- Paper forthcoming (4096 scaling results and
techniques)
21Future Work
- Continue exploring virtualization in HPC
- NU, UNM and SNL collaboration
- Granted 5 million hours on Jaguar
- Current fastest supercomputer in the world
Oak Ridge National Labs Cray XT5 224,256
cores 4352 sq. ft 6.95 MegaWatts 104 million
22Conclusion
- Palacios and Kitten
- Two open source tools for HPC
- Proved virtualization of HPC systems can scale
- Contributions Welcome!!
- http//www.v3vee.org
- http//software.sandia.gov/trac/kitten