Title: Cloud Computing with Nimbus
1 Cloud Computing with Nimbus
- April 2009
- Kate Keahey
- (keahey_at_mcs.anl.gov)
- University of Chicago
- Argonne National Laboratory
2Cloud Computing
SaaS Software-as-a-Service
elasticity computing on demand
PaaS Platform-as-a-Service
capital expense operational expense
IaaS Infrastructure-as-a-Service
3Cloud Computing for Science
- Environment
- Resource control
4Workspaces
- Dynamically provisioned environments
- Environment control
- Resource control
- Implementations
- Via leasing hardware platforms reimaging,
configuration management, dynamic accounts - Via virtualization VM deployment
Isolation
5A Brief History of Nimbus
First STAR production run on EC2
Xen released
EC2 goes online
Nimbus Cloud comes online
2003
2009
2006
Research on agreement-based services
First WSRF Workspace Service release
Support for EC2 interfaces
EC2 gateway available
Context Broker release
6Nimbus Goals
- Allow providers to build clouds
- Private clouds (privacy, expense considerations)
- Workspace Service open source EC2 implementation
- Allow users to use cloud computing
- Do whatever it takes to enable scientists to use
IaaS - Context Broker turnkey virtual clusters
- IaaS Gateway interoperability
- Allow developers to experiment with Nimbus
- For research or usability/performance
improvements - Community extensions and contributions
7The Workspace Service
Pool node
Pool node
Pool node
VWS Service
Pool node
Pool node
Pool node
Pool node
Pool node
Pool node
Pool node
Pool node
Pool node
8The Workspace Service
The workspace service publishes information about
each workspace
Pool node
Pool node
Pool node
VWS Service
Pool node
Pool node
Pool node
Users can find out information about
their workspace (e.g. what IP the workspace was
bound to)
Pool node
Pool node
Pool node
Pool node
Pool node
Pool node
Users can interact directly with their workspaces
the same way the would with a physical machine.
9Workspace Service Interfaces and Clients
- Web Services based
- Web Service Resource Framework (WSRF)
- WS state management (WS-Notification)
- Elastic Computing Cloud (EC2)
- Compatible with EC2 clients
- Supported ec2-describe-images,
ec2-run-instances, ec2-describe-instances,
ec2-terminate-instances, ec2-reboot-instances,
ec2-add-keypair, ec2-delete-keypair - Unsupported availability zones, security groups,
elastic IP assignment, REST
10Workspace Service Security
- GSI authentication and authorization
- PKI-based
- VOMS, Shibboleth (via GridShib), custom PDPs
- Secure access to VMs
- EC2 key generation or accessed from .ssh
- Validating images and image data
- Extensions from Vienna University of Technology
- Paper Descher et al., Retaining Data Control in
Infrastructure Clouds, ARES (the International
Dependability Conference), 2009.
11Workspace Service Networking
- Network configuration
- External public IPs or private IPs (via VPN)
- Internal private network via a local cluster
network - Each VM can specify multiple NICs mixing private
and public networks (WSRF only) - E.g., cluster worker nodes on a private network,
headnode on both public and private network
12Workspace Components
workspace resource manager
WSRF
workspace service
workspace control
EC2
workspace pilot
workspace client
OpenNebula Project
- See papers at http//workspace.globus.org/papers/
index.html - Simple Leases with Workspace Pilot (EuroPar08)
- Combining Batch Execution and Leasing Using
- Virtual Machines (HPDC08),
13Cloud Capabilities
storage service
workspace resource manager
WSRF
workspace control
workspace service
workspace pilot
EC2
workspace client
cloud client
14The IaaS Gateway
storage service
workspace resource manager
WSRF
workspace control
workspace service
workspace pilot
EC2
IaaS gateway
EC2
potentially other providers
workspace client
cloud client
15Cloud Computing Ecosystem
Appliance Providers Marketplaces, commercial
providers, Virtual Organizations Appliance
management software
Deployment Orchestrator
User Environments
VMM/DataCenter/IaaS
User Environments
VMM/DataCenter/IaaS
16Turnkey Virtual Clusters
IP1
HK1
IP2
HK2
IP3
HK3
IP1
HK1
IP1
HK1
IP1
HK1
MPI
IP2
HK2
IP2
HK2
IP2
HK2
IP3
HK3
IP3
HK3
IP3
HK3
Context Broker
- Turnkey, tightly-coupled cluster
- Shared trust/security context
- Shared configuration/context information
17Context Broker Goals
- Can work with every appliance
- Appliance schema, can be implemented in terms of
many configuration systems - Can work with every cloud provider
- Simple and minimal conditions on generic context
delivery - Can work across multiple cloud providers, in a
distributed environment
18Context Broker Status
- Releases
- In alpha since 08/07, first release 06/08, update
01/09 - Used to contextualize cluster composed of 100s of
virtual nodes for multiple apps - Contextualized images on workspace marketplace
- Working with rPath to make contextualization
easier for the user - Discussing OVF extensions
Paper KeaheyFreeman, Contextualization
Providing One-Click Virtual Clusters, eScience
2008
19End of Nimbus Tour
storage service
workspace resource manager
WSRF
workspace control
workspace service
EC2
workspace pilot
context broker
IaaS gateway
EC2
potentially other providers
context client
workspace client
cloud client
20Science Clouds
- Goals
- Enable scientific projects to experiment with
IaaS clouds - Evolve software in response to the needs of
scientific projects - A laboratory for exploration of cloud
interoperability issues - Participants
- University of Chicago (since 03/08, 16 nodes),
University of Florida (05/08, 16-32 nodes, access
via VPN), Masaryk University, Brno, Czech
Republic (08/08), Wispy _at_ Purdue (09/08) - In progress Grid5K, Vrije, others
- Using EC2 for large runs
- Simple governance model, access given to any
scientific project - http//workspace.globus.org/clouds
21Who Runs on Nimbus at UC?
100 DNs projects ranging across Science, CS,
education, buildtest
22STAR
- STAR a nuclear physics experiment studies
fundamental properties of nuclear matter - Computations require complex and consistently
configured environments - Requirements
- A virtual OSG STAR cluster OSG headnode
(gridmapfiles, host certificates, NFS, Torque),
worker nodes SL4 STAR - From Science Clouds to EC2 runs
- One-click virtual cluster deployment Context
Broker - Producing just-in-time results for Quark Matter
conference http//www.isgtw.org/?pid1001735 - Work by Jerome Lauret, Doug Olson, Leve Hajdu,
Lidia Didenko at BNL
23Alice HEP Experiment at CERN
- Collaboration with CERNVM project
- HPCwire article
24Sky Computing
U of Florida
U of Chicago
ViNE router
ViNE router
ViNE router
Purdue
25Sky Computing
U of Florida
U of Chicago
Hadoop cloud
Purdue
- Papers
- Sky Computing, by K. Keahey, A. Matsunaga, M.
Tsugawa, J. Fortes. Submitted to IEEE Internet
Computing. - CloudBLAST Combining MapReduce and
Virtualization on Distributed Resources for
Bioinformatics Applications by A. Matsunaga, M.
Tsugawa and J. Fortes. eScience 2008.
26Open Source IaaS Implementations
- OpenNebula
- Open source datacenter implementation
- University of Madrid, I. Llorente team, 03/2008
- Eucalyptus
- Open source implementation of EC2
- UCSB, R. Wolski team, 06/2008
- Cloud-enabled Nimrod-G
- Open source implementation of EC2
- Monash University, MeSsAGE Lab, 01/2009
- Industry efforts
- openQRM, Enomalism
27Friends and Family
- Committers Kate Keahey Tim Freeman (ANL/UC),
Ian Gable (UVIC) - A lot of help from the community, see
http//workspace.globus.org/people.html - Collaborations
- Cumulus S3 implementation (Globus team)
- EBS IU project
- Appliance management rPath, Bcfg2 project,
CohesiveFT - Virtual network overlays University of Florida
- Security (research) Vienna University of
Technology
28IaaS Clouds vs Grids
- Grid computing
- Assumption site retains control over resources
- Remote interfaces to local site mechanisms
- Tradeoff difficult to provide the right
environments and control but easy to deploy - Cloud computing
- Assumption a user gets a lease on a remote
resource that it gets to control - Enabled by virtauliaztion (Xen)
- Tradeoff eanbles a larger class of applications
but hard to deploy - Raises issues e.g., site licenses? Configuration
support? - Towards sky computing
- I can now trust a remote resource I configured
it myself - Cloud computing virtual networks
- Local distributed environment
29Parting Thoughts
- Science-driven cloud computing
- Importance of open source
- Drive requirements into the infrastructure,
customize - Drive the development of standards
- Cloud computing for the user
- Combine with what we have (grid computing)
- Explore new potential
- Future directions
- Creating the ecosystem, working out the issues,
e.g. licensing, appliance support - Interoperability and standards
- Service Levels