Title: StarCluster
1StarCluster
(http//web.mit.edu/starcluster)
HPC on Amazon's Elastic Compute Cloud
Justin Riley (jtriley_at_mit.edu) Software Tools
for Academics and Researchers Office of
Educational Innovation and Technology Massachusett
s Institute of Technology 77 Massachusetts
Ave. Cambridge, MA 02139
2Outline
- About STAR
- Overview of Amazon Web Services (AWS)
- Elastic Compute Cloud (EC2) Hardware
- Motivations Behind StarCluster
- About StarCluster
- StarCluster Features
- StarCluster Advantages
- StarCluster Live Demo
- EC2 Performance
- Materials Science Research Case Study
3About STAR
What's your biggest problem bringing your
research into the classroom?
4Overview of
- Elastic Compute Cloud (EC2) Features
- Amazon EC2 allows you to dynamically allocate
and terminate Linux virtual machines with a
variety of hardware configurations - Pay only for what you use (i.e. machine hours
and data transfer) - Ability to capture software configurations into
Amazon Machine Images (AMI) for later use. - AMI's can be used to launch multiple machines
with identical software configurations.
5Overview of
- Elastic Block Storage (EBS) Features
- EBS volumes are highly available, highly
reliable volumes that can be attached to a
running Amazon EC2 machine and are exposed as
standard block devices - Allows you to create point-in-time snapshots of
your data. - Pay per month based on allocation as well as per
1 million I/O requests (0.10/GB allocated/month
and 0.10/million I/O requests) - 1GB-1TB limit per EBS volume
6Elatic Compute Cloud Hardware
One EC2 Compute Unit provides the equivalent CPU
capacity of a 1.0-1.2 GHz 2007 Opteron or 2007
Xeon processor.
Standard Instances
Instance Arch CPU RAM Storage I/O Performance Cost/hr
Small 32bit 1.0-1.2GHz 1.7GB 160GB Moderate 0.10/hr
Large 64bit 2.0-2.4GHz dual-core 7.5GB 860GB High 0.40/hr
Extra Large 64bit 2.0-2.4GHz quad-core 15GB 1.690TB High 0.80/hr
High CPU Instances
Instance Arch CPU RAM Storage I/O Performance Cost/hr
Medium 32bit 2.5-3.0GHz dual-core 1.7GB 350GB Moderate 0.20/hr
Extra Large 64bit 2.5-3.0GHz quad-core(ht) 7GB 1.690TB High 0.80/hr
7Motivations Behind StarCluster
StarHPC - an on demand compute cluster for
parallel programming with both OpenMP and OpenMPI
technologies. It provides a virtual desktop
environment, hosted on EC2, configured with all
the necessary tools for programming in
OpenMP/OpenMPI. http//web.mit.edu/star/hpc
StarMolsim - a web application used to run
materials modeling research software. It enables
the user to run various simulations on a
distributed compute cluster and retrieve the
results, all from a web browser. http//web.mit.e
du/star/molsim
8HPC in the Classroom
9StarHPC
Use case students have direct access to a HPC
cluster to actively develop parallel programs
using the Message Passing Interface (MPI)
SSH/VNC
StarHPC was used for 2 weeks in an Independent
Activities Period (IAP) course for parallel
programming using OpenMP and OpenMPI.
Result Creating a 4-node cluster for two weeks
came out to about 25 per student using Amazon
EC2.
10StarMolsim
Use case students log in to a web application as
a proxy to the computing resources. The web
application handles communicating with the
cluster to submit jobs, retrieve the results, etc.
User
Web Server hosting GenePattern from the Broad
Institute of MIT and Harvard
Result Amazon EC2 was used to replace a
traditional 9-node HPC cluster for an entire
semester. The cost for using the 9 node EC2
cluster for the semester was around
3,000-4,000.
11About StarCluster
StarCluster is a utility for creating and
managing general purpose compute clusters hosted
on Amazon's Elastic Compute Cloud (EC2).
StarCluster makes it easy for a user to create
their own compute cluster on EC2 and pay only for
what they use.
- StarCluster Dependencies
- Registered and fully configured EC2 account.
- Python 2.4
- Paramiko library for Python
- Software included in the virtual machine
- OpenMPI
- NFS'd /home directory
- Sun Grid Engine
- Scipy/Numpy/IPython
- Compilers for installing your own custom
software - Ubuntu Linux OS with apt-get for installing
additional OS software
http//web.mit.edu/stardev/cluster
12StarCluster Features
- Simple configuration with sensible defaults
- One command to create and configure a n-node
cluster on EC2 -
- Utilizes Amazon's Elastic Block Storage to store
and snapshot your applications and data. -
- Easily recreate identical working environments
- 32bit/64bit Ubuntu 9.04 public AMI's
-
13StarCluster Features
- Automatic Configuration of
-
- Sun Grid Engine with Parallel Environment (PE)
-
- OpenMPI with SGE PE Support
-
- NFS shares (e.g. /home and /opt)
-
- Passwordless SSH
-
- 147GB local scratch space on /scratch for each
node -
14StarCluster Advantages
- Portable, launch a cluster from virtually
anywhere! - Supplements existing resources when needed
- Easily store your applications and data in the
cloud via EBS. Simply upload your
applications/data to /home and your data will be
available each time you launch StarCluster. - Easy to install additional OS software. Just
launch the AMI, use the package manager to
install additional software, and rebundle the AMI
to create your own customized version of
StarCluster - Easily package results of computational
experiment for reproducible research
15StarCluster Live Demo
16EC2 Performance
Summary Message Passing extremely poor in
comparison to local HPC resources. Embarrassingly
Parallel much better, but still under performs
compared to local HPC resources.
Walker, E. (2008, October) benchmarking Amazon
EC2 for high-performance scientific
computing. Retrieved from http//www.usenix.org/pu
blications/login/2008-10/openpdfs/walker.pdf
17Materials Science Case Study
54 relaxation calculations, 25 and 32 atoms
(C,N,O,H), standard convergence criterion
(Espresso-4.0.5 with MKL 10 and gfortran/gcc
4.3.3)
Worst 58 Best 69
Worst 53 Best 57
18Getting Started with StarCluster
Point your web browser to http//web.mit.edu/starc
luster
19AWS Funding Opportunities...
The AWS in Education program offers
- Teaching Grants for educators using AWS in
courses (plus access to selected course content
resources) - Research Grants for academic researchers using
AWS in their work - Project Grants for student organizations
pursuing entrepreneurial endeavors Tutorials for
students that want to use AWS for self-directed
learning - Solutions for university administrators looking
to use cloud computing to be more efficient and
cost-effective in the university's IT
Infrastructure
Learn more about AWS in Education programs
http//aws.amazon.com/education/
20Acknowledgements
- Professor Buehler (MIT)
- Professor Marzari (MIT)
- Constantinos Evangelinos (MIT)
- Nicolas Poilvert
- Nicolas Pinto (MIT)
- Amazon Web Services
21Thanks for coming! Any questions?