StarCluster - PowerPoint PPT Presentation

About This Presentation
Title:

StarCluster

Description:

HPC on Amazon's Elastic Compute Cloud Justin Riley (jtriley_at_mit.edu) Software Tools for Academics and Researchers Office of Educational Innovation and Technology – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 22
Provided by: starMitE
Learn more at: http://star.mit.edu
Category:

less

Transcript and Presenter's Notes

Title: StarCluster


1
StarCluster
(http//web.mit.edu/starcluster)
HPC on Amazon's Elastic Compute Cloud
Justin Riley (jtriley_at_mit.edu) Software Tools
for Academics and Researchers Office of
Educational Innovation and Technology Massachusett
s Institute of Technology 77 Massachusetts
Ave. Cambridge, MA 02139
2
Outline
  • About STAR
  • Overview of Amazon Web Services (AWS)
  • Elastic Compute Cloud (EC2) Hardware
  • Motivations Behind StarCluster
  • About StarCluster
  • StarCluster Features
  • StarCluster Advantages
  • StarCluster Live Demo
  • EC2 Performance
  • Materials Science Research Case Study

3
About STAR
What's your biggest problem bringing your
research into the classroom?
4
Overview of
  • Elastic Compute Cloud (EC2) Features
  • Amazon EC2 allows you to dynamically allocate
    and terminate Linux virtual machines with a
    variety of hardware configurations
  • Pay only for what you use (i.e. machine hours
    and data transfer)
  • Ability to capture software configurations into
    Amazon Machine Images (AMI) for later use.
  • AMI's can be used to launch multiple machines
    with identical software configurations.

5
Overview of
  • Elastic Block Storage (EBS) Features
  • EBS volumes are highly available, highly
    reliable volumes that can be attached to a
    running Amazon EC2 machine and are exposed as
    standard block devices
  • Allows you to create point-in-time snapshots of
    your data.
  • Pay per month based on allocation as well as per
    1 million I/O requests (0.10/GB allocated/month
    and 0.10/million I/O requests)
  • 1GB-1TB limit per EBS volume

6
Elatic Compute Cloud Hardware
One EC2 Compute Unit provides the equivalent CPU
capacity of a 1.0-1.2 GHz 2007 Opteron or 2007
Xeon processor.
Standard Instances
Instance Arch CPU RAM Storage I/O Performance Cost/hr
Small 32bit 1.0-1.2GHz 1.7GB 160GB Moderate 0.10/hr
Large 64bit 2.0-2.4GHz dual-core 7.5GB 860GB High 0.40/hr
Extra Large 64bit 2.0-2.4GHz quad-core 15GB 1.690TB High 0.80/hr
High CPU Instances
Instance Arch CPU RAM Storage I/O Performance Cost/hr
Medium 32bit 2.5-3.0GHz dual-core 1.7GB 350GB Moderate 0.20/hr
Extra Large 64bit 2.5-3.0GHz quad-core(ht) 7GB 1.690TB High 0.80/hr
7
Motivations Behind StarCluster
StarHPC - an on demand compute cluster for
parallel programming with both OpenMP and OpenMPI
technologies. It provides a virtual desktop
environment, hosted on EC2, configured with all
the necessary tools for programming in
OpenMP/OpenMPI. http//web.mit.edu/star/hpc
StarMolsim - a web application used to run
materials modeling research software. It enables
the user to run various simulations on a
distributed compute cluster and retrieve the
results, all from a web browser. http//web.mit.e
du/star/molsim
8
HPC in the Classroom
9
StarHPC
Use case students have direct access to a HPC
cluster to actively develop parallel programs
using the Message Passing Interface (MPI)
SSH/VNC
StarHPC was used for 2 weeks in an Independent
Activities Period (IAP) course for parallel
programming using OpenMP and OpenMPI.
Result Creating a 4-node cluster for two weeks
came out to about 25 per student using Amazon
EC2.
10
StarMolsim
Use case students log in to a web application as
a proxy to the computing resources. The web
application handles communicating with the
cluster to submit jobs, retrieve the results, etc.
User
Web Server hosting GenePattern from the Broad
Institute of MIT and Harvard
Result Amazon EC2 was used to replace a
traditional 9-node HPC cluster for an entire
semester. The cost for using the 9 node EC2
cluster for the semester was around
3,000-4,000.
11
About StarCluster
StarCluster is a utility for creating and
managing general purpose compute clusters hosted
on Amazon's Elastic Compute Cloud (EC2).
StarCluster makes it easy for a user to create
their own compute cluster on EC2 and pay only for
what they use.
  • StarCluster Dependencies
  • Registered and fully configured EC2 account.
  • Python 2.4
  • Paramiko library for Python
  • Software included in the virtual machine
  • OpenMPI
  • NFS'd /home directory
  • Sun Grid Engine
  • Scipy/Numpy/IPython
  • Compilers for installing your own custom
    software
  • Ubuntu Linux OS with apt-get for installing
    additional OS software

http//web.mit.edu/stardev/cluster
12
StarCluster Features
  • Simple configuration with sensible defaults
  • One command to create and configure a n-node
    cluster on EC2
  •  
  • Utilizes Amazon's Elastic Block Storage to store
    and snapshot your applications and data.
  •  
  • Easily recreate identical working environments
  • 32bit/64bit Ubuntu 9.04 public AMI's
  •  

13
StarCluster Features
  •  Automatic Configuration of
  •  
  • Sun Grid Engine with Parallel Environment (PE)
  •  
  • OpenMPI with SGE PE Support
  •  
  • NFS shares (e.g. /home and /opt) 
  •  
  • Passwordless SSH
  •  
  • 147GB local scratch space on /scratch for each
    node
  •  

14
StarCluster Advantages
  • Portable, launch a cluster from virtually
    anywhere!
  • Supplements existing resources when needed
  • Easily store your applications and data in the
    cloud via EBS. Simply upload your
    applications/data to /home and your data will be
    available each time you launch StarCluster.
  • Easy to install additional OS software. Just
    launch the AMI, use the package manager to
    install additional software, and rebundle the AMI
    to create your own customized version of
    StarCluster
  • Easily package results of computational
    experiment for reproducible research

15
StarCluster Live Demo
16
EC2 Performance
Summary Message Passing extremely poor in
comparison to local HPC resources. Embarrassingly
Parallel much better, but still under performs
compared to local HPC resources.
Walker, E. (2008, October) benchmarking Amazon
EC2 for high-performance scientific
computing. Retrieved from http//www.usenix.org/pu
blications/login/2008-10/openpdfs/walker.pdf
17
Materials Science Case Study
54 relaxation calculations, 25 and 32 atoms
(C,N,O,H), standard convergence criterion
(Espresso-4.0.5 with MKL 10 and gfortran/gcc
4.3.3)
Worst 58 Best   69
Worst 53 Best   57
18
Getting Started with StarCluster
Point your web browser to http//web.mit.edu/starc
luster
19
AWS Funding Opportunities...
The AWS in Education program offers
  • Teaching Grants for educators using AWS in
    courses (plus access to selected course content
    resources)
  • Research Grants for academic researchers using
    AWS in their work
  • Project Grants for student organizations
    pursuing entrepreneurial endeavors Tutorials for
    students that want to use AWS for self-directed
    learning
  • Solutions for university administrators looking
    to use cloud computing to be more efficient and
    cost-effective in the university's IT
    Infrastructure

Learn more about AWS in Education programs
http//aws.amazon.com/education/
20
Acknowledgements
  • Professor Buehler (MIT)
  • Professor Marzari (MIT)
  • Constantinos Evangelinos (MIT)
  • Nicolas Poilvert
  • Nicolas Pinto (MIT)
  • Amazon Web Services

21
Thanks for coming! Any questions?
Write a Comment
User Comments (0)
About PowerShow.com