Title: Real Science at the Petascale
1Real Science at the Petascale
- Radhika S. Saksena1, Bruce Boghosian2,
- Luis Fazendeiro1, Owain A. Kenway, Steven Manos1,
- Marco Mazzeo1, S. Kashif Sadik1, James L. Suter1,
- David Wright1 and Peter V. Coveney1
- 1. Centre for Computational Science, UCL, UK
- 2. Tufts University, Boston, USA
2Contents
- New era of petascale resources
- Scientific applications at petascale
- Unstable periodic orbits in turbulence
- Liquid crystalline rheology
- Clay-polymer nanocomposites
- HIV drug resistance
- Patient specific haemodynamics
- Conclusions
3New era of petascale machines
- Ranger (TACC) - NSF funded SUN Cluster
- 0.58 petaflops (theoretical) peak
10 times HECToR (59
Tflops) bigger than all other
TeraGrid resources combined -
- Linpack speed 0.31 petaflops, 123TB memory
- Architecture 82 racks 1 rack 4 chassis 1
chassis 12 nodes - 1 node Sun blade x6420 (four 16 bit AMD
Opteron Quad-Core processors) - 3,936 nodes 62,976 cores
-
- Intrepid (ALCF) - DOE funded BlueGene/P
- 0.56 petaflops (theoretical) peak
- 163,840 cores 80TB memory
- Linpack speed 0.45 petaflops
- Fastest machine available for open science
and third in general1 - 1. http//www.top500.org/lists/2008/06
4New era of petascale machines
- US firmly committed to path to petascale (and
beyond) - NSF Ranger (5 years, 59 million award)
- University of Tennessee, to build system with
just under 1PF - peak performance (65 million, 5-year
project)1 - Blue Waters will come online in 2011 at NCSA
(208 grant), using - IBM technology to deliver peak 10 Pflops
performance - ( 200K cores, 10PB of disk)
- 1. http//www.nsf.gov/news/news_summ.jsp?cntn_id1
09850
5New era of petascale machines
- We wish to do new science at this scale not
just incremental - advances
- Applications that scale linearly up to tens of
thousands of cores - (large system sizes, many time steps)
capability computing at - petascale
- High throughput for intermediate scale
applications - (in the 128 512 core range)?
6Intercontinental HPC grid environment
UK NGS
US TeraGrid
HECToR
NCSA
AHE
SDSC
PSC
TACC (Ranger)?
ANL (Intrepid)?
DEISA
Lightpaths
- Massive data transfers
- Advanced reservation/
- co-scheduling
- Emergency/pre-emptive access
7Lightpaths - Dedicated 1 Gb UK/US network
-
- JANET Lightpath is a centrally managed service
which supports large research projects on the
JANET network by providing end-to-end
connectivity, from 100s of Mb up to whole fibre
wavelengths (10 Gb). - Typical usage
- Dedicated 1Gb network to connect to
- national and international HPC infrastructure
- Shifting TB datasets between the
UK/US - Real-time visualisation
- Interactive computational steering
- Cross-site MPI runs (e.g. between
- NGS2 Manchester and NGS2 Oxford)?
8Advanced reservations
- Plan in advance to have access to the resources
Process of
reserving multiple resources for use by a single
application - - HARC1 - Highly Available Resource
Co-Allocator
- GUR2 - Grid Universal Remote - Can reserve the resources
- For the same time
- Distributed MPIg/MPICH-G2 jobs
- Distributed visualization
- Booking equipment (e.g. visualization
facilities)? - Or some coordinated set of times
- Computational workflows
- Urgent computing and pre-emptive access
(SPRUCE)
1.
http//www.realitygrid.org/middleware.shtmlHARC
- 2. http//www.ncsa.uiuc.edu/UserInfo/Resourc
es/Hardware/TGIA64LinuxCluster/Doc/coschedule.html
9Advanced reservations
- Also available via the HARC API - can be easily
built into Java - applications.
- Deployed on a number of systems - LONI (ducky,
bluedawg, zeke, neptune IBM p5 clusters) -
TeraGrid (NCSA, SDSC IA64 clusters, Lonestar,
Ranger(?)) - HPCx - North West Grid (UK) -
National Grid Service - UK NGS - Manchester,
Oxford, Leeds
10Application Hosting Environment
- Middleware which simplifies access to distributed
resources manage workflows - Wrestling with middleware can't be a limiting
step for scientists - Hiding complexities of the
grid from the end user - Applications are stateful Web services
- Application can consist of a coupled model,
parameter sweep, steerable application, or a
single executable
11HYPO4D1 (Hydrodynamic periodic orbits in 4D)?
- Scientific goal to identify and characterize
periodic orbits in turbulent fluid - flow (from which exact time averages can be
computed exactly) -
- Uses lattice-Boltzmann method highly scalable
(linear scaling up to - at least 33K cores on Intrepid and close to
linear up to 65K)
a) Ranger
b) Intrepid Surveyor (Blue Gene/P)?
1. L. Fazendeiro et al. A novel computational
approach to turbulence, AHM08
12HYPO4D1? (Hydrodynamic periodic orbits in 4D)?
- Novel approach to turbulence studies
efficiently parallelizes time and - space
-
- Algorithm is extremely memory-intensive full
spacetime trajectories are - numerically relaxed to nearby minimum (unstable
periodic orbit)? - Ranger is ideal resource for this work (123 TB
of RAM)? - During early-user period millions
- of time steps for different
- systems simulated and
- then compared for similarities
- 9TB of data
1. L. Fazendeiro et al. A novel computational
approach to turbulence, AHM08
13LB3D1
- LB3D -- three-dimensional lattice-Boltzmann
solver for multi-component - fluid dynamics, in particular amphiphilic
systems -
- Mature code - 9 years in development. It has
been extensively used on - the US TeraGrid, UK NGS, HECToR and HPCx
machines - Largest model simulated to date is 20483 (needs
Ranger)
- R. S. Saksena et al. Petascale lattice-Boltzmann
simulations of amphiphilic liquid crystals, AHM08
14Cubic Phase Rheology Results1
- Recent results include the
- tracking of large time-scale
- defect dynamics on 10243
- lattice-sites systems only
- possible on Ranger, due to
- sustained core count and disk
- storage requirements
-
- Regions of high stress
- magnitude are localized in the
- vicinity of defects
2563 lattice-sites gyroidal system with multiple
domains
1. R. S. Saksena et al. Petascale
lattice-Boltzmann
simulations of amphiphilic liquid
crystals, AHM08
15LAMMPS1
- Fully-atomistic simulations
- of clay-polymer nanocomposites
- on Ranger
- More than 85 million atoms
- simulated
- Clay mineral studies, with
- 3 million atoms, 2-3 orders
- of magnitude greater than any
- previous study
- Prospects to include the edges
- of the clay (not periodic
- boundary) and do realistic-sized
- models at least 100 million
- atoms (2 weeks wall clock,
- using 4096 cores)?
1. J Suter et al. Grid-Enabled Large-Scale
Molecular Dynamics of Clay
Nano-materials, AHM08
16HIV-1 drug resistance1
- Goal to study the effect of anti-
- retroviral inhibitors (targetting
- proteins in the HIV lifecycle, such
- as viral protease and reverse-
- transcriptase enzymes)
- High end computational power to
- confer clinical decision support
-
- On Ranger, up to 100 replicas
- (configurations) simulated, for the
- first time, in some cases going to
- 100 ns
- 3.5TB of trajectory and free
- energy analysis
Energy differences of binding compared with
experimental results for wildtype and MDR
proteases with inhibitors LPV and RTV using 10ns
trajectory.
- 6 microseconds in four weeks
- AHE orchestrated workflows
1. K. Sadiq et al., Rapid, Accurate and
Automated Binding Free Energy
Calculations of Ligand-Bound HIV Enzymes for
Clinical Decision Support using HPC and
Grid Resources, AHM08
17GENIUS project1
- Grid Enabled Neurosurgical Imaging Using
Simulation (GENIUS)? - Scientific goal to perform real time patient
specific medical simulation - Combines blood flow simulation with clinical
data - Fitting the computational time scale
- to the clinical time scale
- Capture the clinical workflow
- Get results which will influence
clinical
decisions 1 day? 1 week? - GENIUS - 15 to 30 minutes
1. S. Manos et al., Surgical Treatment for
Neurovascular Pathologies
Using Patient-specific Whole Cerebral
Blood Flow Simulation, AHM08
18GENIUS project1
- Blood flow is simulated using lattice-Boltzmann
method (HemeLB)? - Parallel ray tracer doing real time in situ
visualization - Sub-frames rendered on each MPI processor/rank
and composited before - being sent over the network to a (lightweight)
viewing client - Addition of volume rendering cuts down
scalability of fluid solver due to - required global communications
- Even so, datasets rendered at more than 30
frames per second (10242 - pixel resolution)
1. S. Manos et al., Surgical Treatment for
Neurovascular Pathologies Using Patient-specific
Whole Cerebral Blood Flow Simulation,
AHM08
19CONCLUSIONS
- A wide range of scientific research activities
were presented that make - effective use of the new range of petascale
resources available in the USA - These demonstrate the emergence of new science
not possible without - access to this scale of resources
- Some existing techniques still hold however,
such as MPI, as some of - these applications have shown, scaling
linearly up to at least tens of
- thousands of cores
- Future prospects we are well placed to move
onto next machines coming - online in the US and Japan
20Acknowledgements
JANET/David Salmon NGS staff TeraGrid Staff Simon
Clifford (CCS)? Jay Bousseau (TACC) Lucas Wilson
(TACC)? Pete Beckmann (ANL)? Ramesh Balakrishnan
(ANL)? Brian Toonen (ANL)? Prof. Nicholas Karonis
(ANL)?