TeraGyroid - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

TeraGyroid

Description:

UKLight Town Meeting, NeSC, 9/9/2004. 2. The TeraGyroid Project. Funded by EPSRC (UK) & NSF (USA) to join the UK ... rapprochement necessary for success ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 25
Provided by: stephe530
Category:

less

Transcript and Presenter's Notes

Title: TeraGyroid


1
TeraGyroid
  • HPC Applications ready for UKLight

Stephen Pickles ltstephen.pickles_at_man.ac.ukgt http
//www.realitygrid.org http//www.realitygrid.org/
TeraGyroid.html UKLight Town Meeting, NeSC,
Edinburgh, 9/9/2004
2
The TeraGyroid Project
  • Funded by EPSRC (UK) NSF (USA) to join the UK
    e-Science Grid and US TeraGrid
  • application from RealityGrid, a UK e-Science
    Pilot Project
  • 3 month project including work exhibited at SC03
    and SC Global, Nov 2003
  • thumbs up from TeraGrid mid-September, funding
    from EPSRC approved later
  • Main objective was to deliver high impact science
    which it would not be possible to perform without
    the combined resources of the US and UK grids
  • Study of defect dynamics in liquid crystalline
    surfactant systems using lattice-Boltzmann
    methods
  • featured worlds largest Lattice Boltzmann
    simulation
  • 10243 cell simulation of gyroid phase demands
    terascale computing
  • hence TeraGyroid

3
Networking
HPC engine
HPC engine
checkpoint files
steering control and status
visualization data
compressed video
visualization engine
storage
4
LB3D 3-dimensional Lattice-Boltzmann simulations
  • LB3D code is written in Fortran90 and
    parallelized using MPI
  • Scales linearly on all available resources
    (Lemieux, HPCx, CSAR, Linux/Itanium II clusters)
  • Data produced during a single run can exceed 100s
    of gigabytes to terabytes
  • Simulations require supercomputers
  • High end visualization hardware (eg. SGI Onyx,
    dedicated viz clusters) and parallel rendering
    software (e.g. VTK) needed for data analysis

3D datasets showing snapshots from a simulation
of spinodal decomposition A binary mixture of
water and oil phase separates. Blue areas
denote high water densities and red visualizes
the interface between both fluids.
5
Computational Steering ofLattice Boltzmann
Simulations
  • LB3D instrumented for steering using the
    RealityGrid steering library.
  • Malleable checkpoint/restart functionality allows
    rewinding of simulations and run-time job
    migration across architectures.
  • Steering reduces storage requirements because the
    user can adapt data dumping frequencies.
  • CPU time can be saved because users do not have
    to wait for jobs to be finished if they can
    already see that nothing relevant is happening.
  • Instead of doing task farming, parameter
    searches are accelerated by steering through
    parameter space.
  • Analysis time is significantly reduced because
    less irrelevant data is produced.

Applied to study of gyroid mesophase of
amphiphilic liquid crystals at unprecedented
space and time scales
6
Parameter space exploration
Cubic micellar phase, high surfactant density
gradient.
Cubic micellar phase, low surfactant density
gradient.
Initial condition Random water/ surfactant
mixture.
Self-assembly starts.
Lamellar phase surfactant bilayers between water
layers.
Rewind and restart from checkpoint.
7
Strategy
  • Aim use federated resources of US TeraGrid and
    UK e-Science Grid to accelerate scientific
    process
  • Rapidly map out parameter space using large
    number of independent small (1283) simulations
  • use job cloning and migration to exploit
    available resources and save equilibration time
  • Monitor their behaviour using on-line
    visualization
  • Hence identify parameters for high-resolution
    simulations on HPCx and Lemieux
  • 10243 on Lemieux (PSC) takes 0.5 TB to
    checkpoint!
  • create initial conditions by stacking smaller
    simulations with periodic boundary conditions
  • Selected 1283 simulations were used for
    long-time studies
  • All simulations monitored and steered by
    geographically distributed team of computational
    scientists

8
The Architecture of Steering
OGSI middle tier
multiple clients Qt/C, .NET on PocketPC,
GridSphere Portlet (Java)
remote visualization through SGI VizServer,
Chromium, and/or streamed to Access Grid
  • Computations run at HPCx, CSAR, SDSC, PSC and
    NCSA
  • Visualizations run at Manchester, UCL, Argonne,
    NCSA, Phoenix
  • Scientists in 4 sites steer calculations,
    collaborating via Access Grid
  • Visualizations viewed remotely
  • Grid services run anywhere

9
SC Global 03 Demonstration
10
TeraGyroid Testbed
Starlight (Chicago)
Netherlight (Amsterdam)
10 Gbps
ANL
PSC
Manchester
Caltech
BT provision
NCSA
Daresbury
2 x 1 Gbps
production network
MB-NG
SJ4
SDSC
Phoenix
Visualization
UCL
Computation
Access Grid node
Service Registry
Network PoP
Dual-homed system
11
Trans-AtlanticNetwork
  • Collaborators
  • Manchester Computing
  • Daresbury Laboratory Networking Group
  • MB-NG and UKERNA
  • UCL Computing Service
  • BT
  • SurfNET (NL)
  • Starlight (US)
  • Internet-2 (US)

12
TeraGyroidHardware Infrastructure
  • Computation (using more than 6000 processors)
    including
  • HPCx (Daresbury), 1280 procs IBM Power4 Regatta,
    6.6 Tflops peak, 1.024 TB
  • Lemieux (PSC), 3000 procs HP/Compaq, 3TB memory,
    6 Tflops peak
  • TeraGrid Itanium2 cluster (NCSA), 256 procs, 1.3
    Tflops peak
  • TeraGrid Itanium2 cluster (SDSC), 256 procs, 1.3
    Tflops peak
  • Green (CSAR), SGI Origin 3800, 512 procs, 0.512
    TB memory (shared)
  • Newton (CSAR), SGI Altix 3700, 256 Itanium 2
    procs, 384GB memory (shared)
  • Visualization
  • Bezier (Manchester), SGI Onyx 300, 6xIR3, 32procs
  • Dirac (UCL), SGI Onyx 2, 2xIR3, 16 procs
  • SGI loan machine, Phoenix, SGI Onyx 1xIR4, 1xIR3,
    commissioned on site
  • TeraGrid Visualization Cluster (ANL), Intel Xeon
  • SGI Onyx (NCSA)
  • Service Registry
  • Frik (Manchester), Sony Playstation2
  • Storage
  • 20 TB of science data generated in project
  • 2 TB moved to long term storage for on-going
    analysis - Atlas Petabyte Storage System (RAL)
  • Access Grid nodes at Boston University, UCL,
    Manchester, Martlesham, Phoenix (4)

13
Network lessons
  • Less than three weeks to debug networks
  • applications people and network people nodded
    wisely but didnt understand each other
  • middleware such as GridFTP is infrastructure to
    applications folk, but an application to network
    folk
  • rapprochement necessary for success
  • Grid middleware not designed with dual-homed
    systems in mind
  • HPCx, CSAR (Green) and Bezier are busy production
    systems
  • had to be dual homed on SJ4 and MB-NG
  • great care with routing
  • complication we needed to drive everything from
    laptops that couldnt see the MB-NG network
  • Many other problems encountered
  • but nothing that cant be fixed once and for all
    given persistent infrastructure

14
Measured Transatlantic Bandwidths during SC03
15
TeraGyroid Summary
  • Real computational science...
  • Gyroid mesophase of amphiphilic liquid crystals
  • Unprecedented space and time scales
  • investigating phenomena previously out of reach
  • ...on real Grids...
  • enabled by high-bandwidth networks
  • ...to reduce time to insight

Dislocations
Interfacial Surfactant Density
16
TeraGyroid Collaborating Organisations
  • Our thanks to hundreds of individuals at...
  • Argonne National Laboratory (ANL)
  • Boston University
  • BT
  • BT Exact
  • Caltech
  • CSC
  • Computing Services for Academic Research (CSAR)
  • CCLRC Daresbury Laboratory
  • Department of Trade and Industry (DTI)
  • Edinburgh Parallel Computing Centre
  • Engineering and Physical Sciences Research
    Council (EPSRC)
  • Forschungzentrum Juelich
  • HLRS (Stuttgart)
  • HPCx
  • IBM
  • Imperial College London
  • National Center for Supercomputer Applications
    (NCSA)

ANL
17
The TeraGyroid Experiment
  • S. M. Pickles1, R. J. Blake2, B. M. Boghosian3,
    J. M. Brooke1,
  • J. Chin4, P. E. L. Clarke5, P. V. Coveney4,
  • N. González-Segredo4, R. Haines1, J. Harting4, M.
    Harvey4,
  • M. A. S. Jones1, M. Mc Keown1, R. L. Pinning1,
  • A. R. Porter1, K. Roy1, and M. Riding1.
  • Manchester Computing, University of Manchester
  • CLRC Daresbury Laboratory, Daresbury
  • Tufts University, Massachusetts
  • Centre for Computational Science, University
    College London
  • Department of Physics Astronomy, University
    College London

http//www.realitygrid.org http//www.realitygrid
.org/TeraGyroid.html
18
New Application at AHM2004
Exact calculation of peptide-protein binding
energies by steered thermodynamic integration
using high-performance computing grids.
  • Philip Fowler, Peter Coveney, Shantenu Jha and
    Shunzhou Wan
  • UK e-Science All Hands Meeting
  • 31 August 3 September 2004

19
Why are we studying this system?
  • Measuring binding energies are vital for e.g.
    designing new drugs.
  • Calculating a peptide-protein binding energy can
    take weeks to months.
  • We have developed a grid-based method to
    accelerate this process

To compute ??Gbind during the AHM 2004 conference
i.e. in less than 48 hours Using federated
resources of UK National Grid Service and US
TeraGrid
20
Thermodynamic Integration on Computational Grids
Use steering to launch, spawn and terminate ?-
jobs
Starting conformation
Check for convergence
Combine and calculate integral
?0.1
time
?0.2
?0.3
lambda
Seed successive simulations (10 sims, each 2ns)

?0.9
Run each independent job on the Grid
21
checkpointing
steering and control
monitoring
22
We successfully ran many simulations
  • This is the first time we have completed an
    entire calculation.
  • Insight gained will help us improve the
    throughput.
  • The simulations were started at 5pm on Tuesday
    and the data was collated at 10am Thursday.
  • 26 simulations were run
  • At 4.30pm on Wednesday, we had nine simulations
    in progress (140 processors)
  • 1x TG-SDSC, 3x TG-NCSA, 3x NGS-Oxford, 1x
    NGS-Leeds, 1x NGS-RAL
  • We simulated over 6.8ns of classical molecular
    dynamics in this time

23
Very preliminary results
??G (kcal/mol) Experiment -1.0 0.3 Quick
and dirty analysis -9 to -12
- as at 41 hours
We expect our value to improve with further
analysis around the endpoints.
24
Conclusions
  • We can harness todays grids to accelerate
    high-end computational science
  • On-line visualization and job migration require
    high bandwidth networks
  • Need persistent network infrastructure
  • else set up costs are too high
  • QoS Would like ability to reserve bandwidth
  • and processors, graphics pipes, AG rooms, virtual
    venues, nodops... (but thats another story)
  • Hence our interest in UKLight
Write a Comment
User Comments (0)
About PowerShow.com