Supercomputer Platforms and Its Applications - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Supercomputer Platforms and Its Applications

Description:

Supercomputer Platforms and Its Applications – PowerPoint PPT presentation

Number of Views:139
Avg rating:3.0/5.0
Slides: 27
Provided by: alan106
Category:

less

Transcript and Presenter's Notes

Title: Supercomputer Platforms and Its Applications


1
Supercomputer Platforms and Its Applications Dr.
George Chiu IBM T.J. Watson Research Center
2
Plasma Science International Challenges
  • Microturbulence Transport
  • What causes plasma transport?
  • Macroscopic Stability
  • What limits the pressure in plasmas?
  • Wave-particle Interactions
  • How do particles and plasma waves interact?
  • Plasma-wall Interactions
  • How can high-temperature plasma and material
    surfaces co-exist?

3
2007-2008 Deep Computing Roadmap Summary
1H07 2H07
1H08
2H08
PHV8
PL4/ML16
PL4/ML32
System P Servers
11S0
P6H
P5 560Q
p6 Blade
11S2
PHV8
p6IH
HV4
p6 Blade
JS21 IB AIX Solution CSM 1.6/RSCT 2.4.7
GPFS 3.1, LoadLeveler 3.4.1 PESSL 3.3, PE
4.3.1PERCS System Design Analysis
p6 IH/Blades IB SolutionsAIX 6.1 CSM 1.7.0.x
,GPFS 3.3, LoadLeveler 3.5, PE 5.1, ESSL 4.4,
PESSL 3.3
p6 IH/Blades IB SolutionsAIX 5.3 and SLES
10 Initial AIX 6.1 support for SMPs Ethernet
System P Software
Initial p6 support for SMPs EthernetGPFS 3.2
filesystem mgtCSM 1.7
x3455 DC
x3455 QC (Barcelona)
System XServers
x3550 Harpertown/ Greencreek Refresh
x3550 QC
x3850 QC
x3755 QC
iDPX Stoakley Planar
iDPX Thurley Planar
LS Blades gt Barcelona QC
HS21
LS21
LS41
System X Software
GPFS 3.3 and CSM 1.7.0.x support for System
x/1350
GPFS 3.2 support for System x/1350RHEL 5 support
CSM RHEL 5 support CSM 1.6/RSCT 2.4.7
CSM 1.7 for System x/1350
M50 R1
M60 R1
M60
Workstations
APro elim impacts DCV
Z40 R1
Z30 R1
Z40
BlueGene
Blue Gene /L
BG/L (EOL)
LA
BG/P 1st Petaflop

.

Blue GeneSoftware
BlueGene/P Support GPFS 3.2, CSM 1.7
LoadLeveler 3.4.2, ESSL 4.3.1
QS22
Cell BE
QS21 Prototype
QS20
SDK 3.0
SDK 4.0
SDK 5.0
SDK 2.1
QS21
System Accept
SystemStorage
DDN OEM Agreement
DCS9550
DCS9550
DS4800
EXP100 Attach
DS4800 Follow-on
DS4700 for HPC
SERVER SYSTEMS LEGEND
1st Petaflop dependent on BG client demand
Specific Exclusive
Repurposed Neither Specific nor exclusive
Specific but not exclusive
Source IBM Deep Computing Strategy 7.18.07
3
4
IBM HPC roadmap
Power 7
Power 6
Power 5
Clusters and Blades
5
IBM HPC conceptual roadmap POWER
  • The POWER series is IBMs mainstream computing
    offering
  • Market is about 60 commercial and 40 technical
  • Product line value proposition
  • General purpose computing engine
  • Robustness, security reliability fitting
    mission-critical requirements
  • Standard programming model and interfaces
  • Performance leadership with competitive
    performance/price value
  • Robust integration with industry standards
    (hardware and software)
  • Current status
  • POWER 6 announced
  • POWER 7 is underway

Power 7
Power 6
Power 5
6
ASC Purple
  • 100TF Machine based on Power 5
  • 1500 8-way Power5 Nodes
  • Federation (HPS) 12K CPUs
  • (1500 2 multi-plane fat-tree topology, 2x2
    GB/s links)
  • Communication libraries lt 5 µs latency, 1.8
    GB/s uni
  • GPFS 122 GB/s
  • Supports NIF

7
POWER Server Roadmap
2001
2007
2002-3
2004
2005-06
POWER4
POWER6
POWER4
POWER5
POWER5
65 nm
90 nm
130 nm
130 nm
180 nm
L2 caches
Advanced System Features Switch
Shared L2
Ultra High Frequency Very Large L2 Robust Error
Recovery High ST and HPC Perf High throughput
Perf More LPARs (1024) Enhanced memory
subsystem
Distributed Switch
Simultaneous multi-threading Sub-processor
partitioning Dynamic firmware updates Enhanced
scalability, parallelism High throughput
performance Enhanced memory subsystem
Reduced size Lower power Larger L2 More LPARs (32)
Chip Multi Processing - Distributed Switch -
Shared L2 Dynamic LPARs (16)
Autonomic Computing Enhancements
Planned to be offered by IBM. All statements
about IBMs future direction and intent are
subject to change or withdrawal without notice
and represent goals and objectives only.

8
MareNostrum at a Glance
Challenge
IBM e1350 capability Linux cluster platform
comprising 42 IBM eServer p615 servers, 2560 IBM
eServer BladeCenter JS21 servers and IBM
TotalStorage hardware
  • Deliver world-class deep-computing and
    e-Science services with an attractive
    cost/performance ratio
  • Enable collaboration among leading scientific
    teams in the areas of biology, chemistry,
    medicine, earth sciences and physics

Innovation
120 m² 750 kW
  • Efficient integration of commercially available
    commodity components
  • Modular and scalable open cluster architecture
  • computing, storage, networking, software,
    management, applications
  • Diskless capability
  • improves node reliability, reducing installation
    and maintenance costs
  • Record cluster density and power efficiency
  • Leading price/performance and TCOin High
    Performance Computing

94 TF DP (64-bit) 186 TF SP (32-bit) 376 Tops
(8-bit) 20 TB RAM, 370 TB disk Linux 2.6 1 in
Europe 9 in TOP500
9
IBM HPC conceptual roadmap Blue Gene
  • Blue Gene focuses on ultra-scalability
  • Blue Gene works best for applications that
  • scale naturally to 100s, 1000s or 100,000s of
    processors,
  • tolerate a relatively small amount of memory per
    processor.
  • For these applications, Blue Gene offers
  • Best of breed performance/price value.
  • Lowest operating costs through
  • a small footprint and low
  • power/performance.

Power 7
Power 6
Power 5
10
System
BlueGene/P
72 Racks, 72x32x32
Cabled 8x8x16
Rack
32 Node Cards
1 PF/s 144 TB
13.9 TF/s 2 TB
Compute Card
1 chip, 20 DRAMs
435 GF/s 64 GB
Chip
4 processors
13.6 GF/s 2.0 GB DDR2 (4.0GB is an option)
13.6 GF/s 8 MB EDRAM
11
(No Transcript)
12
HPC Challenge Benchmarks
13
System Power Efficiency
Gflops/Watt
14
Failures per Month per _at_ 100 TFlops (20 BG/L
racks)unparalleled reliability
Results of survey conducted by Argonne National
Lab on 10 clusters ranging from 1.2 to 365 TFlops
(peak) excluding storage subsystem, management
nodes, SAN network equipment, software outages
15
Classical MD ddcMD2005 Gordon Bell Prize
Winner!!
  • Scalable, general purpose code for performing
    classical molecular dynamics (MD) simulations
    using highly accurate MGPT potentials
  • MGPT semi-empirical potentials, based on a
    rigorous expansion of many body terms in the
    total energy, are needed in to quantitatively
    investigate dynamic behavior of d-shell and
    f-shell metals.

524 million atom simulations on 64K nodes
achieved 101.5 TF/s sustained. Superb strong and
weak scaling for full machine - (very impressive
machine says PI)
Visualization of important scientific findings
already achieved on BG/L Molten Ta at 5000K
demonstrates solidification during isothermal
compression to 250 GPa
16
Qbox First Principles Molecular
DynamicsFrancois Gygi UCD, Erik Draeger, Martin
Schulz, Bronis de Supinski, LLNLFranz Franchetti
Carnegie mellon, John Gunnels, Vernon Austel, Jim
Sexton, IBM
  • Treats electrons quantum mechanically
  • Treats nuclii classically
  • Developed at LLNL
  • BG Supported provided by IBM
  • Simulated 1,000 Mo atoms with 12,000 electrons
  • Achieves 207.3 Teraflops sustained.
  • (56.8 of peak).

Qbox simulation of the transition from a
molecular solid (top) to a quantum liquid
(bottom) that is expected to occur in hydrogen
under high pressure.
17
(No Transcript)
18
Compute Power of the Gyrokinetic Toroidal
CodeNumber of particles (in million) moved 1
step in 1 second
BG/L at Livermore
Cray XT3/XT4
BG/L Optimal
BG/L
19
Compute Power of the Gyrokinetic Toroidal
CodeNumber of particles (in million) moved 1
step in 1 secondBlueGene can reach 150 billion
particles in 2008, gt1 trillion in 2011.POWER6
can reach 1 billion particles in 2008, gt0.3
trillion in 2011.
BG/P at 3.5PF
P6 at 300TF
BG/L at Livermore
IBM Power
BG/L Optimal
Cray XT3/XT4
BG/L
20
Rechenzentrum Garching at BG Watson GENE
Strong scaling of GENEv11 for a problem size of
300-500 GB with measurement points for 1k, 2k,
4k, 8k and 16k processors normalized to 1k
processors. Quasi-linear scaling has been
observed with a parallel efficiency of 95 on 8k
processors, and of 89 on 16k processors By
Hermann Lederer, Reinhard Tisma and Frank
Jenko, RZGand IPP, March 21,22 2007
21
Current HPC Systems Characteristics
22
Summary
  • IBM is much involved in ITER applications through
    its collaborations
  • Princeton Plasma Physics Laboratory
  • Max-Planck-Institut für Plasma Physik/Rechenzentru
    m Garching
  • Barcelona Supercomputer Center
  • Oak Ridge National Laboratory
  • IBM is also involved in laser-plasma fusion
    through its collaborations
  • Lawrence Livermore National Laboratory
  • Forschungszentrum Jülich
  • IBM offers multiple platforms to address ITER
    needs
  • POWER high memory capacity/node, moderate
    interprocessor bandwidth, moderate scalability
    capability and capacity machine
  • Blue Gene low power, low memory capacity/node,
    high interprocessor bandwidth, highest
    scalability - capability and capacity
    applications
  • X Series and white box moderate memory
    capacity/node, low interprocessor bandwidth,
    limited, moderate scalability mostly capacity
    machine.

23
  • Backup

24
What BG brings to Core Turbulence Transport
  • Benchmark case CYCLONE
  • GENE lt 1 day on 64 procs few hours on 1024
    procs BG/L
  • GYSELA 2.5 days on 64 procs
  • ORB5 lt 1day on 64 procs few hours on 1024
    procs BG/L
  • Similar ITER-size benchmark
  • GENE ½ day on 6K procs BG/L
  • GYSELA 10 days on 1024 procs
  • ORB5 ½ day on 16K procs BG/L 1 week on 256
    procs PC cluster

Courtesy José Mª Cela , Director of Applications,
BSC
25
The Gyrokinetic Toroidal Code GTC
  • Description
  • Particle-in-cell code (PIC)
  • Developed by Zhihong Lin (now at UC Irvine)
  • Non-linear gyrokinetic simulation of
    microturbulence Lee, 1983
  • Particle-electric field interaction treated
    self-consistently
  • Uses magnetic field line following coordinates
    (y,q,z)
  • Guiding center Hamiltonian White and Chance,
    1984
  • Non-spectral Poisson solver Lin and Lee, 1995
  • Low numerical noise algorithm (df method)
  • Full torus (global) simulation

26
BlueGene Key Applications - Major Scientific
Advances
  • Qbox (DFT) LLNL 56.5 2006 Gordon-Bell
    Award 64 racksCPMD IBM 30 highest
    scaling 64 racks
  • ddcMD (Classical MD) LLNL 27.6 2005
    Gordon-Bell Award 64 racksMDCASK LLNL highest
    scaling 64 racksSPaSM LANL highest
    scaling 64 racksLAMMPS SNL highest
    scaling 16 racksBlue Matter IBM highest
    scaling 16 racksRosetta UW highest
    scaling 20 racksAMBER 8 racks
  • Quantum Chromodynamics IBM 30 2006 GB Special
    Award 64 racksQCD at KEK 10 racks
  • sPPM (CFD) LLNL 18 highest scaling 64
    racksMiranda LLNL highest scaling 64
    racksRaptor LLNL highest scaling 64
    racksDNS highest scaling 16 racksPETSc
    FUN3D ANL 14.2NEK5 (Thermal Hydraulics) ANL
    22
  • ParaDis (dislocation dynamics) LLNL highest
    scaling 64 racks
  • GFMC (Nuclear Physics) ANL 16
  • WRF (Weather) NCAR 14 highest scaling 64
    racksPOP (Oceanography) highest scaling 16
    racks
  • HOMME (Climate) NCAR 12 highest scaling 32
    racks
  • GTC (Plasma Physics) PPPL highest scaling 16
    racks ORB5 RZG highest scaling 8 racksGENE
    RZG 12.5 highest scaling 16 racks
  • Flash (Supernova Ia) highest scaling 32
    racksCactus (General Relativity) highest
    scaling 16 racks
  • AWM (Earthquake) highest scaling 20 racks

27
Science
Theory
Experiment
Simulation
Write a Comment
User Comments (0)
About PowerShow.com