Title: Brief presentation of Earth Simulation Center
1Brief presentation of Earth Simulation Center
2Hardware configuration
- Highly parallel vector supercomputer of the
distributed-memory type - 640 Processor nodes (PNs)
- PN
- 8 vector-type arithmetic processors (APs)
- 16 GB main momory
- Remote control and I/O parts
3Arithmetic processor
4Processor node
5Processor node
6Interconnection network
7Interconnection Network
865m
50m
Earth Simulator Research and Development Center
9Software
- OS
- NECs UNIX-based OS SUPER-UX
- Programming model
- Supported language
- Fortran90, C, C (modified for ES)
hybrid flat
Inter-PN HPF/MPI HPF/MPI
Intra-PN Microtasking/OpenMP HPF/MPI
AP Automatic vectoriztion Automatic vectoriztion
10Earth Simulator Center
First results from the Earth Simulator
Resolution ? 300km
11Earth Simulator Center
First results from the Earth Simulator
Resolution ? 120km
12Earth Simulator Center
First results from the Earth Simulator
Resolution ? 20km
13Earth Simulator Center
First results from the Earth Simulator
Resolution ? 10km
14First results from the Earth Simulator
? resolution 0.1º 0.1º ( ? 10km) ? initial
condition Levitus data (1982)? computer
resources number of nodes 175,
elapsed time ? 8,100
hours
15First results from the Earth Simulator
16Terascale ClusterSystem X
- Virginia Tech, Apple, Mellanox, Cisco, and
Liebert - 2003. 3. 16
- Daewoo Lee
17Terascale Cluster System X
- A Groundbreaking Supercomputer Cluster with
Industrial Assistance - Apple, Mellanox, Cisco, and Liebert
- 5.2 million for hardware
- 10280/17600 GFlops of Performance with 1100 Nodes
(3rd Ranked in TOP500 Supercomputer Site)
18Goals
Dual Usage Mode (90 of computational cycles
devoted to production use)
19Hardware Architecture
Node Apple G5 Platform Dual IBM PowerPC 970 (64-bit CPU)
Primary Communication InfiniBand by Mellanox (20Gbps full duplex, fat-tree topology)
Secondary Communication Gigabit Ethernet by Cisco
Cooling System by Liebert
20Software
- Mac OS X (FreeBSD based)
- MPI-2 (MPICH-2)
- Support C/C/Fortran compilation
- Déjà vu transparent fault-tolerance system
- Maintain computer stability by transferring a
failed application to another location without
alerting the computer, thus keeping the
application intact.
21Reference
- Terascale Cluster Web Site
- http//computing.vt.edu/research_computing/terasca
le
224th fastest supercomputerTungsten
PAK, EUNJI
234th NCSA Tungsten
- Top500.org
- National Center for Supercomputing Applications
(NCSA) - University of Illinois at Urbana-Champaign
24Tungsten Architecture 1/3
- Xeon 3.0 GHz Dell cluster
- 2,560 processors
- 3 GB memory/node
- Peak performance 15.36 TF
- Top 500 list debut 4 (9.819 TF, November 2003)
- Currently 4th fastest supercomputer in the world
25Tungsten Architecture 2/3
26Tungsten Architecture 3/3
- 1450 nodes
- Dell PowerEdge 1750 Server
- Intel Xeon 3.06GHZ Peak performance 6.12GFLOPS
- 1280 compute nodes, 104 I/O nodes
- Parallel I/O
- 11.1 Gigabytes per second (GB/s) of I/O
throughput - Complements the clusters 9.8TFLOPS of
computational capability - 104 node I/O sub-cluster with more than 120TB
- Node local 73GB, Shared 122TB
27Applications on Tungsten 1/3
- PAPI and PerfSuite
- PAPI Portable interface to hardware performance
counters - PerfSuite Set of tools for performance analysis
on Linux platforms
28Applications on Tungsten 2/3
29Applications on Tungsten 3/3
- CHARMM (Harvard Version)
- Chemistry at Harvard Macromolecular Mechanics
- General purpose molecular mechanics, molecular
dynamics and vibrational analysis packages - Amber 7.0
- A set of molecular mechanical force fields for
the simulation of bimolecular - Package of molecular simulation programs
30MPP2 SupercomputerThe worlds largest Itanium2
cluster.
- Molecular Science Computing Facility
- Pacific Northwest National Laboratory
- 2004. 3. 16
- Presentation Kim SangWon
31Contents
- MPP2 Supercomputer Overview
- Configuration
- HP rx2600(Longs Peak) Node
- QsNet ELAN Interconnect Network
- System/Application Software
- File System
- Future Plan
32MPP2 Overview
- MPP2
- The High Performance Computing System-2
- At the Molecular Science Computing Facilityin
the William R. Wiley Environmental Molecular
Sciences Laboratoryat Pacific Northwest National
Laboratory - the fifth-fastest supercomputer in the world in
the November 2003
33MPP2 Overview
- System Name Mpp2
- Linux Supercomputer cluster
- 11.8(8.633) Teraflops
- 6.8 Terabytes of memory
- Purpose Production
- Platform HP Integrity rx2600
bi-Itanium2 1,5 Ghz - Nodes 980 (Processors 1960)
- ¾ Megawatt of power
- 220 Tons of Air Conditioning
- 4,000 Sq. Ft.
- Cost 24.5 million (estimated)
UPS
Generator
34Configuration(Phase2b)
Operational September 2003
1,900 next generation Itanium processors
11.4TF 6.8TB Memory
1,856 Madison Batch CPUs
928 compute nodes
...
Elan4 Not Operational
Elan4
Elan3
Lustre
SAN / 53TB
2 System Mgt nodes
4 Login nodes with 4Gb-Enet
35HP rx2600 Longs Peak Node Architecture
- Each node has
- 2 Intel Itanium 2 Processors(1.5Ghz)
- 6.4GB/s System bus
- 8.5GB/s Memory bus
- 12GB of RAM
- 1 1000T Connection
- 1 100T Connection
- 1 Serial Connection
- 2 Elan3 Connections
Elan3
PCI-X2 (1GB/s)
Elan3
2SCSI160
36QsNet ELAN Interconnect Network
- High bandwidth, Ultra low latency and scalability
- 900Mbytes/s user space to user space bandwidth.
- 1024 nodes for standard QsNet conf., rising to
4096 in QsNetII systems. - Optimized libraries for common distributed memory
programming models exploit the full capabilities
of the base hardware.
37Software on MPP2 (1/2)
- System Software
- Operating System - Red Hat Linux 7.2 Advanced
Server - NWLinux tailored to IA64 clusters (2.4.18
kernel with various patches) - Cluster Management Resource Management
System(RMS) by Quadrix - A single point interface to the system for
resource management - Monitoring, Fault diagnosis, Data collection,
Allocating CPUs, Parallel jobs execution - Job Management Software
- LSF(Load Sharing Facility) Batch Scheduler
- QBank Control and Manage CPU resources
allocated to projects or users. - Compiler Software
- C (ecc), F77/F90/F95 (efc), G
- Code Development
- Etnus TotalView
- A parallel and multithreaded application debugger
- Vampir
- the GUI driven frontend used to visualize the
profile data of running a program - gdb
38Software on MPP2 (2/2)
- Application Software
- Quantum Chemistry Codes
- GAMESS(The General Atomic and Molecular
Electronic Structure System) - performing a variety of ab initio molecular
orbital (MO) calculations - MOLPRO
- an advanced ab initio quantum chemistry software
package - NWChem
- computational chemistry software developed by
EMSL - ADF (Amsterdam Density Functional) 2000
- software for first-principle electronic structure
calculations via Density-Functional Theory (DFT) - General Molecular Modeling Software
- Amber
- Unstructured Mesh Modeling Codes
- NWGrid (Grid Generator)
- hybrid mesh generation, mesh optimization, and
dynamic mesh maintenance - NWPhys (Unstructured Mesh Solvers)
- a 3D, full-physics, first principles,
time-domain, free-Lagrange code for parallel
processing using hybrid grids.
39File System on MPP2
- Four file systems available on the cluster
- Local filesystem(/scratch)
- On each of the compute nodes
- Non-persistent storage area provided to a
parallel job running on that node. - NFS filesystem(/home)
- User home directory and files are located.
- Uses RAID-5 for reliability
- Lustre Global filesystem(/dtemp)
- Designed for the world's largest high-performance
compute clusters. - Aggregate write rate of 3.2 Gbyte/s.
- Restart files and files needed for post analysis.
- Long term global scratch space
- AFS filesystem(/msrc)
- On the front-end (non-compute) nodes
40Future Plan
- MPP2 will be upgraded with the faster Quadrics
QsNetII interconnect in early 2004
928 compute nodes
1,856 Madison Batch CPUs
...
Elan4
Lustre
SAN / 53TB
4 Login nodes with 4Gb-Enet
2 System Mgt nodes
41Bluesky Supercomputer
- Top 500 Supercomputers
- CS610 Parallel Processing
- Donghyouk Lim
- (Dept of Computer Science, KAIST)
42Contents
- Introduction
- National Center for Atmosphere Research
- Scientific Computing Division
- Hardware
- Software
- Recommendations for usage
- Related Link
43Introduction
- Bluesky
- 13th Supercomputer in the world
- Clustered Symmetric Multi-Processing(SMP) System
- 1600 IBM Power 4 processor
- Peak of 8.7 TFLOP
44National Center for Atmosphere Research
- Established in 1960
- Located in Boulder, Colorado
- Research area
- Earth system
- Climate change
- Changes in atmospheric composition
45Scientific Computing Division
- Research on high-performance supercomputing
- Computing resources
- Bluesky (IBM Cluster 1600 running AIX) 13th
place - blackforest (IBM SP RS/6000 running AIX) 80th
place - Chinook complex Chinook (SGI Origin3800 running
IRIX) and Chinook (SGI Origin2100 running IRIX)
46Hardware
- Processor
- 1600 Power 4 Processors 1.3 GHz
- each can perform up to 4 fp operations per cycle
- Peak of 8.7 TFLOPS
- Memory
- 2 GB memory per processor
- memory on a node is shared between processors on
that node - Memory Caches
- L1 cache 64KB I-cache, 32KB d-cache, direct
mapped - L2 cache For pair of processors, 1.44MB, 8-way
set associative - L3 cache 32MB, 512byte cache line, 8-way set
associative
47Hardware
- Computing Nodes
- 8-way processor nodes 76
- 32-way processor nodes 25
- 32-processor nodes for running interactive jobs
4 - Separate nodes for user logins
- System support nodes
- 12 nodes dedicated to the General Parallel File
System (GPFS) - Four nodes dedicated to HiPPI communications to
the Mass Storage System - Two master nodes dedicated to controlling
LoadLeveler operations - One dedicated system monitoring node
- One dedicated test node for system
administration, upgrades, testing
48Hardware
- Storage
- RAID disk storage capacity 31.0 TB total
- Each user application can access 120 GB of
temporary space - Interconnect fabric
- SP switch2 (Colony switch)
- Two full duplex network path to increase
throughput - Bandwidth 1.0GB per second bidirectional
- Worst case latency 2.5 microsecond
- HiPPI(High-Performance Parallel Interface) to the
Mass Storage System - Gigabit Ethernet network
49Software
- Operating System AIX (IBM-proprietary UNIX)
- Compilers Fortran (95/90/77), C, C
- Batch subsystem LoadLeveler
- Managing serial and parallel jobs over a cluster
of servers - File System General Parallel File System (GPFS)
- System information commands spinfo for general
information, lslpp for information about
libraries
50Related Links
- NCAR http//www.ncar.ucar.edu/ncar/
- SCD http//www.scd.ucar.edu/
- Bluesky http//www.scd.ucar.edu/computers/bluesk
y/ - IBM p690 http//www-903.ibm.com/kr/eserver/pseri
es/highend/p690.html
51About Cray X1
- Kim, SooYoung (sykim_at_camars.kaist.ac.kr)
- (Dept of Computer Science, KAIST)
52Features (1/2)
- Contributing areas
- weather and climate prediction, aerospace
engineering, automotive design, and a wide
variety of other applications important in
government and academic research - Army High Performance Computing Research Center
(AHPCRC), Boeing, Ford, Warsaw Univ., U.S.
Government, Department of Energy's Oak Ridge
National Laboratory (ORNL) - Operating System UNICOS/mptm from UNICOS,
UNICOS/mktm - True single system image (SSI)
- Scheduling algorithms for parallel applications
- Accelerated application mode and migration
- Variable processor utilization Each CPU has four
internal processors - Together as a closely coupled, multistreaming
processor (MSP) - Individually as four single-streaming processors
(SSPs) - Flexible system partitioning
53Features (2/2)
- Scalable system architecture
- Distributed shared memory (DSM)
- Scalable cache coherence protocol
- Scalable address translation
- Parallel programming models
- Shared-memory parallel models
- Traditional distributed-memory parallel models
MPI and SHMEM - Up-and-coming global distributed-memory parallel
models Unified Parallel C(UPC) - Programming environments
- Fortran compiler, C and C compiler
- High-performance scientific library (LibSci),
language support libraries, system libraries - Etnus TotalView debugger, CrayPat (Cray
Performance Analysis Tool)
54Node Architecture
Figure 1. Node, Containing Four MSPs
55System Conf. Examples
Cabinets CPUs Memory Peak Performance
1 (AC) 16 64 256 GB 204.8 Gflops
1 64 256 1024 GB 819.0 Gflops
4 256 1024 4096 GB 3.3 Tflops
8 512 2048 8192 GB 6.6 Tflops
16 1024 4096 16384 GB 13.1 Tflops
32 2048 8192 32768 GB 26.2 Tflops
64 4096 16384 65536 GB 52.4 Tflops
56Technical Data (1/2)
Technical specifications Technical specifications
Peak performance 52.4 Tflops in a 64 cabinet configuration
Architecture Scalable vector MPP with SMP nodes
Processing element Processing element
Processor Cray custom design vector CPU 16 vector floating-point operations/clock cycle 32- and 64-bit IEEE arithmetic
Memory size 16 to 64GB per node
Data error protection SECDED
Vector clock speed 800MHz
Peak performance 12.8 Gflops per CPU
Peak memory bandwidth 34.1 GB/sec per CPU
Peak cache bandwidth 76.8 GB/sec per CPU
Packaging 4 CPUs per node Up to 4 nodes per AC cabinet, up to 4 interconnected cabinets Up to 16 nodes per LC cabinet, up to 64 interconnected cabinets
57Technical Data (2/2)
Memory Memory
Technology RDRAM with 204 GB/sec peak bandwidth per node
Architecture Cache coherent, physically distributed, globally addressable
Total system memory size 32 GB to 64 TB
Interconnect network Interconnect network
Topology Modified 2D torus
Peak global bandwidth 400 GB/sec for a 64-CPU Liquid Cooled (LC) system
I/O I/O
I/O system port channels 4 per node
Peak I/O bandwidth 1.2 GB/sec per channel