Title: Supercomputers 2
1Supercomputers 2
- With the acknowledgement of
- Igor Zacharov and Wolfgang Mertz
- SGI European Headquarters
2Classification of Computers
PVP (Cray T90)
UMA Central Memory
SMP (Intel SHV, SUN E10000, DEC 8400 SGI Power
Challenge, IBM R60, etc.)
Multiprocessors Single Address space Shared Memory
COMA (KSR-1, DDM)
CC-NUMA (SGI Origin2000, SN1 (SGI3000), Cray
T3E, HP Exemplar, Sequent NUMA-Q, Data General)
NUMA distributed memory
NCC-NUMA (Cray T3D, IBM SP3)
MIMD
Cluster (IBM SP2, DEC TruCluster, Microsoft
Wolfpack, Beowolf, etc.) loosely coupled,
multiple OS
NORMA no-remote memory access
Multicomputers Multiple Address spaces
MPP (Intel TFLOPS,TM-5) tightly coupled
single OS
MIMD Multiple Instruction s Multiple Data PVP
Parallel Vector Processor UMA Uniform Memory
Access SMP Symmetric
Multi-Processor NUMA Non-Uniform Memory Access
COMA Cache Only Memory
Architecture NORMA No-Remote Memory Access
CC-NUMA Cache-Coherent
NUMA MPP Massively Parallel Processor
NCC-NUMA Non-Cache Coherent NUMA
3Design Space of Competing Computer Architecture
4Structure of an SMP System (1)
- Does NOT scale due to Bus-saturation
- Bus is a very complex Component
- High Memory-Latency due to the Complexity
5Structure of an SMP System (2)
- Scales very well
- Crossbar is a very complex Component
- High Memory-Latency due to the Complexity
Central Crossbar
6Structure of an SMP System (3)Origin SGI NUMA
Architecture
7Systems are built from Modules
8New High-End ProductsOrigin 3000 Servers Onyx
3 Systems
SGI Origin 3200 SGI Onyx 3200
SGI Origin 3800 SGI Onyx 3800
SGI Origin 3400 SGI Onyx 3400
9SGI 3800 System (16-512p)
128P System Topology
R-Brick 8-port router
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
C-Brick
C-Brick
C-Brick
C-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
C-Brick
C-Brick
C-Brick
C-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
C-Brick
C-Brick
C-Brick
C-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
R-Brick
R-Brick
R-Brick
R-Brick
C-Brick
C-Brick
C-Brick
C-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
C-Brick
C-Brick
C-Brick
C-Brick
C-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
R-Brick
R-Brick
R-Brick
R-Brick
R-Brick
C-Brick
C-Brick
C-Brick
C-Brick
C-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
C-Brick
C-Brick
C-Brick
C-Brick
C-Brick
I-Brick
I-Brick
P, I, or, X-Brick
P, I, or, X-Brick
P, I, or, X-Brick
C-Brick
C-Brick
C-Brick
C-Brick
C-Brick
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Power Bay
Minimum (16p) System
128p System
10ASCI Blue MountainLos Alamos National
Laboratories
- Origin 2000 with 3 Tflops peak
- 1 Tflop Application Performance
- 48 Systems with 128 CPUs each 6144 CPUs
- 1536 Gbyte Memory
- 76 Tbyte Diskspace
11Memory hierarchy
12NUMAflexFlexible Configuration