Title: SOS7:
1SOS7 Machines Already OperationalNSFs
Terascale Computing System
- SOS-7 March 4-6, 2003
- Mike Levine, PSC
2Outline
- Overview of TCS, the US-NSFs Terascale Computing
System. - Answering 3 questions
- Is your machine living up to performance
expectations? - What is the MTBI?
- What is the primary complaint, if any, from
users? - See also PSC web pages Rolfs info.
3Q1 Performance
- Computational and communications performance is
very good! - Alpha processors ES45 servers very good
- Quadrics bw latency very good.
- 74 of peak on Linpack gt76 on LSMS
- More work on disk IO.
- This has been a very ease port for most users.
- Easier than some Cray ? Cray upgrades.
4Q2 MTBI (Monthly Average)
- Compare with theoretical prediction of 12 hrs.
- Expect further improvement (fixing systematic
problems).
5Time Lost to Unscheduled Events
- Purple nodes requiring cleanup
- Worst case is 3
6Q3 Complaints
- 1 I need more time (not a complaint about
performance) - Actual usage gt80 of wall clock
- Some structural improvements still in progress.
- Not a whole lot more is possible!
- Work needed on
- Rogue OS activity. recall Prof. Kales comment
- MPI global reduction libraries. ditto
- System debugging and fragility.
- IO performance.
- We have delayed full disk deployment to avoid
data corruption instabilities. - Node cleanup
- We detect hold out problem nodes until staff
clean. - All in all, the users have been VERY pleased.
ditto
7Full Machine Job
- This system is capable of doing big science
8TCS (Terascale Computing System) ETF
- Sponsored by the U.S. National Science Foundation
- Serving the very high end for US academic
computational science and engineering - Designed to be used, as a whole, on single
problems. (recall full machine job) - Full range of scientific and engineering
applications. - Compaq AlphaServer SC hardware and software
technology - In general production since April, 2002
- 6 in Top 500 (largest open facility in the
world Nov 2001) - TCS-1 in general production since April, 2002
- Integrated into the PACI program (Partnerships
for Academic Computing Infrastructure) - DTF project to build and integrate multiple
systems - NCSA, SDSC, Caltech, Argonne. Multi-lamba,
transcontinental interconnect - ETF aka Teratrid (Extensible Terascale Facility)
integrating TCS with DTF forming - A heterogeneous, extensible scientific/engineering
cyberinfrastructure Grid
9Infrastructure PSC - TCS machine room ( _at_
Westinghouse)(Not require a new building just a
pipe wire upgrade not maxed out)
- 8k ft2
- Use 2.5k
- Existingroom.
- (16 yrs old.)
10Floor Layout
Full System Physical Structure
- Geometrical constraints invariant twixt US Japan
11Terascale Computing System
Compute Nodes
- 750 ES45 4-CPU servers
- 13 inline spares
- (2 login nodes)
- 4 - EV68s /node
- 1 GHz 2.Gf 6 Tf
- 4 GB memory 3.0 TB
- 318.2 GB disk 41 TB
- System
- User temporary
- Fast snapshots
- 90 GB/s
- Tru64 Unix
Compute Nodes
12- ES45 nodes
- 5 nodes per cabinet
- 3 local disks /node
13Terascale Computing System
Quadrics
Quadrics Network
- 2 rails
- Higher bandwidth
- (250 MB/s/rail)
- Lower latency
- 2.5 ?s put latency
- 1 NIC/node/rail
- Federated switch (/rail)
- Fat-tree (bbw 0.2 TB/s)
Compute Nodes
- User virtual memory mapped
- Hardware retry
- Heterogeneous
- (Alpha Tru64 Linux, Intel Linux)
14Central Switch Assembly
- 20 cabinetsin center
- Minimize max internode distance
- 3 out of 4 rows shown
- 21st LL switch, outside (not shown)
15Quadrics wiring overhead (view towards ceiling)
16Terascale Computing System
Quadrics
Management Control
Control
- Quadrics switch control
- Internal SBC Ethernet
- Insight Manager on PCs
- Dedicated systems
- Cluster/node monitoring control
- RMS database
- Ethernet
- Serial Link
LAN
Compute Nodes
17Terascale Computing System
Quadrics
Interactive Nodes
Control
- Dedicated 2ES45
- 8 on compute nodes
- Shared function nodes
- User access
- Gigabit Ethernet to WAN
- Quadrics connected
- /usr indexed store (ISMS)
LAN
Compute Nodes
/usr
WAN/LAN
18Terascale Computing System
Quadrics
File Servers
Control
- 64, on compute nodes
- 0.47 TB/server 30 TB
- 500 MB/s 32 GB/s
- Temporary user storage
- Direct IO
- /tmp
- Each server has
- 24 disks on
- 8 SCSI chains on
- 4 controllers
- sustain full drive bw.
LAN
Compute Nodes
File Servers
/tmp
/usr
WAN/LAN
19Terascale Computing System
Summary
- 750 ES45 Compute Nodes
- 3000 EV68 CPUs _at_ 1 GHz
- 6 Tf
- 3. TB memory
- 41 TB node disk, 90GB/s
- Multi-rail fat-tree network
- Redundant monitor/ctrl
- WAN/LAN accessible
- File servers 30TB, 32 GB/s
- Buffer disk store, 150 TB
- Parallel visualization
- Mass store, 1 TB/hr, gt 1 PB
- ETF coupled (hetero)
Quadrics
Control
LAN
Compute Nodes
File Servers
/tmp
/usr
WAN/LAN
20Terascale Computing System
Visualization
- Intel/Linux
- Newest software
- 16 nodes
- Parallel rendering
- HW/SW compositing
- Quadrics connected
- Image output
- ? Web pages
TCS
340 GB/s (1520Q)
Quadrics
4.5 GB/s (20Q)
3.6 GB/s (16Q)
3.6 GB/s (16Q)
ApplicationGateways
Viz
Buffer Disk
WAN coupled
21Buffer Disk HSM
Terascale Computing System
- Quadrics coupled (225 MB/s/link)
- Intermediate between TCS HSM
- Independently managed.
- Private transport from TCS.
TCS
340 GB/s (1520Q)
Quadrics
4.5 GB/s (20Q)
3.6 GB/s (16Q)
3.6 GB/s (16Q)
gt360 MB/s to tape
HSM - LSCi
ApplicationGateways
Viz
Buffer Disk
WAN/LAN SDSC
22Application Gateways
Terascale Computing System
- Quadrics coupled (225 MB/s/link)
- Coupled to ETF backbone by GigE
- 30 Gb/s
TCS
340 GB/s (1520Q)
Quadrics
4.5 GB/s (20Q)
3.6 GB/s (16Q)
3.6 GB/s (16Q)
ApplicationGateways
Viz
Buffer Disk
Multi GigE to ETF Backbone _at_
30 Gb/s
23The Front Row
- Yes, those are Pittsburgh sports colors.