Title: NUG Training 10/3/2005
1NUG Training 10/3/2005
- Logistics
- Morning only coffee and snacks
- Additional drinks 0.50 in refrigerator in small
kitchen area can easily go out to get coffee
during 15-minute breaks - Parking garage vouchers at reception desk on
second floor - Lunch
- On your own, but can go out in groups
2Todays Presentations
- Jacquard Introduction
- Jacquard Nodes and CPUs
- High Speed Interconnect and MVAPICH
- Compiling
- Running Jobs
- Software overview
- Hands-on
- Machine room tour
3Overview of Jacquard Richard Gerber NERSC User
Services RAGerber_at_lbl.gov NERSC Users
Group October 3, 2005 Oakland, CA
4Presentation Overview
- Cluster overview
- Connecting
- Nodes and processors
- Node interconnect
- Disks and file systems
- Compilers
- Operating system
- Message passing interface
- Batch system and queues
- Benchmarks and application performance
5Status
Jacquard has been experiencing node
failures. While this problem is being worked on
we are making Jacquard available to users in a
degraded mode. About 200 computational nodes are
available, one login node, and about half of the
storage nodes that support the GPFS file
system. Expect lower than usual I/O
performance. Because we may still experience some
instability, users will not be charged until
Jacquard is returned to full production
6Introduction to Jacquard
- Named in honor of inventor Joseph Marie Jacquard,
whose loom was the first machine to use punch
cards to control a sequence of operations. - Jacquard is a 640-CPU Opteron cluster running a
Linux operating system. - Integrated, delivered, and supported by Linux
Networx - Jacquard has 320 dual-processor nodes available
for scientific calculations. (Not dual-core
processors.) - The nodes are interconnected with a high-speed
InfiniBand network. - Global shared file storage is provided by a GPFS
file system.
7Jacquard
- http//www.nersc.gov/nusers/resources/jacquard/
8Jacquard Characteristics
Processor type Opteron 2.2 GHz
Processor theoretical peak 4.4 GFlops/sec
Processors per node 2
Number of application nodes/processors 320 / 640
System theoretical peak (computational nodes) 2.8 TFlops/sec
Physical memory per node (usable) 6 (3-5) GBytes
Number of spare application nodes 4
Number of login nodes 4
Node interconnect InfiniBand
Global shared disk GPFS 30 TBytes usable
Batch system PBS Pro
9Jacquards Role
- Jacquard is meant to be for codes that do not
scale well on Seaborg. - Hope to relieve Seaborg backlog.
- Typical job expected to be in the concurrency
range of 16-64 nodes. - Applications typically run 4X Seaborg speed. Jobs
that cannot scale to large parallel concurrency
should benefit from faster CPUs.
10Connecting to Jacquard
- Interactive shell access is via SSH.
- ssh l login_name jacquard.nersc.gov
- Four login nodes for compiling and launching
parallel jobs. Parallel jobs do not run on login
nodes. - Globus file transfer utilities can be used.
- Outbound network services are open (e.g., ftp).
- Use hsi for interfacing with HPSS mass storage.
11Nodes and processors
- Each jacquard node has 2 processors that share 6
GB of memory. OS/network/GPFS uses 1 (?) GB of
that. - Each processor is a 2.2 GHz AMD Opteron
- Processor theoretical peak 4.4 GFlops/sec
- Opteron offers advanced 64-bit processor,
becoming widely used in HPC.
12Node Interconnect
- Nodes are connected by an InfiniBand high speed
network from Mellanox. - Adapters and switches from Mellanox
- Low latency 7µs vs. 25 µs on Seaborg
- Bandwidth 2X Seaborg
- Fat tree
13Disks and file systems
- Homes, scratch, and project directories are in
global file system from IBM, GFPS. - SCRATCH environment variable is defined to
contain path to a users personal scratch space. - 30 TBytes total usable disk
- 5 GByte space, 15,000 inode quota in HOME per
user - 50 GByte space, 50,000 inode quota in SCRATCH
per user - SCRATCH gives better performance, but may be
purged if space is needed
14Project directories
- Project directories are coming (some are already
here). - Designed to facilitate group sharing of code and
data. - Can be repo- or arbitrary group-based
- /home/projects/group
- For sharing group code
- /scratch/projects/group
- For sharing group data and binaries
- Quotas TBD
15Compilers
- High performance Fortran/C/C compilers from
Pathscale. - Fortran compiler pathf90
- C/C compiler pathcc, pathCC
- MPI compiler scripts use Pathscale compilers
underneath and have all MPI I, -L, -l options
already defined - mpif90
- mpicc
- mpicxx
16Operating system
- Jacquard is running Novell SUSE Linux Enterprise
Linux 9 - Has all the usual Linux tools and utilities
(gcc, GNU utilities, etc.) - It was the first enterprise-ready Linux for
Opteron. - Novell (indirectly) provides support and product
lifetime assurances (5 yrs).
17Message passing interface
- MPI implementation is known as MVAPICH.
- Based on MPICH from Argonne with additions and
modifications from LBNL for InfiniBand. Developed
and supported ultimately by Mellanox/Ohio State
group. - Provides standard MPI and MPI/IO functionality.
18Batch system
- Batch scheduler is PBS Pro from Altair
- Scripts not much different from LoadLeveler _at_
-gt PBS - Queues for interactive, debug, premium charge,
regular charge, low charge. - Configured to run jobs using 1-128 nodes (1-256
CPUs).
19Performance and benchmarks
- Applications run 4x Seaborg, some more, some less
- NAS Parallel Benchmarks (64-way) are 3.5-7
times seaborg - Three applications the author has examined (-O3
out of the box) - CAM 3.0 (climate) 3.5 x Seaborg
- GTC (fusion) 4.1 x Seaborg
- Paratec (materials) 2.9 x Seaborg
20User Experiences
- Positives
- Shorter wait in the queues
- Linux many codes already run under Linux
- Good performance for 16-48 node jobs some codes
scale better than on Seaborg - Opteron is fast
21User Experiences
- Negatives
- Fortran compiler is not common, so some porting
issues. - Small disk quotas.
- Unstable at times.
- Job launch doesnt work well (cant pass ENV
variables). - Charge factor.
- Big endian I/O.
22Todays Presentations
- Jacquard Introduction
- Jacquard Nodes and CPUs
- High Speed Interconnect and MVAPICH
- Compiling
- Running Jobs
- Software overview
- Hands-on
- Machine room tour
23Hands On
- We have a special queue blah with 64 nodes
reserved. - You may work on your own code.
- Try building and running test code
- Copy to your directory and untar
/scratch/scratchdirs/ragerber/NUG.tar - 3 NPB parallel benchmarks ft, mg, sp
- Configure in config/make.def
- make ft CLASSC NPROCS16
- Sample PBS scripts in run/
- Try new MPI version, opt levels, -g, IPM