Introduction to the NERSC HPCF NERSC User Services - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to the NERSC HPCF NERSC User Services

Description:

Title: Introduction to NERSC NERSC User Services Author: Thomas M. DeBoni Last modified by: Thomas M. DeBoni Created Date: 5/9/2000 4:10:41 PM Document presentation ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 44
Provided by: Thoma346
Learn more at: https://www.nersc.gov
Category:

less

Transcript and Presenter's Notes

Title: Introduction to the NERSC HPCF NERSC User Services


1
Introduction to the NERSC HPCFNERSC User Services
  • Hardware, Software, Usage
  • Mass Storage
  • Access Connectivity

2
Hardware, part 1
  • Cray Parallel Vector Processor (PVP) Systems
  • 96 CPUs, Shared-memory parallelism (Cray tasking,
    OpenMP)
  • J90SE clock is 100 MHz peak performance is 200
    Mflops/cpu (125, actual)
  • SV1 clock is 300 MHz peak performance is 1200
    Mflops/cpu (300, actual)
  • J90Se and SV1 are not binary compatible
  • Cray T3E MPP System
  • mcurie
  • 692 PEs 644 application, 33 command, 15 OS 256
    MB/PE
  • PE clock is 450 MHz peak performance is 900
    Mflops/PE (100, actual)

3
Hardware, part 2
  • IBM SP MPP System
  • gseaborg, Phase 1
  • 304 nodes (608 CPUs) 256 (512) compute, 8 (16)
    login, 16 (32) GPFS, 8 (16) network, 16 (32)
    service) 1 GB/node
  • Node clock is 200 MHz peak performance is
    800Mflops per CPU (200, actual)
  • Phase 2 will be bigger and faster
  • Visualization Server
  • escher SGI Onyx 2
  • 8 CPUs, 5 GB RAM, 2 graphic pipes
  • CPU clock is 195 MHz 2 simultaneous video
    streams
  • Math Server
  • newton Sun UltraSPARC-II
  • 1 CPU, 512 MB RAM
  • CPU clock is 248 MHz

4
Hardware, part 3
  • Parallel Distributed Systems Facility (PDSF)
  • High Energy Physics facility for detector
    simulation and data analysis
  • Multiple clustered systems Intel Linux PCs, Sun
    Solaris workstations
  • Energy Sciences Network (ESNet)
  • Major component of the Internet ATM Backbone
  • Specializing in information retrieval,
    infrastructure, and group collaboration
  • High Performance Storage System (HPSS)
  • Multiple libraries, hierarchical disk and tape
    archive systems
  • High speed transfers to NERSC systems
  • Accessible from outside NERSC
  • Multiple user interface utilities
  • Directories for individual users and project
    groups

5
PVP File Systems, part 1
  • HOME
  • permanent (but not archival)
  • 5 GB quota, regular backups, file migration
  • local to killeen, NFS-mounted on seymour and
    batch systems
  • poor performance for batch jobs
  • /u/repo/u10101
  • /Un/u10101
  • /u/ccc/u10101
  • /U0/u10101

6
PVP File Systems, part 2
  • TMPDIR
  • temporary (created/destroyed each session)
  • no quota (but NQS limits 10 GB - 40 GB)
  • no backups, no migration
  • local to each machine
  • high-performance RAID arrays
  • system manages this for you
  • A.K.A. BIG
  • /tmp
  • location of TMPDIR
  • 14-day lifetime
  • A.K.A. /big
  • you manage this for yourself

7
PVP Environment, part 1
  • Unicos
  • Shells
  • Supported
  • sh
  • csh
  • ksh (same as sh)
  • Unsupported
  • tcsh (get it by module load tcsh)
  • bash (get it by module load tools)

8
PVP Environment, part 2
  • Modules
  • Found on many Unix systems
  • Sets all or any of environment variables,
    aliases, executable search paths, man search
    paths, header file include paths, library load
    paths
  • Exercise care modifying startup files!
  • Crays PrgEnv is modules-driven
  • Provided startup files are critical!
  • Add to .ext files, dont clobber originals
  • Append to paths, dont set them, and this only if
    necessary
  • If you mess up, no compilers, etc.
  • Useful commands
  • module list
  • module avail
  • module load modfile
  • module display modfile
  • module help modfile

9
PVP Environment, part 3
  • Programming
  • Fortran 90 - f90
  • C/C - cc, CC
  • Assembler - as
  • Use compiler (f90, cc, CC) for linking also
  • f90 file naming conventions
  • filename.f - fixed form Fortran-77 code
  • filename.F - fixed form Fortran-77 code, run
    preprocessor first
  • filename.f90 - free form Fortran 90 code
  • filename.F90 - free form Fortran 90 code, run
    preprocessor first
  • Multiprocessing (aka multitasking,
    multithreading)
  • setenv NCPUS 4 (csh)
  • export NCPUS4 (ksh)
  • "a.out Command not found.
  • ./a.out (Note No parallelism specified with
    execution)

10
PVP Environment, part 4a
  • Execution modes
  • Interactive serial
  • 10 hours on killeen and seymour
  • 80 MW max memory
  • Interactive parallel
  • No guarantee of real-time concurrency
  • Batch queues
  • killeen, seymour, franklin, bhaskara
    franklin, bhaskara
  • To see them qstat -b
  • Queues shuffled at night, and sometimes during
    the day
  • Subject to change

11
PVP Environment, part 4b
  • Batch
  • User creates shell script (e.g., myscript)
  • Submits to NQE with cqsub myscript
  • Returns NQE task id (e.g., t1234)
  • NQE selects machine and forwards to NQS
  • Job remains pending (NPend) until resources
    available
  • NQS runs the job
  • Assigns NQS request id (e.g., 5678.bhaskara)
  • Run job in appropriate batch queue
  • Job log returned upon completion

12
PVP Environment, part 5
  • Libraries
  • Mathematics
  • nag, imsl, slatec, lsode, harwell, etc.
  • Graphics
  • ncar, gnuplot, etc.
  • I/O
  • HDF, netCDF, etc.
  • Applications
  • Amber, Ansys, Basis, Gamess, Gaussian, Nastran,
    etc.

13
PVP Environment, part 6
  • Tools
  • ja - job accounting
  • hpm - Hardware Performance Monitor
  • prof - Execution time profiler viewer
  • flowtrace/flowview - Execution time profiler
    viewer
  • atexpert - Autotasking performance predictor
  • f90 - Compiler feedback
  • totalview - Debugger (visual and line-oriented)

14
T3E File Systems, part 1
  • HOME
  • permanent (but not archival)
  • 2 GB quota, regular backups, file migration
  • poor performance for batch jobs
  • /u/repo/u10101
  • /Un/u10101
  • /u/ccc/u10101
  • /U0/u10101

15
T3E File Systems, part 2
  • TMPDIR
  • temporary (created/destroyed each session)
  • 75 GB quota (but NQS limits 4 GB - 32 GB)
  • no backups, no migration
  • high-performance RAID arrays
  • system manages this for you
  • Can be used for parallel files
  • /tmp
  • location of TMPDIR
  • 14-day lifetime
  • A.K.A. /big
  • you manage this for yourself

16
T3E Environment, part 1
  • UNICOS/mk
  • Shells sh/ksh, csh, tcsh
  • Supported
  • Sh
  • Csh
  • ksh (same as sh)
  • Unsupported
  • tcsh (get it by module load tcsh)
  • Bash (get it by module load tools)

17
T3E Environment, part 2
  • Modules - manages user environment
  • Paths, Environment variables, Aliases, same as on
    PVP systems
  • Crays PrgEnv is modules-driven
  • Provided startup files are critical!
  • Add to .ext files, dont clobber originals
  • Append to paths, dont set them, and this only if
    necessary
  • If you mess up, no compilers, etc.
  • Useful commands
  • module list
  • module avail
  • module load modfile
  • module display modfile
  • module help modfile

18
T3E Environment, part 3a
  • Programming
  • Fortran 90 f90
  • C/C cc, CC
  • Assembler cam cam
  • Use compiler (f90, cc, CC) for linking also
  • Same naming conventions as on PVP systems
  • PGHPF - Portland group HPF
  • KCC Kuck and Assoc. C
  • Get it via module load KCC
  • Multiprocessing
  • Execution in Single-Program, Multiple-Data (SPMD)
    Mode
  • In Fortran 90, C, C, all processors execute
    same program

19
T3E Environment, part 3b
  • Executables - Malleable or Fixed
  • specified in compilation and/or execution
  • f90 -Xnpes ... (e.g., -X64) creates fixed
    executable
  • Always runs on same number of (application)
    processors
  • Type ./a.out to run
  • f90 -Xm... or without -X option creates
    malleable executable
  • ./a.out will run on command PE
  • mpprun -n npes ./a.out runs on npes APP PEs
  • Executing code can ask for
  • Process id (from zero up)
  • MPI_COMM_RANK(...)
  • Total number of PEs
  • MPI_COMM_SIZE(...)
  • PE or Process/Task ID used to establish
    master/slave identities, controlling execution

20
T3E Environment, part 4a
  • Execution modes
  • Interactive serial
  • lt 60 minutes on one command PE, 20 MW max memory
  • Interactive parallel
  • lt 30 minutes on lt 64 processors, 29 MW memory per
    PE
  • Batch queues
  • To see them qstat -b
  • Queues shuffled in at night
  • Subject to change

21
T3E Environment, part 4b
  • (Old, obsolete) Example of T3E management and
    queue scheduling

22
T3E Environment, part 5
  • Math graphics libraries, and application codes
    are similar to those on the PVP systems
  • Libraries are needed for communication
  • MPI (Message-Passing Interface)
  • PVM (Parallel Virtual Machine)
  • SHMEM (SHared MEMory non-portable)
  • BLACS (Basic Linear Algebra Communication
    Subprograms)
  • ScaLAPACK (SCAlable parts of LAPACK)
  • LIBSCI (including parallel FFTs), NAG, IMSL
  • I/O libraries
  • Crays FFIO
  • NetCDF (NETwork Common Data Format)
  • HDF (Hierarchical Data Format)

23
T3E Environment, part 6
  • Tools
  • Apprentice - finds performance problems and
    inefficiencies
  • PAT - Performance analysis tool
  • TAU - ACTS tuning and analysis utility
  • Vampir - commercial trace generation and viewing
    utility
  • Totalview - multiprocessing-aware debugger
  • F90 - compiler feedback

24
SP File Systems, part 1
  • AIX is a Virtual Memory operating system
  • Each node has its own disks, with OS image, swap
    and paging spaces, and scratch partitions .
  • Two types of user-accessible file systems
  • Large, globally accessible parallel file system,
    called GPFS
  • Smaller node-local partitions

25
SP File Systems, part 2
  • Environment variables identify directories
  • HOME - your personal home directory
  • Located in GPFS, so globally available to all
    jobs
  • Home directories are not currently backed up!
  • Quotas 4 GB, and 5000 inodes
  • SCRATCH - one of your temporary spaces
  • Located in GPFS
  • Very large - 3.5 TB
  • Transient - purged after session or job
    termination
  • TMPDIR - another of your temporary spaces
  • Local to a node
  • Small - only 1 GB
  • Not particularly fast
  • Transient - purged on termination of creating
    session or batch job

26
SP File Systems, part 3
  • Directly-specified directory paths can also be
    used
  • /scratch - temporary space
  • Located in GPFS
  • Very large
  • Not purged at job termination
  • Subject to immediate purge
  • Quotas 100 GB and 6000 inodes
  • Your SCRATCH directory is set up in
    /scratch/tmpdirs/nodename/tmpdir.number
  • where number is system-generated
  • /scratch/username - user-created temporary
    space
  • Located in GPFS
  • Large, fast, encouraged usage
  • Not purged at job termination
  • Subject to purge after 7 days, or as needed
  • Quotas 100 GB and 6000 inodes

27
SP File Systems, part 4
  • /scr - temporary space
  • Local to a node
  • Small - only 1 GB
  • Your session-local TMPDIR is set up in
    /scr/tmpdir.number
  • where number is system-generated
  • Not user-accessible, except for TMPDIR
  • /tmp - System-owned temporary space
  • Local to a node
  • Very small - 65 MB
  • Intended for use by utilities, such as vi for
    temporary files
  • Dangerous - DO NOT USE!
  • If filled up, it can cause the node to crash!

28
SP Environment, part 1
  • IBM's AIX - a true virtual memory kernel
  • Not a single system image, as on the T3E
  • Local implementation of module system
  • No modules load by default
  • Default shell is csh
  • Shell startup files (e.g., .login, .cshrc, etc.)
    are links DONT delete them!
  • Customize extension files (e.g., .cshrc.ext), not
    startup files

29
SP Environment, part 2
  • SP Idniosyncracies
  • All nodes have unique identities different
    logins may put you on different nodes
  • Must change password, shell, etc. on gsadmin node
  • No incoming FTP allowed
  • xterms should not originate on the SP
  • Different sessions may be connected to different
    nodes
  • High speed I/O is done differently from the T3E
  • Processors are faster, but communication is
    slower, than on the T3E
  • PFTP is faster than native FTP
  • SSH access methods differ, slightly

30
SP Environment, part 3a
  • Programming in Fortran
  • Fortran - Fortran 77, Fortran 90, and Fortran 95
  • Multiple "versions" of the XLF compiler
  • xlf, xlf90 for ordinary serial code
  • xlf_r, xlf90_r for multithreaded code (shared
    memory parallelism)
  • mpxlf90, mpxlf90_r for MPI-based parallel code
  • Currently, must specify separate temporary
    directory for Fortran-90 modulesxlf90
    -qmoddirTMPDIR -ITMPDIR modulesource.F
    source.F
  • IBM's HPF (xlhpf) is also available

31
SP Environment, part 3b
  • Programming in C and C
  • C C languages supported by IBM
  • Multiple "versions" of the XLC compiler
  • cc, xlc for ordinary serial C code
  • xlC for ordinary serial C code
  • cc_r, xlc_r for multithreaded C code (shared
    memory parallelism)
  • xlC_r for multithreaded C code (shared memory
    parallelism)
  • mpcc for MPI-based parallel C code
  • mpCC for MPI-based parallel C code
  • Kuck Assoc. KCC also available in its own module

32
SP Environment, part 4a
  • Execution
  • Many ways to run codes
  • serial, parallel
  • shared-memory parallel, message-based parallel,
    hybrid
  • interactive, batch
  • Serial execution is easy
  • ./a.out ltinput_file gtoutput_file
  • Parallel execution - SPMD Mode, as with T3E
  • Uses POE, a supra-OS resource manager
  • Uses Loadleveler to schedule execution
  • There is some overlap in options specifiable to
    POE and LoadLeveler
  • You can use one or both processors on each node
  • environment variables and batch options control
    this

33
SP Environment, part 4b
  • Shared memory parallel execution
  • Within a node, only
  • OpenMP, Posix Threads, IBM SMP directives
  • Message-based parallel execution
  • Across nodes and within a node
  • MPI , PVM, LAPI, SHMEM (planned)
  • Hybrid parallel execution
  • Threading and message passing
  • Most likely to succeed OpenMP and MPI
  • Currently, MPI understands inter- vs. intra-node
    communication, and sends intra-node messages
    efficiently

34
SP Environment, part 4c
  • Interactive execution
  • Interactive jobs run on login nodes or compute
    nodes
  • currently, there are 8 login nodes
  • Serial execution is easy
  • ./a.out ltinput_file gtoutput_file
  • Parallel exeuction involves POE
  • poe ./a.out -procs 4 ltinput_file gtoutput_file
  • Interactive parallel jobs may be rejected due to
    resource scarcity no queueing
  • By default, parallel interactive jobs use both
    processors on each node
  • Batch execution
  • Batch jobs run on the compute nodes
  • By default, parallel batch jobs use both
    processors on each node
  • you will be charged for both, even if you
    override this
  • Use Loadleveler utilities set to submit, monitor,
    cancel, etc.
  • requires a script, specifying resource usage
    details, execution parameters, etc.
  • Several job classes, for charging, resource
    limits premium, regular, low
  • two job types - serial and parallel

35
SP Environment, part 4d
  • SP Batch Queues and resource Limits
  • Limits
  • 3 jobs running
  • 10 jobs considered for scheduling (idle)
  • 30 jobs submitted

36
SP Environment, part 5
  • Libraries and Other Software
  • Java, Assembler
  • Aztec, PETSc, ScaLAPACK
  • Emacs
  • Gaussian 98, NWChem
  • GNU Utilities
  • HDF, netCDF
  • IMSL, NAG, LAPACK
  • MASS, ESSL, PESSL
  • NCAR Graphics
  • TCL/TK

37
SP Environment, part 6
  • Tools
  • VT - vsualization tool for trace visualization
    and performance monitoring
  • Xprofiler - graphical code structure and
    execution time monitoring
  • Totalview - multiprocessing-aware debugger
  • Other Debugging Tools
  • Totalview - available in its own MODULE
  • adb - general purpose debugger
  • dbx - symbolic debugger for C, C, Pascal, and
    FORTRAN programs
  • pdbx - based on dbx, with functionality for
    parallel programming
  • TAU - ACTS tuning and analysis utility - planned!
  • Vampir - commercial trace generation and viewing
    utility - future!
  • KAP Suite - future?
  • PAPI - future?

38
HPSS Mass Storage
  • HPSS
  • Hierarchical, flexible, powerful,
    performance-oriented
  • Multiple user interfaces allow easy, flexible
    storage management
  • Two distinct physical library systems
  • May be logically merged in future software
    release
  • Accessible from any system from inside or outside
    NERSC
  • hpss.nersc.gov, archive.nersc.gov (from outside
    NERSC)
  • hpss, archive (from inside NERSC)
  • Accessible via several utilities
  • HSI, PFTP, FTP
  • Can be accessed interactively or from batch jobs
  • Compatible with system maintenance utilities
    (sleepers)

39
HPSS Mass Storage
  • HPSS
  • Allocated and accounted, just like CPU resources
  • Storage Resource Units (SRUs)
  • Open ended - you get charged, but not cut off, if
    you exceed your allocation
  • Project spaces available, for easy group
    collaboration
  • Used for system backups and user archives
  • hpss used for both purposes
  • archive is for user use only
  • Has modern access control
  • DCE allows automatic authentication
  • Special DCE accounts needed
  • Not uniformly accessible from all NERSC systems
  • Problems with PFTP on the SP system
  • Modern secure access methods are problematic
  • ftp tunneling doesnt work (yet)

40
Accessing NERSC
  • NERSC recognizes two connection contexts
  • Interaction (working on a computer)
  • File transfer
  • Use of SSH is required for interaction (telnet,
    rlogin are prohibited)
  • SSH is (mostly) standardized and widely available
  • Most Unix Linux systems come with it
  • Commercial (and some freeware) versions available
    for Windows, Macs,
  • SSH allows telnet-like terminal sessions, but
    protects account name and password with
    encryption
  • simple and transparent to set up and use
  • Can look and act like rlogin
  • SSH can forward xterm connections
  • sets up a special DISPLAY environment variable
  • encrypts the entire session, in both directions

41
Accessing NERSC
  • SSH is encouraged for file transfers
  • SSH contains scp, which acts like rcp
  • scp encrypts login info and all transferred data
  • SSH also allows secure control connections
    through tunneling or forwarding
  • Heres how tunneling is done
  • Set up a terminal connection to a remote host
    with port forwarding enabled
  • This specifies a port on your workstation that
    ssh will forward to another host
  • FTP to the forwarded port - looks like you are
    ftping to your own workstation
  • Control connection (login process) is forwarded
    encrypted
  • Data connections proceed as any ftp transfer
    would, unencrypted
  • Ongoing SSH issues being investigated by NERSC
    staff
  • Not all firewalls allow ftp tunneling, without
    passive mode
  • HPSS wont accept tunneled ftp connections
  • Workstation platform affects tunneling method
  • Methods differ slightly on the SP
  • New options, must use xterm forwarding, no ftp
    tunneling...
  • Different platforms accept different ciphers

42
Information Sources - NERSC Web Pages
43
Information Sources - On-Line Lecture Materials
Write a Comment
User Comments (0)
About PowerShow.com