How to use the System - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

How to use the System

Description:

Universit t Karlsruhe (TH) Rechenzentrum. How to use the System ... mpif90 o my_prog my_prog.o sub1.o sub2.o. SSCK Workshop, Karlsruhe, March 9, 2005 ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 33
Provided by: nikolau
Category:
Tags: sub1 | system | use

less

Transcript and Presenter's Notes

Title: How to use the System


1
How to use the System
  • SSCK Workshop Introduction to HP XC6000
    ClusterKarlsruhe, March 9 11, 2005
  • Hartmut HäfnerSSCKUniversität Karlsruhe (TH)
  • haefner_at_rz.uni-karlsruhe.de

2
Interactive Login
3
Available Services (1/2)
HWW-Firewall
ssh (scp)
XC1
passive ftp
  • No print manager
  • No exported file system

4
Available Services (2/2)
  • Login to HP XC6000 Clusterssh
    ltuser-idgt_at_hwwxc1.hww.de
  • or within University Karlsruhessh
    ltuser-idgt_at_xc1.rz.uni-karlsruhe.de
  • SSH2 from RZ administrated workstationsssh2 p
    22 ltuser-idgt_at_hwwxc1.hww.de

5
File Systems (1/2)
TMP
TMP
. . .
. . .
Quadrics QsNet II (single rail)
. . .
FC Network
TMP
HOME
WORK
10 TB
6
File Systems (2/2)
  • global - all nodes access the parallel file
    system HP SFS, based on Lustre
  • local each node has ist own file system
  • permanent files are stored permanently
  • temporary files are removed at end of job or
    session

7
Moving Files (HP XC ?? Workstations)
  • Either by the command scp or by passive ftpscp
    ltuser-idgt_at_ws.institute.uni-karlsruhe.demydata
    HOMEftp ws.institute.uni-karlsruhe.de

8
Module Concept
  • module is a user interface to the Modules
    package.
  • Typically modulefiles instruct the module command
    to set or alter environment variables like PATH,
    MANPATH, .
  • Syntax ismodule switches sub-command
    modulefilepathdirectory
  • Important switches are
  • --force, -f Force active dependency resolution.
    This will result in modules found on a prereq
    command inside a modulefile being loaded
    automatically.
  • --verbose, -v Enable verbose messages during
    module comand execution.
  • Further switches control the amount of output of
    the module command.

9
Modules (1/2)
  • module help modulefile...Print the useage of
    each subcommand. If an argument is given, print
    the Module specific help information for the
    modulefile.
  • module addload modulefile modulefile...Load
    modulefile into the shell environment.
  • module unloadrm modulefile modulefile...Remove
    modulefile from the shell environment.
  • module switchswap modulefile1 modulefile2Switch
    loaded modulefile1 with modulefile2.
  • module displayswitch modulefile
    modulefile...Display information about the
    modulefile.
  • module listList loaded modules.
  • module avail path...List all available
    modulefiles in the current MODULEPATH.
  • module purgeUnload all loaded modulefiles.
  • Further commands to add directories to MODULEPATH
    and to addremove modulefiles tofrom the shell
  • dependent startup files.

10
Modules (2/2)
11
Modulefiles containing modifications to the
environment
  • modulefile is a file containing Tcl code
    extensions for the Modules package.
  • modulefile contains the changes to a users
    environment needed to access an application.
  • modulefiles can also be used to implement site
    policies regarding the access and use of
    applications.
  • modulefiles also hide the notion of different
    types of shells. From the modulefile writers
    perspective, this means one set of information
    will take care of every type of shell.
  • Change default module environment by inserting
    module add ltmodulefilegtin the setup file
    .bash_profile.
  • Add your own Modulefiles by extending the
    MODULEPATH environment variable.

12
Compilers (1/4)
  • Fortran 2 Intel Compilers (ifort in V8.1 and efc
    in V7.1), NAG Compiler (f95), GNU
    Compiler (g77 - only Fortran77)
  • C/C 2 Intel Compilers (icc in V8.1 and ecc in
    V7.1), GNU Compiler (gcc)
  • -- General options -c, -Iltpathgt, -g,
    -00,1,2,3, -Lltpathgt,
    -lltlibrarygt, -o ltnamegt
  • NAG Fortran Compiler - best choice to check the
    Fortran90/95 conformity of your program
  • Important specific options of the NAG Fortran
    Compiler
  • -Ounsafe performs possibly unsafe optimizations
  • -dusty allows the compilation of legacy
    software (errors ? warning)
  • -ieeefullnonstdstop enablesdisables all
    IEEE and deallocation facilities
  • -C compiles code with all possible runtime
    checks
  • -mtrace traces memory allocation and
    deallocation
  • -gline compiles code to generate a traceback in
    case of runtime errors
  • -gc enables automatic garbage collection of the
    executable
  • -tread_safe compiles code for safe execution in
    a multi-threaded environment
  • -static prevents linking with shared libraries

13
Compilers (2/4)
  • Intel Fortran suffix names
  • NAG Fortran suffix names

14
Compilers (3/4)
  • Change compiler by a simple module command (by
    default the Intel compiler in version 8.1 is
    used) module addload intel-compilers/7.1
  • Using different compilers
  • dont use explicit compiler names
  • use the FC environment variable for the Fortran
    compiler
  • use the CC environment variable for the C/C
    compiler name

15
Compilers (4/4)
  • Compiling Fortran90/95 source code with Intel
    compilerifort c O3 my_prog.f90
  • Compiling Fortran90/95 source code with an
    arbitrary Fortran compiler FC c O3
    my_prog.f90
  • Compiling C source code with Intel compilericc
    c O3 my_prog.c
  • Compiling C source code with Intel compiler
    CC c O3 my_prog.C

16
Linking
  • Special compiler scripts to (compile and) link
    MPI programs (the scripts dont work together
    with the GNU compilers)
  • mpicc (compile and) link C programs
  • mpicc.mpich (compile and) link C programs in
    MPICH compatibility mode
  • mpiCC (compile and) link C programs
  • mpiCC.mpich (compile and) link C programs in
    MPICH compatibility mode
  • mpif77 or mpif90 (compile and) link Fortran
    programsIf MPICH compatibility mode is required,
    call mpif77.mpich or mpif90.mpich
  • Example for Fortran90/95 object code with Intel
    compilermpif90 o my_prog my_prog.o sub1.o
    sub2.o

17
Benchmarks
Measurements of Itanium2 (1.5 GHz) on HP XC6000
Cluster
What is remarkable? The dot product runs very
slow! The scattering of the performance rates,
if the data are stored in the L2-cache is very
high (up to 40 percent!!!).
18
Benchmarks Ping Pong within a node
Neighbor send/receive speed test
--------------------------------- --- Multiple
simple Ping/Pong --- ----------------------------
----- Clock overhead is 0.1736E-07 secs per
snd/rcv. bytes ms MB/s
0 0.001 0.000 4
0.001 4.590 8 0.001
7.875 16 0.001 15.526
32 0.001 34.528 64 0.001
73.807 128 0.001 127.790
256 0.001 209.114 512 0.001
436.936 1024 0.002 674.397 2048
0.007 308.211 4096 0.007
550.674 8192 0.010 834.013 16384
0.014 1181.921 32768 0.022 1507.639
65536 0.036 1835.203 131072 0.071
1854.967 262144 0.126 2074.492 524288
0.254 2060.727 1048576 0.502 2089.745
Neighbor send/receive speed test
--------------------------------- --- Multiple
double Ping/Pong --- ----------------------------
----- Clock overhead is 0.2670E-08 secs per
snd/rcv. bytes ms MB/s
0 0.003 0.000
4 0.004 1.131 8
0.003 2.381 16 0.003
4.744 32 0.004 8.936
64 0.003 19.438 128
0.003 37.425 256 0.004
65.514 512 0.004 134.188
1024 0.004 253.168 2048 0.006
343.425 4096 0.008 541.139
8192 0.011 729.931 16384 0.018
914.383 32768 0.033 1002.130
65536 0.064 1021.981 131072 0.124
1055.018 262144 0.233 1127.460
524288 0.486 1078.485 1048576 0.911
1151.049
19
Benchmarks Ping Pong between nodes
Neighbor send/receive speed test
--------------------------------- --- Multiple
simple Ping/Pong --- ----------------------------
----- Clock overhead is 0.1736E-07 secs per
snd/rcv. bytes ms MB/s
0 0.003 0.000 4
0.003 1.441 8 0.003
2.905 16 0.003 5.828
32 0.003 11.605 64
0.004 16.514 128 0.004
30.021 256 0.006 45.949
512 0.006 87.778 1024 0.006
161.227 2048 0.008 271.353
4096 0.010 408.196 8192 0.015
546.295 16384 0.025 659.058
32768 0.045 735.468 65536 0.084
781.339 131072 0.164 797.490
262144 0.320 818.153
524288 0.660 794.346 1048576 1.266
828.447
Neighbor send/receive speed test
--------------------------------- --- Multiple
double Ping/Pong --- ----------------------------
----- Clock overhead is 0.2666E-08 secs per
snd/rcv. bytes ms MB/s
0 0.009 0.000
4 0.009 0.443 8 0.009
0.899 16 0.009 1.739
32 0.009 3.497
64 0.010 6.495 128 0.010
12.508 256 0.012 22.125
512 0.012 43.344 1024
0.013 80.759 2048 0.016
129.800 4096 0.021 197.767
8192 0.031 267.897 16384 0.050
329.511 32768 0.089 367.656
65536 0.172 381.144 131072 0.334
392.161 262144 0.673 389.346
524288 1.313 399.440 1048576 2.816
372.366
20
Benchmarks Overlap for short messages between
nodes
Neighbor send/receive overlap test
---------------------------------- ------ Short
messages --------- ----------------------------
------ The used message length during
computation is ... 10 the used
vectorlength during computation is . . .
10 all times in seconds, gtgtol_fac in
percent!!! Bal_fac Rep_fac_comm Rep_fac_comp
T_comm T_comp T_all T_ol ol_fac 1
103548 12267030
1.03 1.03 1.90 0.16 15.6 3
103548 12267030
1.02 3.08 3.94 0.17 16.6
The used message length during computation is ...
100 the used vectorlength during
computation is . . . 100 all times in
seconds, gtgtol_fac in percent!!! Bal_fac
Rep_fac_comm Rep_fac_comp T_comm T_comp
T_all T_ol ol_fac 1
69722 5725738 1.03
1.03 1.72 0.34 32.9 3
69722 5725738 1.03
3.09 3.78 0.35 33.8 The used
message length during computation is ...
1000 the used vectorlength during computation
is . . . 1000 all times in seconds,
gtgtol_fac in percent!!! Bal_fac Rep_fac_comm
Rep_fac_comp T_comm T_comp T_all T_ol
ol_fac 1 28496
979641 1.03 1.03 1.42
0.64 62.1 3 28496
979641 1.03 3.09
3.49 0.63 61.4
21
Benchmarks Overlap for long messages between
nodes
Neighbor send/receive overlap test
---------------------------------- ------ Long
messages ---------- ---------------------------
------- The used message length during
computation is ... 10000 the used
vectorlength during computation is . . .
10000 all times in seconds, gtgtol_fac in
percent!!! Bal_fac Rep_fac_comm Rep_fac_comp
T_comm T_comp T_all T_ol ol_fac 1
4670 68699
1.03 1.03 1.13 0.92 89.7
3 4670 68699
1.03 3.00 3.23 0.80
78.0 The used message length during
computation is ... 100000 the used
vectorlength during computation is . . .
100000 all times in seconds, gtgtol_fac in
percent!!! Bal_fac Rep_fac_comm Rep_fac_comp
T_comm T_comp T_all T_ol ol_fac 1
503 6101
1.06 1.05 1.13 0.98 92.2
3 503
6101 1.06 3.13 3.19 1.00
94.0 The used message length during
computation is ... 1000000 the used
vectorlength during computation is . . .
1000000 all times in seconds, gtgtol_fac in
percent!!! Bal_fac Rep_fac_comm Rep_fac_comp
T_comm T_comp T_all T_ol ol_fac 1
49 101
1.05 1.06 1.30 0.82 78.0
3 49
101 1.05 3.18 3.35
0.88 83.7
22
Debugging with DDT
  • Commandsmodule add ddtddt hello

23
HP MPI Execution of Parallel Programs
  • The syntax to start a parallel application
    interactively ismpirun mpirun_options
    ltprogramgt ormpirun mpirun_options f
    ltappfilegt

24
HP MPI Environment Variables
  • Many environment variables

25
Numerical Libraries
  • HP XC Mathematical LIBrary (MLIB)
  • Intel Mathematical Kernel Library (MKL)
  • NAG Libraries (non-commercial users)
  • LINear SOLver package (LINSOL)

26
Well Established Open Source Libraries
  • BLAS
  • BLAS1,2,3 included in HP XC MLIB and Intel MKL
  • LAPACK
  • included in HP XC MLIB and Intel MKL
  • contains many functions for the solution of
    linear systems
  • and eihenvalue problems for dense and banded
    matrices
  • ScaLAPACK
  • included in HP XC MLIB
  • contains above mentioned functions for parallel
    computers
  • Metis
  • included in HP XC MLIB
  • contains a special implementation of the graph
    partitioning and matrix reordering library

27
HP XC MLIB (1/2)
  • Functions from several areas linear equations,
    least squares, eigenvalue problems, singular
    value decomposition, vector and matrix
    computations, convolutions and Fourier Transforms
  • Four components VECLIB, LAPACK, ScaLAPACK and
    SuperLU_DIST
  • VECLIB includes all BLAS1,2,3 and sparse BLAS
    subroutines, sparse linear equation solvers,
    sparse eigenvalue and eigenvector solvers, FFTs,
    correlation and convolution subprograms, random
    number generators and METIS V4.0.1
  • Load bevor use module add hp-mlib/7.1 for Intel
    compiler V7.1 andmodule add hp-mlib for Intel
    compiler V8.1

28
HP XC MLIB (2/2)
  • Appropriate options at link time
  • VECLIBFC LMLIBPATH lveclib openmp o
    myprog myprog.f90
  • LAPACK FC LMLIBPATH llapack openmp o
    myprog myprog.f90
  • ScaLAPCKmpif90 LMLIBPATH lscalapack openmp
    o myprog myprog.f90
  • SuperLU_DIST mpif90 LMLIBPATH lsuperlu_dist
    openmp o myprog myprog.f90
  • More details http//www.rz.uni-karlsruhe.de/ssc/h
    pxc-mlib

29
Intel MKL (1/2)
  • Many components
  • BLAS,
  • Sparse BLAS,
  • LAPACK,
  • direct sparse solver PARDISO,
  • Vector Mathematical Library (VML) for core
    mathematical functions on vector arguments,
  • Vector Statistical Library (VSL) for generating
    vectors of pseudorandom numbers,
  • general Discrete Fourier Transform functions
    (DFT) and
  • a subset of FFTs
  • Load bevor use module add mkl

30
Intel MKL (2/2)
  • Appropriate options at link time
  • BLAS, FFT, VML, VSL etc.FC LMKLPATH
    lmkl_ipf lguide lpthread o myprog myprog.f90
  • LAPACKFC LMKLPATH lmkl_lapack lmkl_ipf
    lguide lpthread o myprog myprog.f90
  • PARDISOmpif90 LMKLPATH lmkl_solver lmkl_ipf
    lguide lpthread o myprog myprog.f90
  • More details http//www.rz.uni-karlsruhe.de/ssc/h
    pxc-mkl

31
NAG Libraries
  • NAG Fortran, NAG Fortran90 and NAG C libraries
    only for non-commercial customers
  • Load bevor use module add naglib/7.1module add
    mkl/7.1 for Intel compiler V7.1 andmodule add
    naglib module add mkl for Intel compiler V8.1
  • Appropriate options at compile and link time
  • NAG Fortran LibraryFC myprog.f90
    INAGLIBPATH/interface_blocks LNAGLIBPATH \
    lnag-mkl LMKLPATH lmkl_lapack
    lmkl_ipf lguide -lpthread
  • NAG Fortran90 LibraryFC myprog.f90
    INAGLIBPATH/nag_mod_dir LNAGLIBPATH \
    lnagfl90-noblas LMKLPATH
    lmkl_lapack lmkl_ipf lguide -lpthread
  • NAG C LibraryCC myprog.c INAGLIBPATH/include
    LNAGLIBPATH/nagc
  • More details http//www.rz.uni-karlsruhe.de/ssc/h
    pxc-nag

32
LINSOL
  • LINSOL is a program package to solve large sparse
    linear systems
  • many iterative solvers
  • several polyalgorithms
  • (I)LU direct solvers as preconditioners
  • optimized for workstations (cache reuse),
    vectorcomputers and parallel computers (MPI)
  • supporting 7 different storage patterns for
    sparse matrices (automatic optimization to the
    architecture of the computer)
  • Load bevor usemodule add linsol
  • Appropriate options at compile and link
    timempif90 LLINSOLPATH llinsol lMPI
    myprog.o running a MPI jobFC LLINSOLPATH
    llinsol lnocomm myprog.o running a serial job
  • More details http//www.rz.uni-karlsruhe.de/produ
    kte/linsol
Write a Comment
User Comments (0)
About PowerShow.com