Introduction to Using the Origin2000 hecate things to know''' - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Using the Origin2000 hecate things to know'''

Description:

Always use 'secure shell' ssh to connect (works like rlogin ... NAG. link with '-lnag' use 'naghelp' on hecate or at PPPL. SCSL (SGI/Cray Scientific Library) ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 21
Provided by: fis7
Learn more at: https://w3.pppl.gov
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Using the Origin2000 hecate things to know'''


1
Introduction to Using theOrigin-2000
(hecate)things to know...
  • Stéphane Ethier
  • CPPG talk series
  • July 6, 1999

2
Hecate
  • 64-processor SGI Origin 2000
  • 16 Gbytes of physical memory
  • Shared memory but ...
  • Non-Uniform Memory Architecture (NUMA)

3
Logging in
  • Always use secure shell ssh to connect (works
    like rlogin but creates encrypted session)
  • ssh -l username hecate.princeton.edu
  • Answer yes if you see
  • Host key not found from the list of known
    hosts.
  • Are you sure you want to continue connecting
    (yes/no)?
  • (it creates an entry in the file
    /.ssh/known_hosts)
  • Enter your password as you would normally do
  • To use ssh-agent with RSA authentication, see
  • http//w3.pppl.gov/info/pppl-unix/ssh_with_RSA_au
    thentication.html

4
First thing you see
  • Last login Wed Jun 30 145608 1999 from
    taurus.pppl.gov
  • You have mail.
  • --------------------------------------------------
    -------------------------------------
  • Dedicated time
  • 99/06/19 0001 - 99/06/19 0800 jps (cs)
  • --------------------------------------------------
    --------------------------------------
  • To view the dedicated time again
  • cat /etc/motd
  • To ask for dedicated time, send a message to
    sisyfolks_at_astro.princeton.edu 2 days in advance.
    You get the whole machine for yourself but it
    is only at night.
  • Use same format as above

5
Environment
  • Default shell /usr/peyton/bin/tcsh
  • Default path /usr/peyton/tex/bin./usr/bsd/usr
    /sbin/usr/bin
  • /usr/bin/X11/opt/totalview/bin/u/USER/bin/
    mipssgiirix6
  • /u/USER/bin/common/u/USER/bin/usr/peyto
    n/bin
  • /usr/peyton/bin/X11/usr/p-ton/bin/X11/usr/p-t
    on/bin
  • /usr/princeton/bin/usr/princeton/bin/X11
  • PPPL users should add to their /.cshrc seten
    v NCARG_ROOT /usr/pppl
  • setenv GRAPHCAP X11
  • setenv MANPATH MANPATHNCARG_ROOT/man
  • setenv PATH PATHNCARG_ROOT/bin/usr/afsws
    /bin
  • ln -s /afs/pppl.gov/u/USER /u/USER/afs
  • alias lpr_pppl "/usr/pppl/bin/lpr
    -Ptheoryne_at_tern.pppl.gov \!
  • use lpr_pppl filename to print at PPPL

6
To transfer files
  • secure copy scp
  • scp filename orion.pppl.govcode/src
  • scp username_at_orion.pppl.govtmp/c\.f
    /code
  • hecate_talk.txt 7 KB 7.1 kB/s
    ETA 000000 100
  • ? Same restrictions as ssh
  • NFS mounted filesystem (user must have same uid
    and gid to have same permissions)
  • /usr/pppl/work (hecate) ? /work (PPPL)
  • AFS (Andrew File System) /afs/pppl.gov/u
  • Use klog to obtain an AFS Token from the
    Authentication Server (you have to enter your AFS
    password) and gain write permission in your AFS
    directory (you can also use ssh token forwarding,
    port 1515)
  • see http//w3.pppl.gov/info/pppl-unix/AFS.html
  • (http//w3.pppl.gov/info/pppl-unix/ssh_with_RSA_a
    uthentication.html)

7
Scratch disks
  • Use the scratch disks to run jobs with large
    output files
  • You have to ask for a directory. You cannot
    create it yourself. Nice but NO BACKUPS!
  • hecate33 df -kl
  • Filesystem Type kbytes
    use avail use Mounted on
  • /dev/root xfs 4188256
    2776528 1411728 67 /
  • /dev/xlv/xlv11 xfs 17775472
    2051364 15724108 12 /scr11
  • /dev/xlv/xlv8 xfs 8884712
    8822820 61892 100 /scr5
  • /dev/xlv/xlv5 xfs 8880616
    8742460 138156 99 /scr2
  • /dev/xlv/xlv10 xfs 17775472
    5967228 11808244 34 /scr10
  • /dev/xlv/xlv3 xfs 4788712
    1893716 2894996 40 /export/home
  • /dev/dsk/dks4d1s0 xfs 4175968
    2138104 2037864 52 /original
  • /dev/xlv/xlv6 xfs 8884712
    7091596 1793116 80 /scr3
  • /dev/xlv/xlv7 xfs 8884712
    8166236 718476 92 /scr4
  • /dev/xlv/xlv4 xfs 8880616
    7754328 1126288 88 /scr1
  • /dev/xlv/xlv9 xfs 35550816
    17554000 17996816 50 /scr9
  • /dev/dsk/dks4d2s7 xfs 8884712
    7071876 1812836 80 /scr8
  • /dev/xlv/xlv1 xfs 71105632
    62363740 8741892 88 /scr6
  • /dev/xlv/xlv0 xfs 106672736
    104936992 1735744 99 /scr0
  • /dev/xlv/xlv2 xfs 142231648
    64785880 77445768 46 /scr7

8
Compilers
  • MIPSpro 7.2.1 f90, f77, cc, CC
  • To see the compilers default options
  • f90 -show_defaults
  • cat /etc/compiler.defaults
  • ( -DEFAULTabin32isamips4procr10k)
  • To access the MIPSpro 7.3 compilers (beta
    version)
  • source /opt/modules/modules/init/tcsh
  • module load modules
  • module load MIPSpro.73
  • module list (to see which modules are
    loaded)
  • Currently Loaded Modulefiles
  • 1) modules 2) MIPSpro.73
  • To return to MIPSpro 7.2.1 (check with f90
    -version)
  • module unload MIPSpro.73
  • module load MIPSpro.721

9
Do I need -64 ???
  • When you compile with -n32, the chip executes
    in 64-bit mode and the software restricts
    addresses to 32 bits
  • Compile with -n32 when you want
  • To generate smaller executables than for -64
  • Executables to have fewer data cache misses and
    less memory paging than for -64
  • Compile with -64 if your program
  • Requires more than 2 gigabytes of address space
  • Will overflow a 32-bit long integer (for C)
  • C Type -n32 -64 Fortran Type
  • ------ ------- ------ --------------
  • char 8 8 character
  • short int 16 16 integer2
  • int 32 32 integer
  • long int 32 64 none
  • long long int 64 64 integer8
  • pointer 32 64 pointer
  • float 32 32 real
  • double 64 64 real8
  • long double 64 128 real16

10
Directory search for ld linker
  • Default search order
  • -n32 ? /usr/lib32, /lib32
  • -64 ? /usr/lib64, /lib64
  • -o32 ? /usr/lib, /lib
  • That is as far as it goes you have to know if
    you are using -n32 or -64 for some libraries.

11
Libraries of interest
  • Mathematical libraries
  • NAG
  • link with -lnag
  • use naghelp on hecate or at PPPL
  • SCSL (SGI/Cray Scientific Library)
  • SGIs own optimized BLAS, LAPACK, etc
  • man scsl
  • man name_of_individual_routine
  • I/O format library
  • NETCDF
  • Graphic library
  • NCAR Graphics

12
Debuggers
  • Compile program with -g for all debuggers
  • dbx ? comes with the system so always latest
    version
  • (see man dbx)
  • cvd ? WorkShop pro debugger (SGI-style
    graphics)
  • (see http//techpubs.sgi.com/library ? Books ?
    Developer ? Developer Magic Debugger Users
    Guide)
  • totalview ? Beta version only (GUI oriented)
  • (see /opt/totalview/docs/User_Guide.pdf (or .ps)
    on hecate)
  • All of them can more or less debug multi-process
    programs

13
Linking multi-language programs
  • Compile object files from the source files of
    each language separately by using the -c option
  • cc -c more.c rest.c
  • f77 -c main.f
  • Use the compiler associated with the language of
    the main program to link the objects
  • f77 main.o more.o rest.o

14
Running a job...
  • To run a 1-cpu job on a fix processor
  • runon cpu command
  • To run a multi-processor job on chosen
    processors
  • dplace -place pfile -mustrun command
  • Where pfile contains something like
  • physical placement_file for 2 specific memories
    and 4 threads
  • memories 2 in topology physical near \
  • /hw/module/3/slot/n1/node \
  • /hw/module/3/slot/n2/node
  • threads 4
  • distribute threads across memories

15
Multi-processor jobs
  • MPI programs (link with -lmpi)
  • setenv MPI_DSM_OFF
  • mpirun -np nproc dplace - place pfile
    -mustrun a.out lt /dev/null
  • OpenMP or automatically parallelized code (link
    with -lmp)
  • setenv MP_SET_NUMTHREADS nproc
  • dplace -place pfile a.out gt stdout.out

16
Monitoring system activity top
  • top or top -U username
  • IRIX64 hecate 6.5 IP27 load averages
    47.97 48.17 44.95 102421
  • 165 processes 111 sleeping, 1 zombie, 53 running
  • 64 CPUs 18.3 idle, 81.1 usr, 0.4 ker, 0.2
    wait, 0.0 xbrk, 0.1 intr
  • Memory 16G max, 15G avail, 9274M free, 8693M
    swap, 8687M free swap
  • PID PGRP USERNAME PRI SIZE RES
    STATE TIME WCPU CPU COMMAND
  • 189228 190712 cen 20 1232M
    452M run/8 2120 13.5 99.52
    lcdm192
  • 188986 191186 gnedin 20 1491M
    1179M run/34 2528 13.5 99.52 slh
  • 189817 190712 cen 20 1232M
    452M run/51 2354 13.5 99.51
    lcdm192
  • 189832 190712 cen 20 1232M
    452M run/53 2256 13.5 99.51
    lcdm192
  • 189826 190712 cen 20 1232M
    452M run/26 2609 13.5 99.51
    lcdm192
  • 189809 190712 cen 20 1232M
    452M run/30 2512 13.5 99.51
    lcdm192
  • 190299 190299 bode 20 92M
    69M run/3 93 48 13.5 99.51
    corrv.x
  • 189217 190712 cen 20 1232M
    452M run/52 2224 13.5 99.50
    lcdm192
  • 189759 190712 cen 20 1232M
    452M run/49 2418 13.5 99.50
    lcdm192
  • 189796 190712 cen 20 1232M
    452M run/48 2439 13.5 99.49
    lcdm192
  • 188282 190712 cen 20 1232M
    452M run/22 2641 13.5 99.48
    lcdm192

17
Monitoring tools osview
  • Osview 2.1 One Second Average hecate
    111712 07/06/99 1 int5s
  • Load Average readch 1.2M
    pgallocs 4.4K
  • 1 Min 52.839 writech 1.1M
    Scheduler
  • 5 Min 52.900 iget 0 runq
    0
  • 15 Min 52.119 System Memory swapq
    0
  • CPU Usage Phys 15.8G switch
    2.3K
  • user 85.71 kernel 535.3M
    kswitch 2.3K
  • sys 5.56 heap 292.1M
    preempt 150
  • intr 0.00 mbufs 27.0M Wait
    Ratio
  • gfxc 0.00 stream 8.6M IO
    0.0
  • gfxf 0.00 ptbl 10.6M Swap
    0.0
  • sxbrk 0.00 fs ctl 327.8M
    Physio 0.0
  • idle 8.73 fs data 3.7G
  • System Activity delwri 46.9M
  • syscall 3.9K free 9.2G
  • read 300 data 2.5G
  • write 300 empty 6.6G
  • fork 0 userdata 2.1G
  • exec 0 reserved 0

18
Monitoring tools cpu
  • hecate cpu
  • --------------------------CPU Usage---------------
    ------------
  • MODULE n1 n2 n3
    n4
  • 1 ( 0- 7) O H H O F F
    F F
  • 2 ( 8-15) O F O F O O
    O O
  • 3 (32-39) O O O O O O
    O O
  • 4 (40-47) O O O O O O
    O O
  • 5 (16-23) F O O O O O
    O O
  • 6 (24-31) O O O O O O
    O O
  • 7 (48-55) O O O O O O
    O O
  • 8 (56-63) O O O O O O
    O O
  • F free O Occupied H half occupied
  • (100 - average_idle_time) includes I/O

19
Monitoring tools pscpu
  • MODULE 1 MODULE 5
  • 0 - 16
    -
  • 1 ssh1 root 17 lcdm192
    cen
  • 2 ps ethier 18 lcdm192
    cen
  • 3 corrv.x bode 19 lcdm192
    cen
  • 4 - 20 lcdm192
    cen
  • 5 - 21 lcdm192
    cen
  • 6 - 22 lcdm192
    cen
  • 7 lcdm192 cen 23 lcdm192
    cen
  • MODULE 2 MODULE 6
  • 8 lcdm192 cen 24 lcdm192
    cen
  • 9 ssh1 root 25 lcdm192
    cen
  • 10 ssh1 root 26 lcdm192
    cen
  • 11 - 27 lcdm192
    cen
  • 12 mrc jbreslau 28 lcdm192
    cen
  • 13 mrc jbreslau 29 lcdm192
    cen
  • 14 mrc jbreslau 30 lcdm192
    cen
  • 15 mrc jbreslau 31 lcdm192
    cen
  • MODULE 3 MODULE 7

20
Web pages
  • http//w3.pppl.gov/xtang/hecate/hecate.html
  • http//astro.princeton.edu/ognedin/supercomputer/
  • http//techpubs.sgi.com/
Write a Comment
User Comments (0)
About PowerShow.com