Effective Use of NERSC File Systems - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Effective Use of NERSC File Systems

Description:

Example 5 - Minimizing usage with tar and compress. mcurie ... Example 6 - Minimizing usage with cpio. cd $HOME /bin/find . - type f -size -15000c -atime 90 ! ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 38
Provided by: mjd5
Category:
Tags: nersc | effective | file | systems | use

less

Transcript and Presenter's Notes

Title: Effective Use of NERSC File Systems


1
Effective Use of NERSC File Systems
  • Thomas M. DeBoni
  • NERSC/USG

2
Effective Use of NERSC File Systems
  • Contents
  • Home Directories
  • Scratch Space
  • Mass Storage
  • Networked File Systems
  • Resource Conservation
  • Examples
  • Details
  • See also
  • http//home.nersc.gov/training/tutorials/file.mana
    gement.html

3
Home Directories
  • Your private portion of the file systems on a
    computer
  • Default current working directory when you log
    in
  • Knows as system environment variable HOME
  • Usage limited in bytes
  • Max size is 2 GB on T3E, 5 GB on J-90s
  • Warnings issued at 75-90 usage
  • Theres not enough space for everybody to use
    this much at the same time, so migration
    sometimes happens
  • Usage limited in inodes
  • An inode is a file or directory
  • Max number is 6000 on T3E, unsettled on J-90s
  • Warnings issued at 75-90 usage

4
Home Directories, cont.
  • HOME
  • is routinely backed up
  • is shared among all J-90s
  • is NOT the fastest file system available
  • should be used for development, debugging, pre-
    and post-processing, and other administrative
    tasks
  • should NOT be routinely used by large jobs
    requiring high performance
  • Startup files (.cshrc, .login, etc.) may change
    your working directory on login
  • Remove all references to WRK from these files

5
Home Directories, cont.
  • Files can be migrated to backing store
  • Largest and oldest files first
  • De-migrate with dmget command before using, or
  • Automatically de-migrated when referenced, but
    with unknown delay
  • Example listing
  • killeen 257 ls -al
  • total 64
  • drwx------ 2 u10101 zzz 4096 Sep 21
    1111 .
  • drwxr-xr-x 5 u10101 zzz 4096 Sep 21
    1111 ..
  • mrw------- 1 u10101 zzz 2414 Sep 21
    1111 decomp.job.log
  • mrw------- 1 u10101 zzz 2712 Sep 21
    1111 decomp.job.out
  • -rw------- 1 u10101 zzz 2381 Sep 21
    1111 decomp.job2.log
  • -rw------- 1 u10101 zzz 11490 Sep 21
    1111 decomp.job2.out

m in first column means file has been migrated
6
Home Directories, cont.
  • A word about quotas
  • Use the quota command to view them
  • mcurie 154 quota
  • File system /u5
  • User deboni, Id 9950
  • Aggregate blocks (512 bytes)
    Inodes
  • User Quota 3906240 ( 19.2)
    3500 ( 17.2)
  • Warning 3515616 ( 21.4)
    2975 ( 20.2)
  • Usage 751416
    602

Current usage is this of max usage is
this of warning level
  • Maximum usage
  • allowed
  • Level at which
  • warning will be
  • issued
  • Current usage

7
Scratch Space
  • Also known as temporary storage or working
    storage
  • A pool of fast RAID drives
  • The fastest file system available
  • Unique to each batch system
  • Not backed up
  • Usage limits are larger than for HOME
  • 75 GB and 5000-6000 inodes on T3E, unsettled on
    J-90s
  • Should be used for large files and high
    performance jobs
  • This is transient space, and persistence will
    vary with usage and demand

8
Scratch Space, Cont.
  • System environment variable TMPDIR
  • Created for each session or batch job
  • Randomly named, so always use TMPDIR to refer to
    it
  • Deleted at the end of session or job
  • Use TMPDIR if you want the OS to manage your
    scratch space usage for you.
  • E.g., on the J-90s, you cant log on to batch
    machines, and each has its own scratch space, so
    you cant get at it directly, as you can on the
    T3E.

9
Scratch Space, cont.
  • /tmp or /usr/tmp
  • Create directories there for yourself
  • Watch out for name collisions with other users
    directories
  • Delete files and directories as you finish with
    them
  • This space will be scavenged depending on demand
    largest and oldest files are usually deleted
    first
  • It should be safe for 7 to 14 days
  • You must manage this scratch space for yourself

10
Scratch Space, cont.
  • Pre-staging files to scratch space is a good
    idea, but...
  • You dont know when your batch job will run, so
    it may not work when batch queues are heavily
    loaded
  • Staging files in a batch script is a good idea,
    but...
  • It idles your processor ensemble and uses up
    serial time
  • So, do it both ways
  • !/bin/csh -f
  • Change to scratch directory
  • cd /tmp/mydir
  • Check for presence of pre-staged files
  • if (-e foo.input) then
  • echo "input file prestaged"
  • else
  • echo "fetching input file"
  • hsi "get foo.input"
  • endif

11
Scratch Space, cont.
  • What about intermediate I/O? Example to follow
  • What about I/O from parallel programs?
  • This is a deep topic
  • Beware of interleaving of multiple outputs to a
    single file
  • J-90 codes typically use a small number of files
    at a time
  • T3E codes may use hundreds of files at a time
  • Special mechanisms exist to manage parallel files
    and I/O
  • General rule the more you do, the faster it
    should be
  • See these sources for further info
  • http//home.nersc.gov/software/prgenv/opt/binary.h
    tml
  • http//home.nersc.gov/training/tutorials/T3E/IO/
  • http//www.cray.com/swpubs/
  • (See, especially , " CRAY T3E Fortran
    Optimization Guide, SG-2518 3.0)

12
Mass Storage
  • NERSC provides the High Performance Storage
    System (HPSS)
  • A modern, flexible, hierarchical system
  • Optimized for large files and fast transfers
  • Built on multiple disk farms and tape libraries
  • Used for system backups, file migration, and user
    archives
  • Has multiple user interface utilities
  • HSI - powerful, flexible, convenient utility,
    from SDSC and NERSC
  • pftp - parallel ftp, locally customized, fastest
    for large files
  • ftp - traditional version, available everywhere
  • The proper place to save large, important,
    long-lived files (e.g. raw output and restart
    data)
  • Requires a separate storage account (DCE
    account), but can automatically authenticate
    after initial setup

13
Networked File Systems
  • Networked or distributed file systems are
    intended to decouple a files physical location
    from its logical location
  • Can be very convenient, but also dangerous
  • There are three of interest
  • NFS - Developed at Sun and has become a standard
    in workstation environments
  • Used as little as possible at NERSC, due to
    security and performance concerns
  • AFS - More modern, global in scope, with pretty
    good security
  • Used at NERSC via the gateway system
    dano.nersc.gov
  • Use AFS with care - it can ruin performance
  • DFS - A coming standard that NERSC is evolving
    toward also has good security and will be global
    in scope

14
Resource Conservation
  • Critical resources are expensive and rare
  • They are shared among (competed for by) all users
  • Four critical resources related to file systems
    use
  • Storage space - the actual files and bytes of
    data
  • File system entries - inodes one per file or
    directory
  • Bandwidth - bits per second, in transfers between
    devices
  • Time - servers, I/O devices, and CPU cycles
  • NERSC meters (charges for) all these types
  • Resource conservation must be engineered in
    (dont depend on luck)

15
Bandwidth Conservation
  • Design parallel I/O carefully
  • Human readable I/O probably should be done by a
    single (master) process(or) to/from a single file
  • Binary I/O may be done by one process(or) or by
    many, as required
  • Binary data may occupy many files which match
    problem decomposition for parallel execution
  • Limits exist on the number of files that can be
    open at any one time
  • Flushing larger buffers is usually a better idea
    than flushing smaller ones more often.
  • For further info, see
  • Cray publication Application Programmer's I/O
    Guide
  • NERSC web doc http//home.nersc.gov/training/tutor
    ials/T3E/IO/
  • Man page for the assign command

16
Bandwidth Conservation, cont.
  • Transfer files carefully
  • Session setup is not free - move files to/from
    mass storage in as few sessions or commands as
    possible dont run pftp in a loop
  • Use multiple-file transfer commands, such as
    mget, when possible
  • Use the appropriate utility for the job
  • Meta-data operations do not involve actual file
    access
  • Renaming files or directories or moving files
    around within HPSS
  • changing file or directory permissions in HPSS
  • Use hsi for these sorts of operations, for
    efficiency
  • For further info see
  • NERSC web doc http//home.nersc.gov/hardware/stora
    ge/hpss.html
  • NERSC web doc http//home.nersc.gov/hardware/stora
    ge/hsi.html

17
Bandwidth and Time Conservation
  • Use the fastest utility available
  • Use pftp when moving within the NERSC domain
  • Use multiple-file transfer commands, such as
    mget, when possible
  • Use ftp when moving files into or out of the
    NERSC domain
  • Use the fastest devices and networks available
  • This is a deep area and oversimplified here ...
  • Dont make a fast machine wait on a slow one
  • Pre-stage and de-migrate files needed by batch
    jobs into fast storage space
  • Sometimes a multiple-step process is better
  • First, move files from outside NERSC onto a NERSC
    computer (a workstation)
  • Then, move files from the NERSC computer to the
    destination device
  • Avoid networked file systems
  • For further info see
  • NERSC web doc http//home.nersc.gov/hardware/stora
    ge/hpss.html
  • Man pages for ftp, pftp, and hsi

18
Bandwidth and Storage Conservation
  • Shrink files, if appropriate
  • If the file contains redundant or unimportant
    data
  • Such as white space in formatted output
  • Use Unix commands compress and gzip
  • Combine files into archives
  • If the files are small, transferring them
    individually may involve more setup time than
    transfer time
  • Use Unix commands tar, ar, and cpio
  • For more info, see
  • Man pages for all the above commands

19
Example 1 - batch pftp multiple file access with
a here-doc
  • !/bin/csh
  • ...
  • First, copy the source from the submitting
    directory and compile it.
  • pftp -i -v archive ltlt
  • cd my_HPSS_directory
  • mget data myprog
  • quit
  • ja
  • ./myprog ltdata gtoutfile
  • ja -cst
  • Save the output file in HPSS.
  • pftp -i -v archive ltlt
  • cd my_HPSS_directory
  • mput outfile restart
  • quit

Here-document
20
Example 2 - batch hsi multiple file access
  • !/bin/csh
  • ...
  • First, copy the source from the submitting
    directory and compile it.
  • hsi archive cd my_HPSS_directory mget data
    myprog
  • ja
  • ./myprog ltdata gtoutfile
  • ja -cst
  • Save the output file in HPSS.
  • hsi archive cd my_HPSS_directory mput outfile
    restart
  • exit

21
Example 3 - Minimizing parallel cpu idling during
pftp I/O
  • !/bin/csh -f
  • preliminary job steps, including fetching
    executables and input files
  • . . .
  • parallel code execution with mpprun
  • set i 1
  • mpprun -n 128 a.out lt bigjob.input gt
    bigjob.output
  • Intermediate file movement, to save output file
    to mass storage
  • mv bigjob.output bigjob.outputi
  • mv bigjob.restart bigjob.restartI
  • Generate a separate serial job to do the
    actual I/O
  • echo "pftp -i -v archive ltltEOF\
  • mkdir bigjobs/job.06.15.99\
  • cd bigjobs/job.06.15.99 \
  • mput bigjob.outputi bigjob.restarti \
  • ls\
  • quit\
  • EOF" qsub -q serial

22
Example 4 - Minimizing parallel cpu idling during
HSI I/O
  • !/bin/csh -f
  • preliminary job steps, including fetching
    executables and input files
  • . . .
  • parallel code execution with mpprun
  • set i 1
  • mpprun -n 128 a.out lt bigjob.input gt
    bigjob.output
  • Intermediate file movement, to save output file
    to mass storage
  • mv bigjob.output bigjob.outputi
  • mv bigjob.restart bigjob.restarti
  • Generate a separate serial job do to the
    actual I/O
  • echo hsi archive mkdir bigjobs/job.06.15.99 cd
    bigjobs/job.06.15.99 mput bigjob.outputi
    bigjob.restarti ls qsub -q serial
  • . . .
  • further parallel code execution, perhaps
    through shell script looping
  • _at_i i 1
  • . . .

23
Example 5 - Minimizing usage with tar and compress
  • mcurie 181 ls -al STD bigjob
  • -rw-r--r-- 1 deboni mpccc 0 Dec 23
    1226 STDIN.e48938
  • -rw-r--r-- 1 deboni mpccc 0 Dec 23
    1227 STDIN.e48939
  • -rw-r--r-- 1 deboni mpccc 1126 Dec 23
    1226 STDIN.l48938
  • -rw-r--r-- 1 deboni mpccc 1126 Dec 23
    1227 STDIN.l48939
  • -rw-r--r-- 1 deboni mpccc 7181 Dec 23
    1226 STDIN.o48938
  • -rw-r--r-- 1 deboni mpccc 7181 Dec 23
    1227 STDIN.o48939
  • -rw------- 1 deboni mpccc 486 Feb 4
    1124 bigjob.output
  • -rw------- 1 deboni mpccc 486 Feb 4
    1210 bigjob.output1
  • -rw------- 1 deboni mpccc 486 Feb 4
    1210 bigjob.output2
  • -rw------- 1 deboni mpccc 972 Feb 4
    1124 bigjob.restart
  • -rw------- 1 deboni mpccc 972 Feb 4
    1210 bigjob.restart1
  • -rw------- 1 deboni mpccc 972 Feb 4
    1210 bigjob.restart2
  • --------------------total space 20988 bytes
    and 12 inodes
  • mcurie 182 tar cf batch.tar STD bigjob
  • mcurie 183 ls -al batch.tar
  • -rw------- 1 deboni mpccc 65536 Feb 5
    0915 batch.tar
  • mcurie 184 compress batch.tar
  • mcurie 185 ls -al batch.tar

24
Example 6 - Minimizing usage with cpio
  • cd HOME
  • /bin/find . -type f -size -15000c -atime 90 ! \
  • ( -type m -o -type M \) \ -print gt hitlist
  • vi hitlist
  • cat hitlist cpio -co gt myfiles.cpio
  • cat hitlist xargs rm -f
  • Here's what the above commands (NOT a shell
    script!) do
  • 1) First, cd to home directory and generate a
    list of eligible files
  • 2) the find command will find regular files
    smaller than 15000 chars, that have not been
    accessed in 90 days, and are not migrated.
  • 3) Use "vi" to examine the list and delete items
    from it that you do not want removed.
  • 4) The fourth line will create the cpio file
    archive.
  • 5) The fifth line will remove all the files now
    stored in the archive.

25
Details - some useful ftp and pftp commands
  • FTP Commands Meanings or actions
    PFTP Variants
  • get ltrfgt ltlfgt retrieve a file
    pget
  • put ltlfgt ltrfgt store a file
    pput
  • mget ltfgt ltfgt retrieve multiple files
    mpget
  • mput ltfgt ltfgt store multiple files
    mpput
  • del ltfgt delete a file
  • mdel ltfgt ltfgt delete multiple files
  • mkdir ltdgt create a remote directory
  • rmdir ltdgt delete a remote directory
  • cd ltdgt change to remote dirctory
  • lcd ltdgt change to local directory
  • ls, dir list files in directory
  • ldir list files in local directory
  • !ltcmdgt perform ltcmdgt locally outside
    ftp/pftp
  • ltfgt file name, ltlfgt local file name, ltrfgt
    remote file name
  • Caveats
  • Be aware of where your actions will take place
  • Watch out for name collisions

26
Details - HSI commands
  • HPSS File and Directory Commands
  • get, mget, recv - Copy file(s) from
    HPSS to a local
  • directory
  • cget - Copy file from HPSS
    to a local
  • directory if not
    already there
  • put, mput, replace,
  • save, store, send - Copy local file(s)
    to HPSS
  • cput - Copy local file to
    HPSS if it
  • doesnt already
    exist there
  • cp, copy - Copy file within
    HPSS
  • mv, move, rename - Rename/relocate an
    HPSS file
  • delete, mdelete, erase, rm - Remove a file from
    HPSS
  • ls, list - List directory

27
Details - HSI commands, cont.
  • HPSS File and Directory Commands, cont.
  • find - Traverse a
    directory tree looking
  • for a file
  • mkdir, md, add - Create an HPSS
    directory
  • rmdir, rd, remove - Delete an HPSS
    directory
  • pwd - Print current
    directory
  • cd, cdls - Change current
    directory
  • Local File and Directory Commands
  • lcd, lcdls - Change local
    directory
  • lls - List local
    directory
  • lpwd - Print current local
    directory
  • ! - Issue shell command

28
Details - HSI commands, cont.
  • File Administrative Information
  • chmod - Change permissions of file or
    directory
  • umask - Set file creation permission
    mask
  • Miscellaneous HSI commands
  • help - Display help file
  • quit, exit, end - Terminate HSI
  • in - Read commands from a local
    file
  • out - Write HSI output to a local
    file
  • log - Write commands and responses
    to a log file
  • prompt - Toggles prompting for mget,
    mput, mdelete

29
Details - HSI commands, cont.
  • HSI can accept input several different ways
  • From a command session, consisting of multiple
    lines and ending with an explicit termination
    command
  • From a single line command, with semicolons ()
    separating commands
  • hsi mkdir foo cd foo put data_file
  • From a command file
  • hsi in command_file
  • HSI can read from standard input and write to
    standard output
  • tar cvf - . hsi put - datadir.tar
  • hsi get - datadir.tar tar xvf -
  • Wildcards are supported, but quoting must be used
    in one-line commands to prevent shell
    interpretation.
  • hsi cd foo mget data

30
Details - HSI commands, cont.
  • WARNING For 'get' and 'put' operations, HSI uses
    a different syntax than ftp a colon () is
    used to separate local and remote file names.
  • put local_file hpss_file
  • get local_file hpss_file
  • Recursive operations are allowed for the
    following commands
  • cget, chgrp, chmod, chown, cput, delete, get,
    ls,
  • mdelete, mget, mput, put, rm
  • Special commands exist for setting up variables
    whose values are directories, commands, and
    command-sets.
  • The complete HSI manual is online at
    http//home.nersc.gov/hardware/storage/hsi.html

31
Details - Tasks that HSI Simplifies
  • Accessing segmented CFS files
  • CFS handled files larger then 400 MB by spliting
    them into smaller subfiles and storing the
    subfiles. HSI is the only utility that can read
    and rejoin segmented CFS files to reproduce their
    original state. The procedure for handling such
    files is quite simple simply read the first of
    the segmented subfiles from the archive storage
    system.
  • Renaming/moving or copying an entire subdirectory
  • ltmv/cpgt path1 path2 renames/copies path1 to
    path2
  • Changing the permissions of several files at
    once
  • chmod perms files changes the permissions
    of all files to perms the
    file specifications may
    include wildcards the permissions may
    be given as octal numbers or via
    symbolic designators.

32
Details - Getting Access to AFS Directories
  • Dont do this in batch jobs!
  • killeen 210 telnet dano.nersc.gov
  • Trying 128.55.200.40...
  • Connected to dano.nersc.gov.
  • Escape character is ''.
  • Hello killeen.nersc.gov.

  • gt WARNING Unauthorized access to this
    computer system is lt
  • gt prohibited, and is subject to criminal
    and civil penalties. lt
  • gt Use constitutes consent to security
    testing and monitoring. lt

  • UNIX(r) System V Release 4.0 (dano)
  • login u10101
  • Password

33
Details - Getting Access to AFS Directories, cont.

  • AFS gateway user interface

  • used to enable AFS access on J90's
    and T3E
  • (select enable attached hosts (1)
    before exiting)


  • 1) enable attached hosts (knfs)
  • 2) disable attached hosts (unlog)
  • 3) list tokens (tokens)
  • 4) authenticate to another cell (klog)

  • 5) help
  • 0) exit (logoff)

  • enter command(0-5) 3
  • Tokens held by the Cache Manager

34
Details - Getting Access to AFS Directories, cont.
  • enter command(0-5)0
  • Connection closed by foreign host.
  • killeen 213 pwd
  • /U3/u10101
  • killeen 214 cd /afs/nersc.gov
  • killeen 215 pwd
  • /afs/nersc.gov
  • Option 4 is used to attach other cells,
    regardless of location, but you must have login
    and password to use in the klog process.

35
Details - Dealing With Your DCE Account
  • DCE is a modern authentication methodology that
    will likely evolve into general use at NERSC
  • Right now, it merely control access to HPSS
  • DCE accounts and login/password info must be
    gotten from NERSC Support staff
  • Initial login is necessary to change from initial
    password, and to set up future automatic
    authentication
  • It has been occasionally necessary for a few
    users to re-initialize their accounts
  • Both procedures are easy
  • DCE is currently most reliable on
    killeen.nersc.gov

36
Details - Dealing With Your DCE Account, cont.
  • Initial Setup Once you have your initial DCE
    login and password, change it with the following
    procedure, on any NERSC mainframe
  • dce_login
  • Enter Principal Name ltHPSS_user_namegt
  • Enter Password ltcurrent_or_temporary_HPSS/DCE_
    passwordgt
  • chpass -p
  • Changing registry password for HPSS_user_name
  • New password ltnew_HPSS/DCE_passwdgt
  • Re-enter new password ltnew_HPSS/DCE_passwdgt
  • Enter your previous password for verification
    ltcurrent_or_temporary_HPSS/DCE_passwordgt
  • kdestroy
  • exit
  • You will need to log in to HPSS only on your next
    use, and thereafter you will be automatically
    authenticated.

37
Details - Dealing With Your DCE Account, cont.
  • If you should get the following message from
    HPSS...
  • mcurie 224 hsi hpss
  • credential user mismatch
  • use -l option to generate a new cred file
  • DCE Principal
  • it means automatic authentication has failed.
  • You must authenticate manually, until you
    re-initialize authentication
  • mcurie 232 hsi -l hpss
  • DCE Principal u10101
  • Password
  • ------------------------------------------------
    -----------
  • NERSC HPSS USER SYSTEM(hpss)
  • ------------------------------------------------
    -----------
  • V1.5 Username u10101 UID 0123
  • ? quit
  • Subsequent usage should not require full login.
  • In rare and unusual situations, do rm .hsipw
    and then repeat the above.
Write a Comment
User Comments (0)
About PowerShow.com