Title: Getting Started with HPC
1Getting Started with HPC On Iceberg
Michael Griffiths Corporate Information and
Computing Services The University of
Sheffield Email m.griffiths_at_sheffield.ac.uk
2Outline
- e-Science
- Review of hardware and software
- Accessing
- Managing Jobs
- Building Applications
- Single processor
- Shared memory multiprocessor tasks
- Parallel tasks
- Resources
- Getting Help
3e-Science
- More science relies on computational experiments
- More large, geographically disparate,
collaborative projects - More need to share/lease resources
- Compute power, datasets, instruments,
visualization
4e-Science Requirements
- Simple and secure access to remote resources
across administrative domains - Minimally disruptive to local administration
policies and users - Large set of resources used by a single
computation - Adapt to non-static configuration of resources
5Types of Grids
- Cluster Grid
- Beowulf clusters
- Enterprise Grid, Campus Grid, Intra-Grid
- Departmental clusters,
- servers and PC network
- Cloud,Utility Grid
- Access resources over internet on demand
- Global Grid, Inter-grid
- White Rose Grid, National Grid Service, Particle
physics data grid
6Three Applications of Grid Computing
- Compute grids
- Data grids
- Collaborative grids
7Grid TypesData Grid
- Computing Network stores large volume of data
across network - Heterogeneous
- data sources
8Particle Physics Work at Sheffield University
Large Hadron Collider (LHC) is under
construction at CERN in Geneva. When commences
operation in 2007, it will be the worlds highest
energy collider. Sheffield, key member of ATLAS
collaboration building one of two General Purpose
Detectors on LHC ring, will be analysing the data
generated by these experiments on its HPC
cluster. Main motivations for building LHC and
ATLAS are Finding the Higgs boson Finding
evidence for Supersymmetry believed to be next
great discovery / layer in our understanding of
the universe.
9Grid Types Collaborative
- Internet videoconferencing
- Collaborative Visualisation
10EGEE
- The EGEE project brings together experts from
over 27 countries - Build on recent advances in Grid technology.
- Developing a service Grid infrastructure in
Europe, - available to scientists 24 hours-a-day.
11Available Grid Services
- Access Grid
- White Rose Grid
- Grid research
- HPC Service
- National Grid Service
- Compute Grid
- Data Grid (SRB)
- National HPC Services
- HPCx and CSAR (part of NGS)
- Portal Services
12UK major Grid e-Science Centres
13Review Hardware 1
- AMD based supplied by Sun Microsystems
- Processors 252
- Cores 624
- Performance 435GFLOPs
- Main Memory 2.296TB
- Filestore 8TB
- Temporary disk space 18TB
- Physical size 4 racks
- Power usage 36KW
14Review Hardware 2
- Older V20 and V40 Servers for Grid pp community
- Dual headnode
- Node 1 login node
- Node 2 cluster services (including sge master),
behaves as failover node - 435 Cores General Use
- 96 Sun X2200 nodes, each with 4 cores and 16 GB
of RAM. - 23 "Hawley" nodes, each with 8 cores and 32 GB of
RAM each - Comparing L2 Cash
- AMD Opteron 1MB
- Ultrac sparc III Cu (Titania) 8MB
15ReviewHardware 3
Inside an X2200 unit.
16Review Hardware 4
- Two main Interconnect types gigabit (commodity),
Infiniband (more specialist) - Gigabit Supported as standard good for job
farms, and small to mid size systems - Infiniband High End solution for large parallel
applications has become defacto standard for
clusters (4Gb/s)
17Infiniband specifications.
- High data rates of upto 1880 MBits/s
- ConnectX IB HCA Card, Single Port 16Gb/s
InfiniBand - Low latency of 1 ms
- Gigabit Ethernet is order of 100 ms
- SilverStorm 24-Port InfiniBand DDR Switch
-
18Iceberg cluster overview
19Review Hardware 5
- 64bit v 32 bit
- Mainly useful for programs requiring large memory
available on bigmem nodes - Greater Floating Point accuracy
- Future-proof 32-bit systems are becoming
obselete in HPC
20White Rose Grid
YHMAN Network
21Review Software 1
Ganglia
Portland, GNU
Sun Grid Engine v6
Redhat 64bit Scientific Linux
MPICH
Opteron
22Review Software 2
- Maths and Statistical
- Matlab2009a, scilab 5
- R 2.0.1
- Engineering and Finite Element
- Fluent 6.2.16, 6.1.25 and 6.1.22 als gambit,
fidap and tgrid - Ansys v90
- Abaqus
- CFX 5.7.1
- DYNA 91a
- Visualisation
- IDL 6.1
- OpenDX
23ReviewSoftware 3
- Development
- MPI, mvapich2, openmpi
- mvapich2, Hawley nodess
- OpenMPI, Hawley nodes and using GigE
- OpenMP
- Nag, 20
- ACML
- Grid
- Globus 2.4.3 (via gpt 3.0)
- SRB s-client tools to follow
24Accessing 1 Registration
- Registration
- Details at http//www.shef.ac.uk/wrgrid/register
- WRG users complete form at
- http//www.wrgrid.group.shef.ac.uk/cms/WRG_form_ne
wuser.pdf - e-Science Certificate registration optional
25Accessing 2 Logging in
- ssh client
- putty, SSH Secure Shell Client (from
http//www.shef.ac.uk/wrgrid/trainingresources/SSH
SecureShellClient-3.2.9.exe) - X-Windows
- Exceed 3d (just start exceed and login using ssh
client) - Cygwin
- Note when using SSH secure shell client
- From menu edit-gt settings
- Select Connection-gttunneling
- Tick Tunnel X11 connections
26Accessing 3Linux vs Solaris
- For end users, things are much the same
- RedHat Enterprise 5 (Scientific Linux)
- BASH is default shell (use up and down key for
history, type history , use tab for
auto-completion - Setting Aliases for BASH is like
- export environment_varsetting
27Accessing 4 Login Environment
- Paths and environment variables have been setup.
(change things with care) - BASH, CSH and TCSH setup by default more exotic
shells may need additional variables for things
to work correctly - Install any e-Science certs in your .globus
28Resources 1 Filestore
- Two areas of filestore available on Iceberg.
- A permanent, secure, backed up area in your home
directory /home/username - data directory /data/username
- Not backed up to tape
- Data is mirrored on the storage server
29Resources2 Scratch area
- Temporary data storage on local compute nodes
- I/O much faster than NFS mounted /home and /data
- Data not visible to other worker nodes and not
backed up - Create a directory using your username in
/scratch on a worker and work from this directory - The data in the /scratch area is deleted
periodically when the worker is nnot accessed by
any processor or jobs
30Resources 3 Storage Allocations
- Storage allocations for each area are as follows
- Your home directory has a filestore of 5 GB,but
you can request additional space. - If you change directory to /data you will see a
directory labelled by your username. - In /data you can store 50GB of files you can
request additional space.
31Resources 4 Important Notes
- The data area is not backed up.
- Check quota regularly if you go over quota the
account will become frozen youll need to contact
iceberg-admins - Check quota using the command quota
- If you exceed your quota using the RM command
- Note upper case
32Resources 5 Transferring Data
- Command line tools such as scp, sftp
- Use sftp tools such as winscp for windows
- http//winscp.net/eng/index.php
33Running programs on iceberg
- Iceberg is the gateway to the cluster of worker
nodes and the only one where direct logging in is
allowed. - Icebergs main purpose is to allow access to the
worker nodes but NOT to run cpu intensive
programs. - All cpu intensive computations must be performed
on the worker nodes. This is achieved by the qsh
command for the interactive jobs and qsub command
for the batch jobs. - Once you log into iceberg, taking advantage of
the power of a worker node for interactive work
is done simply by typing qsh and working in the
new shell window that is opened. This what
appears to be a trivial task has would in fact
have queried all the worker nodes for you and
started a session on the least loaded worker in
the cluster. - The next set of slides assume that you are
already working on one of the worker nodes (qsh
session).
34Managing Jobs 1Sun Grid Engine Overview
- Resource management system, job sscheduler, batch
system - Can schedule Serial and Parallel jobs
- Serial jobs run in individual host queues
- Parallel jobs must include a parallel environment
request (-pe ltpe_namegt N)
35Job scheduling on the cluster
SGE worker node
SGE worker node
SGE worker node
SGE worker node
SGE worker node
Queue-A
Queue-B
Queue-C
SGE MASTER node
- Queues
- Policies
- Priorities
- Share/Tickets
- Resources
- Users/Projects
36Job scheduling on the cluster
SGE worker node
SGE worker node
SGE worker node
SGE worker node
SGE worker node
Queue-A
Queue-B
Queue-C
SGE MASTER node
- Queues
- Policies
- Priorities
- Share/Tickets
- Resources
- Users/Projects
37Managing Jobs 2 Job Scheduling
- Job schedulers work predominantly with batch
jobs - require no user input or intervention once
started - Installation here also supports interactive use
via qsh
38Managing Jobs 3 Working with SGE jobs
- There are a number of commands for querying and
modifying the status of a job running or queued
by SGE - qsub (submit a job to SGE)
- qstat (query job status)
- qdel (delete a job)
39Managing Jobs 4 Submitting Serial Jobs
- Create a submit script (example.sh)
- !/bin/sh
- Scalar benchmark
- echo This code is running on /bin/hostname
- /bin/date
- The job is submitted to SGE using the qsub
command - qsub example.sh
40Managing Jobs 5 Options Used with SGE
-l h_rthhmmss The wall clock time. This parameter must be specified, failure to include this parameter will result in the error message Error no suitable queues.
-l memmemory sets the memory limit e.g. l mem10G
-l h_vmemmemory Sets the limit of virtual memory required (for parallel jobs per processor).
-help Prints a list of options
-pe ompigige np Specifies the parallel environment to be handled by the Score system. np is the number of nodes to be used by the parallel job. Please note this is always one more than needed as one process must be started on the master node, which, although does not carry out any computation, is necessary to control the job.
41Managing Jobs 6 Options Used with SGE
cwd Execute the job from the current working directory output files are sent to the directory form which the job was submitted, not to the users home directory.
-m be Send mail at the beginning and at the end of the job to the owner
-S shell Use the specified shell to interpret the script rather than the C shell (default).
-masterq iceberg.q Specifies the name of the master scheduler as the master node (iceberg)
-v Export all environment variables to all spawned processes.
42Managing Jobs 7 qsub
- qsub arguments
- qsub o outputfile j y cwd ./submit.sh
- OR in submit script
- !/bin/bash
- -o outputfile
- -j y
- -cwd
- /home/horace/my_app
43Managing Jobs 8 Interactive Use
- Interactive but with a dedicated resource
- qsh
- Then use as your desktop machine
- Fluent, matlab
44Managing Jobs 9 Deleting Jobs with qdel
- Individual Job
- qdel 151
- gertrude has registered the job 151 for deletion
- List of Jobs
- qdel 151 152 153
- All Jobs running under a given username
- qdel u ltusernamegt
45Managing Jobs 9Monitoring Jobs with qstat
- To list the status and node properties of all
nodes - qstat (add f to get a full listing)
- Information about users' own jobs and queues is
provided by the qstat -u usersname command. e.g - qstat -u fred
- Monitor job and show memory usage qstat f
-jjobid grep usage
46Managing Jobs 10qstat Example
job-ID prior name user state
submit/start at queue
slots ja-task-ID -------------------------------
--------------------------------------------------
-------------------------------- 206951 0.51000
INTERACTIV bo1mrl r 07/05/2005 093020
bigmem.q_at_comp58.iceberg.shef.a 1
206933 0.51000 do_batch4 pc1mdh r
07/04/2005 162820 long.q_at_comp04.iceberg.shef.ac.
1 206700 0.51000 l100-100.m mb1nam
r 07/04/2005 133014 long.q_at_comp05.icebe
rg.shef.ac. 1 206698 0.51000
l50-100.ma mb1nam r 07/04/2005 132944
long.q_at_comp12.iceberg.shef.ac. 1
206697 0.51000 l24-200.ma mb1nam r
07/04/2005 132929 long.q_at_comp17.iceberg.shef.ac.
1 206943 0.51000 do_batch1 pc1mdh
r 07/04/2005 174945 long.q_at_comp20.icebe
rg.shef.ac. 1 206701 0.51000
l100-200.m mb1nam r 07/04/2005 133044
long.q_at_comp22.iceberg.shef.ac. 1
206705 0.51000 l100-100sp mb1nam r
07/04/2005 134207 long.q_at_comp28.iceberg.shef.ac.
1 206699 0.51000 l50-200.ma mb1nam
r 07/04/2005 132959 long.q_at_comp30.icebe
rg.shef.ac. 1 206632 0.56764
job_optim2 mep02wsw r 07/03/2005 225530
parallel.q_at_comp43.iceberg.shef 18
206600 0.61000 mrbayes.sh bo1nsh r
07/02/2005 112219 parallel.q_at_comp51.iceberg.shef
24 206911 0.51918 fluent cpp02cg
r 07/04/2005 141906 parallel.q_at_comp52.i
ceberg.shef 4 206954 0.51000
INTERACTIV php04awb r 07/05/2005 100617
short.q_at_comp01.iceberg.shef.ac 1
47Managing Jobs 11Monitoring Job Output
- The following is an example of submitting a SGE
job and checking the output produced - qsub pe mpich 8 myjob.sh
- job lt131gt submitted
- qstat f (is job running ?)
- tail f myjob.sh.o.131
48Managing Jobs 12SGE Job Output
- When a job is queued it is allocated a job
number. Once it starts to run output usually sent
to standard error and output are spooled to files
called - ltscriptgt.oltjobidgt
- ltscriptgt.eltjobidgt
49Managing Jobs 13Reasons for Job Failures
- SGE cannot find the binary file specified in the
job script - Required input files are missing from the startup
directory - Environment variable is not set (LM_LICENSE_FILE
etc) - Hardware failure (eg. mpi ch_p4 or ch_gm errors)
50Managing Jobs 14SGE Job Arrays
- Add to qsub command or script file (with at
beginning of line) - t 1-10
- Would create 10 tasks from one job
- Each task has SGE_TASK_ID set in the environment
51Specifying The Memory Requirements of a Job
- Policies that apply to queues
- Default memory requirement for each job is 4GB
- Jobs will be killed if memory exceeds amount
requested - Determine memory requirements for a job as
follows - qstat f j jobid grep mem
- The reported figures will indicate- the
currently used memory ( vmem )- Maximum memory
needed since startup ( maxvmem)- cumulative
memory_usageseconds ( mem ) - When you run the job next you need to use the
reported value of vmem to specify the memory
requirement
52Managing Jobs 15SGE Parallel Environments
- Parallel environments on Iceberg
- ompigige
- openmp
- openmpi-ib
- mvapich2-ib
- See later
53Managing Jobs 16Job Queues on Iceberg
Queue name Job size limit System specification
short.q 8 cpu hours
long.q 168 cpu hrs 2GB memory per cpu
parallel.q 168 cpu hrs Jobs requiring multiple cpus (V40s)
openmp.q 168 cpu hrs Shared memory jobs using openmp
parallelx22.q 168 cpu hrs Jobs requiring multiple cpus (X2200s)
54Managing Jobs 17Interactive Computing
- Software that runs interactively should not be
run on the head node. - Instead you must run interactive jobs on an
execution node (see qsh command below). - The time limit for interactive work is 8 hours.
- Interactive work on the head node will be killed
off.
55Checkpointing Jobs
- Simplest method for checkpointing
- Ensure that applications save configurations at
regular intervals so that jobs may be restarted
(if necessary) using these configuration files. - Using the BLCR checkpointing environment
- BLCR commands
- Using BLCR checkpoint with an SGE job
56Checkpointing Using BLCR
- BLCR commands
- cr_run, cr_checkpoint, cr_restart
- Run the code
- cr_run ./executable
- To checkpoint a process with process id
PIDcr_checkpoint -f checkpoint.file PID - To restart the process from a checkpointcr_restar
t checkpoint.file
57Using BLCR Checkpoint with An SGE Job
- A checkpoint environment has been setup called
BLCR - it's accessible using the test cstest.q
queue. - An example of a checkpointing job would look
something like -l h_cpu1680000 -q
cstest.q -c hhmmss - -ckpt blcrcr_run ./executable gtgt output.file
- The -c hhmmss options tells SGE
to checkpoint over the specified time interval . - The -c sx options tells SGE to checkpoint if the
queue is suspended, or if the execution daemon is
killed.
58Tutorials
- On iceberg copy the contents of the tutorial
directory to your user area into a directory
named sge - cp rp /usr/local/courses/sge sge
- cd sge
- In this directory the file readme.txt contains
all the instructions necessary to perform the
exercises. -
59Building Applications 1 Overview
- The operating system on iceberg provides full
facilities for, - scientific code development,
- compilation and execution of programs.
- The development environment includes,
- debugging tools provided by the Portland test
suite, - the eclipse IDE.
60Building Applications 2 Compilers
- C and Fortran programs may be compiled using the
GNU or Portland Group. The invoking of these
compilers is summarized in the following table
Language GNU Compiler Portland
C gcc pgcc
C g pgCC
FORTRAN 77 g77 pgf77
FORTRAN 90/95 pgf90
61Building Applications 3 Compilers
- All of these commands take the filename
containing the code to be compiled as one
argument followed by numerous options. - Example
- pgcc myhelloworld.c o hello
- Details of these options may be found through
the UNIX man facility, - To find details about the Portland f90 compiler
useman pgf90
62Building Applications 4 Compiler Options
Option Effect
-c Compile Compile, do not link.
-o exefile Specifies a name for the resulting executable.
-g Produce debugging information (no optimization).
-Mbounds Check arrays for out of bounds access.
-fast Full optimisation with function unrolling and code reordering.
63Building Applications 5 Compiler Options
Option Effect
-Mvectsse2 Turn on streaming SIMD extensions (SSE) and SSE2 instructions. SSE2 instructions operate on 64 bit floating point data.
-Mvectprefetch Generate prefetch instructions.
-tp k8-64 -tp k8-64 Specify target processor type to be opteron processor running 64 bit system.
-g77 libs Link time option allowing object files generated by g77 to be linked into programs (n.b. may cause problems with parallel libraries).
64Building Applications 6 Sequential Fortran
- Assuming that the Fortran 77 program source code
is contained in the file mycode.f, to compile
using the Portland group compiler type pgf77
mycode.f - In this case the code will be output into the
file a.out. To run this code issue ./a.outat
the UNIX prompt. - To add some optimization, when using the Portland
group compiler, the fast flag may be used. Also
o may be used to specify the name of the
compiled executable, i.e. pgf77 o mycode
fast mycode.f - The resultant executable will have the name
mycode and will have been optimized by the
compiler.
65Building Applications 7 Sequential C
- Assuming that the program source code is
contained in the file mycode.c, - to compile using the Portland C compiler, type
pgcc o mycode mycode.c - In this case, the executable will be output into
the file mycode which can be run by typing its
name at the command prompt ./mycode
66Memory Issues
- Programs using lt2GB require no modification
- Large memory associated with heap or data memory
segment, if this exceeds 2GB use following
compiler flags - C/C compilers
- pgcc mcmodelmedium
- Fortran compilers
- pgf77/pgf90/pgf95 mcmodelmedium
- g77 mcmodelmedium
67Setting available memory using ulimit
- ulimit provides control over available resources
for processes - ulimit a report all available resource limits
- ulimit s XXXXX set maximum stacksize
- Sometimes necessary to set the hardlimit e.g.
- ulimit sH XXXXXX
68Useful Links for Memory Issues
- 64 bit programming memory issues
- http//www.ualberta.ca/CNS/RESEARCH/LinuxClusters/
64-bit.html - Understanding Memory
- http//www.ualberta.ca/CNS/RESEARCH/LinuxClusters/
mem.html
69Building Applications 8 Debugging
- The Portland group debugger is a
- symbollic debugger for Fortran, C, C programs.
- Allows the control of program execution using
- breakpoints,
- single stepping and
- enables the state of a program to be checked by
examination of - variables
- and memory locations.
70Building Applications 9 Debugging
- PGDBG debugger is invoked using
- the pgdbg command as follows
- pgdbg arguments program arg1 arg2.. Argn
- arguments may be any of the pgdbg command line
arguments. - program is the name of the traget program being
debugged, - arg1, arg2,... argn are the arguments to the
program. - To get help from pgdbg usepgdbg -help
71Building Applications 10 Debugging
- PGDBG GUI
- invoked by default using the command pgdbg.
- Note that in order to use the debugging tools
applications must be compiled with the -g switch
thus enabling the generation of symbolic debugger
information.
72Building Applications 11 Profiling
- PGPROF profiler enables
- the profiling of single process, multi process
MPI or SMP OpenMP, or - programs compiled with the -Mconcur option.
- The generated profiling information enables the
identification of portions of the application
that will benefit most from performance tuning. - Profiling generally involves three stages
- compilation
- exection
- analysis (using the profiler)
73Building Applications 12 Profiling
- To use profiling in is necessary to compile your
program with the following options indicated in
the table below
Option Effect
-Mproffunc Insert calls to produce function level pgrpof output.
-Mproflines Insert calls to produce line level pgprof output.
-Mprofmpi. Link in mpi profile library that intercepts MPI calls to record message sizes and count message sends and receives. e.g. -Mprofmpi,func.
-pg Enable sample based profiling.
74Building Applications 13 Profiling
- The PG profiler is executed using the command
pgprof options datafile - Datafile is a pgprof.out file generated from the
program execution.
75Shared Memory Applications 1 OpenMP
- Source code containing OpenMP compiler directives
can be compiled on symmetric multiprocessor nodes - On Iceberg
- 2 X dual core (amd opteron)
- 2 X quad core(amd shanghai)
76Shared Memory Applications 2 Compiling OpenMP
- SMP source code is compiled using the PGI
compilers with the -mp option. - To compile C, C, Fortran77 or Fortran90 code,
use the mp flag as follows, - pgf77 compiler options -mp filename
- pgf90 compiler options -mp filename
- pgcc compiler options -mp filename
- pgCC compiler options -mp filename
77Shared Memory Applications 3 Simple OpenMP
Makefile
- Simple openmp makefile
- C compiler and options
- CC pgcc -fast -mp
- LIB -lm
- Object files
- OBJ main.o \
- another.o
- Compile
- myapp (OBJ)
- (CC) -o _at_ (OBJ) (LIB)
- .c.o
- (CC) -c lt
- Clean out object files and the executable.
- clean rm .o myapp
78Shared Memory Applications 4 Specifying Required
Number of Threads
- The number of parallel execution threads at
execution time is controlled by setting the
environment variable OMP_NUM_THREADS to the
appropriate value. - For the csh or tcsh this is set using,setenv
OMP_NUM_THREADS2 - or for the sh or bash shell use,export
OMP_NUM_THREADS2
79Shared Memory Applications 5 Starting an OpenMP
Interactive Shell
- To start an interctive shell with NPROC
processors enter,qsh -pe openmp NPROC -v
OMP_NUM_THREADSNPROC - Note
- although the number of processors required is
specified with the -pe option, - it is still necessary to ensure that the
OMP_NUM_THREADS environment variable is set to
the correct value.
80Shared Memory Applications 6 Submitting an
OpenMP Job to Sun Grid Engine
- The job is submitted to a special parallel
environment that ensures the job ocupies the
required number of slots. - Using the SGE command qsub the openmp parallel
environment is requested using the -pe option as
follows, - qsub -pe openmp 2 -v OMP_NUM_THREADS2
myjobfile.sh - The following job script, job.sh is submitted
using, qsub job.sh Where job.sh is, - !/bin/sh -cwd -pe openmp 4 -v
OMP_NUM_THREADS4 ./executable
81Parallel Programming with MPI 1 Introduction
- Iceberg is designed with the aim of running MPI
(message passing interface ) parallel jobs, - the sun grid engine is able to handle MPI jobs.
- In a message passing parallel program each
process executes the same binary code but, - executes a different path through the code
- this is SPMD (single program multiple data)
execution. - Iceberg uses
- openmpi-ib and mvapich2-ib implementation
provide by infiniband (quadrics/connectX) - Using IB fast interconnect at 16GigaBits/second.
82Parallel Programming with MPI 2 Hello MPI World!
- include ltmpi.hgt include ltstdio.hgt int
main(int argc,char argv) int rank / my
rank in MPI_COMM_WORLD / int size / size of
MPI_COMM_WORLD / / Always initialise mpi by
this call before using any mpi functions. /
MPI_Init( argc , argv) / Find out how
many processors are taking part in the
computations. / MPI_Comm_size(MPI_COMM_WORLD,
size) / Get the rank of the current process
/ MPI_Comm_rank(MPI_COMM_WORLD, rank) if
(rank 0) printf("Hello MPI world from C!\n")
printf("There are d processes in my world, and
I have rank d\n",size, rank) MPI_Finalize()
83Parallel Programming with MPI 3 Output from
Hello MPI World!
- When run on 4 processors the MPI Hello World
program produces the following output, - Hello MPI world from C! There are 4 processes
in my world, and I have rank 2 There are 4
processes in my world, and I have rank 0 There
are 4 processes in my world, and I have rank 3
There are 4 processes in my world, and I have
rank 1
84Parallel Programming with MPI 4 Compiling MPI
Applications Using Myrinet on V40s
- To compile C, C, Fortran77 or Fortran90 MPI
code using the portland compiler, type, - mpif77 compiler options filename
- mpif90 compiler options filename
- mpicc compiler options filename
- mpiCC compiler options filename
85Parallel Programming with MPI 4 Compiling MPI
Applications Using Gigabit ethernet on X2200s
- To compile C, C, Fortran77 or Fortran90 MPI
code using the portland compiler, with OpenMPI
type, - export MPI_HOME/usr/local/packages5/openmpi-pgi/
bin - MPI_HOME/mpif77 compiler options filename
- MPI_HOME/mpif90 compiler options filename
- MPI_HOME/mpicc compiler options filename
- MPI_HOME/mpiCC compiler options filename
86Parallel Programming with MPI 5 Simple Makefile
for MPI
- MPI Makefile for intrompi examples .
- SUFFIXES .f90 .f .o
- MPI_HOME /usr/local/mpich-gm2_PGI
- MPI_HOME/usr/local/packages5/openmpi-pgi/bin
- MPI_INCLUDE (MPI_HOME)/include
- C compiler and options
- CC MPI_HOME/bin/mpicc
- CLINKER CC
- COPTFLAGS -O -fast
- F90 MPI_HOME/bin/mpif90
- FLINKER (F90) FOPTFLAGS -O3 -fast
- LINKER(CLINKER)
- OPTFLAGS(COPTFLAGS)
- Object files
- OBJ ex1.o \
- another.o
- Compile
Comment out one of these line using a to select
either OpenMPI or mpich-gm compiler
87Parallel Programming with MPI 6 Submitting an
MPI Job to Sun Grid Engine
- To submit a an MPI job to sun grid engine,
- use the openmpi-ib parallel environment,
- ensures that the job occuppies the required
number of slots. - Using the SGE command qsub,
- the openmpi-ib parallel environment is requested
using the -pe option as follows, - qsub -pe openmpi-ib 4 myjobfile.sh
88Parallel Programming with MPI 7 Sun Grid
Engine MPI Job Script
- The following job script, job.sh is submitted
using, - qsub job.sh
- job.sh is,
- !/bin/sh -cwd -pe openmpi-ib 4 -q
parallel.q SGE_HOME to locate sge mpi
execution script -v SGE_HOME/usr/local/sge6_2
/usr/mpi/pgi/openmpi-1.2.8/bin/mpirun
./mpiexecutable
89Parallel Programming with MPI 9 Sun Grid Engine
MPI Job Script
- Using this executable directly the job is
submitted using qsub in the same way but the
scriptfile job.sh is, !/bin/sh -cwd -pe
mvapich2-ib 4 -q parallel.q MPIR_HOME
from submitting environment -v
MPIR_HOME/usr/mpi/pgi/mvapich2-1.2p1
MPIR_HOME/bin/mpirun_rsh rsh -np 4 -hostfile
TMPDIR/machines ./mpiexecutable
90Parallel Programming with MPI 10 Sun Grid
Engine OpenMPI Job Script
- Using this executable directly the job is
submitted using qsub in the same way but the
scriptfile job.sh is, !/bin/sh -cwd -pe
ompigige 4 -q parallelx22.q MPIR_HOME
from submitting environment -v
MPIR_HOME/usr/local/packages5/openmpi-pgi
MPIR_HOME/bin/mpirun -np 4 -machinefile
mpiexecutable
91Parallel Programming with MPI 10 Extra Notes
- Number of slots required and parallel environment
must be specified using -pe openmpi-ib NSLOTS - The correct SGE queue set up for parallel jobs
must be specified using -q parallel.q - The job must be executed using the correct
PGI/Intel/gnu implementation of mpirun. Note
also - Number of processors is specified using -np
NSLOTS - Specify the location of the machinefile used for
your parallel job, this will be located in a
temporary area on the node that SGE submits the
job to.
92Parallel Programming with MPI 10 Pros and Cons.
- The downside to message passing codes is that
they are harder to write than scalar or shared
memory codes. - The system bus on a modern cpu can pass in excess
of 4Gbits/sec between the memory and cpu. - A fast ethernet butween PC's may only pass up to
200Mbits/sec between machines over a single
ethernet cable and - this can be a potential bottleneck when passing
data between compute nodes. - The solution to this problem for a high
performance cluster such as iceberg is to use a
high performance network solution, such as the
16Gbit/sec interconnect provided by infiniband. - The availability of such high performance
networking makes possible a scalable parallel
machine.
93Supported Parallel Applications on Iceberg
- Abaqus
- Fluent
- Matlab
- For information see documentation at
- http//www.wrgrid.group.shef.ac.uk/forum/
- See the iceberg users forum for Research Computing
94Getting help
- Web site
- http//www.shef.ac.uk/wrgrid/
- Documentation
- http//www.shef.ac.uk/wrgrid/documents
- Training (also uses the learning management
system) - http//www.shef.ac.uk/wrgrid/training
- Forum
- http//www.wrgrid.group.shef.ac.uk/forum/index.php
- Contacts
- http//www.shef.ac.uk/wrgrid/contacts.html
95Tutorials
- On iceberg copy the contents of the tutorial
directory to your user area into a directory
named sge - cp rp /usr/local/courses/sge sge
- cd sge
- In this directory the file readme.txt contains
all the instructions necessary to perform the
exercises. -