On evaluating GPFS - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

On evaluating GPFS

Description:

Appears to work just like a traditional UNIX file system from the user application level. Provides additional functionality and enhanced performance when accessed via ... – PowerPoint PPT presentation

Number of Views:278

Avg rating:3.0/5.0

Slides: 34

Provided by: alejandr70

Category:

Tags: gpfs | es | evaluating

more less

Transcript and Presenter's Notes

Title: On evaluating GPFS

1
On evaluating GPFS
Research work that has been done at HLRS
by Alejandro Calderon
2
On evaluating GPFS

Short description
Metadata evaluation
fdtree
Bandwidth evaluation
Bonnie
Iozone
IODD
IOP

3
GPFS description http//www.ncsa.uiuc.edu/UserInf
o/Data/filesystems/index.html

General Parallel File System (GPFS) is a parallel
file system package developed by IBM.
History
Originally developed for IBM's AIX operating
system then ported to Linux Systems.
Features
Appears to work just like a traditional UNIX
file system from the user application level.
Provides additional functionality and enhanced
performance when accessed via parallel
interfaces such as MPI-I/O.
High performance is obtained by GPFS by striping
data across multiple nodes and disks.
Striping is performed automatically at the block
level. Therefore, all files (larger than the
designated block size) will be striped.
Can be deployed in NSD or SAN configurations.
Clusters hosting a GPFS file system can allow
other clusters at different geographical
locations to mount that file system.

4
GPFS (Simple NSD Configuration)
5
GPFS evaluation (metadata)

fdtree
Used for testing the metadata performance of a
file system
Create several directories and files, in several
levels
Used on
Computers
noco-xyz
Storage systems
Local, GPFS

6
fdtree local,NFS,GPFS
7
fdtree on GPFS (Scenario 1)ssh x,...
fdtree.bash -f 3 -d 5 -o /gpfs...
nodex

P1
Pm

Scenario 1
several nodes,
several process per node,
different subtrees,
many small files

8
fdtree on GPFS (scenario 1)
9
fdtree on GPFS (Scenario 2)ssh x,...
fdtree.bash -l 1 -d 1 -f 1000 -s 500 -o /gpfs...
nodex

Scenario 2
several nodes,
one process per node,
same subtree,
many small files

P1
Px

10
fdtree on GPFS (scenario 2)
11
Metadata cache on GPFS client

Working in a GPFS directory with 894 entries
ls las need to get each file attribute from GPFS
metadata server
In a couple of seconds, the contents of the cache
seams disappear

hpc13782 noco186.nec 304 time ls -als wc -l
894
real 0m0.466s
user 0m0.010s
sys 0m0.052s

hpc13782 noco186.nec 306 time ls -als wc
-l 894 real 0m0.033s user 0m0.009s sys
0m0.025s
hpc13782 noco186.nec 305 time ls -als wc
-l 894 real 0m0.222s user 0m0.011s sys
0m0.064s
hpc13782 noco186.nec 307 time ls -als wc
-l 894 real 0m0.034s user 0m0.010s sys
0m0.024s
12
fdtree results

Main conclusions
Contention at directory level
If two o more process from a parallel application
need to write data, please be sure each one use
different subdirectories from GPFS workspace
Better results than NFS (but lower that the local
file system)

13
GPFS performance (bandwidth)

Bonnie
Read and write a 2 GB file
Write, rewrite and read
Used on
Computers
Cacau1
Noco075
Storage systems
GPFS

14
Bonnie on GPFS write re-write
GPFS over NFS
15
Bonnie on GPFS read
GPFS over NFS
16
GPFS performance (bandwidth)

Iozone
Write and read with several file size and access
size
Write and read bandwidth
Used on
Computers
Noco075
Storage systems
GPFS

17
Iozone on GPFS write
18
Iozone on GPFS read
19
GPFS evaluation (bandwidth)
next -gt

IODD
Evaluation of disk performance by using several
nodes
disk and networking
A dd-like command that can be run from MPI
Used on
2, and 4 nodes, 4, 8, 16, and 32 process (1,
2, 3, and 4 per node) that write a file of 1,
2, 4, 8, 16, and 32 GB
By using both, POSIX interface and MPI-IO
interface

20
How IODD works
nodex
P1
P2

Pm

a b .. n
a b .. n
a b .. n

nodex 2, 4 nodes
processm 4, 8, 16, and 32 process (1, 2, 3, 4
per node)
file sizen 1, 2, 4, 8, 16 and 32 GB

21
IODD on 2 nodes MPI-IO
22
IODD on 4 nodes MPI-IO
23
Differences by using different APIs
GPFS (2 nodes, MPI-IO)
GPFS (2 nodes, POSIX)
24
IODD on 2 GB MPI-IO, directory
25
IODD on 2 GB MPI-IO, ? directory
26
IODD results

Main conclusions
The bandwidth decrease with the number of
processes per node
Beware of multithread application with
medium-high I/O bandwidth requirements for each
thread
It is very important to use MPI-IO because this
API let users get more bandwidth
The bandwidth decrease with more than 4 nodes too
With large files, the metadata management seams
not to be the main bottleneck

27
GPFS evaluation (bandwidth)

IOP
Get the bandwidth obtained by writing and reading
in parallel from several processes
The file size is divided between the process
number so each process work in an independent
part of the file
Used on
GPFS through MPI-IO (ROMIO on Open MPI)
Two nodes writing a 2 GB files in parallel
On independent files (non-shared)
On the same file (shared)

28
How IOP works
File per process (non-shared)
Segmented access (shared)

P1
P2

P1
P2
Pm
Pm

a b .. x a b .. x a b .. x
a a ..
b b ..
x x ..
n
n

2 nodes
m 2 process (1 per node)
n 2 GB file size

29
IOP Differences by using shared/non-shared
30
IOP Differences by using shared/non-shared
31
GPFS writing in non-shared files
GPFS writing in a
shared file
32
GPFS writing in shared filethe 128 KB magic
number
33
IOP results

Main conclusions
If several process try to write to the same file
but on independent areas then the performance
decrease
With several independent files results are
similar on several tests, but with shared file
are more irregular
Appears a magic number 128 KBSeams that at that
point the internal algorithm changes and it
increases the bandwidth