High-Performance Parallel I/O Libraries - PowerPoint PPT Presentation

About This Presentation
Title:

High-Performance Parallel I/O Libraries

Description:

HighPerformance Parallel IO Libraries – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 9
Provided by: ArieSh
Category:

less

Transcript and Presenter's Notes

Title: High-Performance Parallel I/O Libraries


1
High-Performance Parallel I/O Libraries
  • (PI) Alok Choudhary, (Co-I) Wei-Keng Liao
  • Northwestern University
  • In Collaboration with the SEA Group
  • (Group Leader, Rob Ross, ANL)

2
Parallel NetCDF
  • NetCDF defines
  • A portable file format
  • A set of APIs for file access
  • Parallel netCDF
  • New APIs for parallel access
  • Maintaining the same file format
  • Tasks
  • Built on top of MPI for portability and high
    performance
  • Support C and Fortran interfaces

3
Parallel NetCDF - status
  • Version 1.0.1 was released on Dec. 7, 2005
  • Web page receives 200 page views a day
  • Supported platforms
  • Linux Cluster, IBM SP, BG/L, SGI Origin, Cray X,
    NEC SX
  • Two sets of parallel APIs
  • High level APIs (mimicking the serial netCDF
    APIs)
  • Flexible APIs (to utilize MPI derived datatype)
  • Support for large files ( gt 2GB files)
  • Test suites
  • Self test codes ported from Unidata netCDF
    package to validate against single-process
    results
  • New data analysis APIs
  • Basic statistical functions
  • min, max, mean, median, variance, deviation

4
Illustrative PnetCDF Users
  • FLASH astrophysical thermonuclear application
    from ASCI/Alliances center at university of
    Chicago
  • ACTM atmospheric chemical transport model, LLNL
  • WRF Weather Research and Forecast modeling
    system, NCAR
  • WRF-ROMS regional ocean model system I/O module
    from scientific data technologies group, NCSA
  • ASPECT data understanding infrastructure, ORNL
  • pVTK parallel visualization toolkit, ORNL
  • PETSc portable, extensible toolkit for
    scientific computation, ANL
  • PRISM PRogram for Integrated Earth System
    Modeling, users from CC Research Laboratories,
    NEC Europe Ltd.
  • ESMF earth system modeling framework, national
    center for atmospheric research
  • CMAQ Community Multiscale Air Quality code I/O
    module, SNL
  • More

5
PnetCDF Future Work
  • Non-blocking I/O
  • Built on top of non-blocking MPI-IO
  • Improve data type conversion
  • Type conversion while packing non-contiguous
    buffers
  • Data analysis APIs
  • Statistical functions
  • Histogram functions
  • Range query regional sum, min, max, mean,
  • Data transformation DFT, FFT
  • Collaboration with application users

6
MPI-IO Caching
  • Client-side file caching
  • Reduces client-server communication costs
  • Enables write behind to better utilize network
    bandwidth
  • Avoids file system locking overhead by aligning
    I/O with file block size (or stripe size)
  • Prototype in ROMIO
  • Collaborating caching by the group of MPI
    processes
  • A complete caching subsystem in MPI library
  • Data consistency and cache coherence control
  • Distributed file locking
  • Memory management for data caching, eviction, and
    migration
  • Applicable for both MPI collective and
    independent I/O
  • Two implementations
  • Creating an I/O thread in each MPI process
  • Using MPI RMA utility

7
FLASH - I/O Benchmark
  • The I/O kernel of FLASH application, a
    block-structured adaptive mesh hydrodynamics code
  • Each process writes 80 cubes
  • I/O through HDF5
  • Write-only operations
  • The improvement is due to write behind

16x16x16 32x32x32
np16 1.15 GB 9.13 GB
np32 2.30 GB 18.26 GB
np64 4.60 GB 36.53 GB
8
BTIO Benchmark
  • Block tri-diagonal array partitioning
  • 40 MPI collective writes followed by 40
    collective reads
Write a Comment
User Comments (0)
About PowerShow.com