Independent Study Midterm Progress Report - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Independent Study Midterm Progress Report

Description:

... their data with all other nodes in the system. Collective. MPI_Allgather ... MPI Restore process on the System Controller is in-charge of recovery decisions ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 14
Provided by: johnwe79
Category:

less

Transcript and Presenter's Notes

Title: Independent Study Midterm Progress Report


1
Independent Study Midterm Progress Report
  • Adam Jacobs
  • October 31, 2005

2
Outline
  • Motivation
  • Original TRL5 Plans
  • FEMPI Software Architecture
  • Implemented Functionality
  • Upcoming Functions

3
Motivations
  • MPI functionality required for HPC space
    applications
  • De-facto standard/parallel programming model in
    HPC
  • Fault-tolerant extensions for HPEC space systems
  • MPI is inherently fault-intolerant, original
    design choice
  • Existing HPC tools for MPI and fault-tolerant MPI
  • Good basis for ideas, API standards, etc.
  • Not readily amenable to HPEC platforms
  • Focus on lightweight fault-tolerant MPI for HPEC
    (FEMPI Fault-tolerance Embedded Message Passing
    Interface)
  • Leverage prior work throughout HPC community
  • Leverage prior work at UF on HPC with MPI

4
Original TRL-5 Plans
  • Focus first on basic MPI and FT-MPI functionality
  • TRL5 baseline 12 functions (4 setup, 3
    point-to-point, 5 collective)
  • Adding Additional datatype functions
  • Focus on blocking and synchronous communications
    (dominant mode in MPI applications)
  • Explore and develop FT extensions to baseline
    functions using one or more FT modes
  • Baseline implementation approach extensions to
    Self-Reliant and FTMS functions
  • Goal is both fault-tolerance and performance, but
    where conflicting then in that order

5
Baseline MPI Functions for TRL5
Reference J. Kohout and A. George, A
High-Performance Communication Service for
Parallel Computing on Distributed DSP Systems,
Parallel Computing, Vol. 29, No. 7, July 2003,
pp. 851-878.
6
Additional Functions
  • Datatype functions are needed in order to
    implement LU decomposition using the HPL
    benchmarks
  • These functions do not transfer data over the
    network
  • Incorporating FT consists of checkpointing the
    datatype information

7
Software Architecture
  • Low-level communication is provided through FEMPI
    using Self-Reliants DMS
  • Heartbeating via SR and a process notification
    extension to the SRP enables FEMPI fault
    detection
  • Application and FEMPI checkpointing make use of
    existing checkpointing libraries checkpoint
    communication uses DMS
  • MPI Restore process on the System Controller is
    in-charge of recovery decisions based on
    application policies

8
FEMPI DMS Interactions
  • Connection Setup
  • Performed during MPI_Init
  • Each node registers as subscriber and publisher
    and then waits for a message indicating that all
    other nodes have reached the same point
  • 4 Transaction Messaging Scheme
  • Uses DMS publish/subscribe calls
  • Currently using the advanced API
  • Only the send/receive nodes access the
    channel/family/type
  • Use separate types for each receiver
  • Keep channels from different applications
    separate
  • Additionally, separate types for header and data
  • Close Connection
  • Close the connections generated by the MPI call
  • Free memory associated with SR
  • Performed during MPI_Finalize

9
4 Transaction Messaging
  • Four transaction messaging is currently used to
    implement MPI_Send and MPI_Recv
  • Transmit Header Information
  • Contains information about the size of the main
    message and the message datatype
  • Acknowledge Header
  • During MPI_Send, an error handler is called if
    the acknowledgement is not received within a
    specified time period
  • During MPI_Recv, an error handler is called if
    the header is not received within a specified
    time period
  • Transmit Main Data
  • Sent using srDmsPublishMsg
  • Acknowledge Data
  • Similar to header acknowledgement

10
Implementation Issues
  • Race conditions / deadlock must be avoided
  • Currently, MPI_Recv will only accept messages
    from the node specified in the function call
  • Otherwise, the message will be discarded and the
    sender must re-transmit
  • Another implementation could store the message
    onto a queue for retrieval at a later time
  • Better Performance
  • Incoming messages trigger callback functions
    through SelfReliant
  • The callback runs in a separate thread, so
    coordination between the threads must be
    considered
  • The callback function should be generic so that
    many functions can use the same callback

11
Remaining Functions
12
Timeline for FEMPI
  • September Design phase
  • Design of basic MPI calls including MPI_Init,
    MPI_Comm_Rank, MPI_Comm_Size, MPI_Send, MPI_Recv,
    MPI_Finalize
  • October-November
  • Develop the basic MPI calls listed above
    (October 28)
  • Basic calls ready to test, additionally
    MPI_Barrier has also developed, working on
    MPI_Sendrecv
  • Test the calls on the X86 followed by original
    testbed (November 4)
  • Demonstrate the above functions with sample
    applications (November 11)
  • Design and develop few other MPI calls including
    MPI_Sendrecv, MPI_Barrier, and MPI_Bcast
    (November 20)
  • Test the calls on the X86 followed by original
    testbed (November 25)

13
Timeline for FEMPI
  • December
  • Test the version of FEMPI with some real
    applications including FFTW (December 20)
  • January-February
  • Design and develop other remaining baseline MPI
    calls including MPI_Scatter, MPI_Gather and
    MPI_Allgather (January 27)
  • Test the calls on the X86 followed by original
    testbed (February 3)
  • Demonstrate the baseline version of FEMPI with
    complex applications (February 24)
  • March
  • Study performance of FEMPI (March 24)
Write a Comment
User Comments (0)
About PowerShow.com