The component architecture of Open MPI: enabling Thirdparty collective algorithms PowerPoint PPT Presentation

presentation player overlay
1 / 27
About This Presentation
Transcript and Presenter's Notes

Title: The component architecture of Open MPI: enabling Thirdparty collective algorithms


1
The component architecture of Open MPIenabling
Third-party collective algorithms
  • Aug 27, 2005
  • Sogang University
  • Distributed Computing Communication Laboratory
  • Eunseok, Kim

2
Outline
  • Introduction
  • Adding a new algorithms to MPI implementation
  • Common Interface Approach
  • Component-based Approach
  • What is Open MPI?
  • Design goals
  • Architecture
  • Collective component in Open MPI
  • Example Collective Components
  • Conclusion

3
Introduction
  • Challenges
  • Trend increasing the number of processors
  • Clusters a widely use architecture
  • Scalability issues
  • Process control
  • Resources exhaustion
  • Latency awareness and management
  • Optimized collectives
  • Fault Tolerance
  • Network transmission errors
  • Process fault tolerance
  • Challenges must be solved by developing new one
  • But hard to do

4
Adding a new algorithms to MPI implementation
  • Common Interface Approach
  • Component-based Approach

5
Common Interface Approach
  • Use the MPI profiling layer
  • Allowing third-party libraries
  • Without access to source code
  • Automatically use new routine without
    modification
  • But allows only one version of overloaded
    function
  • Linker semantic
  • Edit an Existing MPI Implementation
  • Needs source code and license
  • Unmodified MPI application
  • But hard to modify

6
Common Interface Approach
  • Create a New MPI implementation
  • Complete control over the entire MPI
    implementation
  • Ex) PA C-X MPI enable to use meta-computing
    environment
  • But extremely hard to do
  • Use Alternate function name
  • The Simplest way
  • Ex) New_Barrier, MPI_Barrier
  • Application must be
  • Modified
  • Can be solved by preprocessor macros
  • Recompiled

7
Component-based Approach
  • Component
  • A set of top-level routines
  • Component-based approach can solve many problems
    which are caused in common interface approach
  • Open MPI

8
What is Open MPI?
  • Production-quality
  • Easy to develop new algorithm
  • Enabling run-time composition of independent s/w

9
Design goals
  • Full MPI-2 standard conformance
  • High Performance
  • Fault tolerant (optional)
  • Thread safety and concurrency (MPI_THREAD_MULTIPLE
    )
  • Based On Component Architecture
  • Flexible run-time environment
  • Portable
  • Maintainable
  • Production quality
  • Single library support all networks
  • Support for multiple networks

10
MPI implementation overview
User application
MPI API
MPI implementation internals
11
Architecture of Open MPI
User application
MPI API
MPI Component Architecture (MCA)
12
Architecture of Open MPI
  • MCA (MPI Component Architecture)
  • Backbone component
  • Provides management services
  • Pass parameter
  • Finds and invokes components
  • Component frameworks
  • Each Major functional area Has a corresponding
    back-end component frameworks
  • Which manages modules
  • Discover, load, use, and unload modules on demand
  • Modules
  • Self-contained software

13
Component frameworks
  • Point-to-point Transport Layer (PTL)
  • Allow the use of multiple networks
  • Ex) TCP/IP, Myrinet etc
  • Point-to-point Management Layer (PML)
  • Message fragmentation, scheduling, and
    re-assembly services
  • Collective Communication (COLL)
  • Process Topology (TOPO)
  • May benefit from tolopogy-awareness
  • Reduction Operation
  • Parallel I/O

14
Advantages of Component architecture
  • Multiple components within a single MPI process
  • Ex) using several network device drivers by PTL
  • Providing a convenient way to use third-party s/w
  • Providing a fine-grained, run-time,
    user-controlled component selection mechanism

15
Collective Components
  • A component paired with a communicator
  • Becomes a module
  • Top-level MPI collective functions
  • Reduced to thin wrappers
  • Error checking for parameters
  • One coll module
  • Assigned to each communicator
  • Ex) MPI_BCAST
  • simply checks the passed parameters
  • Invokes back-end broadcast functions

16
Implementation models
  • Layered over Point-to-Point
  • Utilizes MPI point-to-point functions
  • Ex) MPI_SEND, MPI_RECV etc
  • Concentration on core algorithm
  • Alternate Communication Channels
  • Ex) Myrinet, UDP multicast etc
  • Hierarchical coll Component
  • basic component
  • Basic implementation for all collective
    operations
  • More complex model (bridge)
  • Using Hierarchy of coll modules
  • Single, top-level MPI collective
  • Allows each network to utilize its own optimized
    coll component

17
Example of hierarchical coll component
18
Component / Module Lifecycle
Component
  • Component
  • Open per-process initialization
  • Selection per-scope determine if want to use
  • Close per-process finalization
  • Module
  • Initialization if component selected
  • Normal usage / checkpoint
  • Finalization per-scope cleanup

Module
Comp.
19
Coll components lifecycle
  • Selection
  • Coll frameworks queries each available coll
    component
  • Consider some factors like run-time env or
    topology
  • Choose the most optimized one
  • Initialization
  • Receive the target communicator as parameter
  • After setup, a module with local state of the
    target communicators is returned.
  • Potential run-time optimization
  • Pre-computation

20
Coll components lifecycle
  • Checkpoint / Restart
  • coll modules layered on top of point-to-point
    functionality
  • Point-to-point modules perform it
  • Optional
  • Normal usage
  • Invoke the modules collective routines
  • When a collective function is invoked on the
    communicator
  • Finalization
  • Occurred when the communicator is destroyed

21
Component / Module Interfaces
  • Emphasis simplicity
  • Main groups of interface functions
  • One-time (per process) initialization
  • Ex) determine threading characteristics during
    MPI_INIT
  • Per-scope query
  • Per-scope initialization
  • Normal usage and checkpoint / restart
    functionality
  • Per-scope finalization
  • One-time (per process) finalization

22
Ex of Coll component interface
23
Ex of Coll module interface
24
Example Components
  • The basic Component
  • A full set of intra and intercommunicator
    collectives
  • The smp Component
  • Maximizing b/w conservation across multiple
    levels of network latency
  • Ex) MagPIe (communication of uniporcessors across
    WAN)
  • Segmenting communicators into group
  • Communicating with other group through
    representatives

25
Broadcast Scenario
26
Pseudocode for the Scenario
27
Conclusion
  • Third-party researchers can develop and test new
    algorithms easily.
  • Through standard component architecture
  • Theres already good architecture.
  • Then should we optimize collective operations?
Write a Comment
User Comments (0)
About PowerShow.com