The Future of MPI - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

The Future of MPI

Description:

University of Chicago. Department of Energy. Why Was MPI Successful? ... University of Chicago. Department of Energy. Parallel I/O Performance with MPI-IO ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 25
Provided by: willia537
Category:
Tags: mpi | chicago | future

less

Transcript and Presenter's Notes

Title: The Future of MPI


1
The Future of MPI
  • William GroppArgonne National Laboratorywww.mcs.
    anl.gov/gropp

2
The Success of MPI
  • Applications
  • Most recent Gordon Bell prize winners use MPI
  • Libraries
  • Growing collection of powerful software
    components
  • Tools
  • Performance tracing (Vampir, Jumpshot, etc.)
  • Debugging (Totalview, etc.)
  • Results
  • Papers http//www.mcs.anl.gov/mpi/papers
  • Clusters
  • Ubiquitous parallel computing

3
Why Was MPI Successful?
  • It address all of the following issues
  • Portability
  • Performance
  • Simplicity and Symmetry
  • Modularity
  • Composability
  • Completeness

4
Portability and Performance
  • Portability does not require a lowest common
    denominator approach
  • Good design allows the use of special,
    performance enhancing features without requiring
    hardware support
  • MPIs nonblocking message-passing semantics
    allows but does not require zero-copy data
    transfers
  • BTW, it is Greatest Common Denominator

5
Simplicity and Symmetry
  • MPI is organized around a small number of
    concepts
  • The number of routines is not a good measure of
    complexity
  • Fortran
  • Large number of intrinsic functions
  • C and Java runtimes are large
  • Development Frameworks
  • Hundreds to thousands of methods
  • This doesnt bother millions of programmers

6
Measuring Complexity
  • Complexity should be measured in the number of
    concepts, not functions or size of the manual
  • MPI is organized around a few powerful concepts
  • Point-to-point message passing
  • Datatypes
  • Blocking and nonblocking buffer handling
  • Communication contexts and process groups

7
Elegance of Design
  • MPI often uses one concept to solve multiple
    problems
  • Example Datatypes
  • Describe noncontiguous data transfers, necessary
    for performance
  • Describe data formats, necessary for
    heterogeneous systems
  • Proof of elegance
  • Datatypes exactly what is needed for
    high-performance I/O, added in MPI-2.

8
Parallel I/O
  • Collective model provides high I/O performance
  • Matches applications most general view objects,
    distributed among processes
  • MPI Datatypes extend I/O model to noncontiguous
    data in both memory and file
  • Unix readv/writev only applies to memory

9
Parallel I/O Performance with MPI-IO
Structured Mesh I/O
Unstructured Grid I/O
(Posix too slow to show)
10
No One Is Perfect
  • Groups and group manipulation
  • MPI provides many routines for creating and
    manipulating groups (e.g., MPI_Group_intersection,
    MPI_Comm_group, MPI_Comm_create)
  • None of these is needed (in MPI-1)
    MPI_Comm_split should be used instead
  • But groups are needed in MPI-2 for scalable
    remote memory synchronization another example
    of a powerful concept having multiple uses
  • Cancel of sends
  • Difficult to implement correctly
  • Little benefit to virtual all applications
  • Semantics dont even match what users often want
    (stop message even if started)

11
Modularity
  • Modern algorithms are hierarchical
  • Do not assume that all operations involve all or
    only one process
  • Provide tools that dont limit the user
  • Modern software is built from components
  • MPI designed to support libraries
  • Many applications have no explicit MPI calls all
    MPI contained within well-designed libraries

12
Composability
  • Environments are built from components
  • Compilers, libraries, runtime systems
  • MPI designed to play well with others
  • MPI exploits newest advancements in compilers
  • without ever talking to compiler writers
  • OpenMP is an example

13
Completeness
  • MPI provides a complete parallel programming
    model and avoids simplifications that limit the
    model
  • Contrast Models that require that
    synchronization only occurs collectively for all
    processes or tasks
  • Make sure that the functionality is there when
    the user needs it
  • Dont force the user to start over with a new
    programming model when a new feature is needed

14
Is Ease of Use the Overriding Goal?
  • MPI often described as the assembly language of
    parallel programming
  • C and Fortran have been described as portable
    assembly languages
  • Ease of use is important. But completeness is
    more important.
  • Dont force users to switch to a different
    approach as their application evolves

15
Lessons From MPI
  • A general programming model for high-performance
    technical computing must address many issues to
    succeed
  • Even that is not enough. Also needs
  • Good design
  • Buy-in by the community
  • Effective implementations
  • MPI achieved these through an Open Standards
    Process

16
An Open and Balanced Process
  • Balanced representation from
  • Users
  • What users want and need
  • Including correctness
  • Implementers (Vendors)
  • What can be provided
  • Many MPI features determined by implementation
    needs
  • Researchers
  • Directions and Futures
  • MPI planned for interoperation with OpenMP before
    OpenMP conceived
  • Support for libraries strongly influenced by
    research

17
Where Next?
  • Improving MPI
  • Simplifying and enhancing the expression of MPI
    programs
  • Improving MPI Implementations
  • Performance
  • Performance
  • Performance
  • New Directions
  • What can displace (or complement)
    MPI?(Yesterdays panel presentation on
    programming models project and tomorrows panel
    on the future of supercomputing)

18
Improving MPI
  • Simpler interfaces
  • Use compiler or precompiler techniques to support
    simpler, integrated syntax
  • Fortran 95 arrays, datatypes in C/C
  • Eliminate function calls
  • Use program analysis and transformation to inline
    operations
  • More tools for correctness and performance
    debugging
  • MPI profiling interface is a good start
  • Debugger interface used by Totalview is an
    example of tool development
  • Effort to provide a common interface to internal
    performance data, such as idle time waiting for a
    message
  • Changes to MPI
  • E.g., MPI-2 RMA lacks a read-modify-write
  • But dont hold your breath
  • These require research and experimentation before
    they are ready for a standardization process

19
Improving MPI Implementations
  • Faster Point-to-point
  • Some current implementations make unnecessary
    copies
  • Collective operations
  • Better algorithms exist
  • SMP optimizations
  • Scatter-gather broadcast, reduce, etc.
  • Optimizing for new hardware
  • RDMA networks
  • NIC-enabled remote atomic operations
  • Wide area networks
  • Optimizations for high latency
  • Speculative sends
  • Quality of service extensions (through MPI
    attributes)
  • Massive scaling
  • Many implementations optimize internal buffers
    for modest numbers of processes
  • Some MPI routines (e.g., MPI_Graph_create) do not
    have scalable definitions

20
More Improvements for MPI Implementations
  • Reduce latency
  • Automatic techniques to compress code paths
  • Closer match to hardware capabilities
  • Improve RMA
  • Many current implementations at best functional
  • Parallel I/O, particularly for clusters
  • Communication aggregation
  • Reliability in the presence of faults
  • Fault tolerance
  • Exploit MPI Intercommunicators to generalize the
    two-party model
  • Thread safe and efficient implementations
  • Lock-free design
  • Software engineering for common MPI
    implementation source tree
  • Many groups working on improved MPI
    implementations
  • MPICH-2 is an all-new and efficient
    implementation
  • Includes many of these ideas
  • Designed, as MPICH was, to encourage others to
    experiment and extend MPI

21
Whats New in MPICH2
  • Beta-test version available for groups that
    expect to perform research on MPI implementations
    with MPICH2
  • Version 0.92 released last Friday
  • Contains
  • All of MPI-1, MPI-I/O, service functions from
    MPI-2, active-target RMA
  • C, C, Fortran 77 bindings
  • Example devices for TCP, Infiniband, shared
    memory
  • Documentation
  • Passes extensive correctness tests
  • Intel test suite (as corrected) good unit test
    suite
  • MPICH test suite adequate system test suite
  • Notre Dame C tests, based on IBM C test suite
  • Passes more tests than MPICH1 ?

22
MPICH2 Research
  • All new implementation is our vehicle for
    research in
  • Thread safety and efficiency (e.g., avoid thread
    locks)
  • Optimized MPI datatypes
  • Optimized Remote Memory Access (RMA)
  • High Scalability (64K MPI processes and more)
  • Exploiting Remote Direct Memory Access (RDMA)
    capable networks
  • All of MPI-2, including dynamic process
    management, parallel I/O, RMA
  • Usability and Robustness
  • Software engineering techniques that automate and
    simplify creating and maintaining a solid,
    user-friendly implementation
  • Allow extensive runtime error checking but do not
    require it
  • Integrated performance debugging
  • Clean interfaces to other system components such
    as scalable process managers

23
Some Target Platforms
  • Clusters (TCP, UDP, Infiniband, Myrinet,
    Proprietary Interconnects, )
  • Clusters of SMPs
  • Grids (UDP, TCP, Globus I/O, )
  • Cray Red Storm
  • BlueGene/x
  • 64K processors 64K address spaces
  • ANL/IBM developing MPI for BG/L
  • QCDoC
  • Cray X1 (at least I/O)
  • Other systems

24
(Logical) Structure of MPICH-2
MPICH-2
PMI
ADIO
ADI-3
Vendors
remshell
PVFS
MPD
Fork
bproc
Other parallel file systems
NFS
XFS
Windows
Unix(python)
HFS
SFS
Myrinet, Other NIC
Multi- Method
Existing
Channel Interface
Portals
In Progress
TCP
shmem
Infiniband
For others
BG/L
MM
shmem
TCP
25
Conclusions
  • The Future of MPI is Bright!
  • Higher-performance implementations
  • More libraries and applications
  • Better tools for developing and tuning MPI
    programs
  • Leverage of complementary technologies
  • Full MPI-2 implementations will become common
  • Several already exist many ES apps use MPI RMA
Write a Comment
User Comments (0)
About PowerShow.com