MPJ: The second generation - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

MPJ: The second generation

Description:

Issue 3: cygwin' If running MPJ on cygwin, chmod o w $MPJ_HOME/logs' ... Is MPJDaemon a windows service, or a linux service on cygwin? 9/18/09. 33 ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 40
Provided by: mpjex
Category:

less

Transcript and Presenter's Notes

Title: MPJ: The second generation


1
MPJ The second generation MPI for Java
  • Aamir Shafi
  • 26th April, 2005
  • Distributed Systems Group
  • http//dsg.port.ac.uk

2
People
  • Aamir Shafi
  • Bryan Carpenter
  • Open Middleware Infrastructure Institute (OMII)
  • Mark Baker

3
Presentation outline
  • Introduction
  • Design and implementation of MPJ
  • The runtime infrastructure
  • Implementation issues
  • Conclusion

4
Introduction
  • MPI was introduced in June 1994 as a standard
    message passing API for parallel scientific
    computing.
  • Language bindings for C, C, and Fortran
  • Java Grande Message Passing Workgroup defined
    Java bindings in 98
  • Previous efforts follow two approaches
  • JNI approach
  • Pure Java approach
  • Remote Method Invocation (RMI)
  • Sockets

5
Introduction Pure Java approach
  • RMI
  • Meant for client server applications
  • Java Sockets
  • Java New I/O package
  • Adds non-blocking I/O to the Java language,
  • Direct Buffers
  • Allocated in the native OS memory and the JVM
    attempts to provide faster I/O
  • Communication performance
  • Comparison of Java NIO and C Netpipe drivers,
  • Java performs similar to C on Fast Ethernet.
  • A very naïve comparison

6
  1. The latency is 250 microseconds
  2. After 1k, the latency starts increasing due to
    fragmentation of packets
  3. Netpipe is a single-threaded simple benchmark

7
  1. Max throughput is 90 Mbps
  2. It will be great if MPJ with all its complexities
    can reach 80 Mbps

8
Introduction JNI approach
  • Importance of JNI cannot be ignored
  • Where Java fails, JNI makes it work
  • Advances in HPC communication hardware have
    continued to grow
  • Network latency has been reduced to a couple of
    microseconds
  • Pure Java looks like an impractical solution
  • In the presence of myrinet, no application
    developer/user would opt for Fast Ethernet
  • Cons
  • Not in essence with Java philosophy of write
    once, run anywhere

9
Introduction
  • For Java messaging
  • There is no one size fits all approach
  • Portability and high performance are often
    contradictory requirements
  • Portability Pure Java
  • High Performance JNI
  • The choice between portability and high
    performance should best be left to application
    developers
  • The challenging issue is how to manage these
    contradictory requirements
  • How to provide a flexible mechanism to help
    applications swap communication protocols?

10
Presentation outline
  • Introduction
  • Design and implementation
  • The runtime infrastructure
  • Implementation issues
  • Conclusion

11
Design
  • Aims
  • Support swapping various communication devices
  • Two device levels
  • The MPJ Device level (mpjdev)
  • Separates native MPI device from all other
    devices
  • native MPI device is a special case
  • Possible to cut through and make use of native
    implementation of advanced MPI features
  • The xdev Device level (xdev)
  • gmdev xdev based on GM 2.x comms library
  • niodev xdev based on Java NIO API
  • smpdev xdev based on Threads API

12
MPJ design
13
Implementation
  • Point to point communications
  • Collective communications
  • Groups, communicators, and contexts
  • Derived datatypes
  • Vector, Indexed, Contiguous, and Struct
  • Explict packing and unpacking
  • Process Topologies
  • Cartesian
  • Graph
  • Possible to cut through to the native MPI
    implementation
  • As of today, three methods (Dims_create, Cancel,
    and Wtick are left unimplemented)

14
Presentation outline
  • Introduction
  • Design and implementation
  • The runtime infrastructure
  • Implementation issues
  • Conclusion

15
The runtime infrastructure
  • All MPI libraries face the task of bootstrapping
    MPI processes over network computers
  • RSH/SSH based scripts are the most common
  • LAM/MPI daemons and runtime system works on UNIX
    based OS
  • No version of LAM for Windows
  • MPICH has recently introduced SMPD (Super Multi
    Purpose Daemon)
  • According to docs
  • Works on linux and Windows
  • Difficult (if not impossible) to interface with
    Java

16
Runtime MPJDaemon and MPJStarter modules
  • Consists of two modules
  • The daemon that runs on compute nodes (MPJDaemon)
  • The starter module that runs on head nodes
    (MPJStarter)
  • Installing MPJDaemon on compute nodes
  • RSH/SSH based scripts can easily install daemon
    on UNIX based OSes
  • Could be installed as services (/etc/init.d)
  • Two files are required to install as a service on
    Windows

17
Runtime MPJDaemon on UNIX based OSes
  • MPJ_HOME/bin/mpjdaemon is a rc shell that starts
    and stops the daemon
  • Installation as an app
  • cd MPJ_HOME/bin
  • ./mpjdaemon start
  • Could use RSH/SSH script to install on whole UNIX
    cluster
  • Installation as a service
  • cp MPJ_HOME/bin/mpjdaemon /etc/init.d
  • Adding to the default runtime
  • rc-update add mpjdaemon default (Gentoo Linux)
  • /etc/init.d/mpjdaemon start/stop/status

18
Runtime MPJDaemon on Windows
  • cd MPJ_HOME/bin
  • InstallMPJDaemon-NT.bat
  • This bat file installs the daemon as a service

19
Runtime MPJDaemon as services
  • Apache Commons Daemon
  • The source bundle does not even compile
  • The project is no more active
  • Spent a week trying to make it work on Windows
  • Gave up!
  • Java Service Wrapper
  • Simple and does what it says
  • Support for almost platforms available (where you
    can run Java)
  • Distributed under MIT License
  • Redistribute without any restricitons

20
Runtime JMX MM
  • Claims monitoring and management of Java apps
  • Start Java app with following switch
  • Dcom.sun.management.jmxremote
  • Run jconsole
  • Possible to connect to remote and local JVMs
  • Useful if application is an Mbean
  • Application attributes could be get/set remotely
  • Possibility
  • MPJDaemon could be operated remotely

21
JMX MM Connection GUI
22
JMX MM Connection summary
23
JMX MM JVM memory
24
JMX MM JVM threads
25
JMX MM JVM info.
26
Runtime Dynamic class loading(1)
  • The application (parallel program) and MPJ
    library is dynamically loaded into the daemon
    JVM
  • No need to copy jar files
  • No shared file system assumption
  • MPJStarter starts the light-weight HTTP server
    (Jetty), which serves the jar file containing
    parallel program

27
Runtime Dynamic class loading(2)
  • For example, HiMPJ.java is a parallel program
  • Requires mpj.jar to compile and run
  • Bundle it into a jarfile specifying a manifest
    file with CLASSPATH attribute pointing to mpj.jar
  • Write the manifest file,
  • Manifest-Version 1.0
  • Main-Class HiMPJ
  • Class-Path mpj.jar
  • jar cfm himpj.jar manifest HiMPJ.class
  • Copy it to MPJ_HOME/lib directory
  • Executing MPJStarter
  • cd MPJ_HOME/bin
  • starter.sh/bat 2 himpj.jar ../lib xdev
    niodev
  • JarClassLoader will load himpj.jar and mpj.jar
    into the daemons JVM

28
Presentation outline
  • Introduction
  • Design and implementation
  • The runtime infrastructure
  • Implementation issues
  • Conclusion

29
Issue 1 Shared memory device
  • Based on Java Threads API
  • Each thread is an MPI process
  • Communicates with other threads by sending
    messages
  • All threads run in the same JVM
  • Cannot have static variables in the parallel
    program
  • Static variables within the MPJ library require
    synchronized access

30
Issue 2 Synchronization problems with threads in
smpdev
  • Each MPJDaemon is assigned number of processes to
    be executed
  • In case of smpdev, all processes run on the same
    machine
  • MPJDaemon loads the parallel program
  • JarClassLoader.loadClass(parallelProgramName)
  • Once loaded, the program is started as follows
  • JarClassLoader.invokeClass(pClass, args)

31
Issue 2 Synchronization problems with threads in
smpdev
  • For example, MPJStarter request MPJDaemons to
    start 2 processes (threads)
  • MPJDaemon started two threads, which first load,
    and then start the program
  • Processes (threads) are started in this way do
    not share static variables and cannot synchronize
  • In order to share static variables and sync them,
    the class should be loaded just once, and
    exectued N times
  • It was implemented in this way because niodev
    requires the exact opposite behaviour No
    sharing of static variables
  • Currently, the user specifies which device should
    be used
  • In case of niodev, the loading is done twice
  • In case of smpdev, the loading is done only once

32
Issue 3 cygwin
  • If running MPJ on cygwin,
  • chmod ow MPJ_HOME/logs
  • chmod ax MPJ_HOME/lib/.dll
  • Is MPJDaemon a windows service, or a linux
    service on cygwin?

33
(Future) Issue 4 Specifying multiple devices
  • Currently, only one device can be specified
  • Either niodev or smpdev will be selected as the
    primary comms device
  • But for SMP clusters, it would be ideal
  • To use smpdev on a SMP node
  • Use niodev/gmdev for internode comms

34
(Future) Issue 5 Starting MPJ with native MPI
device
  • mpiJava/native MPI device uses mpirun to
    bootstrap MPI processes
  • To bring it in line with other devices, native
    MPI device will have to be started by MPJ runtime
    infrastructure

35
Issue 6 Multiple users running MPJDaemons at the
same time
  • Install daemons as an app,
  • Agree on the port numbers.

36
Presentation outline
  • Introduction
  • Design and Implementation
  • The runtime infrastructure
  • Implementation Issues
  • Conclusion

37
Summary
  • The key issue for Java messaging is not debating
    pure Java or JNI approach
  • But, providing a flexible mechanism to swap
    various comm protocols
  • MPJ has a pluggable architecture
  • We are implementing niodev, gmdev, smpdev,
    and native MPI device
  • MPJ runtime infrastructure allows bootstrapping
    MPI process across various platforms
  • MPJDaemons can be installed as native OS service

38
Conclusions
  • We are slowly but surely moving towards the first
    release of MPJ, the next generation of MPI for
    Java
  • Current Status
  • Unit Testing
  • MPJ follows the same API as mpiJava
  • The parallel applications built on top of mpiJava
    will work with MPJ
  • There are some differences in the API
  • Bsend, and explicit packing/unpacking -- see
    release docs for more details
  • Arguably, the first MPI library for Java that
    implements real messaging stuff in pure Java

39
Questions
?
Write a Comment
User Comments (0)
About PowerShow.com