Title: mpjdev, the lowlevel MPJ device
1mpjdev, the low-level MPJ device
- Aamir Shafi
- aamir.shafi_at_port.ac.uk
- Distributed Systems Group
- 12th March, 2004
2Sequence
- Introduction
- History of MPJ
- Overview of java.nio package
- mpjdev implementation
- Introduction
- Packing/Unpacking at the device level
- Communication Protocols
- Eager Send
- Rendezvous Protocol
- Quick Start Guide to mpjdev
- Whats next
3Introduction
- Parallel Programming
- Shared Memory
- Message Passing
- Message Passing
- MPI
- PVM
- Has Java got what it takes to be a parallel
programming language ?
4Message Passing in Java (MPJ)
- Specification Document
- mpiJava 1.2.5
- Uses native MPI
5Pure java MPI-like libraries
- mpiJava 1.2.5
- Uses native MPI
- Best candidate to be developed to MPJ
- http//www.hpjava.org/mpiJava.html
- DSGs project MPJ (http//dsg.port.ac.uk/projects/
mpj/) is tightly coupled to this effort - JMPI
- Drawback, based on RMI
- http//euler.ecs.umass.edu/jmpi/
- MPJava
- Based on java.nio
- Results encouraging
- No source-code
- DOGMA Dead
- MPP
- Basic library
- http//www.mi.uib.no/bjornoh/mtj/mpp/
6Sequence
- Introduction
- History of MPJ
- Overview of java.nio package
- mpjdev implementation
- Introduction
- Packing/Unpacking at the device level
- Communication Protocols
- Eager Send
- Rendezvous Protocol
- Quick Start Guide to mpjdev
- Whats next
7New I/O overview
- Buffer classes,
- Conversion among basic data types
- Direct and Indirect Buffer
- Scalable Applications
- No more one thread per client
- One thread, many clients
- Multiplex Synchronous I/O using the selectors
- SocketChannel New Abstraction
- Pipe One way communication
8Buffer classes in java.nio
- TBuffer classes, T being all the basic data
types - Any basic data type can be copied onto the
ByteBuffer, (the basic primitive) - Buffer can be,
- Direct, allocateDirect(int)
- Indirect, allocate(int)
9a. ByteBuffer buffer ByteBuffer.allocate(8)b.
buffer.putInt(4)c. buffer.flip()
10a. ByteBuffer buffer ByteBuffer.allocate(8)b.
ByteBuffer buffer ByteBuffer.allocateDirect(8)
11Selector
- Selector provide, connecting, accepting, reading
writing facilities to the SocketChannel - OP_WRITE, OP_READ, OP_ACCEPT, OP_CONNECT
- Depends on native OS facilities
- Selection performs differently on Windows and
Linux
12Sequence of events for selectors
13Taming the NIO circus
- Taming the NIO circus thread
- http//forum.java.sun.com/thread.jsp?forum4threa
d459338start0range15hilitefalseq - OutOfMemory Exception
- http//forum.java.sun.com/thread.jsp?thread433702
forum4message2136979 - Selectors taking cent percent CPU
- http//forum.java.sun.com/thread.jsp?forum11thre
ad494967 - http//forum.java.sun.com/thread.jsp?forum4threa
d494194 - J2SE 1.5.0 beta solves many problems
14Sequence
- Introduction
- History of MPJ
- Overview of java.nio package
- mpjdev implementation
- Introduction
- Packing/Unpacking at the device level
- Communication Protocols
- Eager Send
- Rendezvous Protocol
- Quick Start Guide to mpjdev
- Whats next
15mpjdev Introduction
- What is device ?
- Sockets
- Similar to ADI in MPICH
- Meant for library developers (MPJ), not
application programmers - jGMA can use mpjdev
16mpjdev Introduction
- Single JVM Implementation
- Processes are threads in the single JVM
- Native implementation
- Uses native MPI
- LAPI Implementation
- Eager-send
- Rendezous
- Buffer packing/unpacking are taken from these
implementations
17mpjev functions
- Comm methods
- Void send(Buffer, int dest, int tag)
- Req isend(Buffer, int dest, int tag)
- Req irecv(Buffer, int src, int tag)
- For src, ANY_SRC
- For tag, ANY_TAG
- Status recv(Buffer, int dest, int tag)
- Req methods
- Status Wait()
- Status Wait(Req)
18mpjdev Connectivity
19Communication Protocols Eager Send
Sender
Receiver
Huge Memory at Receiver
1 2 3 4 5 6
Sel
Sel
20Rendezvous Protocol
Receiver 1 Checks if step 2 has already posted
recv, if yes, initiates step 3, else posts a req
in Que Receiver 2 Checks if step 1 has posted a
recv request, if yes, initiates step 3, else
posts a request. NOTE - STEP 1 2 needs
synchronization Receiver 3 Writes back to sender
that recvr is ready to receive Receiver
4 Receives the data
Receiver
1
Sender
1
Sender 1 Sender posts send(), Req stored in
Queue Sender 2 Sender sends the control message
asking if matching recv is posted ? Sender
3 Sender receives response confirming there is a
matching recv Sender 4 Sender sends the actual
data
Sender Queue
Recv Queue
1
1
Sel
Sel
21Packing/Unpacking of Buffers
- write(t source, int offset,int length)
- gather(t source, int numEls, int offs, int
indexes) - strGather(t source, int srcOff, int rank, int
exts, int strs, int indexes) - read(t dest, int dstOff, int numEls)
- scatter(t dest, int numEls, int offs, int
indexes) - strScatter(t dest, int dstOff, int rank, int
exts, int strs, int indexes) - Java objects, same concepts
22Message Format
- Primary header (Eight bytes),
- First four bytes nothing
- Last four byte --size of message
- Primary PayLoad,
- Each primitive data type is written as a section,
- SECTION_HEADER (8 bytes)
- First byte Type of Message, int, float etc
- Last four bytes Number of Elements in this
section - SECTION_DATA
- The actual data itself
- Should be multiple of 8
- Secondary header (Eight Bytes)
- First Four bytes nothing
- Last Four bytes Size of Java Object
- Secondary PayLoad
- Actual java Object in bytes
- Can Java object be written as a section in
Primary Payload ?
23Message Format, Single Section
int intArray new int2 int0 1 int1
2 WriteBuffer wBuffer new WriteBuffer(24) wB
uffer.write(intArray, 0,2) wBuffer.pack()
8 bytes
8 bytes
8 bytes
X X X X Size (24) int T X X X 2(NoEls)
int0 int1
PRIMARY_HEADER
SECTION_HEADER
SECTION_HEADER
24Message Format, Multiple Section
int intArray new int2 long longArray
new long2 int0 1 int1 2 longArray0
1L longArray1 2L WriteBuffer wBuffer
new WriteBuffer(48) wBuffer.write(intArray,
0,2) wBuffer.write(longArray, 0,2) wBuffer.pack(
)
8 bytes
8 bytes
8 bytes
8 bytes
16 bytes
X X X X Size (24) int X X X 2(NoEls) int0
int1 L X X X 2 long0 long1
PRIMARY_HEADER
SECTION_HEADER
SECTION_DATA
S_HEADER
S_DATA
25Control Messages
- Used in Rendezvous Protocol
- Format
- Int Rank of Destination
- Int Rank of Source
- Int Message Length
- Int Message Tag
- Should be as small as possible
26Sequence
- Introduction
- History of MPJ
- Overview of java.nio package
- mpjdev implementation
- Introduction
- Packing/Unpacking at the device level
- Communication Protocols
- Eager Send
- Rendezvous Protocol
- Quick Start Guide to mpjdev
- Whats next
27Directory Structure
28Compiling the source files
- Dependencies,
- J2SE1.5.0-beta
- Apache Ant (1.6.1)
- If you want to compile from the source, otherwise
- mpjdev.jar is lying in the mpjdev/lib folder
- You need to add mpjdev.jar in your CLASSPATH
29Running Examples - Config
- Configuration
- mpjdev/conf/mpjdev.conf
- Total Number of Processes
- For each Process
- FULLY_QUALIFIED_NAME_at_PORT_at_RANK
- Sorry IP address wont work at present
- I never had to to write an IP ?
30Running examples
- cd mpjdev/test
- javac -classpath ../lib/mpjdev.jar.
BufferTest3.java - java -classpath ../lib/mpjdev.jar. BufferTest3
0 - (O is the Rank)
- Tracking problems
- java version, is it 1.5. ?
- Are you using IP ?
- Email me
31Writing your own programs
- As an example,
- Two processes
- Initialize the device
- CommImpl.init()
- Get your ID,
- CommImpl.id()
- One process packs sends data,
- CommImpl.send(Buffer, int dest, int tag)
- Other process reads data unpacks
- CommImpl.recv(Buffer, int src, int tag)
- Finalize,
- CommImpl.finish()
32(No Transcript)
33Sequence
- Introduction
- History of MPJ
- Overview of java.nio package
- mpjdev implementation
- Introduction
- Packing/Unpacking at the device level
- Communication Protocols
- Eager Send
- Rendezvous Protocol
- Quick Start Guide to mpjdev
- Whats next
34A Few experiments
- No tests, yet ?
- Bottlenecks,
- Transfer of huge messages seems inefficient
- Queues, java.util.Vector is not good enough
- Would like to see how many processes, can
selector handle ? - Potentially thousands
- Need bench-marks, keeping in view java issues and
bottlenecks, and test them - Write our own
35Future
- Badly need a runtime
- Shell scripts starting jobs using ssh or rsh
? - MPJ,
- Group communications
- I thought, I would have completed the whole MPJ,
by now - Looking for my PhD project, in the same domain
36Suggestions