Title: AgentTeamwork: Mobile-Agent-Based Middleware for Distributed Job Coordination
1AgentTeamwork Mobile-Agent-Based Middleware for
Distributed Job Coordination
- Munehiro Fukuda
- Computing Software Systems, University of
Washington, Bothell - Funded by
2Outline
- Introduction
- Execution Model
- System Design
- Performance Evaluation
- Related Work
- Conclusions
31. Introduction
- Why Grid Computing
- Background
- Objective
- Project Overview
4Why Grid Computing
- Textbooks say
- Only 30 CPU utilization
- Only episodic job requirements
- Anyone and anywhere like a power grid
- Many research prototypes and commercial products
- Globus, Condor, Legion(Avaki), NetSolve, Ninf,
Entropia PCGrid, Sun Grid Engine, etc. - Then, have you ever used them?
- Probably not so many of you.
- What is a big hurdle?
- You dont need it anyway. Or, what?
5BackgroundMost Grid Systems
- Functional viewpoints
- Centralized resource/job management
- Two drawbacks
- A powerful central server essential to manage all
slave computing nodes - Applications based on master-slave or
parameter-sweep model - Out motivation
- Decentralized job distribution, coordination, and
fault tolerance - Applications based on a variety of communication
models - Practical viewpoints
- Systems dedicated to large institutions/companies
- Two drawbacks
- A lot of installation work required under the
root account. - A group of individual computer owners not
targeted at. - Our motivation
- Easy participation in grid-computing and easy
installation
6BackgroundHow to Pursue Our Motivation
- Use of mobile agents
- We are experts in mobile agents.
- Most mobile agents
- An execution model previously highlighted as a
prospective infrastructure of distributed
systems. - No more than an alternative approach to
centralized grid middleware implementation. - Our initial goal
- Decentralized middleware design with mobile agents
7Objective
- A mobile agent execution platform fitted to grid
computing - Allowing an agent to identify which MPI rank to
handle and which agent to send a job snapshot to. - A fault-tolerant inter-process communication
- Recovering lost messages.
- Allowing over-gateway connections.
- Agent-collaborative algorithms for job
coordination - Allocating computing nodes in a distributed
manner. - Implementing decentralized snapshot maintenance
and job recovery.
8Project Overview
- Funded by NSF Middleware Initiative
- Sponsored by University of Washington
- In Collaboration of Ehime University
- In a Team of UWB Undergraduates
92. Execution Model
- System Overview
- Execution Layer
- Programming Interface
10System Overview
User As Process
User As Process
User Bs Process
TCP Communication
Snapshot Methods
GridTCP
User program wrapper
Sentinel Agent
Sentinel Agent
Sentinel Agent
Commander Agent
Commander Agent
Resource Agent
Resource Agent
Bookkeeper Agent
BookkeeperAgent
11Execution Layer
Java user applications
mpiJava API
GridTcp
Java socket
User program wrapper
Commander, resource, sentinel, and bookkeeper
agents
UWAgents mobile agent execution platform
Operating systems
12Programming Interface
- public class MyApplication
- public GridIpEntry ipEntry //
used by the GridTcp socket library - public int funcId //
used by the user program wrapper - public GridTcp tcp // the
GridTcp error-recoverable socket - public int nprocess //
processors - public int myRank //
processor id ( or mpi rank) - public int func_0( String args ) //
constructor - MPJ.Init( args, ipEntry, tcp ) //
invoke mpiJava-A - ..... //
more statements to be inserted - return 1 //
calls func_1( ) -
- public int func_1( ) //
called from func_0 - if ( MPJ.COMM_WORLD.Rank( ) 0 )
- MPJ.COMM_WORLD.Send( ... )
- else
- MPJ.COMM_WORLD.Recv( ... )
- ..... //
more statements to be inserted - return 2 //
calls func_2( ) -
133. System Design
- Mobile Agents
- Job Coordination
- Distribution
- Monitoring
- Resumption and migration
- Programming Support
- Language preprocessing
- Communication check-pointing
14UWAgents Concept of Agent Domain
- Agent domain created per each submission from the
Unix shell - children each agent can spawn is given upon the
initial submission - No name server
- Messages forwarded through an agent tree
- A user job scheduled as a thread, using
suspend/resume
15UWAgents Over Gateway Migration
16Job Distribution
Job Submission
Commander id 0
XML Query
Spawn
Sentinel id 2 rank 0
Bookkeeper id 3 rank 0
Resource id 1
eXist
Sentinel id 8 rank 1
Sentinel id 11 rank 4
Sentinel id 10 rank 3
Sentinel id 9 rank 2
Bookkeeper id 12 rank 1
Bookkeeper id 15 rank 4
Bookkeeper id 14 rank 3
Bookkeeper id 13 rank 2
Sensor id 4
Sensor id 5
Sentinel id 32 rank 5
Sentinel id 34 rank 7
Sentinel id 33 rank 6
Bookkeeper id 48 rank 5
Bookkeeper id 50 rank 7
Bookkeeper id 49 rank 6
id agent id rank MPI Rank
17Resource Allocation
Job submission
total nodes x multiplier
Commander id 0
Resource id 1
eXist
An XML query
CPU Architecture OS Memory Disk Total
nodes Multiplier
A list of available nodes
Spawn
Sentinel id 2 rank 0
Sentinel id 8 rank 1
Case 1 Total nodes 2 Multiplier 1.5
Bookkeeper id 2 rank 0
Bookkeeper id 12 rank 5
Future use
Sentinel id 2 rank 0
Sentinel id 8 rank 1
Bookkeeper id 2 rank 0
Bookkeeper id 12 rank 5
Case 2 Total nodes 2 Multiplier 3
Future use
Future use
18Resource Monitoring
A resource request
Commander id 0
Resource id 1
eXist
A list of available nodes
Spawn
- Current restrictions
- Minimum interval 3secs
- Static distribution of sensor agents
- Future extensions
- Sensor migration
- Use of NWS at each site
19Job Resumption by a Parent Sentinel
Sentinel id 2 rank 0
MPI connections
Sentinel id 8 rank 1
Sentinel id 11 rank 4
Sentinel id 10 rank 3
Sentinel id 9 rank 2
Bookkeeper id 15 rank 4
20Job Resumption by a Child Sentinel
Commander id 0
New
Sentinel id 2 rank 0
Bookkeeper id 3 rank 0
Resource id 1
Sentinel id 8 rank 1
Bookkeeper id 12 rank 1
21User Program Wrapper
User Program Wrapper
Source Code
int fid 1 while( fid -2) switch(
func_id ) case 0 fid func_0( ) case
1 fid func_1( ) case 2 fid func_2( )
check_point( ) // save this object
// including func_id // into a file
func_0( ) statement_1 statement_2
statement_3 return 1 func_1( )
statement_4 statement_5 statement_6
return 2 func_2( ) statement_7
statement_8 statement_9 return -2
statement_1 statement_2 statement_3 statement_
4 statement_5 statement_6 statement_7 stateme
nt_8 statement_9
check_point( ) check_point(
) check_point( )
Preprocessed
22Preproccesser and Drawback
Preprocessed Code
Source Code
Preprocessed
int func_0( ) statement_1 statement_2
statement_3 return 1 int func_1( )
while() statement_4 if ()
statement_5 return 2 else
statement_7 statement_8
int func_2( ) statement_6 statement_8
while() statement_4 if ()
statement_5 return 2
else statement_7 statement8
statement_1 statement_2 statement_3 check_point
( ) while () statement_4 if ()
statement_5 check_point( )
statement_6 else statement_7
statement_8 check_point( )
Before check_point( ) in if-clause
After check_point( ) in if-clause
- No recursions
- Useless source line numbers indicated upon errors
- Still need of explicit snapshot points.
23GridTcp Check-Pointed Connection
User Program Wrapper
rank ip
1 n1.uwb.edu
2 n2.uwb.edu
user program
TCP
outgoing
backup
incoming
Snapshot maintenance
n1.uwb.edu
n2.uwb.edu
- Outgoing packets saved in a backup queue
- All packets serialized in a backup file every
check pointing - Upon a migration
- Packets de-serialized from a backup file
- Backup packets restored in outgoing queue
- IP table updated
n3.uwb.edu
24GridTcp Over-Gateway Connection
User Program Wrapper
User Program Wrapper
User Program Wrapper
User Program Wrapper
rank dest gateway
0 mnode0 -
1 medusa -
2 uw1-320 medusa
3 uw1-320-00 medusa
rank dest gateway
0 mnode0 -
1 medusa -
2 uw1-320 -
3 uw1-320-00 Uw1-320
rank dest gateway
0 mnode0 medusa
1 medusa -
2 uw1-320 -
3 uw1-320-00 -
rank dest gateway
0 mnode0 uw1-320
1 medusa uw1-320
2 uw1-320 -
3 uw1-320-00 -
user program
user program
user program
user program
medusa.uwb.edu (rank 1)
uw1-320.uwb.edu (rank 2)
uw1-320-00 (rank 3)
- RIP-like connection
- Restriction each node name must be unique.
mnode0 (rank 0)
25MPJ Package
MPJ
Init( ), Rank( ), Size( ), and Finalize( )
Communicator
All communication functions Send( ), Recv( ),
Gather( ), Reduce( ), etc.
JavaComm
mpiJava-S uses java sockets and server sockets.
GridComm
mpiJava-A uses GridTcp sockets.
DataType
MPJ.INT, MPJ.LONG, etc.
- InputStream for each rank
- OutputStream for each rank
- User a permanent 64K buffer for serialization
- Emulate collective communication sending the same
data to each OutputStream, which deteriorates
performance
MPJMessage
getStatus( ), getMessage( ), etc.
Op
Operate( )
etc
Other utilities
26MPI Job Execution
UWPlace (UWAgent Execution Platform)
274. Performance Evaluation
- Evaluation Environment
- A 8-node Myrinet-2000 cluster 2.8GHz
pentium4-Xeon w/ 512MB - A 24-node Giga-Ethernet cluster 3.4GHz
Pentium4-Xeon w/512MB - Computation Granularity
- Java Grande MPJ Benchmark
- Process Resumption Overhead
28MPJ.Send and Recv Performance
29Computational Granularity 1
30Computational Granularity 2
31Computational Granularity 3
32Performance Evaluation - Series
33Performance Evaluation - RayTracer
34Performance Evaluation MolDyn
35Overhead of Job Resumption
365. Related Work
- From the viewpoints of
- System Architecture
- Mobile Agents
- Fault Tolerance
37System Architecture
Systems Architectural basis
Globus A toolkit
Condor Process migration
Ninf, NetSolve RPC
Legion (Avaki) OO
Catalina, J-SEAL2, AgentTeamwork Mobile agents
- Difference from Catalina/J-SEAL2
-
- They are not fully implemented.
- They are based on a master-slave model
38Mobile Agents
Mobile agents Naming Cascading termination Job scheduling Security
IBM Aglets AgeltFinder traces all agents Needs to retract one by one Schedules jobs with Baglets. Java byte-code verification
Voyager RPC-based system-unique agent IDs Needs to be implemented at a user level Launches an independent user process. CORBA security service
DAgent Unpredictable agent IDs Needs to be implemented at a user level Launches an independent user process. A currency-based model
Ara (Obsolete) Unpredictable agent IDs Calls ara_kill to kill all agents Launches an independent user process. An allowance model
UWAgent Agent domain Waits for all descendants termination Schedules jobs with Java thread functions. Agent-to-agent security w/ Agent domain
39Fault Tolerance
Systems Libraries Data recovery Communication recovery
Legion (Avaki) FT-MPI Variables passed to MPI_FT_save( ) N/A
Condor MW Library All master data Master-worker communication
Dome Dome_env Objects declared as dXXX lttypegt N/A
AgentTeamwork GridTcp All serializable class data All in-transit messages
406. Conclusions
- Project Summary
- Next Two Years
41Project summary
- Our focus
- A decentralized job execution and fault-tolerant
environment - Applications not restricted to the master-slave
or parameter-sweeping model. - Applications
- 40,000 doubles x 10,000 floating-point operations
- Moderate data transfer combined with
massive/collective communication - At least three times larger than its
computational granularity - Current status
- UWAgent completed
- Agent behavioral design basic job
deployment/resumption implemented - User program wrapper completed except security
feature - GridTcp/mpiJava in testing
- Preprocessor in design
42Next Two Years
- Application support
- Preprocessor implementation
- Efficient input/output file transfer
- Security enhancement in remote execution
- GUI improvement
- Agent algorithms
- Over-gateway application deployment
- Dynamic resource monitoring
- Priority-based agent migration
- Performance evaluation
- Dissemination
43Questions?