Title: Denis'Caromelinria'fr
1 A Strong Programming Model Bridging Distributed
and Multi-Core Computing
- Denis.Caromel_at_inria.fr
- Background INRIA, Univ. Nice, OASIS Team
- Programming Parallel Programming Models
- Asynchronous Active Objects, Futures, Typed
Groups - High-Level Abstractions (OO SPMD, Comp.,
Skeleton) - Optimizing
- Deploying
2 1. Background and Team at
INRIA
3OASIS Team INRIA
- A joint team between INRIA, Nice Univ. CNRS
- Participation to EU projects
- CoreGrid, EchoGrid, Bionets, SOA4ALL
- GridCOMP (Scientific Coordinator)
- ProActive 4.0.1 Distributed and
Parallel - From Multi-cores to EnterpriseScience GRIDs
- Computer Science and Control
- 8 Centers all over France
- Workforce 3 800
- Strong in standardization committees
- IETF, W3C, ETSI,
- Strong Industrial Partnerships
- Foster company foundation
- 90 startups so far
- - Ilog (Nasdaq, Euronext)
- -
- - ActiveEon
4- Startup Company Born of INRIA
- Co-developing, Providing support for Open Source
ProActive Parallel Suite - Worldwide Customers (EU, Boston USA, etc.)
5OASIS Team Composition (35)
- Researchers (5)
- D. Caromel (UNSA, Det. INRIA)
- E. Madelaine (INRIA)
- F. Baude (UNSA)
- F. Huet (UNSA)
- L. Henrio (CNRS)
- PhDs (11)
- Antonio Cansado (INRIA, Conicyt)
- Brian Amedro (SCS-Agos)
- Cristian Ruz (INRIA, Conicyt)
- Elton Mathias (INRIA-Cordi)
- Imen Filali (SCS-Agos / FP7 SOA4All)
- Marcela Rivera (INRIA, Conicyt)
- Muhammad Khan (STIC-Asia)
- Paul Naoumenko (INRIA/Région PACA)
- Viet Dung Doan (FP6 Bionets)
- Virginie Contes (SOA4ALL)
- Guilherme Pezzi (AGOS, CIFRE SCP)
- Visitors Interns
PostDoc (1) Regis Gascon (INRIA) Engineers
(10) Elaine Isnard (AGOS) Fabien Viale (ANR
OMD2, Renault ) Franca Perrina (AGOS) Germain
Sigety (INRIA) Yu Feng (ETSI, FP6
EchoGrid) Bastien Sauvan (ADT Galaxy) Florin-Alexa
ndru.Bratu (INRIA CPER) Igor Smirnov
(Microsoft) Fabrice Fontenoy (AGOS) Open position
(Thales) Trainee (2) Etienne Vallette dOsia
(Master 2 ISI) Laurent Vanni
(Master 2 ISI) Assistants (2) Patricia
Maleyran (INRIA) Sandra Devauchelle (I3S)
6ProActive Contributors
7ProActive Parallel SuiteArchitecture
8(No Transcript)
9ProActive Parallel Suite
9
10ProActive Parallel Suite
10
112. Programming Modelsfor Parallel Distributed
12ProActive Parallel Suite
13ProActive Parallel Suite
14Distributed and ParallelActive Objects
14
15ProActive Active objects
A ag newActive (A, , VirtualNode) V v1
ag.foo (param) V v2 ag.bar (param) ... v1.bar(
) //Wait-By-Necessity
JVM
ag
v2
v1
V
Wait-By-Necessity is a Dataflow Synchronization
Java Object
Active Object
Req. Queue
Future Object
Proxy
Thread
Request
15
16First-Class FuturesUpdate
16
17Wait-By-Necessity First Class Futures
Futures are Global Single-Assignment Variables
b
a
c.gee (V)
c
c
17
18Wait-By-Necessity Eager Forward Based
AO forwarding a future will have to forward its
value
b
a
c.gee (V)
c
c
18
19Wait-By-Necessity Eager Message Based
AO receiving a future send a message
b
a
c.gee (V)
c
c
19
20Standard system at RuntimeNo Sharing
NoC Network On Chip
Proofs of Determinism
20
21(2) ASP Asynchronous Sequential Processes
- ASP ? Confluence and Determinacy
- Future updates can occur at any time
- Execution characterized by the order of request
senders - Determinacy of programs communicating over trees,
- A strong guide for implementation,
- Fault-Tolerance and checkpointing,
Model-Checking,
22No Sharing even for Multi-CoresRelated Talks at
PDP 2009
- SS6 Session, today at 1600
- Impact of the Memory Hierarchy on Shared Memory
Architectures - in Multicore Programming Models
- Rosa M. Badia, Josep M. Perez, Eduard Ayguade,
and Jesus Labarta - Realities of Multi-Core CPU Chips and Memory
Contention - David P. Barker
23TYPED ASYNCHRONOUS GROUPS
23
24Creating AO and Groups
A ag newActiveGroup (A, , VirtualNode) V v
ag.foo(param) ... v.bar() //Wait-by-necessity
JVM
Group, Type, and Asynchrony are crucial for
Composition
Typed Group
Java or Active Object
24
25Broadcast and Scatter
- Broadcast is the default behavior
- Use a group as parameter, Scattered depends on
rankings
cg
ag.bar(cg) // broadcast cg ProActive.setScatter
Group(cg) ag.bar(cg) // scatter cg
25
26Static Dispatch Group
Slowest
ag
cg
JVM
Fastest
JVM
ag.bar(cg)
JVM
26
27Dynamic Dispatch Group
Slowest
ag
cg
JVM
Fastest
JVM
ag.bar(cg)
JVM
27
28Handling Group Failures (2)
V vg ag.foo (param) Group groupV
PAG.getGroup(vg) el groupV.getExceptionList()
... vg.gee()
28
29Abstractions for ParallelismThe right Tool to
execute the Task
30Object-Oriented SPMD
30
31OO SPMD
A ag newSPMDGroup (A, ,
VirtualNode) // In each member
myGroup.barrier (2D) // Global Barrier
myGroup.barrier (vertical) // Any Barrier
myGroup.barrier (north,south,east,west)
Still, not based on raw messages, but Typed
Method Calls gt Components
31
32Object-Oriented SPMDSingle Program Multiple Data
- Motivation
- Use Enterprise technology (Java, Eclipse, etc.)
for Parallel Computing - Able to express in Java MPIs Collective
Communications - broadcast reduce
- scatter allscatter
- gather allgather
- Together with
- Barriers, Topologies.
32
33MPI Communication primitives
- For some (historical) reasons, MPI has many com.
Primitives - MPI_Send Std MPI_Recv Receive
- MPI_Ssend Synchronous MPI_Irecv Immediate
- MPI_Bsend Buffer (any) source, (any) tag,
- MPI_Rsend Ready
- MPI_Isend Immediate, async/future
- MPI_Ibsend,
- Id rather put the burden on the implementation,
not the Programmers ! - How to do adaptive implementation in that context
? - Not talking about
- the combinatory that occurs between send and
receive - the semantic problems that occur in distributed
implementations
33
34Application Semantics rather thanLow-Level
Architecture-Based Optimization
- MPI
- MPI_Send MPI_Recv MPI_Ssend MPI_Irecv MPI_Bsend
MPI_Rsend MPI_Isend MPI_Ibsend - What we propose
- High-level Information from Application
Programmer - (Experimented on 3D ElectroMagnetism, and Nasa
Benchmarks) - Examples
- ro.foo ( ForgetOnSend (params) )
- ActiveObject.exchange(params )
- Optimizations for Both
- Distributed
- Multi-Core
35NAS Parallel Benchmarks
- Designed by NASA to evaluate benefits of high
performance systems - Strongly based on CFD
- 5 benchmarks (kernels) to test different aspects
of a system - 2 categories or focus variations
- communication intensive and
- computation intensive
36Communication IntensiveCG Kernel (Conjugate
Gradient)
- 12000 calls/node
- 570 MB sent/node
- 1 min 32
- 65 comms/WT
- Floating point operations
- Eigen value computation
- High number of unstructured communications
Data density distribution
Message density distribution
37Communication IntensiveCG Kernel (Conjugate
Gradient)
? Comparable Performances
38Communication IntensiveMG Kernel (Multi Grid)
- 600 calls/node
- 45 MB sent
- 1 min 32
- 80 comms
- Floating point operations
- Solving Poisson problem
- Structured communications
Data density distribution
Message density distribution
39Communication IntensiveMG Kernel (Multi Grid)
Pb. With high-rate communications 2D ?? 3D
matrix access
40Computation IntensiveEP Kernel (Embarrassingly
Parallel)
- Random numbers
- generation
- Almost no
- communications
?This is Java!!!
41Related Talk at PDP 2009
- T4 Session, today at 1400
- NPB-MPJ NAS Parallel Benchmarks Implementation
for - Message Passing in Java
- Damián A. Mallón, Guillermo L. Taboada, Juan
Touriño, Ramón Doallo - Univ. Coruña, Spain
42 Parallel Components
42
43GridCOMP Partners
44Objects to Distributed Components
IoC Inversion Of Control (set in XML)
A
Example of component instance
V
Truly Distributed Components
Typed Group
Java or Active Object
JVM
44
45GCM
- Scopes and Objectives
- Grid Codes that Compose and Deploy
- No programming, No Scripting, No Pain
- Innovation
- Abstract Deployment
- Composite Components
- Multicast and GatherCast
MultiCast
GatherCast
46Optimizing MxN Operations
2 composites can be involved in the
Gather-multicast
47Related Talk at PDP 2009
- T1 Session, Yesterday at 1130
- Towards Hierarchical Management of Autonomic
Components - a Case Study
- Marco Aldinucci, Marco Danelutto, and Peter
Kilpatrick
48Skeleton
49Algorithmic Skeletons for Parallelism
- High Level Programming Model Cole89
- Hides the complexity of parallel/distributed
programming - Exploits nestable parallelism patterns
BLAST Skeleton Program
Parallelism Patterns
farm
while
dc(fb,fd,fc)?
Task
pipe
if
for
pipe
fork
seq(f3)?
divide conquer
seq(f1)?
seq(f2)?
Data
map
fork
50Algorithmic Skeletons for Parallelism
public boolean condition(BlastParams param)
File file param.dbFile return
file.length() gt param.maxDBSize
- High Level Programming Model Cole89
- Hides the complexity of parallel/distributed
programming - Exploits nestable parallelism patterns
BLAST Skeleton Program
Parallelism Patterns
farm
while
dc(fb,fd,fc)?
Task
pipe
if
for
pipe
fork
seq(f3)?
divide conquer
seq(f1)?
seq(f2)?
Data
map
fork
51 3. Optimizing
51
52Programming OptimizingMonitoring, Debugging,
Optimizing
53(No Transcript)
54Optimizing User Interface
55IC2D
56ChartIt
57Pies for Analysis and Optimization
58Video 1 IC2D OptimizingMonitoring, Debugging,
Optimizing
59(No Transcript)
60 4. Deploying Scheduling
60
61(No Transcript)
62Deploying
63 Deploy on Various Kinds of Infrastructures
64GCM StandardizationFractal Based Grid Component
Model
- 4 Standards
- 1. GCM Interoperability Deployment
- 2. GCM Application Description
- 3. GCM Fractal ADL
- 4. GCM Management API
- Overall, the standardization is supported by
industrials - BT, FT-Orange, Nokia-Siemens, Telefonica,
- NEC, Alcatel-Lucent, Huawei
65Protocols and Scheduler inGCM Deployment Standard
- Protocols
- Rsh, ssh
- Oarsh, Gsissh
- Scheduler, and Grids
- GroupSSH, GroupRSH, GroupOARSH
- ARC (NorduGrid), CGSP China Grid, EEGE gLITE,
- Fura/InnerGrid (GridSystem Inc.)
- GLOBUS, GridBus
- IBM Load Leveler, LSF, Microsoft CCS (Windows HPC
Server 2008) - Sun Grid Engine, OAR, PBS / Torque, PRUN
- Soon available in stable release
- Java EE
- Amazon EC2
66Abstract Deployment Model
- Problem
- Difficulties and lack of flexibility in
deployment - Avoid scripting for configuration, getting nodes,
connecting
A key principle Virtual Node (VN) Abstract
Away from source code Machines
names Creation/Connection Protocols Lookup and
Registry Protocols Interface with various
protocols and infrastructures Cluster LSF, PBS,
SGE , OAR and PRUN (custom protocols) Intranet
P2P, LAN intranet protocols rsh, rlogin,
ssh Grid Globus, Web services, ssh, gsissh
2009
66
67Resource Virtualization
Deployment Descriptor
Mapping
Infrastructure
Acquisition
Application
Nodes
VN
Connections
Creation
Runtime structured entities 1 VN --gt
n Nodes in m JVMs on k Hosts
2009
67
68Resource Virtualization
node
Application
VN1
node
GCM XML Deployment Descriptor
VN2
node
node
2009
68
69Virtualization resources
node
Application
VN1
node
VN2
node
node
2009
69
70Multiple Deployments
Internet
Local Grid
Distributed Grids
One Host
2009
70
71SchedulingMode
72Programming with flows of tasks
- Program an application as an ordered set of
tasks - Logical flow Tasks execution are orchestrated
- Data flow Results are forwarded from ancestor
tasks to children as parameter - The task is the smallest execution unit
- Two types of tasks
- - Standard Java
- - Native, i.e. any third party application
(binary, scripts, etc.)
Task 2(input 2)
Task 1(input 1)
res2
res1
Task 3(res1,res2)
2009
72
73Defining and running jobs with ProActive
- A workflow application is a job
- a set of tasks which can be executed according to
a - dependency tree
- Rely on ProActive Scheduler
- Java or XML interface
- Dynamic job creation in Java
- Static description in XML
- Task failures are handled by the ProActive
Scheduler - A task can be automatically re-started or not
(with a user-defined bound) - Dependant tasks can be aborted or not
- The finished job contains the cause exceptions as
results if any
2009
73
74Scheduler / Resource Manager Overview
- Multi-platform Graphical Client (RCP)
- File-based or LDAP authentication
- Static Workflow Job Scheduling, Native and Java
tasks, Retry on Error, Priority Policy,
Configuration Scripts,
- Dynamic and Static node sources, Resource
Selection by script, Monitoring and Control GUI,
- ProActive Deployment capabilities Desktops,
Clusters, ProActive P2P,
2009
74
75Scheduler User Interface
76Scheduler User Interface
77 Video 2 Scheduling Scheduler and Resource
ManagerSee the video athttp//proactive.inria
.fr/userfiles/media/videos/Scheduler_RM_Short.mpg
77
78(No Transcript)
79Summary
80Current Open Source Tools
Acceleration Toolkit ConcurrencyParallelism
Distribution
81Conclusion
- Summary
- Programming OO
- Asynchrony, First-Class Futures, No Sharing
- Higher-level Abstractions (SPMD, Skel., )
- Composing Hierarchical Components
- Optimizing IC2D Eclipse GUI
- Deploying ssh, Globus, LSF, PBS, , WS
- Applications
- 3D Electromagnetism SPMD on 300 machines at once
- Groups of over 1000!
- Record 4 000 Nodes
Our next target 10 000 Core Applications
82Conclusion Why does it scale?
- Thanks to a few key features
- Connection-less, RMIJMS unified
- Messages rather than long-living interactions
83Conclusion Why does it Compose?
- Thanks to a few key features
- Because it Scales asynchrony !
- Because it is Typed RMI with interfaces !
- First-Class Futures No unstructured
- Call Backs and Ports
84Perspectives for Parallelism Distribution
- A need for several, coherent, Programming Models
for different applications - Actors (Functional Parallelism) Active Objects
Futures - OO SPMD optimizations away from low-level
optimizations - Parallel Component Codes and Synchronizations
- MultiCast GatherCast Capturing // Behavior at
Interfaces! - Adaptive Parallel Skeletons
- Event Processing (Reactive Systems)
- Efficient Implementations are needed to prove
Ideas! - Proofs of Programming Model Properties Needed for
Scalability! -
- Our Community never had a greatest Future!
- Thank You for your attention!
85(No Transcript)
86Extra Material
- Grid SOA
- J2EE Grid
- Amazon EC2 ProActive Image and Deployment
87Perspective SOA Grid
88SOA IntegrationWeb Services, BPEL Workflow
89Active Objects as Web Services
JVM
- Why ?
- Access Active Objects from any language
- How ?
- HTTP Server
- SOAP Engine (Axis)
- Usage
-
Web Services
ProActive.exposeAsWebService() ProActive.unExpos
eAsWebService()
Web Service Client
90ProActive Services WorkflowsPrinciples3
kinds of Parallel Services
- 3. Domain Specific Parallel Services
- (e.g. Monte Carlo Pricing)
- 2. Typical Parallel Computing Services
- (Parameter Sweeping, DC, )
- Basic Job Scheduling Services
- (parallel execution on the Grid)
913 kinds of Parallel Services
- 3. Domain Specific Parallel
- Services
- providing business
- functionalities executed in
- parallel
- 2. Parallelization services
- typical parallel computing
- patterns (Parameter
- Sweeping, DC, )
-
- 1. Job Scheduling service
- Schedule and Run jobs in
- parallel on the Grid.
Operational Services
Parallel Services
Grid
92A sample pattern Parameter Sweeping
Process using parameter sweeping service
I1 I2 In
O1 O2 On
Param Sweeping Service
parameter sweeping
I1 I2 In
O1 O2 On
All the running instances of the Exec logic X
are executed on the grid as a whole
Parameter Sweeping Service, customized with an
Exec logic X
Exec Logic
93Video
SOA IntegrationWeb Services, BPEL Workflow
946. J2EE Integration
- Florin Alexandru Bratu
- OASIS Team - INRIA
95J2EE Integration with Parallelism Grids/Clouds
- Performing Grid Cloud Computing
- From In
- an Application Servers
- Delegating heavy computations outsides J2EE
Applications - Using Deployed J2EE Nodes as Computational
Resources
96ProActive J2EE Integration (1)
- Delegating heavy computations outsides J2EE
Applications
97ProActive J2EE Integration (2)
2. Using Deployed J2EE Nodes as Computational
Resources
- Objective
- Being able to deploy active objects inside the
JVMs of application servers - Implementation
- Based on a Sun standard Java Connector
Architecture JSR112 - Deployment module resource adapter (RAR)?
- Works with all J2EE-compliant Application Servers
98Integration(2)?
99Grids CloudsAmazon EC2 Deployment
100Big Picture Clouds
101Clouds ProActive Amazon EC2 Deployment
- Principles Achievements
- ProActive Amazon Images (AMI) on EC2
- So far up to 128 EC2 Instances
- (Indeed the maximum on the EC2 platform, ready
to try 4 000 AMI) - Seamless Deployment
- no application change, no scripting, no pain
- Open the road to In house Enterprise Cluster
and Grid Scale out on EC2
102ProActive Deployment onAmazon EC2 Video
103(No Transcript)
104P2PProgramming Modelson Overlay Networks
105(No Transcript)
106(No Transcript)