Title: Data Parallel CORBA Joint Initial Submission
1Data Parallel CORBAJoint Initial Submission
2Submitters
- Mercury Computer Systems, Inc.Jim Kulp,
jek_at_mc.com - MPI Software Technology, Inc.Tony Skjellum,
tony_at_mpi-softtech.com - Objective Interface Systems, Inc.Bill Beckwith,
bill.beckwith_at_ois.com - Supported by
- Los Alamos National LaboratoryDavid Forslund,
dwf_at_lanl.gov
3Agenda
- RFP Overview
- Programming Model Overview
- Interoperability and Scalability Support
- Summary of IDL
- Remaining work and issues
4Existing Parallel Programming Systems and APIs
- Higher level than sockets
- Lower level than Distributed Objects
- Lots of help for data partitioning and
distribution - Lots of help for collective, SIMD processing.
5Existing Parallel Systemsnot far from
Distributed Systems
- Frequently use same hardware and system resources
- Multiple computers connected by a network
- Applications have parts running on different
computers
6Islands in need of bridges
- CORBA apps cant easily have parallel parts
- Parallel apps cant easily incorporate or
integrate with DOC - Two different technologies, tools and disciplines
to learn and buy - Both are deficient for the other
- Parallel Processing needs to leverage DOC
technology CORBA
7RFP Technical Goals
- Allow CORBA applications to use data-parallel
processing without changing methodologies. - Allow parallel applications to easily integrate
with CORBA environments. - Allow parallel applications to benefit from DOC.
- Add parallelism and data distribution to the
transparencies offered by CORBA
8RFP Business Goals
- Add another class of applications to the market
for CORBA products (as with RT and FT CORBA). - Enable CORBA to penetrate new markets
- Enable CORBA to solve more of the problem
9RFP Requirements
- Propose optional extensions
- Define interfaces for creation and management of
parallel objects - Define how operations are parallelized at both
client and server - Define how server data partitioning constraints
are specified and communicated
10RFP Parallel Objects
Data Partitioning
Creation,
determination
management
A
D
...
...
C
B
...
11More RFP Requirements
- Define how actual partitioning is communicated
- Be as expressive as the Data Re-org
specification - Define interactions between parallel and
non-parallel codes - Define interactions between parallel objects
- Optional define interoperability with normal ORBs
12Agenda
- RFP Overview
- Programming Model Overview
- Interoperability and Scalability Support
- Summary of IDL
- Remaining work and issues
13Programming Model Overview
- Concepts
- Clients using parallel objects
- Applications creating parallel objects
- Applications implementing parallel objects
14ORB distinction
- Parallel ORBs an ORB that supports the
extensions defined in this submission - Non-parallel ORBs standard ORBs that do not
support this extension
15Object distinction
- Singular objects normal CORBA objects
- Parallel objects objects consisting of multiple
parts working together in parallel to process
invocations - Part objects the parts of a parallel object
- Not visible to clients, visible to client ORBs
- An invocation on a parallel object is turned into
a set of invocations on part objects by the
clients parallel ORB
16Client distinction
- Parallel clients in the context of a method of
a part object servant - Singular clients normal non-parallel clients
whether running on a parallel ORB or not
17Application distinction
- Client application makes invocations on CORBA
objects - Server application where servants are created
and execute - Part Server application where part servants are
created and execute - the code is different the author is a part
servant implementer and knows it
18Data Partitioning
- How data entities, (arguments to an operation on
a parallel object), are in fact divided up to be
distributed or combined during invocations - Commonly needed features, mostly as defined by
the Data Reorganization Effort(www.data-re.org) - Based largely on cartesian 1/2/3d data.
- Implementations have constraints, clients have
real data
19Data Partitioning Features
- Block partitioning data is divided up into
regular chunks - Overlap chunks have some edge data in common
- Modular/minimum constraints pieces need to be a
minimum or modulo size - Dimension reordering can be implicitly done
during invocation (e.g. row-major to column-major)
20Data Reorganization SpecificationPartitioning
inputs
- Number of dimension 3
- For each dimension
- dimension sizes 4 6 2
- partition_type BLOCK
- left overlap 0 0 0
- right overlap 0 0 0
- minimum 0 3 0
- multiple 0 3 0
- Number of parts 4
21Partition Over Four PartsPartitioning Output
2
A
C
6
B
4
Data in each part
p0
C
p1
B
p2
A
p3
22Request Distribution
- How a client request is transformed into part
requests, e.g. - One request per part, in parallel
- Multiple requests per part, load balanced
- Different parallel implementations require
different patterns of request distribution. - Partitioned (and combined) data gets distributed
in requests to parts
...
...
23 Request distribution types
- All send one request to all parts
- Cyclic Keep rotating requests in order among the
parts until done. - Random Pick parts at random until done
- Least Busy Pick the least busy part (as known
by client ORB) - Not busy Pick a part that is not busy (as known
by client ORB)
24Parallel Behavior
- A packaging of both data partitioning and request
distribution - Describes the implementation-defined (not
interface-defined) requirements on the client of
a parallel object implementation (the parts) - Expressed as runtime data structure, for each
operation
25Collective Invocations
- A special type of invocation made only by
parallel clients all together now - Manifestly different in the source code
- The invocation is made collectively among the
parts as a client - Parallel clients can also make normal,
non-collective invocations on singular or
parallel objects - An important case collective to self
26Collective invocations
Simple
Simple
Server
Client
Object
A
D
...
...
...
C
B
...
Parallel Object
Parallel
and Parallel Client
Object
27Programming Model Overview
- Concepts
- Clients using parallel objects
- Applications creating parallel objects
- Applications implementing parallel objects
Part Servers
28Clients using parallel objects
- No new interfaces or semantics visible to
singular clients using parallel objects - The parallel object exists in its references and
in the existence of its parts. - There is no required singular object anywhere for
a parallel object - Some optional extra hidden objects for
interoperability (described in later section) - Collective invocation is new and special, only
for authors of parts acting as parallel clients
29Clients using Parallel Objects
During execution of
Clients invoking
operation X, the parts of
operation X on Parallel
Parallel Object A perform
Object A
a collective invocation of
operation Y on Object B
X
Client 1 on
Y
Part 1
Part 1
Parallel ORB
Part 2
Part 2
Part 3
Part 3
Proxy for
X
...
X
Client 2 on
Parallel
Part 4
Non-parallel ORB
Object A
...
Parallel
Parallel
Object A
Object B
Data
Reorg
as
Data
Reorg
as
specified by Object A
specified by Object A
(client) and Object B (server)
30Applications creating parallel objects
- No intelligent configuration or resource
management for parallel objects - Creation of parallel objects requires explicit
configuration - Number of Parts
- Location of Parts
- Implementation Type of Parts
- Properties
31Parallel Object Factory (POF)
- The object/interface used by applications to
create parallel objects. - Implemented by the parallel ORB(infrastructure-pr
ovided) - Two patterns of parallel object creation are
supported top-down and bottom-up.
32Top-down parallel object creation
- The creator of a parallel object must (via the
POF) create all the parts - Simple and centralized
- Has scalability and startup latency problems -
central and serial
33Top-down parallel object creation
34Bottom-up parallel object creation
- The creation of parts is distributed and parallel
- These creations are communicated to a POF
- The POF then creates the parallel object
(reference) from the pre-existing parts - Part creation is parallelized and can happen in
advance of parallel object creation
35Bottom-up parallel object creation
36Locations
- Location is defined similar to that of the Fault
Tolerant specification - Named and known to the factory, but just a string
for creators - Available locations are provided to POFs, either
via ORB configuration or by the application - In bottom-up pattern, POFs are the rendezvous
point between creation of parts and the formation
of references for parallel objects
37POF operations
- set_locations informs POF of locations, with
properties (or via ORB configuration) - get_locations returns a sequence of known
locations, with properties - create_parallel_object takes a type name and a
sequence of locations and properties where and
how parts should be created (top-down) - create_parallel_object_by_name takes a type
name and a rendezvous name and returns an
object consisting of the parts pre-registered
with the POF under the rendezvous name
(bottom-up).
38Programming Model Overview
- Concepts
- Clients using parallel objects
- Applications creating parallel objects
- Applications implementing parallel objects
39Part Servers Applications implementing
parallel objects
- Implementing parallel objects is really
implementing parts - This involves special parallel application
techniques and skills - Requires decomposition of the functionality for
computation and data parallelism - Data locality is a key challenge
- Exploit multiple execution contexts (threads,
processes, computers, processors)
40Part server implementation tasks
- Implement part servant, operations specialized to
operate in parallel on a subset of the argument
data - Specify the ParallelBehavior of the part
- constraints on data partitioning and request
distribution - runtime instructions to client parallel ORBs
- Possibly perform collective invocations
- Create parts (bottom-up) or register
implementation type (top-down)
41Creating parts what happens in part servers?
- Parallel Part Adapter (PPA) conceptually
- Parallel Part Factory (PPF)
- Specifying ParallelBehavior
42Parallel Part Adapter (concept, not IDL)
- A POA with extended functionality
- Creation of references to parts must specify
ParallelBehavior - We put this into ParallelCORBACurrent
- As opposed to new POA interfaces
- As opposed to new policy state
- Reference creation for parts uses the
ParallelBehavior attribute of ParallelCORBACurre
nt
43Parallel Part Factory (PPF)
- Analogous to the location-specific GenericFactory
in Fault Tolerant CORBA - Remotely accessed, by the Parallel Object
Factory, for top-down creation pattern - Maintains known implementation types
- Advertises its implementation type repertoire to
the Parallel Object Factory (for top down) - Interface to the PPF is defined, TBD whether the
ORB or part server should implement it
44PPF operations
- Creation of an part object (from POF, top down)
- Deletion of a part object (from POF, top down)
- Upon PPF creation
- Register with POF (Im out here!)
- Register implementation type
- Local, from Part Server application, for top down
- Register part object
- Local, from Part Server application, for bottomup
45 Specifying Parallel Behavior
- Descriptions of the implementation for each
operation - What the part objects will do and what clients
performing invocations must do - A series of operation descriptors, for
non-default behavior - Operation descriptor request distribution
data partitioning
46Data partitioning what to do with the arguments?
- An expression of constraints, without knowing the
actual data sizes - Used for exception, input, output and return
values - Ways to resolve returned arguments
- Combine the values according to partitioning
- Compare all values and provide an exception if
different - Take any returned/output value
- Take any exception if it happens
- Take any value if available even if there are
exceptions - Define the useful ones!
47 Implementing part operations
- The part interface is the part version of the
IDL-defined interface - Rules to generate the part interface from the IDL
are different - Implementers of parallel parts use different
skeletons - Extra partitioning information to be passed to
the parts with the request - the size of the original client-side data
- the amount of data provided to the part
- the position of the data in the whole
48Performing collective invocations
- How parts can collectively make invocations on
singular or parallel objects - How parts can operate on themselves
- We propose to use the convention used by AMI
(operation prefixes) - Collective invocations have extra client-side
meta data, like part skeletons do - Special case of collective invocation on self
is interesting and important
49Agenda
- RFP Overview
- Programming Model Overview
- Interoperability and Scalability Support
- Summary of IDL
- Remaining work and issues
50Concepts for Interoperability and Scalability
- Parallel (object) Realization
- Parallel Proxy
- Parallel Agent
51Parallel (object) Realization
- Run-time ParallelBehavior and profiles of the
parts of a parallel object (how to talk to the
parts) - Not available to the client at compile time or in
any Interface Repository - A description of implementation, not interface
- Only available after the parallel object exists
and its implementation (parts) is fixed - Created by the POF, used by the client parallel
ORB - A data structure
52Parallel Proxy
- An bridge object used by clients on non-parallel
ORBs to make requests on parallel objects - Optionally created by the POF when creating
parallel objects, does not live at the client - Only in the specification as a side affect of
parallel object creation, and in IORs, with no
defined IDL - Automatically generated and/or supplied by the
parallel ORB (infrastructure provided) - Converts singular invocation to parallel
invocation
53Parallel Agent
- An object used by parallel ORBs (as clients), not
used by applications - Used to obtain the ParallelRealization of a
parallel object when it (PR) is not in the IOR - Optionally created or referenced by POF during
parallel object creation, infrastructure provided - Evident in the specification by an IOR profile
and IDL for parallel ORB interoperability - They exist for scalability (parallel objects with
many parts)
54Interoperability requirements
- Support interoperability between parallel and
non-parallel ORBs - Support interoperability between different
parallel ORBs - No requirements on common ORBs between any of
POF, Parts, Singular clients and servers. - Parts of on object use the same ORB
55Requirements for References toParallel Objects
- Allow clients to use references on non-parallel
ORBs - Allow references to contain complete
ParallelRealization information for maximum
performance at modest scale - Allow client ORBs to use references to retrieve
ParallelRealization at runtime for maximum scaling
56IOR profiles for parallel objects
- Parallel-Proxy Profile (not new) the normal
IIOP profile - Parallel-Realization Profile
- Parallel-Agent Profile
- Parallel-Behavior Profile
57Parallel-Proxy Profile
- Not new
- In the parallel object context, it refers to the
parallel proxy object - The normal profile recognized and usable by
non-parallel ORBs - Present only if POF was asked to make a Parallel
proxy
58Parallel-Agent Profile
- Known only to parallel ORBs
- References an object that will supply information
about the parallel object - Useful for scalability when the
Parallel-Realization profile is not appropriate
(next)
59Parallel-Realization Profile
- Known only by parallel ORBs
- A complete ParallelRealization embedded in IOR
- Does not require any further traffic to start
using the part objects - Not highly scalable but delivers best performance
for small-scale parallelism - Not needed if Parallel-Agent is present
- Contains ParallelBehavior and part profiles
60 Parallel-Behavior Profile
- Used in references to part objects, not parallel
objects - Used to communicate part implementation
information to the POF - Embeds ParallelBehavior data structure in part IOR
61Parallel Object References
- Must contain either a Parallel-Realization
profile or a Parallel-Agent profile, can have
both - May also optionally contain a Parallel-Proxy
profile (the standard IIOP profile)
62Parallel Object References
63Part Object References
- Must contain a standard profile
- Must contain a Parallel-Behavior profile
- The interface type is not the same as the
interface to the parallel object - It is derived from the part version of the
interface
64Agenda
- RFP Overview
- Programming Model Overview
- Interoperability and Scalability Support
- Summary of IDL
- Remaining work and issues
65IDL Summary
- Data structures
- Parallel Realization
- Parallel Behavior
- IOR Profiles
- Parallel Agent
- Parallel Realization
- Parallel Behavior
- Interfaces
- Special skeletons for parts from app IDL
- Special stubs for collective invocations
- Parallel Object Factory
- Parallel Part Factory
- ParallelCORBACurrent attribute
66Agenda
- RFP Overview
- Programming Model Overview
- Interoperability and Scalability Support
- Summary of IDL
- Remaining work and issues
67Issues for following submissions 1
- Detailed IDL
- Possibility of data that is distributed in the
state of parts but usable as arguments - Possibility of adding the extra control provided
by collective operations to singular clients? - Ensure reflexive collective invocations do not
have to copy data - Do UML/MOF diagram
68Issues for following submissions 2
- Usage cases, especially typical data flow
pipelines - Possibility of persistent/zero copy requests
may be required for adequate performance - Fault Tolerance relationship in detail
- Interceptor-based implementation possibilities
69Issues for following submissions 3
- POF semantics for turning constraints into
realization. POF analogies to POA? - Possibility of special requests to terminate
cyclic or load balanced operations - Review creation scenarios for bottlenecks
- Possibility of optional (XML-based?) meta data
for compile time Parallel Behavior? - Any POF/PPF issues vs. components?