Component Computing meets SharedResource Metacomputing

About This Presentation

Title:

Component Computing meets SharedResource Metacomputing

Description:

Magda Slawinska, David DeWolfs, Maciej Malawski, Pawel Jurczyk, Dirk Gorissen, ... Oak Ridge Labs (A. Geist, C. Engelmann, J. Kohl) ... – PowerPoint PPT presentation

Number of Views:67

Avg rating:3.0/5.0

Slides: 42

Provided by: daw68

Category:

more less

Transcript and Presenter's Notes

Title: Component Computing meets SharedResource Metacomputing

1
Component Computing meets Shared-Resource
Metacomputing

Dawid Kurzyniec
Emory University, Atlanta, GA, USA
http//harness2.org/

2
Credits and Acknowledgements

Distributed Computing Laboratory, Emory
University
Vaidy Sunderam
Magda Slawinska, David DeWolfs, Maciej Malawski,
Pawel Jurczyk, Dirk Gorissen, Jarek Slawinski,
Piotr Wendykier, Dawid Kurzyniec
Collaborators
Oak Ridge Labs (A. Geist, C. Engelmann, J. Kohl)
Univ. Tennessee (J. Dongarra, G. Fagg, E.
Gabriel)
Sponsors
U. S. Department of Energy
National Science Foundation
Emory University

3
Motivations for component computing

Why do we want software components?
Software is expensive and complex
We should reuse it as much as possible
The idea almost as old as computing
Douglas McIlroy, Mass Produced Software
Components, NATO conference on software
engineering, Garmish, Germany 1968
Software production () would be enormously
helped by the availability of spectra of high
quantity routines, quite as mechanical design is
abetted by the existence of families of
structural shapes, screws or resistors
Apply traditional engineering practices
Build software just like bridges or railroads
by assembling pre-fabricated pieces
Grouping and gradual composition is how humans
have always been tackling complexity

4
Even closer look at the origins reveals

Some component ideas can be traced back to 1940s
Consider the linker technology
In 1947, primitive loaders could combine and
relocate program routines from separate tapes
into a single program
By early 1960s, these evolved into full-fledged
linkage editors

Component-based scientific computing (mobile
components) in 1960s
5
So whats the big fuss about?

If the idea is so old and obvious
How come we still have component workshops in
2006?
Has the software reuse problem not been solved
yet?
Whos responsible?

6
Components everywhere

Successful component manifestations are pervasive
To the point that we dont even notice them
anymore
The true measure of success for technology is if
it becomes invisible
Example the original McIlroys postulates in 68
We have to begin thinking small
Numerical approximation routines, Input-output
conversion, 2D and 3D geometry, text processing,
storage management
Indeed, screws and resistors
Functionality now universally provided by
standard libraries

7
Another example UNIX pipes

Decomposing programs into sub-tasks
The mechanism invented by D. McIlroy, 1972
One of the most widely admired contributions of
Unix to the culture of operating systems (D.
Ritchie, 1979)
McIlroys Unix philosophy
Write programs that do one thing and do it well
Write programs to work together
Write programs to handle text streams, () a
universal interface
Protoplast of component workflows
Data-flow-oriented programming
Components connected via universal interfaces
Stitched together using scripting languages

8
Main stream of software industry

Gradual refinement of concepts and techniques
Object Oriented Programming
At the end, no significant contribution to reuse,
but
Introduced interfaces, abstractions,
encapsulation
Simple components (COM, Java Beans)
Easier composition no hard-core programming
required
Distributed Objects (CORBA, DCOM, RMI)
Middleware services Interface Description
Languages
Distributed Components (EJB, COM, CCM)
Application servers separation of concerns
Notion of a component container
Loose coupling XML, SOAP, Messaging
Transition from programming to integrating
The best software is the one that you dont have
to write

9
2006 Lessons learned

Components the success story
Brought about the most remarkable technologies in
todays mainstream use
Yet, the successes did not come easy
Systematically developing high-quality reusable
software faces numerous technical and
non-technical impediments
It took many trials-and-errors to get where we
are today
Components the never-ending story
Allow us to assembly bigger and bigger pieces
More sophisticated and powerful software
Creating new, even bigger opportunities
We no longer have to think small

Linker Modules
Standard Libraries
Objects
UNIX Pipes
Components
Distributed Objects
Distributed Components
10
Research Trends 1. Distributed Workflows

Data-Flow approach to composition
Influenced by the UNIX pipes paradigm
Distributed component environments
Typically, large volumes of data transfer
Challenges
Assembling workflows dynamically
Matching semantic requirements with component
interface descriptions

Linker Modules
Standard Libraries
Objects
UNIX Pipes
Components
Distributed Objects
Distributed Components
Distributed Workflows
11
Research Trends 2. HPC Components

Motivation
Increasing popularity of interdisciplinary HPC
applications
Modern applications require coupling of separate
simulation codes, which currently exist as
complex monolithic frameworks
Distinct application requirements
1. Performance
2. Performance
3. Performance parallel communication
4. Support for large volume (GB) and/or very
frequent (kHz) data transfers(i.e. performance)

Linker Modules
Standard Libraries
Objects
UNIX Pipes
Components
Distributed Objects
Distributed Components
HPC Components
Distributed Workflows
12
Towards Distributed Scientific Components

Main motivation resource sharing
Combine pools of resources from different
administrative domains to run large/multi-domain
HPC applications
How to
balance interoperability with performance?
deploy applications on shared resources?
embrace legacy codes?
program aggregated resources effectively?
encourage sharing by making it accessible for
clients and providers?
enable scalability?

13
The Harness II Project

Theme
Exploring new capabilities for distributed
scientific computing
Goals
Cooperative resource sharing
Dynamic application deployment and composability
Flexible communication layer
Ease of use, maintenance, and programming

14
Harness II

Aggregation for Concurrent High Performance
Computing
Equivalent to Distributed Virtual Machine
But only on the client side
Hosting layer
Collection of containers
Flexible/lightweight middleware
DVM components responsible for
(Co)allocation/brokering
Naming/discovery
Failures/migration/persistence

15
H2O Middleware Abstraction

Providers own resources
Independently make them available over the
network
Clients discover, locate, andutilize resources
Resource sharing occurs between single provider
and single client
Relationships may betailored as appropriate
Including identity formats, resource allocation,
compensation agreements
Clients can themselves be providers
Cascading pairwise relationships maybe formed

16
H2O Component Platform

Resources provided as services
Service active software component exposing
functionality of the resource
May represent added value
Run within a providers container (execution
context)
Dynamic deployment
By any authorized party
provider, client, or reseller
Provider and deployers specify access policies
Client identity code signatures
Support for temporal restrictions
Decoupling
Providers from deployers
Providers from each other

Provider host
ltltcreategtgt
Container
Provider
Lookup use
Deploy
Client
Traditional model
17
Example usage scenarios

Resource computational service
Reseller deploys software component into
providers container
Reseller notifies the client about the offered
computational service
Client utilizes the service

Resource raw CPU power
Client gathers application components
Client deploys components into providers
containers
Client executes distributed application utilizing
providers CPU power

Resource legacy application
Provider deploys the service
Provider stores the information about the service
in a registry
Client discovers the service
Client accesses legacy application through the
service

18
Model and Implementation
Interface StockQuote double
getStockQuote()

H2O nomenclature
container kernel
component pluglet
Simple component-oriented model
Java and C-based implementations
Pluglet remotely accessible object
Must implement Pluglet interface, may implement
Suspendible interface
Used by kernel to signal/trigger pluglet state
changes
Model
Implement (or wrap) service as a pluglet to be
deployed on kernel(s)

Clients
Functionalinterfaces
(e.g. StockQuote)
Pluglet
Suspendible
Interface Pluglet void init(ExecutionContext
cxt) void start() void stop()
void destroy()
Interface Suspendible void suspend()
void resume()
19
Resource Discovery

Support for Java Naming and Directory Interface
Common API to access diverse back-end services
Two new JNDI provider implementations
JNDI-HDNS fault-tolerant, persistent,
distributed
Information replicated across multiple nodes
Load balancing each node can handle read request
Configurable model of synchrony (based on JGroups
stacks)
Can recover state after node failures and/or
network partitions
JNDI-Jini JNDI front-end to the Jini Lookup
Service

20
Resource Discovery (cont.)

Example JNDI-based hierarchical naming service
DNS redirects clients to nearby HDNS nodes
HDNS forwards requests to department-level naming
services
Motivation lookups prevail at the meta-level,
but updates must be propagated faster than in DNS

21
Accessing Component Services

Rely on standard component communication
paradigms
Request-response
Asynchronous events
Tailor them to the H2O requirements
Stateful service access must be supported
Efficient vs interoperable protocols
Asynchronous access for compute intensive service
Semantics of cancellation and error handling

22
RMIX communication layer

Extensible RMI framework
Client and provider APIs
uniform access to communication capabilities
supplied by pluggable provider implementations
Multiple invocation protocols
JRMPX, ONC-RPC, SOAP
Stackable transport
SSL, tunneling, compression, Jxta
Transparent to the application
Extended semantics
Asynchronous calls, invocation interceptors,

23
RMIX in H2O

Pluglets can use familiar RMI semantics
Interoperability
Pluglets can communicate with Web Services and
RPC clients and servers
Allows tailoring the protocol stack as appropriate

RPC, IIOP, JRMP, SOAP,
24
RMIX protocol stacks

Interoperability
SOAP
Connectivity
Jxta, transport tunnels
Security
SSL, Jxta groups
High performance
ARPC, custom (Myrinet, Quadrics)
Protocol negotiation

H2O Pluglet Client or Server
Security, interoperability
H2O Pluglet Client or Server
Internet
firewall
efficiency
H2O Pluglet Client or Server
H2O Pluglet
Harness Kernel
efficiency
connectivity
H2O Pluglet Client or Server
25
Asynchronous RMIX

Goal 1 simplicity

AsyncHello hello (AsyncHello)Naming.lookup(...)
Future f hello.asyncHello() ... result
f.get() ... f Hello.cbasyncHello(new
Callback() public void completed() ...
public void failed() ... ) ...
26
Asynchronous RMIX

Goal 2 precise semantics

Cancellation
Execution order
Exception handling
Parameter marshaling
27
H2O and events

REVENTS library
Asynchronous remote events
Publisher-subscriber model with a hierarchical
topic list
Event metadata payload (like JMS message)
Focused subscriptions topics filters
Filter language based on the SQL expression
syntax
RMIX and REVENTS in H2O
H2O pluglets can
Implement remote methods
Publish events (e.g. in response to a remote call
or some internal activity)
Pluglet clients can
Invoke methods on pluglets
Subscribe to events fired by pluglets

28
H2O and P2P

JXTA transport in RMIX
Enables communication between components behind
firewalls or NATs
Transparent to the application
Future work JXTA-JNDI
Exploit decentralized P2P resource discovery
Motivations
Ad-hoc collaborations
Self-organizing, scalable resource sharing
networks

29
Programming models H2O and CCA

CCA component standard for High Performance
Computing
Uses and provides ports described in SIDL
Support for scientific data types (complex
numbers, data arrays)
Existing CCA frameworks
CCAFFEINE tightly coupled, support for Babel,
MPI support
XCAT loosely coupled, Globus-compatible,
Java-based
DCA MPI based, MxN problems
SCIRun2 metacomponent model
LegionCCA based on Legion Metacomputing system

30
MOCCA implementation in H2O

H2O dynamic deployment -gt runtime, remote
component (un)loading
H2O security -gt multiple components may run
without interfering
Each component running as a separate pluglet
loaded by different classloader multiple
versions may co-exist
RMIX communication -gt efficiency, multiprotocol
interoperability
MOCCA_Light pure Java implementation (no SIDL)

31
Remote Port Call
32
Performance Small Data Packets

Factors
SOAP header overhead in XCAT
Connection pools in RMIX

33
Large Data Packets

Encoding (binary vs. base64)
CPU saturation on Gigabit LAN (serialization)
Variance caused by Java garbage collection

34
Support for Babel Components
4. RMIX call
Component Pluglet
Component Pluglet
5. call
MOCCA Babel
Services (Java)
Babel port
6. IOR call
stub (Java)
1. getPort
2. create
CCA
3. IOR call
Babel
CCA
Component
port proxy
Component
(Native)
(Java)
(Native)
User Side
Provider Side

Currently MOCCA_Light pure Java framework
Approach
Use Java bindings to Babelized components
Automatically generate wrapping code

Issues
Babel remote bindings
Remote references
CCA package hierarchy

35
Use Case 2 H2O FT-MPI

Overall scheme
H2O framework installed on computational nodes,
or cluster front-ends
Pluglet for startup, event notification, node
discovery
FT-MPI native communication (also MPICH)
Major value added
FT-MPI need not be installed anywhere on
computing nodes
To be staged just-in-time before program
execution
Likewise, application binaries and data need not
be present on computing nodes
The system must be able to stage them in a secure
manner

36
Staging FT-MPI runtime with H2O

FT-MPI runtime library and daemons
Staged from a repository (e.g. Web server) to the
computational node upon users request
Automatic platform type detection appropriate
binary files are downloaded from the repository
as needed
Allows users to run fault tolerant MPI programs
on machines where FT-MPI is not pre-installed
Not needing login account to do so using H2O
credentials instead

37
Launching FT-MPI applications with H2O

Staging applications from a network repository
Uses URL code base to refer to a remotely stored
application
Platform-specific binary transparently uploaded
to a computational node upon client request
Separation of roles
Application developer bundles the application and
puts it into a repository
The end-user launches the application, unaware of
heterogeneity

38
Security

Challenge applications staged from remote
repositories
Computational node needs to assess that the
application code will not compromise or abuse
providers system
Approach
Code authentication based on digital signatures
User authentication, via H2O mechanisms
Security policy, supplied by FT-MPI deployer,
specifies which code sources and/or users are
authorized

39
Interconnecting heterogeneous clusters

Private, non-routable networks
Communication proxies on cluster front-ends route
data streams
Local (intra-cluster) channels not affected
Nodes use virtual addresses at the IP level
resolved by the proxy

40
Initial experimental results

Proxied connection versus direct connection
Standard FT-MPI throughput benchmark was used
within a Gig-Ethernet cluster proxies retain 65
of throughput

41
Summary and Status

Harness II Project
Exploring new capabilities for next-generation
scientific computing
Collaboration between ORNL, UTK, Emory
Status
H2O reconfigurable, secure, and scalable
component framework
Dynamic deployment across collections of shared
resources
RMIX and REVENTS flexible, multi-protocol
communication substrate
Sync async remote method calls distributed
events
JXTA transport layer for global P2P connectivity
Programming models
MOCCA distributed CCA applications in a shared
environment
H2O-FTMPI (in progress) dynamic deployment of
MPI codes
Software and more info
Visit us at http//harness2.org/

42
Closing Remarks

How to make systematic reuse work for
you(Criteria excerpted from the article by D. C.
Schmidt)
Attractive resource magnets
Component repositories
Open Source model
Close feedback loops for bug fixing
Competitive market
Discouraging re-invent, promoting re-use
Iterative development
Good components and frameworks require time to
design, implement, optimize, validate, apply,
maintain, and enhance
Build re-usable assets incrementally
Maintain close feedback between middleware and
application developers
Keep the faith
Impediments will arise
But the design for re-use will pay off in the
long run

43
Common Component Architecture (CCA)

Vision
Rather than a handful of hero programmers
creating a monolithic executable, many developers
and domain experts can collaboratively contribute
components, which can then be assembled and
reused in production scientific simulations
HPC Focus
Efficient bindings for components in the same
address space
Need latency close to a local procedure call
Still, need dynamic (un)binding and
reconfiguration
Support for scientific data types and languages
Must be easy to componentize legacy Fortran
codes
Parallelism
Allow componentization of parallel software
Integrate with existing HPC environments

44
Modern definitions

What is a software component?
A system element offering a predefined service
and able to communicate with other components
(Wikipedia)
Encapsulated software module defined by its
public interfaces, that follows a strict set of
behavior rules defined by the Component
Architecture (D. Gannon, HPDC 2001)
How do I recognize one?
Criteria due to C. Szyperski, D. Messerschmitt
Multiple-use
Non-context-specific
Composable with other components
Encapsulated, i.e., non-investigable through its
interfaces
A unit of independent deployment and versioning