Distributed Analysis: Motivation - PowerPoint PPT Presentation

About This Presentation

Title:

Distributed Analysis: Motivation

Description:

... with the external world only via an abstract interface ... .but the design must be LB-aware. partition the initial data set and assign data chunks to tasks ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 42

Provided by: cseminar

Category:

more less

Transcript and Presenter's Notes

Title: Distributed Analysis: Motivation

1
(No Transcript)
2
Distributed Analysis Motivation
this is the view of analysis application provider

why do we want distributed data analysis?
move processing close to data
for example ntuple
job description kB
the data itself MB, GB, TB ...
rather than downloading gigabyte data let the
remote server do the job
do it in parallel faster
clusters of cheap PCs

3
Computing Models

desktop computing
personal computing resource
may lack CPU, high-speed access to networked
databases,...
"mainframe" computing
shared supercomputer in a LAN
expensive and may have scalability problems
cluster computing
a collection of nodes in a LAN
complex and harder to manage
grid computing
a WAN collection of computing elements
even more complex

4
Cluster Computing at CERN

batch data analysis
e.g. lxbatch currently in production
workload management system (e.g. LSF)
automatic scheduling and load-balancing
batch jobs hours, days to complete
interactive data analysis
currently desktop, will have to be distributed
for LHC
tried in the past for ntuple analysis
PIAF (Parallel Interactive Analysis Facility)
running copies of PAW on behalf of the user.
8 nodes and tight coupling with the application
layer (PAW)
semi-interactive analysis becomes more important
minutes... hours

5
(No Transcript)
6
Topology of I/O Intensive App.

ntuple mostly I/O intensive rather than CPU
intensive
fast DB access from cluster
slow network from user to cluster
very small amount of data exchanged between the
tasks in comparison to"input" data

7
Parallel Ntuple Analysis

data driven
all workers perform same task (similar to SPMD)
synchronization quite simple (independent
workers)
master/worker model

8
(No Transcript)
9
(No Transcript)
10
Master/Worker model

applications share the same computation model
so also share a big part of the framework code
but have different non-functional requirements

11
What DIANE is?

RD project in IT/API
semi-interactive parallel analysis for LHC
middleware technology evaluation choice
CORBA, MPI, Condor, LSF...
also see how to integrate API products with GRID
prototyping (focus on ntuple analysis)
time scale and resources
Jan 2001 start (lt 1 FTE)
June 2002 running prototype exists
sample Ntuple analysis with Anaphe
event-level parallel Geant4 simulation

12
What DIANE is?

framework for parallel cluster computation
application-oriented
master-worker model common in HEP applications
application-independent
apps dynamically loaded in a plugin style
callbacks to applications via abstract interfaces
component-based
subsystems and services packaged into component
libraries
core architecture uses CORBA and CCM (CORBA
Component Model )
integration layer between applications and the
GRID
environment and deployment tools

13
What DIANE is not ?

DIANE is not
a replacement for a GRID and its services
a hardwired analysis toolkit

14
DIANE and GRID

DIANE as a GRID computing element
...via a gateway that understands Grid/JDL
... Grid/JDL must be able to descibe parallel
jobs/tasks
DIANE as a user of (low level) Grid services
...authentication, security, load balancing...
and profit from existing 3rd party
implementations
python environment is a rapid prototyping
platform
and may provide a convinient connection between
DIANE and Globus Toolkit via pyGlobus API

15
Architecture Overview

layering abstract middleware interfaces and
components
plugin-style application loading

16
Client Side DIANE

thin client / lightweight XML job description
protocol
just create a well-formed job description in XML
send and read the results back as XML data
messages
connection scenarios
standalone clients C, python client apps
explicit connection from a shell prompt
flexibility and choice of command-line tools
clients integrated into analysis framework e.g.
Lizard/python
hidden connection behind-the-scenes
Web access Java-CORBA binding, SOAP (?)
universal and easy access

17
Data Exchange Protocol (1)

XDR concept in C
Specify data format
Type and order of data fields
Data messages
Sender and receiver agree on the format
Message is send as opaque object (any)
C type may be different at each side
Interfaces with flexible data types
E.g. store list of identifiers (unknown type)

18
Data Exchange Protocol (2)
class A public DXPDataObject public
DXPString name // predefined fundamental
types DXPLong index DXPSequenceDataObje
ctltDXPplain_Doublegt ratio B b
// nested complex object A(DXPDataObject
parent) DXPDataObject(parent), name(this),
index(this), ratio(this), B(this)
19
Data Exchange Protocol (3)

External streaming supported, e.g
Serialize as CORBAbyte_sequence
Serialize to XML (ascii string)
Visitor pattern new formats easy
Handles
Opaque objects (any)
Typed objects safe casts

DXPTypedDataObjectltAgt a1,a2 // explicit
format DXPAnyDataObject x a1 // opaque
object a2 x if(a1.isValid()) // "cast
successful"
20
Server Side Architecture

Corba Component Model (CCM)
pluggable components services
make a truly component system on the core
architecture level
common interface to the service components
difficult due to different nature of the services
implementations
example load-balancing service
Condor - process migration
LSF - black-box load balancing
custom PULL implenetation - active load balancing
but first results show that it is feasible

21
DIANE CORBA

CORBA
industry standard (mature and tested)
scalable (we need 1000s of nodes and processes)
language and platform independent (IDL)
C, C, Java, python,...
many implementations commercial and open source
directly supports OO, abstract interfaces
CORBA facilities
naming service, trading service etc.
Corba Component Model
supports component programming (evolution of OO)

22
Component Technology

components are not classes!
components are deployment units
they live in libraries, object files and binaries
they interact with the external world only via an
abstract interface
total separation from underlying implementation
classes are source code organization units
they exist on different design levels and support
different semantics
utility classes (e.g. STL vectors or smart
pointers)
mathematical classes (e.g. HepMatrix)
complex domain classes (e.g. FMLFitter)
but a class may implement a component
OO fails to reuse, component technology might
help (hopefully)

23
(No Transcript)
24
Server Side DIANE
25
Server Side DIANE
26
CORBA and XML in Practice

inter-operability (shown in the prototype ntuple
application)
cross-release (muchos gracias XML!)
client running Lizard/Anaphe 3.6.6
server running 4.0.0-pre1
cross-language (muchos gracias CORBA!)
python CORBA client (30 lines)
C CORBA server
compact XML data messages
500 bytes to server, 22k bytes from server of XML
description
factor 106 less than original data (30 MB ntuple)
thin client no need to run Lizard on the client
side as an alternative use case scenario

27
Load balancing service

Black-box (e.g. LSF)
limited control -gt submit jobs (black box)
job queues with CPU limits
automatic load balancing, scheduling (task
creation and dispatch)
prototype deployed (10s workers)
Explicit PULL LB
custom daemons
more control -gt explicit creation of tasks
load balancing callbacks into specific
application
prototype custom PULL load-balancing (10s
workers)

28
Dedicated Interactive Cluster (1)

Daemons per node
Dynamic process allocation

29
Dedicated Interactive Cluster (2)

Daemons per user per node
Thread pools, per-user policies

30
Error Recovery Service

The mechanisms
daemon control layer
make sure that the core framework process are
alive
periodical ping need to be hierarchized to be
scalable
worker sandbox
protect from the seg-faults in the user
applications
memory corruption
exceptions
signals
based on standard Unix mechanisms child
processes and signals

31
(No Transcript)
32
Other Services

Interactive data analysis
connection-oriented vs connectionless
monitoring and fault recovery
User environment replication
do not rely on the common filesystem (e.g. AFS)
distribution of application code
binary exchange possible for homogeneous clusters
distribution of local setup data
configuration files, etc
binary dependencies (shared libraries etc)

33
Optimization

Optimizing distributed I/O access to data
clustering of the data in the DB on the per-task
basis
depends on the experiment-specific I/O solution
Load balancing
framework is not directly addressing low level
issues
...but the design must be LB-aware
partition the initial data set and assign data
chunks to tasks
how big chunks?
static/adaptive algorithm?
push vs pull model for dispatching tasks
etc.

34
Further Evolution

expect full integration and collaboration with
LCG according to their schedule
software evolution and policy
distributed technology (CORBA, RMI, DCOM,
sockets, ...)
persistency technology (LCG RTAGs -gt ODBMS,
RDBMS, RIO)
programming/scripting languages (C, Java,
python,...)
evolution of GRID technologies and services
Globus
LCG, DataGrid, CrossGrid (interactive apps)
...

35
Limitations

Model limited to Master/Worker
More complex synchronization patterns
some particular cpu-intensive applications
require fine-grained synchronization between
workers - this is NOT provided by the framework
and must be achieved by other means (e.g MPI)
Intra-cluster scope NOT a global metacomputer
Grid-enabled gateway to enter Grid universe
otherwise the framework is independent thanks to
Abstract Interfaces

36
Similar Projects in HEP

PIAF (history)
using PAW
TOP-C
G4 examples for parallelism at event-level
BlueOx
Java
using JAS for analysis
some space for communality via AIDA
PROOF
based on ROOT

37
Summary

first prototype ready and working
proof of concept for up to 50 workers
1000 workers needs to be checked
initial deployment
integration with Lizard analysis tool
Geant 4 simulation
active RD in component architecture
relation to LCG to be established

38
That's about it

cern.ch/moscicki/work
cern.ch/anaphe
aida.freehep.org

39
Facade for end-user analysis

3 groups of user roles
developers of distributed analysis applications
brand new applications e.g. simulation
advanced users with custom ntuple analysis code
similar to Lizard Analyzer
execute custom algorithm on the parallel ntuple
scan
interactive users
do the standard projections
just specify the histogram and ntuple to project
user-friendly means
show only the relevant details
hide the complexity of the underlying system

40
Facade for end-user analysis
41
Ntuple Projection Example

example of semi-interactive analysis
data 30 MB HBOOK ntuple / 37K rows / 160 columns
time minutes .. hours
timings
desktop (400Mhz, 128MB RAM) - c.a. 4 minutes
standalone lxplus (800Mhz, SMP, 512MB RAM) - c.a.
45 sec
6 lxplus workers - c.a. 18 sec
why 6 18 45 ?
job is small, so big fraction of the time is
compilation and dll loading, rather than
computation
pre-installing application would improve the
speed
caveat example running on AFS and public machines