Title: Toward third Generation Desktop Grids Private Virtual Cluster
1Toward third Generation Desktop Grids(Private
Virtual Cluster)
- Ala Rezmerita, Franck Cappello
- INRIA
- Grand-Large.lri.fr
2Agenda
- Basic Concepts of DGrids
- First and second generation Desktop Grids
- Third generation concept
- PVC
- Early evaluation
- Conclusion
3Basic Concepts of Desktop Grids
- Bag of tasks, master-worker, divide and conquer
applications - Batch schedulers (clusters)
- Virtual machines (Java, .net)
- Standard OS (Windows, Linux, MaxOS X)
- Cycle stealing (Desktop PCs)
- Condor (single administration domain)
4First Generation DG
- Single application / Single user
- SETI_at_HOME (1998)
- Research for Extra Terrestrial I
- 33.79 Teraflop/s (12.3 Teraflop/s for the ASCI
White!), 2003 - DECRYPTHON
- Protein Sequence comparison
- RSA-155 (1996?)
- Breaking encryption keys
- COSM
5First Gen DG Architecture
Centralized architecture
Monolythique architecture
User Admin interface
Client application Params. /results.
Coordinator/ Resource Disc.
Application
Scheduler
Parameters
Task Data Net
Results
OS Sandbox
PC
Protocols
PC
Firewall/NAT
6Second Generation of DG
- Multi-applications / Multi-users platforms
- BOINC (2002?)
- SETI_at_home, Genome_at_home, XtremLab
- XTREMWEB (2001)
- XtremWeb-CH, GTRS, XW-HEP, etc..
- Platform (ActiveCluster), United devices,
Entropia, etc. - Alchemi (.NET based)
7Second Gen DG Architecture
Centralized architecture (split tasks/data
mgnt., Inter node com.)
Monolythique architecture
User Admin interface
Application
Client application Params. / results.
Coordinator/ Scheduler (Tasks)
Scheduler
Task Data Net
OS Sandbox
Parameters
PC
Protocols
Results
Data Manager Scheduler (Tasks)
Firewall/NAT
8What we have learnedfrom the history
- Rigid architecture (Open source Yes, Modularity
No) - Dedicated job scheduler
- Dedicated data management/file system
- Dedicated connection protocols
- Dedicated transport protocols
- Centralized architecture
- No direct communication
- Almost no security
- Restricted application domain (essentially High
Throughput Computing)
9Third Generation Concept
- Modular architecture
- Pluggable / selectable job scheduler
- Pluggable / selectable data management/file
system - Pluggable / selectable connection protocols
- Pluggable / selectable transport protocols
- Decentralized architecture
- Direct communications
- Strong security
- Unlimited application domain (restrictions
imposed only by the platform performance)
PC
PC
PC
User Admin interface
Applications
Schedulers
Task Data Net
OS Sandbox
Protocols
103rd Gen Dgrid design
PC
User Admin interface
Applications
Apps Binary codes
PC
PC
Scheduler/runtime
Condor, MPI
Data management
LUSTRE / Bittorrent
XEN / Virtual PC
OS
Virtualization at IP level!
IP Virtualization
Connectivity / Security
Network
11PVC (Private Virtual Cluster)
- A generic framework turning dynamically a set of
resources belonging to different administration
domains into a cluster - Connectivity/Security
- Dynamically connects firewall protected nodes
without breaking the security rules of the local
domains - Compatibility
- Creates an execution environment for existing
cluster applications and tools (schedulers, file
systems, etc.)
12PVC Architecture
- Broker
- Collects connection requests
- Forwards data between peers
- Realizes the virtualization
- Helps in the security negotiation
- Peer's modules
- Coordination
- Communication interposition
- Network virtualization
- Security
- Connectivity
13Virtualization
- Virtual network on top of the real one
- Uses a virtual network interface
- Provides virtual to real IP address translation
- Features an IP range and DNS
- The proposed solution respects the IP standards
- Virtual IP class E (240.0.0.1 255.255.255.254)
of IP addresses - Virtual Names use of a virtual domain name
(.pvc)
14Interposition
- Catch the applications communications and
transform them for transparent tunneling - Interposition techniques
- LibC overload (IP masquerading, modified kernel)
- Tun / Tap (encapsulation IP over IP)
- Netfilter (IP masquerading, kernel level)
- Any other
15Interposition techniques
Application packet
ld_preload? (1)
No
Yes
U. Space
Socket Interface
LibC
LibC
Security check Connection IP Masquerading
Kernel
Route Selection
240.X.X.X
Interposition modules
Security check Connection Encapsulation
Security challenge Connection IP Masquerading
(3)
Tun/Tap (2)
Netfilter
Group check
Std network interface
16Connectivity
- Goal Direct connections between the peers
- Firewall/NAT traversal techniques
- UPnP - firewall configuration technique
- UDP/TCP hole punching - online gaming and voice
over IP - Traversing TCP novel technique
- Any other
17Security
B
- Fulfil the security policy of every local domain
- Enforce a cross-domain security policy
- Master peer in every virtual cluster
- Implements global security policy
- Registers new hosts
- PVC peer must
- Check the target peers membership to the same
virtual cluster - After the connection establishment, authenticate
target peer identity
PkMPKC
Master
M
PKM
Put PKC Get PKM
C
S
PK Public Key Pk Private Key
18Security
Security protocol is based on double asymmetric
keys mechanism
- (1)/(2) Membership to the same virtual cluster
- (3)/(4)/(5) Mutual authentication
- The PVC security protocol ensures that
- Only hosts of a same virtual cluster are
connected - Only trusted connections become visible for the
application
19Performance Evaluation
- PVC objectives Minimal overhead for
communications - Network performance with/without PVC
- Connection establishment
- Execution of real applications without
modification in MPI - NAS benchmarks
- MPIPOV program
- Scientific application DOT
- Bag of Tasks typical setting (BLAST on DNA
database) - Condor flocking strategies
- Broadcast protocol
- Spot Checking / Replication voting
- Evaluation platforms
- Grid5000 (Grid eXplorer Cluster)
- DSL-Lab (PC _at_ home, connected on Public ADSL
network)
20Communication perf.
Connection overhead
Direct Communication overhead
21Connection overhead
- Performed on DSL-Lab platform using a specific
test suite
Reasonable overhead in the context of the P2P
applications
22Bandwidth overhead
- Technique LibC overload
- Performed on a local PC cluster with three
different Ethernet (Netperf) - networks 1Gbps, 100Mbps and 10Mbps
717 (5)
715 (5)
720 (5)
23Communication overhead
Tun / Tap
Netfilter
Technique
24MPI applications
- Applications
- NAS benchmarks class A (EP, FT, CG and BT)
- DOT
- MPIPOV
- Results for NAS EP
- Measured acceleration is almost linear
- Overhead lower than 5
- Other experiments Losses of performances due to
ADSL network
25Typical configuration for Bag of tasks (Seti_at_home
like)
PC
BLAST (DNA)
Application
Result certification
PC
PC
Scheduler/runtime
Condor
Data management
Bittorrent
OS
OS
Using PVC!
PVC
Connectivity / Virtualization
Network
26Broadcast protocol?
- Question Bittorent instead of Condor transport
protocol? - BLAST application
- DNA Database (3.2 GB)
- 64 nodes
Condor exec. time grows Protportionnly with
jobs Condor Bittorent exec. Time stays almost
constant
27Distribution of job management?
- Question how many job managers for a
- set of nodes?
- Condor Flocking
- 64 sequences of 10 BLAST jobs
- (between 40 and 160 seconds, with an
- average of 70 seconds)
28Result certification?
- Question how to detect bad results?
- Spot Checking (black listing)
- Replication voting
- Not implemented in Condor
- 20 lines of script both
- Test 70 machines
- 10 Saboteurs (randomly
- choosen)
- How many jobs are required
- to detect the 7 saboteurs?
29Applications
- Computational/Data Desktop Grids
- BOINC/Xtremweb like applications
- User selected scheduler (Condor, Torque, OAR,
etc.) - Communication between workers (MPI, Distributed
file systems, distributed archive, etc.) - Instant Grid
- Connecting resources sans effort family Grid,
Inter School Grids, etc. - Sharing resources and content. Example Apple TV
synchronized with a remote Itune - Passe muraille runtime
- OpenMPI
- Extended Clusters
- Run applications and manage resources beyond
limits of admin domains
30Conclusion
- Third Generation Desktop Grids (2007)
- Break the rigidity, again!
- Let users choose and run their favourite
- environments (Engineers may help)
- PVC connectivity security compatibility
- Dynamically establishes virtual clusters
- Modular, extensible architecture
- Features properties required for 3rd Gen. Desktop
Grids - Security model Use of applications without any
modification With minimal communication
overhead - On going work (on Grid5000)
- Test the scalability and fault tolerance of
cluster tools in the Dgrid Context - Test more applications
- Test improve the scalability of the security
system
31Questions?
32Condor flocking with PVC
- Question May we use several Schedulers?
- Use of a synthetic job that consumes resources (1
sec.) - Sequence of 1000 submissions
- Submits the synthetic jobs from the same host to
Condor pool - Future work make Condor Flocking fault tolerant.