Title: JICOS A Java-Centric Network Computing Service
1JICOSA Java-Centric Network Computing Service
- Peter Cappello Christopher James Coakley
- Computer Science
- University of California, Santa Barbara
2API Goals
- Application program is oblivious to
- Number of processors
- Processor topology
- Inter-process communication
- Faulty compute servers
3API
f(4)
f(3)
f(2)
f(2)
f(1)
f(1)
f(0)
f(1)
f(0)
4API
f(4)
- DAC
- Common environment
- Input object
- Shared object
f(3)
f(2)
f(2)
f(1)
f(1)
f(0)
f(1)
f(0)
5Architectural Goals
- Scalable
- Heterogeneous processors OS
- Mobile code
- Support adaptively parallel computation
- Tolerate faulty compute servers
- Reduce or hide communication latency
6Architecture
H
H
H
H
H
H
S
H
H
H
H
H
H
7Architecture
M
login setComputation getResult logout
C
8Hiding Communication LatencyTask Caching
f(4)
f(3)
f(2)
f(2)
f(1)
f(1)
f(0)
f(1)
f(0)
9Hiding Communication LatencyTask Pre-fetching
f(4)
f(3)
f(2)
f(2)
f(1)
f(1)
f(0)
f(1)
f(0)
10Hiding Communication LatencyExecute Task on
Server
f(4)
f(3)
f(2)
f(2)
f(1)
f(1)
f(0)
f(1)
f(0)
11Tolerating Faulty Hosts
Transactions kill performance
12Tolerating Faulty Hosts
Transactions kill performance
13Tolerating Faulty Hosts
Proxy
H
TASK
TASKS
Proxy
H
TASKS
14Performance Experiments
- Problem 200-city TSP
- 61,295 BranchAndBound Tasks (2.05s)
- 30,647 MinSolution Tasks (lt 1ms)
- 120-processor experiments use 3 processor types
(CX journal paper derives formula)
15(No Transcript)
16Fault Tolerance Experiments
- Problem 200-city TSP
- Killed p processors after 1,500s, for p 2, 6,
20, 24, 26, 30. - overhead
- actual time / ideal time
S
32 PROCESSORS
17Fault Tolerance Experiments
18Task Server Overhead
S/H
S
H
22 PROCESSORS 11 HOSTS, 1 SERVER 12
MACHINES TIME 3114.8s
22 PROCESSORS 11 HOSTS, 1 SERVER 11
MACHINES TIME 3115.1s
19Conclusions
- API Cilk Common Task Environment
- Architecture
- network of servers, each serving many hosts.
- Supports adaptive parallelism
- Efficiently tolerates faulty hosts
- Excellent speedups
- 2 processors (1 Host) 9 hours and 32 minutes
- 120 processors lt 12 minutes (96.66 ideal)
- 3 application-controlled latency-hiding
directives - Small Server overhead Run Host on Server
20THANK YOU!
- URL cs.ucsb.edu/projects/jicos
- Download
- System
- Source
- Tutorial
21(No Transcript)
22A Distributed Computing Taxonomy
NOT
Application fixes processor topology a priori
Adaptively parallel
NOT
Tolerates faulty computer servers
NOT
Divide Conquer API
NOT
23Ancestry
Cilk
Linda
Atlas
Satin
Javelin
CX
Pirannha
JICOS