Locality Sensitive Distributed Computing Exercise Set 2 presentation

About This Presentation

Transcript and Presenter's Notes

Title: Locality Sensitive Distributed Computing Exercise Set 2

1
Locality Sensitive Distributed ComputingExercise
Set 2

David PelegWeizmann Institute

2
Basic partition construction algorithm
Simple distributed implementation for Algorithm
BasicPart Single thread of computation (single
locus of activity at any given moment)
3
Basic partition construction algorithm
Components ClusterCons Procedure for
constructing a cluster around a chosen center
v NextCtr Procedure for selecting the next
center v around which to grow a cluster RepEdge
Procedure for selecting a representative
inter-cluster edge between any two adjacent
clusters
4
Cluster construction procedure ClusterCons
Goal Invoked at center v, construct cluster and
BFS tree (rooted at v) spanning it Tool
Variant of Dijkstra's algorithm.
5
Recall Dijkstras BFS algorithm
phase p1
6
Main changes to Algorithm DistDijk
1. Ignoring covered vertices Global BFS
algorithm sends exploration msgs to all neighbors
save those known to be in tree New variant
ignores also vertices known to belong to
previously constructed clusters 2. Bounding
depth BFS tree grown to limited depth, adding
new layers tentatively, based on halting
condition (G(S) lt Sn1/k)
7
Distributed Implementation

Before deciding to expand tree T
by adding newly discovered layer L
Count vertices in L by convergecast process
Leaf w ? T set Zw new children in L
Internal vertex add and upcast counts.

8
Distributed Implementation

Root compare final count Zv to total vertices
in T (known from previous phase).
If ratio n1/k, then broadcast next Pulse msg
(confirm new layer and start next phase)
Otherwise, broadcast message Reject
(reject new layer, complete current cluster)
Final broadcast step has 2 more goals
mark cluster by unique name (e.g., ID of root),
inform all vertices of new cluster name

9
Distributed Implementation (cont)
This information is used to define cluster
borders. I.e., once cluster is complete, each
vertex in it informs all neighbors of its new
residence. ? nodes of cluster under construction
know which neighbors already belong to existing
clusters.
10
Center selection procedure NextCtr
Fact Algorithm's center of activity always
located at currently constructed cluster
C. Idea Select as center for next cluster some
vertex v adjacent to C ( v from rejected
layer) Implementation Via convergecast
process. (leaf pick arbitrary neighbor from
rejected layer, upcast to parent internal node
upcast arbitrary candidate)
11
Center selection procedure (NextCtr)
Problem What if rejected layer is empty?
(It might still be that the entire process is not
yet complete there may be some yet unclustered
nodes elsewhere in G)
??
?
r0
12
Center selection procedure (NextCtr)
Solution Traverse the graph (using cluster
construction procedure within a global search
procedure)
?
r0
13
Distributed Implementation

Use DFS algorithm for traversing the tree of
constructed cluster.
Start at originator vertex r0, invoke ClusterCons
to construct the first cluster.
Whenever the rejected layer is nonempty, choose
one rejected vertex as next cluster center
Each cluster center marks a parent cluster
in the cluster DFS tree, namely, the cluster from
which it was selected

14
Distributed Implementation (cont)

DFS algorithm (cont)
Once the search cannot progress forward (rejected
layer is empty)
the DFS backtracks to previous cluster and looks
for new center among neighboring nodes
If no neighbors are available, the DFS process
continues backtracking on the cluster DFS tree

15
Inter-cluster edge selection RepEdge
Goal Select one representative inter-cluster
edge between every two adjacent clusters C and C'
E(C,C') edges connecting C and C'
(known to endpoints in C, as C vertices know the
cluster-residence of each neighbor)
r0
16
Inter-cluster edge selection RepEdge
? Representative edge can be selected by
convergecast process on all edges of
E(C,C'). Requirement C and C' must select same
edge Solution Using unique ordering of edges
- pick minimum E(C,C') edge. Q Define unique
edge order by unique ID's?
17
Inter-cluster edge selection (RepEdge)
E.g., Define ID-weight of edge e(v,w), where
ID(v) lt ID(w), as pair h ID(v),ID(w) i, and
order ID-weights lexicographically This ensures
distinct weights and allows consistent selection
of inter-cluster edges
18
Inter-cluster edge selection (RepEdge)

Problem
Cluster C must carry selection process
for every adjacent cluster C' individually
Solution
Inform each C vertex of identities of all
clusters adjacent to C by convergecast
broadcast
Pipeline individual selection processes

19
Analysis
(C1,C2,...,Cp) clusters constructed by
algorithm For cluster Ci Ei edges with at
least one endpoint in Ci ni Ci, mi Ei,
riRad(Ci)
20
Analysis (cont)
ClusterCons Depth-bounded Dijkstra procedure
constructs Ci and BFS tree in O(ri2) time and
O(niri mi) messages
? Time(ClusterCons) ?i O(ri2) ?i O(rik)
k ?i O(ni) O(kn)
Q Prove O(n) bound
21
Analysis (cont)
Ci and BFS tree cost O(ri2) time and O(niri
mi) messages
? Comm(ClusterCons) ?i O(niri mi)
Each edge occurs in 2 distinct sets Ei,
hence Comm(ClusterCons) O(nk E)
22
Analysis (NextCtr)
DFS process on the cluster tree is more
expensive than plain DFS
DFS step
Deciding next step
visiting cluster Ci and deciding the next step
requires O(ri) time and O(ni) comm.
DFS step
23
Analysis (NextCtr)

DFS visits clusters in cluster tree O(p) times
Entire DFS process (not counting Procedure
ClusterCons invocations) requires
Time(NextCtr) O(pk) O(nk)
Comm(NextCtr) O(pn) O(n2)

24
Analysis (RepEdge)
si neighboring clusters surrounding
Ci Convergecasting ID of neighboring cluster C'
in Ci costs O(ri) time and O(ni) messages For
all si neighboring clusters O(siri) time
(pipelining) O(sini) messages
25
Analysis (RepEdge)
Pipelined inter-cluster edge selection
similar. As si n, we get Time(RepEdge) maxi
O(si ri) O(n) Comm(RepEdge) ?i O(si ni)
O(n2)
26
Analysis
Thm Distributed Algorithm BasicPart
requires Time O(nk) Comm O(n2)
27
Sparse spanners
Example - m-dimensional hypercube Hm(Vm,Em),
Vm0,1m, Em (x,y) x and y differ in
exactly one bit Vm2m, Emm 2m-1,
diameter m Ex Prove that for every m 0, the
m-cube has a 3-spanner with edges 72m
28
Regional Matchings
Locality sensitive tool for distributed
match-making
29
Distributed match making
Paradigm for establishing client-server
connection in a distributed system (via specified
rendezvous locations in the network)
Ads of server v written in locations Write(v)
v
client u reads ads in locations Read(u)
u
30
Regional Matchings
Requirement read and write sets must
intersect for every v,u ? V, Write(v) Å Read(u)
? ?
Write(v)
v
Client u must find an ad of server v
Read(u)
u
31
Regional Matchings (cont)
Distance considerations taken into
account Client u must find an ad of server
v only if they are sufficiently close
l-regional matching read and write sets RW
Read(v) , Write(v) v?V s.t. for
every v,u?V, dist(u,v) l ? Write(v) Å
Read(u) ? ?
32
Regional Matchings (cont)
Degree parameters Dwrite(RW) maxv?V
Write(v) Dread(RW) maxv?V Read(v)
33
Regional Matchings (cont)
Radius parameters Strwrite(RW) maxu,v?V
dist(u,v) u ? Write(v) / l Strread(RW)
maxu,v?V dist(u,v) u ? Read(v) / l
34
Regional matching construction

Given graph G, k,l 1,
construct regional matching RWl,k
Set S ? Gsl(V)
(l-neighborhood
cover)

35
Regional matching construction

Build coarsening cover T as in
Max-Deg-Cover Thm

36
Regional matching construction

Select a center vertex r0(T) in each cluster T?T

37
Regional matching construction

Select for every v a cluster Tv?T s.t. Gl(v) ?
Tv

TvT1
Gl(v)
v
38
Regional matching construction

Set
Read(v) r0(T) v?T
Write(v) r0(Tv)

T1
r1
Gl(v)
v
Read(v) r1,r2,r3 Write(v) r1
39
Analysis
Claim Resulting RWl,k is an l-regional
matching. Proof Consider u,v such that
dist(u,v) l Let Tv be cluster s.t. Write(v)
r0(Tv)
40
Analysis (cont)
By definition, u ? Gl(v). Also Gl(v) ? Tv ? u ?
Tv ? r0(Tv) ? Read(u) ? Read(u) Å Write(v) ? ?
41
Analysis (cont)
Thm For every graph G(V,E,w), l,k1, there is
an l-regional matching RWl,k with Dread(RWl,k)
2k n1/k Dwrite(RWl,k) 1 Strread(RWl,k)
2k1 Strwrite(RWl,k) 2k1
42
Analysis (cont)

Taking klog n we get
Corollary For every graph G(V,E,w), l1,
there is an l-regional matching RWl with
Dread(RWl) O(log n)
Dwrite(RWl) 1
Strread(RWl) O(log n)
Strwrite(RWl) O(log n)

Write a Comment

User Comments (0)

About PowerShow.com

Locality Sensitive Distributed Computing Exercise Set 2 PowerPoint PPT Presentation