What cannot be computed locally


What cannot be computed locally!
  • by Fabian Kuhn, Thomas Moscibroda, and Roger
  • (PODC 2004)
  • Presenter Yongwook Choi

Basic Problems
  • Minimum vertex cover (MVC)
  • A vertex cover for a graph G(V,E) is a subset of
    nodes V such that, for each edge, at least one
    of the end-points belongs to V.
  • Minimum dominating set (MDS)
  • A dominating set S is a subset of the nodes of a
    graph G such that all nodes of G are either in S
    or they have a neighbor in S.
  • Maximal matching (MM)
  • A maximal matching of a graph G is a maximal set
    of edges which do not share common end-points.
  • Maximal independent set (MIS)
  • A maximal independent set of a graph G is a
    maximal set of non-adjacent nodes.

Main Results
  • Lower bounds for the distributed approximation of
    MVC and MDS.
  • In k communication rounds, MVC and MDS can only
    be approximated by factors
  • The number of rounds required in order to achieve
    a constant or even only polylogarithmic
    approximation ratio is at least
  • The same time lower bounds for MM and MIS can be
    shown by a simple reduction.
  • Previously the best known lower bound was O(logn)

Main Results
  • Related work
  • Kuhn and Wattenhofer, 2003
  • In k rounds, MVC can be approximated by a factor
  • For a polylogarithmic approximation ratio,
  • kO(log?/loglog?)
  • The lower bound is tight.
  • For a constant approximation ratio,
  • kO(log?)
  • The lower bound is tight up to a factor

  • Classic message passing model
  • Undirected graph G(V,E)
  • V processors
  • E communication channels
  • Each node can send an arbitrarily long message to
    each of its neighbors
  • Local computations are for free
  • Initially, nodes have no knowledge about the
    network graph
  • Each nodes only knows its own unique identifier

  • In k rounds, a node v may collect the IDs and
    interconnections of all nodes up to distance k
    from v
  • Tv,k Topology seen by v after k rounds
  • L(Tv,k) Labelling (or assignment of IDs) of Tv,k

  • The best algorithm
  • Deterministic algorithms
  • Collect its k-neighborhood
  • Make a decision based on (Tv,k,L(Tv,k))
  • Randomized algorithms
  • Also depend on the randomness

  • This model is the strongest possible model when
    proving lower bounds for local computations
  • Focuses entirely on the locality
  • Abstracts away other issues
  • Need for small messages
  • Fast local computations
  • The lower bounds are true consequence of locality

Outline of the Proof
  • Construct a graph Gk(V,E) for each positive
    integer k.
  • Gk contains a bipartite subgraph S.
  • The goal is to construct Gk in such a way that
    all nodes in S see the same topology within
    distance k.
  • In a globally optimal solution
  • All edges of S may be covered by nodes in C1.
  • In a local algorithm
  • The decision of whether a node joins the vertex
    cover depends only on its local view.
  • Because adjacent nodes in S see the same
    topology, every algorithm adds a large portion of
    nodes in C0 to its VC.

The Cluster Tree
  • The nodes of graph Gk can be grouped into
    disjoint sets (clusters) which are linked to each
    other as bipartite graphs.
  • Cluster tree CTk (C,A)
  • C set of clusters
  • A doubly labelled arcs
  • e.g. L(C0,C1) (d0,d1)

C0 d0 C1 d1
The Cluster Tree
  • CTk is recursively defined
  • Given CTk-1, we obtain CTk in two steps.
  • For each inner-cluster Ci, add a new leaf-cluster
    Ci with L(Ci,Ci)(dk,dk1)
  • For each leaf-cluster Ci with L(Ci,Ci)(dp,dp1),
    add k new leaf-clusters Cj with
    L(Ci,Cj)(dj,dj1) for j0k, j?p1
  • Level of a cluster
  • Distance to C0
  • CTk defines just the number of neighbors for each
    node in each cluster.
  • CTk does not specify the actual adjacencies.

The Lower Bound Graph
  • Later, we will prove that the topologies seen by
    nodes in C0 and C1 are identical.
  • The proof is greatly simplified if each nodes
    topology is a tree (rather than a general graph)
    because we do not need to worry about cycles.
  • g(G) the girth of a graph G
  • Length of the shortest cycle in G
  • We want to construct Gk with girth at least 2k1.
  • All nodes see a tree in k communication rounds.

The Graph Family D(r,q)
  • Proposed in Explicit construction of graphs with
    an arbitrary large girth and of large size
  • by Lazebnik and Ustimenko
  • For given an integer r1 and a prime power q,
  • D(r,q) defines a bipartite graph with 2qr nodes
    and girth g(D(r,q))r5.
  • The nodes in P and L are labelled by the
    r-vectors over the finite field Fq



The Graph Family D(r,q)
  • Lemma 3.2
  • For all (p) and l1, there is exactly one l such
    that l1 is the first coordinate of l and that
    (p) and l are connected by an edge in D(r,q).
  • Proof
  • r-1 equations define a linear system for the r-1
    unknown coordinates of l.
  • The matrix corresponding to this system is a
    lower triangular matrix with non-zero elements in
    the diagonal.
  • The solution of this matrix is unique.

The Construction of Gk
  • Construct an instance Gk of the cluster tree
  • Gk may have the minimum possible girth 4.
  • Gk is a bipartite graph with odd-level clusters
    in one set(V1) and even-level clusters in the
    other set(V2).
  • m the number of nodes in the larger of the two
    partitions of Gk.
  • Choose the smallest prime power qm.
  • In both partitions of Gk, uniquely label all
    nodes over Fq.

The Construction of Gk
  • Set r2k-4.
  • Construct D(r,q)
  • D(r,q) is a bipartite graph with 2qr nodes and
    girth at least 2k1.
  • Gk is constructed as a subgraph of D(r,q).
  • For each p1 (in V1) in Gk, put qr-1 nodes
    (p)(p1,) in Gk.
  • Put an edge between (p)(p1,) and ll1, if
    both of the followings hold.
  • (p) and l are connected in D(r,q).
  • p1 in V1 and l1 in V2 are connected in Gk.
  • Remove all the nodes without incident edges.

The Properties of Gk
  • Lemma 3.3 first part
  • Gk has at most 2mq2k-5 nodes and girth at least
  • Proof
  • Each node in Gk, qr-1 nodes are created.
  • The number of nodes in Gk is at most 2m.
  • Gk is a subgraph of D(r,q).
  • g(D(r,q))2k1.

The Properties of Gk
  • Lemma 3.3 second part
  • Gk is a cluster tree with the same degrees di as
    in Gk.
  • Proof
  • In Gk, consider two neighboring clusters C1 in
    V1 and C2 in V2.
  • The clusters C1 and C2 consist of all nodes (p)
    and l which have their first coordinates equal
    to the labels of the nodes in C1 and C2,
  • If each node in C1 have d neighbors in C2, then
    nodes in C1 have d neighbors in C2 by Lemma 3.2.

Equality of Views
  • In Gk, two adjacent nodes in clusters C0 and C1
    see exactly the same topology (same tree) within
    distance k.

Equality of Views
  • We can derive the lower bounds on the
    approximation ratio of k-local MVC algorithms.
  • OPT optimal solution for MVC
  • ALG solution computed by any algorithm
  • Main observation
  • Adjacent nodes in C0 and C1 have the same view.
  • Every algorithm treats the nodes the same way.
  • ALG contains a significant portion of the nodes
    in C0.
  • OPT may not have any node in C0.

  • Lemma 3.13
  • Let ALG be the solution of any k-local MVC
  • When applied to Gk, ALG contains at least half of
    the nodes of C0 in the worst case (in expectation
    for randomized algorithms).
  • Proof for deterministic algorithms
  • The decision whether a given node enters the
    vertex cover depends solely on its topology and
    the labelling.
  • Assume that the labelling is chosen uniformly at
  • Let v0 and v1 be two adjacent nodes from C0 and
    C1, respectively.
  • Let pi denote the probability that vi ends up in
    VC when algorithm A operates on the randomly
    chosen labelling.
  • p0p1 because they see the same topology and the
    same distribution on the labelling.

  • Proof for deterministic algorithms (cont.)
  • For a feasible solution, at least one of the two
    nodes must be in VC.
  • This implies p0p11 and p01/2.
  • At least half of the nodes in C0 are chosen by
    algorithm A in average by the linearity of
  • Therefore, for every deterministic algorithm,
    there is at least one labelling for which at
    least half of the nodes of C0 are in VC.
  • Proof for randomized algorithms
  • Use Yaos minimax principle
  • The expected number of nodes chosen by a
    randomized algorithm cannot be smaller than the
    expected number of nodes chosen by an optimal
    deterministic algorithm for an arbitrarily chosen
    distribution on the labels.

  • ALG n0/2
  • OPT n-n0
  • This is because the nodes of cluster C0 are not
    necessary to obtain a feasible vertex cover.
  • We define didi (i0,,k1) for some value d.
  • Lemma 3.14

  • Proof
  • The number of nodes per cluster decreases for
    each additional level by a factor d.
  • A cluster on level l contains n0/dl nodes.
  • Each cluster has at most k1 neighboring clusters
    on a higher level.
  • nl number of nodes on level l

  • Relationship between d and n0
  • n0 size of C0 in Gk
  • Set n0 d2k1 so that there is no conflict
    during the construction of Gk.
  • If we assume that k1d/2, we have n2n0 by
    Lemma 3.14.
  • Applying the construction of Gk, we get n0
    n0q2k-5, where q is the smallest prime power
    such that qn.
  • q lt 2n 4n0
  • Finally, we get

  • Theorem 3.15
  • There are graphs G, such that in k communication
    rounds, every distributed algorithm for the MVC
    problem on G has approximation ratios at least
  • for some constant c1/4, where n denotes the
    number of nodes and ? denotes the highest degree
    in G.
  • Proof

  • Theorem 3.16
  • In order to obtain a polylogarithmic or even
    constant approximation ratio, every distributed
    algorithm for the MVC problem requires at least
  • communication rounds.

  • Proof

by theorem 3.15
  • For every polylogarithmic term a(n), there is a
    constant ß such that approximation ratio is at
    least a(n).
  • We can show the second lower bound similarly.

Lower Bounds by Reductions
  • Using the lower bound for MVC, we can obtain
    lower bounds for several other graph problems.
  • Time lower bounds
  • For the construction of maximal matchings (MM)
    and maximal independent sets (MIS)
  • For the approximation of minimum dominating set
  • Use reductions

Lower Bounds for MM and MIS
  • Theorem 4.1
  • There are graphs G on which every distributed,
    possibly randomized algorithm requires time
  • to compute a maximal matching.
  • The same lower bounds hold for the construction
    of maximal independent sets.
  • Proof for MM
  • The set of all end-points of the edges of a MM
    form a 2-approximation for MVC.
  • This means that computing a MM achieves a
    constant approximation ratio for MVC.
  • Thus, the lower bound directly follows from
    theorem 3.16.

Lower Bounds for MM and MIS
  • Proof for MIS
  • Consider the line graph L(Gk) of Gk.
  • L(G) line graph of G
  • Nodes of L(G) are the edges of G.
  • Two nodes in L(G) are connected if two
    corresponding edges in G are incident.
  • The MM problem on G is equivalent to the MIS
    problem on L(G).
  • k rounds on L(G) can be simulated in kO(1)
    rounds on G.
  • The number of nodes in L(Gk) is less than n2/2.
  • The maximum degree of L(Gk) is less than 2?.
  • Therefore, the lower bounds have the same order.

Lower Bounds for MDS
  • Theorem 4.2
  • There are graphs G, such that in k communication
    rounds, every distributed algorithm for the
    minimum dominating set problem on G has
    approximation ratios at least
  • for some constant c, where n and ? denote the
    number of nodes and the highest degree in G,

Lower Bounds for MDS
  • Proof
  • We show that every MVC instance can be seen as a
    MDS instance with the same locality.
  • G(V,E) graph for MVC problem
  • Construct G(V,E) for MDS problem
  • k communication rounds on one of the two graphs
    can be simulated by kO(1) rounds on the other
  • MDS on G is exactly as hard as MVC on G.

Lower Bounds for MDS
  • Corollary 4.3
  • To obtain a polylogarithmic or constant
    approximation ratio for minimum dominating set,
    there are graphs on which every distributed
    algorithm needs time
  • Proof
  • The same as the proof for theorem 3.16

Questions or Comments?
