Lower Bounds for Property Testing - PowerPoint PPT Presentation

About This Presentation
Title:

Lower Bounds for Property Testing

Description:

Lower Bounds for Property Testing Luca Trevisan U C Berkeley – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 36
Provided by: LucaTr7
Category:

less

Transcript and Presenter's Notes

Title: Lower Bounds for Property Testing


1
Lower Bounds for Property Testing
  • Luca Trevisan
  • U C Berkeley

2
Sub-linear Time Algorithms
  • Want to design algorithms that run in less than
    linear time
  • cannot read entire input
  • must be probabilistic and approximate
  • For optimization problems
  • compute numerical apx of optimum cost (and
    implicit representation of apx solution?)
  • For decision problems
  • what is approximation?

3
Graph Property Testing GGR
  • Testing a property P with accuracy e
  • Given graph G that has property P
  • accept with probability gt3/4
  • Given graph G that is e-far from property P
  • accept with probability lt1/4
  • e-far must change efraction of representation
    of G to get property P
  • Intuition input (not output) is approximate

4
Different Representations
  • G is represented as adjacency matrix
  • e-far must add/remove en2 edges
  • G has max degree d and is represented using
    adjacency lists
  • e-far must add/remove edn edges
  • (Some extra subtleties in bounded-degree case)

5
Purpose of This Talk
  • Discuss algorithms and lower bounds for
  • Sub-linear time property testing for some basic
    graph properties
  • Sub-linear time approximation algorithms for some
    basic optimization problems
  • (well mostly discuss lower bounds)

6
Motivations
  • Large data sets
  • web, wall-mart, amazon, phone calls, . . .
  • linear time can still be infeasible
  • Fine print most research on property testing
    focuses on problems having no connection to
    applications with large data sets
  • Goal for theory research
  • Develop general algorithmic techniques(like
    dynamic programming, local search, for P)
  • Develop general techniques for impossibility
    results(like NP-completeness)

7
Property Testing and Approximation inAdjacency
Matrix Representation
8
Bipartiteness Algorithm GGR,AK
  • Testing bipartiteness of a given graph G
  • Pick (1/e)polylog(1/e) vertices, and check if
    they induce a bipartite graph if so accept
    otherwise reject
  • If G is bipartite then alg accepts with prob 1
  • If G is e-far from bipartite, then whp algorithm
    discovers an odd cycle (non-trivial to prove)
  • Running time O ((1/e2)polylog(1/e))

9
Lower Bounds BT
  • W(1/e1.5) for adaptive algorithms
  • W(1/e2) for non-adaptive algorithms
  • The bounds apply to the query complexity of the
    algorithm(and to running time for a stronger
    reason)

10
Proof for one-sided error case
  • Pick a random graph with edge-probability 3e
  • whp it is e-far from bipartite
  • Consider view of (possibly adaptive) algorithm
    that makes q queries and finds odd cycle w.h.p.
  • sees Q(eq) edges and O(e2q2) pairs of connected
    vertices
  • a cycle can be discovered only by querying two
    vertices in same connected component
  • it takes W(1/e) such attempts
  • q W (1/e1.5 )

11
One-sided error non-adaptive
  • Pick a random graph with edge-probability 3e
  • Consider view of non-adaptive algorithm that
    makes q queries
  • Same as
  • Start with q-edges graph
  • Independently delete each edge with prob 1-e
  • If qo(1/e2) then view is a forest w.p. 1-o(1)
  • Proof There are at most O(qt/2) cycles of length
    t

12
Two-Sided Error
  • Two distributions
  • Gfar random graph with edge probability 3e
  • Gbip first random partition, then each edge
    crossing partition exists with prob 6e
  • Distributions indistinguishable by
  • Non-adaptive algorithms of query complexity
    o(1/e2)
  • Adaptive algorithms of query complexity o(1/e1.5)
  • Both tight for these distributions

13
Generality/Lessons
  • Possible lesson try random graph as a possible
    distribution of hard instances far from having
    the properties
  • Not good for Triangle freeness property whose
    complexity is possibly most interesting open
    question in the adjacency matrix model.

14
Triangle-free Graphs
  • Want to distinguish triangle-free graphs from
    graphs where need to remove en2 edges to break
    all triangles
  • Solvable in time super-exponential in 1/e
  • Polynomial in 1/e is impossible Alon
  • 2poly(1/e) possible?
  • Simplest special case of more general (and
    important) question

15
Sublinear Time Approximation
  • Max CUT and other graph problems can be
    approximated within (1e) in graphs with at least
    an2 edges in time 2poly(1/ea) GGR
  • Max 3SAT can be approximated within (1e) in
    instances with at least an3 clauses in time
    2poly(1/ea) and similar results for other
    satisfiability problems AFKK
  • Lower bounds?

16
Property Testing and Approximation in Adjacency
List Representation
17
Bipartiteness GR
  • Testing bipartiteness
  • Repeat polylog n times
  • Start at random point, and pick sqrt(n) random
    walks of length polylog n, if two of them combine
    to form an odd cycle reject, otherwise accept
  • Analysis
  • in a graph where you need to remove constant
    fraction of edges to make it bipartite, algorithm
    finds odd cycle

18
Matching Lower Bound GR
  • Define two distributions of graphs
  • Gfar a random hamiltonian circuit, plus a random
    matching(whp 1/100-far from bipartite)
  • Gbip a random hamiltonian circuit, plus a random
    matching conditioned on making the graph
    bipartite
  • Gfar and Gbip are indistinguishable to algorithms
    of query complexity o(sqrt(n)).

19
Approximation Algorithms
  • Minimum spanning tree
  • given a connected weighted graph of degree d with
    weights in range 1,,w, can approximate MST
    weight within (1e) in time about
    O(dw/e2)Chazelle, Rubinfeld, T
  • Max SAT
  • Given a CNF where every variable occurs at most d
    times, can approximate Max SAT optimum within
    .618, presumably also 2/3, in O(d)
    timeHopefully will get 3/4-d

20
Testing 3-Colorability
  • NP-hard in adjacency list representation
  • Only for small enough e
  • Can find 3-coloring good for 80 of the edges in
    a 3-colorable graph using SDP
  • NP-hard to find 3-coloring good for 98 (?)
    fraction of edges
  • Gives non-tight, and conditional lower bound for
    query complexity

21
Other Problems
  • Query complexity of following problems is
    equivalent to query complexity of testing 3col
  • Testing satisfiability of 3SAT instance
  • Every variable occurs in O(1) clauses, adjacency
    list representation
  • Approximating max cut, vertex cover, independent
    set, . . ., in bounded-degree graphs
  • Approximating Max SAT, Max 2SAT, . . .
  • Lower bound of sqrt(n) for all problems
  • Reduction from bipartiteness

22
Tight Lower Bound BOT
  • For one-sided error algorithms
  • W(n) query complexity to distinguish
    3-colorable graphs from graphs that are (1/3
    d)-far
  • Lower bound applies to testing problems that are
    solvable in polynomial time
  • For two-sided error algorithms
  • For some e, W(n) query complexity to distinguish
    3-colorable graphs from graphs that are e-far.

23
Using Reductions. . .
  • Unconditionally, algorithms running in time o(n)
    cannot
  • Approximate Max 3SAT better than 7/8
  • Approximate Max Cut in bounded-degree graphs
    better than 16/17
  • . . .
  • Hastad97 proved above problems are NP-hard

24
The 3-Coloring Lower Bound
  • Consider first one-sided error algorithms
  • Its enough to find a graph G that is (1/3
    d)-far from 3-colorable, but every subgraph of
    size lt an is 3-colorable
  • (for every d there is an a such that . . .)
  • Then an algorithm of query complexity lt an either
    accepts G (which is wrong) or rejects some
    3-colorable graph (which means the algorithm has
    not one-sided error)

25
The Graph
  • Pick a graph of degree O(1/d2) at random (pick so
    many random matchings)
  • Then it is (1/3 d)-far whp
  • But, for some a, whp, every subgraph induced by k
    lt an vertices contains lt1.5k edges
  • In a minimal non-3-colorable graph, every vertex
    has degree at least 3
  • Every subgraph induced by lt an vertices is
    3-colorable
  • Erdos

26
Derandomization
  • For constants d, e, a, and for every suff large
    n, we can explicitly construct a graph
  • on n vertices,
  • max degree d,
  • e-far from 3-colorable,
  • such that every subset of an vertices induces a
    3-colorable subgraph.

27
Two-Sided Error Algorithms
  • Need to define two distributions of graphs Gcol
    and Gfar such that
  • Graphs in Gcol are (almost) always 3-colorable
  • Graphs in Gfar are (almost) always far from
    3-colorable
  • To an algorithm of bounded query complexity, Gcol
    and Gfar look (almost) the same

28
Main Step
  • Define two distributions Dsat and Dfar of
    instances of E3LIN-2(systems over GF(2) with 3
    variables per equation)
  • Systems in Dsat are always satisfiable
  • Systems in Dfar are (almost) always (1/2-d)-far
    from satisfiable
  • To an algorithm of bounded query complexity, Dsat
    and Dfar look the same
  • We get Gcol and Gfar using reduction
    fromapproximate E3LIN-2 to approximate 3-coloring

29
E3LIN-2
  • X1 X3 X10 0 mod 2
  • X2 X3 X4 1 mod 2
  • X1 X2 X9 0 mod 2
  • . . .

30
Main Building Block
  • We show that for every c there is a such that
    there exists a left-hand side with
  • n variables, cn equations, 3 variables per
    equations, every variable occurs in 3c equations
  • every an equations are linearly independent
  • Pick the left-hand side at random
  • repeat 3c times pick at random a set of n/3
    disjoint triples of variables
  • Explicit construction?
  • Need strong unique-neighbor expanders

31
Distributions
  • The left-hand side is always as before
  • In Dsat, we pick a random assignment to the
    variables, and set right-hand side consistently
  • always satisfiable
  • In Dfar, we pick the right-hand side uniformly at
    random
  • With high probability, (1/2 O(1/sqrt c))-far

32
Indistinguishability
  • Two distributions differ only in right-hand side
  • In Dfar uniformly distributed
  • In Dsat, an-wise independent
  • Linear independence implies statistical
    independence
  • Look the same to algorithm that sees less than an
    equations

33
Conclusion of the Argument
  • No algorithm of query complexity o(n) can
    distinguish satisfiable instances of E3LIN-2 from
    instances that are (1/2-d)-far from satisfiable
  • For some e, no algorithm of query complexity o(n)
    can distinguish 3-colorable graphs from graphs
    that efar from 3-col.
  • No algorithm of query complexity o(n) can
    approximate Max 3SAT better than 7/8 . . .

34
Generality/Lessons
  • Reductions are useful and extend results to
    several problems
  • In adjacency matrix (dense graph) setting,
    several and general algorithms. Few and ad-hoc
    lower bounds
  • In adjacency list (sparse graph) setting, vice
    versa.

35
Open Questions
  • Show that distinguishing 3-colorable graphs from
    (1/3-d)-far graphs requires query complexity W(n)
  • we can only prove it for one-sided error
  • Show that approximating Max SAT better than ¾ and
    Max CUT bettter than ½ requires query complexity
    W(n)
  • we only know W(sqrt(n)) implicit in GR
  • would explain why we need SDP
Write a Comment
User Comments (0)
About PowerShow.com