Lower Bounds for Property Testing - PowerPoint PPT Presentation

About This Presentation
Title:

Lower Bounds for Property Testing

Description:

Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley Joint work with Andrej Bogdanov and Kenji Obata – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 37
Provided by: msrgst23
Category:

less

Transcript and Presenter's Notes

Title: Lower Bounds for Property Testing


1
Lower Bounds for Property Testing
  • Luca Trevisan
  • U.C. Berkeley
  • Joint work with Andrej Bogdanov and Kenji Obata

2
Sub-linear Time Algorithms
  • Want to design algorithms that run in less than
    linear time (and so cannot read entire input).
  • Must be probabilistic and approximate
  • For optimization problems
  • Compute numerical apx of optimum cost (and
    implicit representation of apx solution?)
  • For decision problems
  • What is approximation for decision problems?

3
(Graph) Property Testing
  • Testing a property P with accuracy e in adjacency
    matrix representation
  • Given graph G that has property P, accept with
    probability gt3/4
  • Given graph G that is e-far from property P
    accept with probability lt1/4
  • e-far must change efraction of adjacency
    matrix to get property P
    (add/remove gt en2 edges)

4
Example GGR,AK
  • Testing bipartiteness of a given graph G
  • Pick (1/e)polylog(1/e) vertices, and check if
    they induce a bipartite graph if so accept
    otherwise reject
  • If G is bipartite then alg accepts with prob 1
  • If G is e-far from bipartite, then whp algorithm
    discovers an odd cycle (non-trivial to prove)
  • Running time O ((1/e2)polylog(1/e))
  • We will discuss matching lower bound if time
    allows

5
Paleontologists approach
6
Bounded Degree Graphs
  • Testing a property P with accuracy e in adjacency
    lists representation
  • Given graph G that has property P, accept with
    probability gt3/4
  • Given graph G that is e-far from property P
    accept with probability lt1/4
  • e-far must change efraction of adjacency
    lists entries to get property P
    (add/remove gt edn edges)

7
Bipartiteness GR
  • Testing bipartiteness
  • Repeat polylog n times
  • Start at random point, and pick sqrt(n) random
    walks of length polylog n, if two of them combine
    to form an odd cycle reject, otherwise accept
  • Analysis
  • in a graph where you need to remove constant
    fraction of edges to make it bipartite, algorithm
    finds odd cycle

8
Matching Lower Bound GR
  • Define two distributions of graphs
  • Gfar a random hamiltonian circuit, plus a random
    matching(whp 1/100-far from bipartite)
  • Gbip a random hamiltonian circuit, plus a random
    matching conditioned on making the graph
    bipartite
  • Gfar and Gbip are indistinguishable to algorithms
    of query complexity o(sqrt(n)).

9
Sub-linear Time Approximation
  • Minimum spanning tree
  • given a connected weighted graph of degree d with
    weights in range 1,,w, can approximate MST
    weight within (1e) in time about
    O(dw/e2)Chazelle, Rubinfeld, T
  • Max SAT
  • Given a CNF where every variable occurs at most d
    times, can approximate Max SAT optimum within
    .618, presumably also 2/3, in O(d) timework in
    progress, hopefully will get 3/4-d

10
Sublinear Time Approximation
  • Problems restricted to dense instances
  • Max CUT and other graph problems can be
    approximated within (1e) in graphs with at least
    an2 edges in time 2poly(1/ea) GGR
  • Max 3SAT can be approximated within (1e) in
    instances with at least an3 clauses in time
    2poly(1/ea) and similar results for other
    satisfiability problemsAFKK

11
General Goals
  • When looking for polynomial-time algorithms
  • Several algorithmic techniques of general
    applicability
  • A general technique to prove impossibility
    (NP-completeness)
  • For sublinear-time algorithms
  • General algorithmic techniques?
  • Impossibility results?

12
Testing 3-Colorability
  • Easy in adjacency matrix representation
  • NP-hard in adjacency list representation
  • Only for small enough e
  • Can find 3-coloring good for 80 of the edges in
    a 3-colorable graph using SDP
  • NP-hard to find 3-coloring good for 98 (?)
    fraction of edges
  • Non-tight, and conditional lower bound for query
    complexity

13
Other problems
  • The query complexity of following problems is
    equivalent to query complexity of testing 3col
  • Testing satisfiability of 3SAT instance
  • Every variable occurs in O(1) clauses, adjacency
    list representation
  • Approximating max cut, vertex cover, independent
    set, . . ., in bounded-degree graphs
  • Approximating Max SAT, Max 2SAT, . . .
  • Lower bound of sqrt(n) for all problems
  • Nothing better except with complexity assumptions

14
Our Results
  • For one-sided error algorithms
  • W(n) query complexity to distinguish 3-colorable
    graphs from graphs that are (1/3 d)-far
  • Lower bound applies to testing problems that are
    solvable in polynomial time
  • For two-sided error algorithms
  • For some e, W(n) query complexity to distinguish
    3-colorable graphs from graphs that are e-far.

15
Additional Results
  • Unconditionally, algorithms running in time o(n)
    cannot
  • Approximate Max 3SAT better than 7/8
  • Approximate Max Cut in bounded-degree graphs
    better than 16/17
  • . . .
  • Hastad97 proved above problems are NP-hard

16
The 3-Coloring Lower Bound
  • Consider first one-sided error algorithms
  • Its enough to find a graph G that is (1/3
    d)-far from 3-colorable, but every subgraph of
    size lt an is 3-colorable
  • (for every d there is an a such that . . .)
  • Then an algorithm of query complexity lt an either
    accepts G (which is wrong) or rejects some
    3-colorable graph (which means the algorithm has
    not one-sided error)

17
The Graph
  • Pick a graph of degree O(1/d2) at random (pick so
    many random matchings)
  • Then it is (1/3 d)-far whp
  • But, for some a, whp, every subgraph induced by k
    lt an vertices contains lt1.5k edges
  • In a minimal non-3-colorable graph, every vertex
    has degree at least 3
  • Every subgraph induced by lt an vertices is
    3-colorable
  • Erdos

18
Explicit Construction
  • Can the previous construction be derandomized?
  • For constants d, e, a, and for every suff large
    n, we can explicitly construct a graph on n
    vertices, max degree d, e-far from 3-colorable,
    and such that every subset of an vertices induces
    a 3-colorable subgraph.

19
Explicit Construction
  • We construct a 3SAT formula such that for
    constants k, e, a
  • Every variable occurs k times
  • No assignment satisfies more than 1-e fraction
    of clauses
  • Every a fraction of clauses is satisfiable
  • Then we use (slightly new) reduction from 3SAT to
    3Coloring

20
The Formula
  • Fix a degree-d expander graph G(V,E) such that
    for every cut (S,V-S) at least minS,V-S
    edges cross the cut(enough d14)
  • Have two variables xuv and xvu for each egde
    (u,v)
  • For every vertex v have the (3SAT equivalent of)
    the constraint
  • Su xuv 1 Sw xvw

21
Structure of the Analysis
  • Impossible to satisfy more than a fraction
    1/(d1) of the constraints
  • Can always satisfy half of the constraint
  • define an auxiliary network
  • show that the auxiliary network has no small cut
    because of expansion
  • then there is a large flow
  • use large flow to find assignment for subset of
    constraint

22
Flow Argument
  • Want to satisfy constraints corresponding to
    vertices in C, with C lt V/2

Construct flow network with new source s, sink t
obtained by collapsing V-C, and vertices in C
V-C
s
t
C
23
Flow Argument
A edges
A
t
  • Every cut has size at least C
  • There is a 0/1 flow of cost at least C
  • Interpreted as an assignment, satisfies all
    constraints in C

s
C-A edges
C-A
24
Two-Sided Error Algorithms
  • Need to define two distributions of graphs Gcol
    and Gfar such that
  • Graphs in Gcol are (almost) always 3-colorable
  • Graphs in Gfar are (almost) always far from
    3-colorable
  • To an algorithm of bounded query complexity, Gcol
    and Gfar look (almost) the same

25
Main Step
  • Define two distributions Dsat and Dfar of
    instances of E3LIN-2(systems over GF(2) with 3
    variables per equation)
  • Systems in Dsat are always satisfiable
  • Systems in Dfar are (almost) always (1/2-d)-far
    from satisfiable
  • To an algorithm of bounded query complexity, Dsat
    and Dfar look the same
  • We get Gcol and Gfar using reduction
    fromapproximate E3LIN-2 to approximate
    3-coloring

26
E3LIN-2
  • X1 X3 X10 0 mod 2
  • X2 X3 X4 1 mod 2
  • X1 X2 X9 0 mod 2
  • . . .

27
Main Building Block
  • We show that for every c there is a such that
    there exists a left-hand side with
  • n variables, cn equations, 3 variables per
    equations, every variable occurs in 3c equations
  • every an equations are linearly independent
  • Pick the left-hand side at random
  • repeat 3c times pick at random a set of n/3
    disjoint triples of variables
  • Explicit construction?

28
Distributions
  • The left-hand side is always as before
  • In Dsat, we pick a random assignment to the
    variables, and set right-hand side consistently
  • always satisfiable
  • In Dfar, we pick the right-hand side uniformly at
    random
  • With high probability, (1/2 O(1/sqrt c))-far

29
Indistinguishability
  • Two distributions differ only in right-hand side
  • In Dfar uniformly distributed
  • In Dsat, an-wise independent
  • Linear independence implies statistical
    independence
  • Look the same to algorithm that sees less than an
    equations

30
Conclusion of the Argument
  • No algorithm of query complexity o(n) can
    distinguish satisfiable instances of E3LIN-2 from
    instances that are (1/2-d)-far from satisfiable
  • For some e, no algorithm of query complexity o(n)
    can distinguish 3-colorable graphs from graphs
    that efar from 3-col.
  • No algorithm of query complexity o(n) can
    approximate Max 3SAT better than 7/8 . . .

31
Open Questions
  • Show that distinguishing 3-colorable graphs from
    (1/3-d)-far graphs requires query complexity W(n)
  • we can only prove it for one-sided error
  • Show that approximating Max SAT better than ¾ and
    Max CUT bettter than ½ requires query complexity
    W(n)
  • we only know W(sqrt(n)) implicit in GR
  • would explain why we need SDP

32
Back to Dense Graphs
  • Recall Alon-Krivelevich bipartiteness test for
    the adjacency matrix representation
  • pick (1/e)polylog(1/e) vertices and look at
    induced subgraph
  • if see odd cycle reject, otherwise accept
  • Running time (1/e2)polylog(1/e)
  • We prove
  • W(1/e2) for non-adaptive algorithms
  • W(1/e1.5) for adaptive algorithms

33
Two Distributions
  • Gfar every edge exists with probability e
  • whp it is e/3-far from bipartite
  • Gbip pick a random partition, then every edge
    that crosses the partition exists with
    probability 2e
  • Thm1 look the same to non-adaptive algorithms
    making o(1/e2) queries
  • Thm2 look the same to adaptive algorithms making
    o(1/e1.5) queries

34
Proof of a Weaker Statement
  • Thm1 (weaker) a non-adaptive algorithm making
    qo(1/e2) queries in Gfar is unlikely to see an
    odd cycle
  • Proof
  • a non-adaptive algorithm asks about some subgraph
    with q edges.
  • There are at most about qt/2 cycles of length t,
    and each one exists with probability etqt/2,
    exponentially small in t.
  • Summing over all t, its still unlikely that
    there is a cycle

35
Proof of a Weaker Statement
  • Thm2 (weaker) an adaptive algorithm making
    qo(1/e1.5) queries in Gfar is unlikely to see an
    odd cycle
  • Proof
  • the algorithm sees an edge only once in 1/e
    queries
  • the algorithm sees a cycle only after querying a
    pair that it already sees as connects
  • It takes 1/e.5 edges to have 1/e pairs of
    connected vertices
  • It takes 1/e1.5 queries to have so many edges

36
Some more open questions
  • In adjacency matrix representation, most
    interesting problems solvable in constant (in e)
    time
  • For some problems (eg testing triangle-freeness)
    analysis uses Szemeredys regularity lemma, and
    constant is hyper-exponential in e
  • Lower bound (1/e)log 1/ e and only and for
    one-sided error
  • Alternative analysis / stronger lower bounds?
Write a Comment
User Comments (0)
About PowerShow.com