An Introduction to Property Testing - PowerPoint PPT Presentation

About This Presentation
Title:

An Introduction to Property Testing

Description:

Property Testing: The Art of Uninformed Decisions' ... over a random choice of vector r. f passes the test with probability at most ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 113
Provided by: ape64
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: An Introduction to Property Testing


1
An Introduction to Property Testing
  • Andy Drucker
  • 1/07

2
Property Testing The Art of Uninformed
Decisions
  • This talk developed largely from a survey by
    Eldar Fischer F97--fun reading.

3
The Age-Old Question
  • Accuracy or speed?
  • Work hard or cut corners?

4
The Age-Old Question
  • Accuracy or speed?
  • Work hard or cut corners?
  • In CS heuristics and approximation algorithms
    vs. exact algorithms for NP-hard problems.

5
The Age-Old Question
  • Both well-explored. But

6
The Age-Old Question
  • Both well-explored. But
  • Constant time is the new polynomial time.

7
The Age-Old Question
  • Both well-explored. But
  • Constant time is the new polynomial time.
  • So, are there any corners left to cut?

8
The Setting
  • Given a set of n inputs x and a property P,
    determine if P(x) holds.

9
The Setting
  • Given a set of n inputs x and a property P,
    determine if P(x) holds.
  • Generally takes at least n steps (for sequential,
    nonadaptive algorithms).

10
The Setting
  • Given a set of n inputs x and a property P,
    determine if P(x) holds.
  • Generally takes at least n steps. (for
    sequential, nonadaptive algorithms).
  • Possibly infeasible if we are doing internet or
    genome analysis, program checking, PCPs,

11
First Compromise
  • What if we only want a check that rejects any x
    such that P(x), with probability gt 2/3? Can we
    do better?

12
First Compromise
  • What if we only want a check that rejects any x
    such that P(x), with probability gt 2/3? Can we
    do better?
  • Intuitively, we must expect to look at almost
    all input bits if we hope to reject x that are
    only one bit away from satisfying P.

13
First Compromise
  • What if we only want a check that rejects any x
    such that P(x), with probability gt 2/3? Can we
    do better?
  • Intuitively, we must expect to look at almost
    all input bits if we hope to reject x that are
    only one bit away from satisfying P.
  • So, no.

14
Second Compromise
  • This kind of failure is universal. So, we must
    scale our hopes back.

15
Second Compromise
  • This kind of failure is universal. So, we must
    scale our hopes back.
  • The problem those almost-correct instances are
    too hard.

16
Second Compromise
  • This kind of failure is universal. So, we must
    scale our hopes back.
  • The problem those almost-correct instances are
    too hard.
  • The solution assume they never occur!

17
Second Compromise
  • Only worry about instances y that either satisfy
    P or are at an edit distance of cn from any
    satisfying instance. (we say y is c-bad.)

18
Second Compromise
  • Only worry about instances y that either satisfy
    P or are at an edit distance of cn from any
    satisfying instance. (we say y is c-bad.)
  • Justifying this assumption is app-specific the
    excluded middle might not arise, or it might
    just be less important.

19
Model Decisions
  • Adaptive or non-adaptive queries?

20
Model Decisions
  • Adaptive or non-adaptive queries?
  • Adaptivity can be dispensed with at the cost of
    (exponentially many) more queries.

21
Model Decisions
  • Adaptive or non-adaptive queries?
  • Adaptivity can be dispensed with at the cost of
    (exponentially many) more queries.
  • One-sided or two-sided error?

22
Model Decisions
  • Adaptive or non-adaptive queries?
  • Adaptivity can be dispensed with at the cost of
    (exponentially many) more queries.
  • One-sided or two-sided error?
  • Error probability can be diminished by repeated
    trials.

23
The Trivial Case Statistical Sampling
  • Let P(x) 1 iff x is all-zeroes.

24
The Trivial Case Statistical Sampling
  • Let P(x) 1 iff x is all-zeroes.
  • Then y is c-bad, if and only if y gt cn.

25
The Trivial Case Statistical Sampling
  • Algorithm sample O(1/c) random, independently
    chosen bits of y, accepting iff all bits come up
    0.

26
The Trivial Case Statistical Sampling
  • Algorithm sample O(1/c) random, independently
    chosen bits of y, accepting iff all bits come up
    0.
  • If y is c-bad, a 1 will appear with probability
    2/3.

27
The Trivial Case Statistical Sampling
  • Algorithm sample O(1/c) random, independently
    chosen bits of y, accepting iff all bits come up
    0.
  • If y is c-bad, a 1 will appear with probability
    2/3.

28
Sort-Checking EKKRV98
  • Given a list L of n numbers, let P(L) be the
    property that L is in nondecreasing order. How
    to test for P with few queries?
  • (Now queries are to numbers, not bits.)

29
Sort-Checking EKKRV98
  • First try Pick k random entries of L, check that
    their contents are in nondecreasing order.

30
Sort-Checking EKKRV98
  • First try Pick k random entries of L, check that
    their contents are in nondecreasing order.
  • Correct on sorted lists

31
Sort-Checking EKKRV98
  • First try Pick k random entries of L, check that
    their contents are in nondecreasing order.
  • Correct on sorted lists
  • Suppose L is c-bad what k will suffice to reject
    L with probability 2/3?

32
Sort-Checking (contd)
  • Uh-oh, what about
    L (2, 1, 4, 3, 6, 5, , 2n, 2n - 1)?

33
Sort-Checking (contd)
  • Uh-oh, what about
    L (2, 1, 4, 3, 6, 5, , 2n, 2n - 1)?
  • Its ½-bad, yet we need k sqrt(n) to succeed.

34
Sort-Checking (contd)
  • Uh-oh, what about
    L (2, 1, 4, 3, 6, 5, , 2n, 2n - 1)?
  • Its ½-bad, yet we need k sqrt(n) to succeed.
  • Modify the algorithm to test adjacent pairs? But
    this algorithm, too, has its blind spots.

35
An O(1/c log n) solution EKKRV98
  • Place the entries on a binary tree by an in-order
    traversal

36
An O(1/c log n) solution EKKRV98
  • Place the entries on a binary tree by an in-order
    traversal
  • Repeat O(1/c) times pick a random i lt n, and
    check that Li is sorted with respect to its
    path from the root.

37
An O(1/c log n) solution EKKRV98
  • Place the entries on a binary tree by an in-order
    traversal
  • Repeat O(1/c) times pick a random i lt n, and
    check that Li is sorted with respect to its
    path from the root.
  • (Each such check must query the whole path, O(log
    n) entries.)
  • Algorithm is non-adaptive with one-sided error.

38
Sortchecking Analysis
  • If L is sorted, each check will succeed.

39
Sortchecking Analysis
  • If L is sorted, each check will succeed.
  • What if L is c-bad? Equivalently, what if L
    contains no nondecreasing subsequence of length
    (1-c)n?

40
Sortchecking Analysis
  • It turns out that a contrapositive analysis works
    more easily.

41
Sortchecking Analysis
  • It turns out that a contrapositive analysis works
    more easily.
  • That is, suppose most such path-checks for L
    succeed we argue that L must be close to a
    sorted list L which we will define.

42
Sortchecking Analysis (contd)
  • Let S be the set of indices for which the
    path-check succeeds.

43
Sortchecking Analysis (contd)
  • Let S be the set of indices for which the
    path-check succeeds.
  • If a path-check of a randomly chosen element
    succeeds with probability gt (1 c), then
  • S gt (1 c)n.

44
Sortchecking Analysis (contd)
  • Let S be the set of indices for which the
    path-check succeeds.
  • If a path-check of a randomly chosen element
    succeeds with probability gt (1 c), then
  • S gt (1 c)n.
  • We claim that L, restricted to S, is in
    nondecreasing order!

45
Sortchecking Analysis (contd)
  • Then, by correcting entries not in S to agree
    with the order of S, we get a sorted list L at
    edit distance lt cn from L.

46
Sortchecking Analysis (contd)
  • Then, by correcting entries not in S to agree
    with the order of S, we get a sorted list L at
    edit distance lt cn from L.
  • So L cannot be c-bad.

47
Sortchecking Analysis (contd)
  • Thus, if L is c-bad, it fails each path-check
    with probability gt c, and O(1/c) path-checks
    expose it with probability 2/3.

48
Sortchecking Analysis (contd)
  • Thus, if L is c-bad, it fails each path-check
    with probability gt c, and O(1/c) path-checks
    expose it with probability 2/3.
  • This proves correctness of the (non-adaptive,
    one-sided error) algorithm.

49
Sortchecking Analysis (contd)
  • Thus, if L is c-bad, it fails each path-check
    with probability gt c, and O(1/c) path-checks
    expose it with probability 2/3.
  • This proves correctness of the (non-adaptive,
    one-sided error) algorithm.
  • EKKRV98 also shows this is essentially optimal.

50
First Moral
  • We saw that it can take insight to discover and
    analyze the right local signature for the
    global property of failing to satisfy P.

51
Second Moral
  • This, and many other property-testing algorithms,
    work because they implicitly define a correction
    mechanism for property P.

52
Second Moral
  • This, and many other property-testing algorithms,
    work because they implicitly define a correction
    mechanism for property P.
  • For an algebraic example

53
Linearity Testing BLR93
  • Given a function f 0,1n ? 0, 1.

54
Linearity Testing BLR93
  • Given a function f 0,1n ? 0, 1.
  • We want to differentiate, probabilistically and
    in few queries, between the case where f is
    linear
  • (i.e., f(xy)f(x)f(y) (mod 2) for all x, y),

55
Linearity Testing BLR93
  • Given a function f 0,1n ? 0, 1.
  • We want to differentiate, probabilistically and
    in few queries, between the case where f is
    linear
  • (i.e., f(xy)f(x)f(y) (mod 2) for all x, y),
  • and the case where f is c-far from any linear
    function.

56
Linearity Testing BLR93
  • How about the naïve test pick x, y at random,
    and check that
  • f(xy)f(x)f(y) (mod 2)?

57
Linearity Testing BLR93
  • How about the naïve test pick x, y at random,
    and check that
  • f(xy)f(x)f(y) (mod 2)?
  • Previous sorting example warns us not to assume
    this is effective

58
Linearity Testing BLR93
  • How about the naïve test pick x, y at random,
    and check that
  • f(xy)f(x)f(y) (mod 2)?
  • Previous sorting example warns us not to assume
    this is effective
  • Are there pseudo-linear functions out there?

59
Linearity Test - Analysis
  • If f is linear, it always passes the test.

60
Linearity Test - Analysis
  • If f is linear, it always passes the test.
  • Now suppose f passes the test with probability gt
    1 d, where d lt 1/12

61
Linearity Test - Analysis
  • If f is linear, it always passes the test.
  • Now suppose f passes the test with probability gt
    1 d, where d lt 1/12
  • we define a linear function g that is 2d-close to
    f.

62
Linearity Test - Analysis
  • If f is linear, it always passes the test.
  • Now suppose f passes the test with probability gt
    1 d, where d lt 1/12
  • we define a linear function g that is 2d-close to
    f.
  • So, if f is 2d-bad, it fails the test with
    probability gt d, and O(1/d) iterations of the
    test suffice to reject f with probability 2/3.

63
Linearity Test - Analysis
  • Define g(x) majority ( f(xr)f(r) ),
  • over a random choice of vector r.

64
Linearity Test - Analysis
  • Define g(x) majority ( f(xr)f(r) ),
  • over a random choice of vector r.
  • f passes the test with probability at most
  • 1 t/2, where t is the fraction of entries
    where g and f differ.

65
Linearity Test - Analysis
  • Define g(x) majority ( f(xr)f(r) ),
  • over a random choice of vector r.
  • f passes the test with probability at most
  • 1 t/2, where t is the fraction of entries
    where g and f differ.
  • 1 t/2 gt 1 d implies t lt 2d,
  • so f, g are 2d-close, as claimed.

66
Linearity Test - Analysis
  • Now we must show g is linear.

67
Linearity Test - Analysis
  • Now we must show g is linear.
  • For c lt 1, let G_(1-c)
  • x with probability gt 1-c over r,
  • f(x)
    f(xr)f(r) .
  • Let t_c 1 - G_(1-c)/ 2n.

68
Linearity Test - Analysis
  • Reasoning as before, we have t_c lt d / c.

69
Linearity Test - Analysis
  • Reasoning as before, we have t_c lt d / c.
  • Thus, t_(1/6) lt 6d lt 1/2.

70
Linearity Test - Analysis
  • Reasoning as before, we have t_c lt d / c.
  • Thus, t_(1/6) lt 6d lt 1/2.
  • Then, given any x, there must exist a z such that
    z, xz are both in G_5/6.

71
Linearity Test - Analysis
  • Now, what is Probg(x) f(xr) f(r)?
  • (How resoundingly is the majority vote
    decided for an arbitrary x?)

72
Linearity Test - Analysis
  • Now, what is Probg(x) f(xr) f(r)?
  • (How resoundingly is the majority vote
    decided for an arbitrary x?)
  • Its the same as
  • Probg(x) f(x (zr)) f(zr),

73
Linearity Test - Analysis
  • Now, what is Probg(x) f(xr) f(r)?
  • (How resoundingly is the majority vote
    decided for an arbitrary x?)
  • Its the same as
  • Probg(x) f(x (zr)) f(zr),
  • since for fixed z, zr is uniformly
    distributed if r is.

74
Linearity Test - Analysis
  • Now f(x (zr)) f(zr)
  • f((x z)r) f(r) f(zr)
    f(r).

75
Linearity Test - Analysis
  • Now f(x (zr)) f(zr)
  • f((x z)r) f(r) f(zr)
    f(r).
  • Since xz, z are in G_(5/6), with probability
    greater than 1 2(1/6) 2/3, this expression
    equals g(xz) g(z).

76
Linearity Test - Analysis
  • Now f(x (zr)) f(zr)
  • f((x z)r) f(r) f(zr)
    f(r).
  • Since xz, z are in G_(5/6), with probability
    greater than 1 2(1/6) 2/3, this expression
    equals g(xz) g(z).
  • So every xs majority vote is decided by a
  • gt 2/3 majority.

77
Linearity Test - Analysis
  • Finally we show g(xy) g(x) g(y) for all x,
    y.

78
Linearity Test - Analysis
  • Finally we show g(xy) g(x) g(y) for all x,
    y.
  • Choosing a random r, f(xyr) f(r) g(xy)
    with probability gt 2/3.

79
Linearity Test - Analysis
  • Finally we show g(xy) g(x) g(y) for all x,
    y.
  • Choosing a random r, f(xyr) f(r) g(xy)
    with probability gt 2/3.
  • Also, f(x(yr)) f(yr) g(x), and
  • f(yr) f(r) g(y), each with probability gt
    2/3.

80
Linearity Test - Analysis
  • Finally we show g(xy) g(x) g(y) for all x,
    y.
  • Choosing a random r, f(xyr) f(r) g(xy)
    with probability gt 2/3.
  • Also, f(x(yr)) f(yr) g(x), and
  • f(yr) f(r) g(y), each with probability gt
    2/3.
  • Then with probability gt 1 3 (1/3) gt 0, all 3
    occur. Adding, we get g(xy) g(x) g(y).
    QED.

81
Testing Graph Properties
  • A graph property should be invariant under
    vertex permutations.

82
Testing Graph Properties
  • A graph property should be invariant under
    vertex permutations.
  • Two query models i) adjacency matrix queries,
    ii) neighborhood queries.

83
Testing Graph Properties
  • A graph property should be invariant under
    vertex permutations.
  • Two query models i) adjacency matrix queries,
    ii) neighborhood queries.
  • ii) appropriate for sparse graphs.

84
Testing Graph Properties
  • For hereditary graph properties, most common
    testing algorithms simply check random subgraphs
    for the property.

85
Testing Graph Properties
  • For hereditary graph properties, most common
    testing algorithms simply check random subgraphs
    for the property.
  • E.g., to test if a graph is triangle-free, check
    a small random subgraph for triangles.

86
Testing Graph Properties
  • For hereditary graph properties, most common
    testing algorithms simply check random subgraphs
    for the property.
  • E.g., to test if a graph is triangle-free, check
    a small random subgraph for triangles.
  • Obvious algorithms, but often require very
    sophisticated analysis.

87
Testing Graph Properties(Contd)
  • Efficiently testable properties in model i)
    include

88
Testing Graph Properties(Contd)
  • Efficiently testable properties in model i)
    include
  • Bipartiteness

89
Testing Graph Properties(Contd)
  • Efficiently testable properties in model i)
    include
  • Bipartiteness
  • 3-colorability

90
Testing Graph Properties(Contd)
  • Efficiently testable properties in model i)
    include
  • Bipartiteness
  • 3-colorability
  • In fact AS05, every property thats monotone
    in the entries of the adjacency matrix!

91
Testing Graph Properties(Contd)
  • Efficiently testable properties in model i)
    include
  • Bipartiteness
  • 3-colorability
  • In fact AS05, every property thats monotone
    in the entries of the adjacency matrix!
  • A combinatorial characterization of the testable
    graph properties is known AFNS06.

92
Lower Bounds via Yaos Principle
  • A q-query probabilistic algorithm A testing for
    property P can be viewed as a randomized choice
    among q-query deterministic algorithms Ti.

93
Lower Bounds via Yaos Principle
  • A q-query probabilistic algorithm A testing for
    property P can be viewed as a randomized choice
    among q-query deterministic algorithms Ti.
  • We will look at the 2-sided error model.

94
Lower Bounds via Yaos Principle
  • For any distribution D on inputs, the probability
    that A accepts its input is a weighted average of
    the acceptance probabilities of the Ti.

95
Lower Bounds via Yaos Principle
  • Suppose we can find (D_Y, D_N) two distributions
    on inputs, such that

96
Lower Bounds via Yaos Principle
  • Suppose we can find (D_Y, D_N) two distributions
    on inputs, such that
  • i) x from D_Y all satisfy property P

97
Lower Bounds via Yaos Principle
  • Suppose we can find (D_Y, D_N) two distributions
    on inputs, such that
  • i) x from D_Y all satisfy property P
  • ii) x from D_N all are c-bad

98
Lower Bounds via Yaos Principle
  • Suppose we can find (D_Y, D_N) two distributions
    on inputs, such that
  • i) x from D_Y all satisfy property P
  • ii) x from D_N all are c-bad
  • iii) D_Y and D_N are statistically 1/3-close on
    any fixed q entries.

99
Lower Bounds via Yaos Principle
  • Then, given a non-adaptive deterministic q-query
    algorithm Ti, the statistical distance between
    Ti(D_Y) and Ti(D_N) is at most 1/3, so the
    same holds for our randomized algorithm A!

100
Lower Bounds via Yaos Principle
  • Then, given a non-adaptive deterministic q-query
    algorithm Ti, the statistical distance between
    Ti(D_Y) and Ti(D_N) is at most 1/3, so the
    same holds for our randomized algorithm A!
  • Thus A cannot simultaneously accept all
    P-satisfying instances with prob. gt 2/3 and
    accept all c-bad instances with prob. lt 1/3.

101
Example Graph Isomorphism
  • Let P(G_1, G_2) be the property that G_1 and G_2
    are isomorphic.

102
Example Graph Isomorphism
  • Let P(G_1, G_2) be the property that G_1 and G_2
    are isomorphic.
  • Let D_Y be distributed as (G, pi(G)), where G is
    a random graph and pi a random permutation

103
Example Graph Isomorphism
  • Let P(G_1, G_2) be the property that G_1 and G_2
    are isomorphic.
  • Let D_Y be distributed as (G, pi(G)), where G is
    a random graph and pi a random permutation
  • Let D_N be distributed as (G_1, G_2), where G_1,
    G_2 are independent random graphs.

104
Example Graph Isomorphism
  • Briefly (G_1, G_2) is almost always far from
    satisfying P because for any fixed permutation
    pi, the adjacency matrices of G_1 and pi(G_2) are
    too unlikely to be similar

105
Example Graph Isomorphism
  • Briefly (G_1, G_2) is almost always far from
    satisfying P because for any fixed permutation
    pi, the adjacency matrices of G_1 and pi(G_2) are
    too unlikely to be similar
  • D_Y looks like D_N as long as we dont query
    both a lhs vertex of G and its rhs counterpart in
    pi(G)and pi is unknown.

106
Example Graph Isomorphism
  • Briefly (G_1, G_2) is almost always far from
    satisfying P because for any fixed permutation
    pi, the adjacency matrices of G_1 and pi(G_2) are
    too unlikely to be similar
  • D_Y looks like D_N as long as we dont query
    both a lhs vertex of G and its rhs counterpart in
    pi(G)and pi is unknown.
  • This approach proves a sqrt(n) query lower bound.

107
Concluding Thoughts
  • Property Testing
  • Revitalizes the study of familiar properties

108
Concluding Thoughts
  • Property Testing
  • Revitalizes the study of familiar properties
  • Leads to simply stated, intuitive, yet
    surprisingly tough conjectures

109
Concluding Thoughts
  • Property Testing
  • Contains hidden layers of algorithmic ingenuity

110
Concluding Thoughts
  • Property Testing
  • Contains hidden layers of algorithmic ingenuity
  • Brilliantly meets its own lowered standards

111
Acknowledgments
  • Thanks for Listening!
  • Thanks to Russell Impagliazzo and Kirill
    Levchenko for their help.

112
References
  • AS05. Alon, Shapira Every Monotone Graph
    Property is Testable, STOC 05.
  • AFNS06. Alon, Fischer, Newman, Shapira A
    combinatorial characterization of the testable
    graph properties it's all about regularity, STOC
    06.
  • BLR93 Manuel Blum, Michael Luby, and Ronitt
    Rubinfeld. Self-testing/correcting with
    applications to numerical problems. Journal of
    Computer and System Sciences, 47(3)549595,
    1993. (linearity test)
  • EKKRV98F. Ergun, S. Kannan, S. R. Kumar, R.
    Rubinfeld, and M. Viswanathan. Spot-checkers.
    STOC 1998. (sort-checking)
  • F97 Eldar Fischer The Art of Uninformed
    Decisions. Bulletin of the EATCS 75 97 (2001)
    (survey on property testing)
Write a Comment
User Comments (0)
About PowerShow.com