Title: Approximating The Permanent
1Approximating The Permanent
Seminar in Complexity 04/06/2001
2Topics
- Description of the Markov chain
- Analysis of its mixing time
3Definitions
- Let G (V1, V2, E) be a bipartite graph on nn
vertices. - Let M denote the set of perfect matchings in G.
- Let M(y, z) denote the set of near-perfect
matchings with holes only at y and z. -
4M(u,v)/M Exponentially Large
- Observe the following bipartite graph
u
v
It has only one perfect matching...
5M(u,v)/M Exponentially Large
- But two near-perfect matchings with holes at u
and v.
u
v
6M(u,v)/M Exponentially Large
- Concatenating another hexagon,
- adds a constant number of vertices,
- but doubles the number of near-perfect matchings,
- while the number of perfect matchings remains 1.
. . .
Thus we can force the ratio M(u,v)/M to be
exponentially large.
7The Breakthrough
- Jerrum, Sinclair, and Vigoda 2000 introduced an
additional weight factor. - Any hole pattern (including that with no holes)
is equally likely in the stationary distribution
p. - p will assign O(1/n2) weight to perfect matchings.
8Edge Weights
- For each edge (y, z) ? E, we introduce a positive
weight ?(y, z). - For a matching M, ?(M) ?(i, j)?M?(i, j).
- For a set of matchings S, ?(S) ?M?S?(M).
- We will work with the complete graph on nn
vertices - ?(e) 1 for all e ? E
- ?(e) ? 0 for all e ? E
9The Stationary Distribution
- The desired distribution p over O is ?(M) ? ?(M),
where
w V1 V2 ? ? is the weight function, to be
specified shortly
10The Markov Chain
- Choose an edge e(u,v) uniformly at random.
-
(i) If M ? M and e ? M, let M M\e, (ii)
if M ? M(u,v), let M M?e,
(iii) if M ? M(u,z) where z ? v, and (y,v) ? M,
let M M?e\(y,v), (iv) if M ? M(y,v)
where y ? u, and (u,z) ? M, let M
M?e\(u,z).
Metropolis rule
3. With probability min1,?(M)/?(M) go to
M otherwise, stay at M.
11The Markov Chain (cont.)
- Finally, we add a self-loop probability of ½ to
every state. - This insures the MC is aperiodic.
- We also have irreducibility.
What about the stationary distribution?
12Detailed Balance
P(M,M) gt 0
- Consider two adjacent matchings M and M with
?(M) ?(M).
? ?(M)P(M, M) ?(M)P(M, M)
Q(M,M)
13The Ideal Weight
- Recall that ?(M) ? ?(M), where
- Ideally, we would take w w, where
?(M)
?(M)w(u,v)
?(M(u,v))
?(M)
?(M)
14The Concession
- We will content ourselves with weights w
satisfying
- This perturbation will reduce the relative weight
of perfect and near-perfect matchings by at most
a constant factor (4).
15The Mixing Time Theorem
- Assuming the weight function w satisfies the
above inequality for all (y,z) ? V1 V2 , then
the mixing time of the MC is bounded above by
?(?) O(m6n8(n logn log?-1)), provided the
initial state is a perfect matching of maximum
activity.
16Edge Weights Revisited
- We will work with the complete graph on nn
vetices. - Think of non-edges e ? E as having a very small
activity of 1/n!. - The combined weight of all invalid matchings is
at most 1. - We begin with activities ? whose ideal weights w
are easy to compute, and progress towards our
target activities.
?(e) 1/n! for all e ? E ?(e) 1/n! for all
e ? E
? 1
17Step I
- We assume at the beginning of the phase w(u,v)
approximates w(u,v) within ratio 2 for all
(u,v). - Before updating an activity, we will find for
each (u,v) a better approximation, one that is
within ratio c for some 1 lt c lt 2. - For this purpose we use the identity
18Step I (cont.)
- The mixing time theorem allows us to sample, in
polynomial time, from a distribution ? that is
within variation distance ? of p. - We choose ? c1/n2, take O(n2 log ?-1) samples
from ?, and use sample averages. - Using a few Chernoff bounds, we have, with
probability 1- (n21)?, approximation within
ratio c to all of w(u,v).
c1 gt 0 is a sufficiently small constant
19Step I (conclusion)
- Taking c 6/5 and using O(n2 log ?-1) samples,
we obtain refined estimates w(u,v) satisfying - 5w(u,v)/6 w(u,v) 6w(u,v)/5
20Step II
- We update the activity of an edge e
- ?(e) ? ?(e) exp(-1/2)
- The ideal weight function w changes by at most a
factor of exp(1/2). - Since 6exp(1/2)/5 lt 2, our estimates w after step
I approximate w within ratio 2 for the new
activities.
1.978
21Step II (cont.)
?(e) 1/n! for all e ? E ?(e) 1/n! for all e
? E
? 1
- We use the above procedure repeatedly to reduce
the initial activities to the target activities.
- This requires O(n2 n log n) phases.
- Each phase requires O(n2 log ?-1) samples.
- Each sample requires O(n21 log n) simulation
steps (mixing time theorem). - Overall time - O(n26 log2 n log ?-1)
22The ? Error
- We need to set ? so that the overall failure
probability is strictly less than ?, say ?/2. - The probability that any phase fails is at most
O(n3 log n n2?). - We will take ? c2? / n5 log n .
23Time Complexity
24Conductance
- The conductance of a reversible MC is defined as
?min??S???(S), where - Theorem
- For an ergodic, reversible Markov chain with
self- loops probabilities P(y,y) ? ½ for all
states x??,
25Canonical Paths
- We define canonical paths ?I,F from all I ? O to
all F ? M. - Denote G ?I,F (I, F) ? O M.
- Certain transitions on a canonical path will be
deemed chargeable. - For each transition t denote cp(t) (I, F)
?I,F contains t as a chargeable
transition
26I ? F
- If I ? M, then I ? F consists of a collection of
alternating cycles. - If I ? M(y,z), then I ? F consists of a
collection of alternating cycles together with a
single alternating path from y to z.
27Type A Path
We assume w.l.g. that the edge (v0, v1) belongs
to I
- Assume I ? M.
- A cycle v0 ? v1 ? ? v2k v0 is unwound by
(i) removing the edge (v0, v1), (ii)
successively, for each 1 i k 1, exchanging
the edge (v2i, v2i1) with (v2i-1, v2i), (iii)
adding the edge (v2k-1, v2k).
- All these transitions are deemed chargeable.
28Type A Path Illustrated
29Type B Path
- Assume I ? M(y,z).
- The alternating path y v0 ? ? v2k1 z is
unwound by - (i) successively, for each 1 i k,
exchanging the edge (v2i-1, v2i) with (v2i-2,
v2i-1), and - (ii) adding the edge (v2k, v2k1).
- Here, only the above transitions are deemed
chargeable.
30Type B Path Illustrated
31Congestion
- We define a notion of congestion of G
- Lemma I
- Assuming the weight w approximates w within
ratio 2, then t(G) 16m.
32Lemma II
- Let u,y ? V1, v,z ? V2. Then,
- (i) ?(u,v)?(M(u,v)) ?(M), for all vertices
u,v with u ? v. - (ii) ?(u,v)?(M(u,z))?(M(y,v)) ?(M)?(M(y,z)),
for all distinct vertices u,v,y,z with u ? v. - Observe that Mu,z ? My,v ? (u,v) decomposes
into a collection of cycles together with an
odd-length path O joining y and z.
33Corollary III
- Let u,y ? V1, v,z ? V2. Then,
- (i) w(u,v) ?(u,v), for all vertices u,v
with u ? v. - (ii) w(u,z)w(y,v) ?(u,v)w(y,z), for all
distinct vertices u,v,y,z with u ? v. - (iii) w(u,z)w(y,v) ?(u,v) ?(y,z), for all
distinct vertices u,v,y,z with u ? v and y ? z.
34Proof of Lemma I
- For any transition t (M,M) and any pair of
states I, F ? cp(t), we will define an encoding
?t(I,F) ? O such that ?t cp(t) ? O is an
injection, and - p(I)p(F) 8 minp(M), p(M)p(?t(I,F))
- 16m Q(t)p(?t(I,F))
- Summing over I,F ? cp(t), we get
35The Injection ?t
- For a transition t (M,M) which is involved in
stage (ii) of unwinding a cycle, the encoding
is ?t(I,F) I ? F ? (M ? M) \
(v0, v1). - Otherwise, the encoding is ?t(I,F) I
? F ? (M ? M).
36From Congestion to Conductance
- Corollary IV Assuming the weight
function w approximates w within ratio 2 for all
(y,z) ? V1 V2 , then ? 1/100t3n4 1/106m3n4. - Proof
- Set a 1/10tn2 .
- Let (S,S) be a partition of the state-space.
37Case I
- p(S ? M) / p(S) a and p(S ? M) / p(S) a.
- Just looking at canonical paths of type A we have
a total flow of p(S ? M)p(S ? M) a2p(S)p(S)
across the cut. - Thus, tQ(S,S) a2p(S)p(S), and,
- ?(S) Q(S,S)/p(S)p(S) a2 /t 1/100t3n4.
1/10tn2
38Case II
- Otherwise, p(S ? M) / p(S) lt a .
- Note the following estimates
- p(M) 1/4(n21) 1/5n2
- p(S ? M) lt ap(S) lt a
- p(S \ M) p(S) p(S ? M) gt (1 a)p(S)
- Q(S \ M, S ? M) p(S ? M) lt ap(M)
39Case II (cont.)
- Consider the cut (S \ M, S ? M).
- The weight of canonical paths (all chargeable as
they cross the cut) is p(S \ M)p(M) (1
a)p(S)/5n2 p(S)/6n2.
1/10tn2
- Hence, tQ(S \ M,S ? M) p(S)/6n2.
- Q(S,S) p(S)p(S)/15tn2.
- ?(S) Q(S,S)/p(S)p(S) 1/15tn2.
40Summing It Up
- Starting from an initial state X0 of maximum
activity guarantees p(X0) 1/n!, and hence,
log(p(X0)-1) O(n log n). - We showed ?(S) 1/100t3n4, and hence, ?(S)-1
O(t3n4) O(m3n4). - Thus, according to the conductance theorem,
?x0(?) O(m6n8(n logn log?-1)).