Title: Why almost all satisfiable kCNF formulas are easy
1Why almost all satisfiable k-CNF formulas are
easy?
Joint work with A. Coja-Oghlan and M. Krivelevich
2SAT Basic Notions
- 3CNF form
- F (x1Çx2Çx5) Æ (x3Çx4Çx1) Æ (x1Çx2Çx6) Æ
- Ã
- F ( F ÇF Ç T ) Æ ( T Ç T Ç T ) Æ ( T Ç F Ç T
)Æ
x5 supports this clause w.r.t. Ã
Goal algorithm that produces optimal result,
efficient, and works for all inputs
3SAT Some Background
- Finding a satisfying assignment is NP Hard
Cook71 - No approximation for MAX-SAT with factor better
than 7/8 Hastad01 - How to proceed?
- Hardness results only show that there exist hard
instances - The heuristical approach - relaxes the
universality requirement - Typical instance?
- One possibility random models
Heuristic is a polynomial time algorithm that
produces optimal results on typical instances
4Random 3SAT
- Random 3SAT
- Fix m,n
- Pick m clauses uniformly at random (over the n
variables) - Threshold there exists a constant d such that
Fri99 - m/nd most 3CNFs are not satisfiable (4.506)
- m/nltd most 3CNFs are satisfiable (3.52)
- Near-threshold 3CNFs are apparently hard for
many SAT heuristics - Possible reason complicated structure of
solution space (clustering)
5Near Threshold Clustering Phenomenon
- Conjectured solution space of Random k-SAT just
below the threshold - (part of this picture was rigorously proved for
k8, AR06,MMZ05)
- All assignments within a
- cluster are close
- A linear number of
- variables are frozen
- Every two clusters are far
- from each other
- Exponentially many clusters
6Our Result
- Rigorously characterize the structure of the
solution space of Random - 3SAT, m/n some constant above the threshold
- Single cluster of satisfying assignments
- Size of the cluster is exponential in n
- (1-e-?(m/n))n variables are frozen
7Our Results
Theorem There exists a deterministic polynomial
time algorithm that finds a satisfying
assignment for almost all satisfiable 3CNF
formulas with m/ngtC, C a sufficiently large
constant
- Rigorously complement results for the very sparse
case - When clustering is simple the problem is easy
- When clustering is complicated the problem is
harder (?) - Improving the exponential time algorithm for
uniform satisfiable 3CNFs in this regime (only
one known so far, Chen03)
Almost all k-CNF formulas are easy !
8The Planted Distribution
- Planted 3SAT distribution with parameters m,n
- Fix an assignment ?
- Pick u.a.r. m clauses out of all clauses that
are satisfied by ? - Planted 3SAT was analyzed in several papers
- Fla03 shows a spectral algorithm for solving
sparse instances - Ben-Sasson et. al. for m/n?(logn) (planted and
uniform coincide) - Planted models also fashionable for graph
coloring, max clique, max independent set, min
bisection - Planted models are more approachable clauses
are practically independent - Open question how does the planted model compare
with the uniform?
9Our Result
- We show that the planted and uniform
distributions share many structural properties
(close) - In particular, same structure of the solution
space - Justifying the somewhat unnatural usage of
planted-solution models - Flaxmans algorithm Fla03 works for the uniform
distribution as well
10SAT and Message Passing
- FMV06 Warning Propagation was shown to solve
planted 3SAT instances with m/ngtC, C some
sufficiently large constant - Our work implies WP works in the uniform
setting as well - Reinforces the following thesis
- When clustering is complicated ) formulas are
hard ) sophisticated algorithms needed Survey
Propagation - When clustering is simple ) formulas are easy )
naïve algorithms work Warning Propagation
11Clustering Proof Technique
- Recall uniform distribution over satisfiable
3CNFs with m clauses - Why more difficult than the planted distribution?
- Edges are not independent
- For starters, consider the planted 3SAT
distribution - m/n sufficiently large constant
- Every variable is expected to support 3m/(7n)
clauses w.r.t. planted - Prx supports CPrx supports Cx appears in
CPrx appears in C
Fact 1 whp there is no subformula H on h
variables s.t. hltn/100 and there are at
least hm/(10n) clauses containing two
variables from H
Fact 2 whp there are no two satisfying
assignments at distance greater than n/100
12Clustering Proof Technique
Claim suppose that every variable has the
expected support, and Facts 1 and 2 hold, then F
is uniquely satisfiable
- Proof suppose not,
- Let ? be the planted assignment and à some other
satisfying assignment - Take x s.t. Ã(x)??(x), x supports 3m/(7n)
clauses w.r.t. ? - Consdier such clause (T Ç F Ç F)
- Define H x Ã(x)??(x) , hHltn/100 (Fact 1)
- There exists 3hm/(7n) clauses containing two
variables from H - This contradicts Fact 2.
F
T
Ã
13Clustering Proof Technique
- This picture is whp the case when m/ngtClog n
- When m/nO(1) - whp not the case (some variables
have 0 support)
Definition Given a 3CNF F and a satisfying
assignment Ã, a set C is called a core of F if
8x2C, x supports at least m/(4n) clauses in FC
- Claim For F in the planted distribution, m/n
sufficiently large constant - there exists a core C s.t.
- V(C)gt(1-e-?(m/n))n
- C is frozen in F
Corollary one-cluster structure
14Moving to the Uniform Case
- A a bad structural property (in our case no
big core) - ? expected number of satisfying assignments of
planted 3CNF
Claim PruniformA lt ?PrplantedA
Claim Pruniformno big core lt ?Prplantedno
big corelt ¹e-nc
Claim ¹ltenc, cltc
Corollary Pruniformno big core o(1)
15Further Research
solution space
m/n