Title: Combining Component Caching and Clause Learning for Effective Model Counting
1Combining Component Caching and Clause Learning
for Effective Model Counting
- Tian Sang
- University of Washington
- Fahiem Bacchus (U Toronto), Paul Beame (UW),
- Henry Kautz (UW), Toniann Pitassi (U Toronto)
2Why SAT?
- Prototypical P complete problem
- Natural encoding for counting problems
- Test-set size
- CMOS power consumption
- Can encode probabilistic inference
3Generality
NP complete
- SAT
- SAT
- Bayesian Networks
- Bounded-alternation Quantified Boolean formulas
- Quantified Boolean formulas
- Stochastic SAT
P complete
PSPACE complete
4Our Approach
- Good old Davis-Putnam-Logemann-Loveland
- Clause learning (no good-caching)
- Bounded component analysis
- Formula caching
5DPLL with Clause Learning
DPLL(F) while exists unit clause (y) ? F F
? Fy if F is empty, report satisfiable and
halt if F contains the empty clause Add a
conflict clause C to F return false choose a
literal x return DPLL(Fx) DPLL(F?x)
6Conflict Graph
Known Clauses (p ? q ? a) (? a ? ? b ? ? t) (t ?
?x1) (t ? ?x2) (t ? ?x3) (x1 ? x2 ? x3 ? y) (x2 ?
?y)
Current decisions p ? false q ? false b ? true
7Component Analysis
- Can use DPLL to count models
- Just dont stop when first assignment is found
- If formula breaks into separate components (no
shared variables), can count each separately and
multiply results - SAT(C1 ? C2) SAT(C1) SAT(C2)
- (Bayardo Shrag 1996)
8Formula Caching
- New idea cache number of models of residual
formulas at each node - Bacchus, Dalmao Pitassi 2003
- Beame, Impagliazzo, Pitassi, Segerlind 2003
- Matches time/space tradeoffs of best known exact
probabilistic inference algorithms
9SAT with Component Caching
- SAT(F) // Returns fraction of all truth
- // assignments that satisfy F
- a 1
- for each G ? to_components(F)
- if (G ?) m 1
- else if (? ? G) m 0
- else if (in_cache(G)) m cache_value(G)
- else select v ? G
- m ½ SAT(Gv)
- ½ SAT(G?v)
- insert_cache(G,m)
-
- if (m 0) return 0
- a a m
- return a
10Putting it All Together
- Goal combine
- Clause learning
- Component analysis
- Formula caching
- to create a practical SAT algorithm
- Not quite as straightforward as it looks!
11Issue 1 How Much to Cache?
- Everything
- Infeasible often gt 10,000,000 nodes
- Only sub-formulas on current branch
- Linear space
- Similar to recursive conditioning
- Darwiche 2002
- Can we do better?
12Age versus Cumulative Hits
age time elapsed since the entry was cached
13Efficient Cache Management
- Age-bounded caching
- Separate-chaining hash table
- Lazy deletion of entries older than K when
searching chains - Constant amortized time
14Issue 2 Interaction of Component Analysis
Clause Learning
- As clause learning progresses, formula becomes
huge - 1,000 clauses ? 1,000,000 learned clauses
- Finding connected components becomes too costly
- Components using learned clauses unlikely to
reoccur!
15Bounded Component Analysis
- Use only clauses derived from original formula
for - Component analysis
- Keys for cached entries
- Use all the learned clauses for unit propagation
- Can this possibly be sound?
Almost!
16Safety Theorem
G?
F?
A2
A1
A3
Then ? can be extended to satisfy G? It is
safe to use learned clauses for unit propagation
for SAT sub-formulas
- Given
- original formula F
- learned clauses G
- partial assignment ?
- F? is satisfiable
- Ai is a component of F?
- ? satisfies Ai
17UNSAT Sub-formulas
- But if F? is unsatisfiable, all bets are off...
- Without component caching, there is still no
problem because the final value is 0 in any
case - With component caching, could cause incorrect
values to be cached - Solution
- Flush siblings ( their descendents) of UNSAT
components from cache
18Safe Caching Clause Learning Implementation
- ...
- else if (? ? G) m 0
- add a conflict clause
- ...
- if (m0)
- flush_cache( siblings(G) )
- if (G is not last child of F)
flush_cache(G) - return 0
- a a m
- ...
19Evaluation
- Implementation based on zChaff
- (Moskewicz, Madigan, Zhao, Zhang, Malik 2001)
- Benchmarks
- Random formulas
- Pebbling graph formulas
- Circuit synthesis
- Logistics planning
20Random 3-SAT, 75 Variables
sat/unsat threshhold
21Random 3-SAT Results
75V, R1.0
75V, R1.4
75V, R1.6
75V, R2.0
22Results Pebbling Formulas
X means time-out after 12 hours
23Summary
- A practical exact model-counting algorithm can be
built by the careful combination of - Bounded component analysis
- Component caching
- Clause learning
- Outperforms the best previous algorithm by orders
of magnitude
24Whats Next?
- Better heuristics
- component ordering
- variable branching
- Incremental component analysis
- Currently consumes 10-50 of run time!
- Applications to Bayesian networks
- Compiler for discrete BN to weighted SAT
- Direct BN implementation
- Applications to other P problems
- Testing, model-based diagnosis,
25Questions?
26Results Planning Formulas
X means time-out after 12 hours
27Results Circuit Synthesis
X means time-out after 12 hours
28Bayesian Nets to Weighted Counting
- Introduce new vars so all internal vars are
deterministic
A
B
29Bayesian Nets to Weighted Counting
- Introduce new vars so all internal vars are
deterministic
A
B
30Bayesian Nets to Weighted Counting
- Weight of a model is product of variable weights
- Weight of a formula is sum of weights of its
models
31Bayesian Nets to Weighted Counting
- Let F be the formula defining all internal
variables - Pr(query) weight(F query)
32Bayesian Nets to Counting
- Unweighted counting is case where all non-defined
variables have weight 0.5 - Introduce sets of variables to define other
probabilities to desired accuracy - In practice just modify SAT algorithm to
weighted SAT