Title: Non-clausal Reasoning
1Non-clausal Reasoning
- Fahiem Bacchus, Christian Thiffault, Toronto
- Toby Walsh, UCC Uppsala
- (soon UNSW, NICTA, Uppsala)
2Every morning
- I read the plaque on the wall of this house
- Dedicated to the memory of George Boole
- Professor of Mathematics at Queens College (now
University College Cork)
3George Boole (1815-1864)
- Boolean algebra
- The Mathematical Analysis of Logic, Cambridge,
1847 - The Calculus of Logic, Cambridge and Dublin
Mathematical journal, 1848 - Reduce propositional logic to algebraic
manipulations
4George Boole (1815-1864)
- Boolean algebra
- The Mathematical Analysis of Logic, Cambridge,
1847 - The Calculus of Logic, Cambridge and Dublin
Mathematical journal, 1848 - Reduce propositional logic to algebraic
manipulations
Crater on the moon named after him!
5How do we automate reasoning with propositional
formulae?
6Propositional SATisfiability
- Rapid progress being made
- 10 years ago, lt 50 vars
- Today, gt 1000 vars
- Algorithmic advances
- Learning
- Watched literals
- ..
- Heuristic advances
- VSIDS branching
7Propositional SATisfiability
- Efficient implementations
- Chaff, Berkmin, Forklift,
- SAT competition has new winner almost every year
- Practical applications
- Hardware verification
- Planning
8SAT folklore
- Need to solve in CNF
- Everything is a clause
- Efficient reasoning
- Optimize code with simple data structures
- Effective reasoning
- Conversion into CNF does not hinder unit
propagation
9Overturning SAT folklore
- Deciding arbitrary Boolean formulae
- Without converting into CNF
- Efficient reasoning
- Raw speed as good as optimized CNF solvers
- Effective reasoning
- More inference than unit propagation
- Exploit structure
- More exotic gates,
Similar ideas being explored in ATPG
10Davis Putnam procedure
- DPLL(S)
- if S empty then SAT
- if S contains then UNSAT
- if S contains unit, l then DPLL(S u l)
- else chose literal, l
- if DPLL(S u l) then SAT
- else DPLL(S u -l)
11Unit Propagation
- If the formula has a unit clause then the literal
in that clause must be true - Set the literal to true and reduce the formula.
- Unit propagation is the most commonly used type
of constraint propagation - One of the most important parts of current SAT
solvers
12Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
13Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
atrue
14Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
atrue
15Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
atrue
16Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
atrue
17Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
bfalse
18Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
bfalse
19Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
bfalse
20Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
c true
21Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
c true
22Unit Propagation
(a)(-a, b, c)(-b)(a, d, e)(-c, d, g)
c true
23Implementing Unit Propagation
- UP is main (often only) inference rule applied at
each search node. - Performing UP occupies most of the time in these
solvers. - More efficient implementations of UP has been one
of the recent advances.
24Implementing Unit Propagation
- Most DPLL solvers do not build an explicit
representation of the reduced formula - Too expensive in time and space to do this.
- Rather they keep original formula and mark the
changes made - All changes generated by UP undone when we
backtrack.
25Tableau Crawford and Auton 95
- We number the variables and clauses.
- Each variable has
- a field to store its current value, true, false
or unvalued - the list of clauses it appears positively in
- the list of clauses it appears negatively in
- Each clause has
- a list of its literals
- a flag to indicate whether or not it is satisfied
- the number of unvalued literals it contains
26Tableau Crawford and Auton 95
- Unit propagated literal put on a stack
- pop the literal on top of the stack
- mark the variable with the appropriate value.
- mark each clause it appears positively in as
satisfied. - for each clause it appears negatively in
- if the clause is not already satisfied decrement
the clauses counter - if the counter is equal to 1, the clause is unit
- find the single unvalued literal in the clause
and add that literal to the UP stack. - remember all changes so that they can be undone
on backtrack.
27Watch literals SATO, Chaff
- Tableaus technique requires visiting each clause
a variable appears in when we value a variable. - When clause learning is employed, and 100,000s
of long new clauses are added to the original
formula this becomes slow. - The watch literal technique is more efficient.
28Watch literals SATO, Chaff
- For each clause, pick two literals to watch.
- At least one of these literals must be false for
the clause to be unit. - For each variable instead of lists of all of the
clauses it appears in positively and negatively,
we only have lists of the clauses it is a watch
for. - reduces the total size of these lists from O(kn)
to O(n)
29Watch literals SATO, Chaff
- When we assign a value to a variable we
- Ignore the clauses it watches positively
- For each clause it watches negatively, we search
the clause - if we find an unvalued literal or a true literal
not equal to the other watch we replace this
literal the watch - otherwise the clause is unit and we UP the other
watch literal if it is not already true. - On backtrack we do nothing!
- The new watch literals retain the property that
at least one of them must become false if the
clause is to become unit.
30Solving non-CNF formulae
- Convert into CNF
- Use efficient DPLL solver like Chaff
- Adapt DPLL solver to reason with non-CNF
- Exploit structure
- Permit complex gates (eg counting, XOR, ..)
31Encoding into CNF
- Most common (and relatively efficient?) is that
of Tseitin 1970. - Recusively converts a formula by adding a new
variable for every subformula. - Linear space
32Tseitin Encoding
33Tseitin Encoding
- A ? (C D)
- V1 ? (C D)
- (V1, C), (V1, D), (C,D,V1)
(V1, C) (V1, D) (C,D,V1)
34Tseitin Encoding
- A ? (C D)
- V1 ? (C D)
- (V1, C), (V1, D), (C,D,V1)
- V2 ? (A ? V1)
- (V2,A,V1), (A, V2), (V1, V2)
(V1, C) (V1, D) (C,D,V1) 4. (V2, A, V1) (A, V2) 6. (V1, V2)
35Tseitin Encoding
- A ? (C D)
- V1 ? (C D)
- (V1, C), (V1, D), (C,D,V1)
- V2 ? (A ? V1)
- (V2,A,V1), (A, V2), (V1, V2)
(V1, C) (V1, D) (C,D,V1) 4. (V2, A, V1) (A, V2) (V1, V2) 7. (V2)
36Disadvantage of CNF
- Structural information is lost
- Flattens formulae into clauses.
- In a Boolean circuit
- Which variables are inputs?
- Which are internal wires?
-
- Additional variables are added.
- Potentially increases the size of the DPLL search.
37Structural Information
- Not all structural information can be recovered
Lang Marquis, 1989. - Recovering structural information can improve
performance EqSatZ, LSAT. - Why lose this information in the first place?
- In addition, we can exploit more complex gates
38Extra Variables
- Potentially increase search space
- Do not branch on any on the newly introduced
subformula variables. - Theoretically this can increase exponentially the
size of smallest DPLL proof Jarvisalo et al.
2004 - Empirically solvers restricted in this way can
perform poorly
39Extra Variables
- The alternative is unrestricted branching.
- However, with unrestricted branching, a CNF
solver can waste a lot of time branching on
variables that have become irrelevant.
40Irrelevant Variables
- A ? (C D) Afalse
- formula satisfied
41Irrelevant Variables
Solver must still determine that the remaining
clauses are SAT
- A ? (C D)
- V1 ? (C D)
- V2 ? (A ? V1)
1. (V1, C) 2. (V1, D) 3. (C,D,V1) 4. (V2, A,V1) 5. (A,V2) 6. (V1,V2) 7. (V2) 8. (A)
42Converting to CNF is Unnecessary
- Search can be performed on the original formula.
- This has been noted in previous work on circuit
based solvers, e.g. Ganai et al. 2002 - Reasoning with the original formula may permit
other efficiencies - E.g. exploiting structure, complex gates
43DPLL on formulae
- View formulae as DAGs
- Every node has a label (True/ False/ Unassigned)
- Branch on the truth value of any unassigned node
- Use Boolean logic to propagate truth values to
neighbouring nodes - Contradiction when node is labeled both True and
False - Find consistent labeling with truth values that
assigns True to root (SAT) - Or exhaust all possibilities (UNSAT)
44 True
\/
\/
False
?
xor
?
B
A
C
D
C
D
45Labeling ? unit propagation
- Labeling a node ? assigning a truth value to
corresponding var in CNF encoding - Propagating labels in the DAG ? unit propagation
in the CNF encoding
46Learning
- Once a contradiction is detected a conflict
clause can be learned - set of impossible node assignments
- can use 1-UIP scheme (as in CNF solvers)
- Learned clauses stored and used to unit propagate
node truth values
47Complex gates
- Gates can have arbitrary degree
- n-ary AND, n-ary OR,
- Gates can be complicated Boolean functions
- n-ary XOR (which requires exponential number of
CNF clauses) - cardinality gates (at least one, k out of n, ..)
48Label propagation
- Use lazy data structures as in CNF solvers
- For example. assign one child as a true watch for
an AND gate - Dont check if AND gate can be labeled true until
its true watch becomes true - Some benchmarks have AND gates with thousands of
children - No intrinsic loss of efficiency in using the DAG
over CNF.
49Structure based optimizations
- We can also exploit the extra structural
information the DAG provides - Two such optimizations
- Dont care propagation to deal with irrelevant
subformulae - Conflict clause reduction
50Dont Care labeling
- Add a third truth value to the DAG dont
care - A node C is dont care wrt a particular parent P
- If its truth value can no longer affect the truth
value of P nor any of its P siblings. - Or P is dont care.
- A node C is dont care if it is dont care wrt to
all of its parents - No need to branch on dont cares!
51Dont Care labeling
- Assign a dont care watch parent for each node.
- When P is labeled, C can becom dont care wrt to
its watch parent P - If C becomes dont care wrt to its dont care
watch we look for another watch. - If we cant find one we know, C has become dont
care
52 True
\/
\/
False
Dont care
?
xor
?
xor
B
B
A
A
C
D
C
D
53Conflict Clause Reductions
- If one learns (L1,L2,...) and one has (L1, L2)
then we can reduce the conflict clause - (L1,L2) resolves with (L1,L2,...) to give
(L2,...) - Result subsumes the original conflict clause
- In CNF, we would have to search the clause
database to detect this situation - Probably not going to be effective
54Conflict Clause Reductions
- Suppose P is an AND node, and C is a child
- Then C implies P
- If we have the conflict clause
- (P,C,X,)
- This reduces to
- (P,X,)
- Equivalent to a resolution step against (C,P)
55Conflict Clause Reductions
- When conflict clause generated
- Search neighbours in DAG for such reductions
- More useful on shorter clauses
- Experimentally found it only worth looking for
such reductions on clauses of length 100 or less
56Empirical Results.
- We compared with Zchaff
- Tried to isolate impact of CNF v non-CNF
- Made the two solvers as close as possible
- Same magic numbers (e.g., clause database cleanup
criteria, restart intervals etc.) - Same branching heuristics
- Expect similar improvements could be obtained
with others CNF solvers
57Empirical Results caveats
- Lack of non-clausal benchmarks
- Hope SAT-05 competition will include non-CNF
- Benchmarks we did obtain had already been
transformed into simpler formulas - No complex XOR or IFF gates
58FVP-UNSAT-2.0 (Velev) Time
59FVP-UNSAT-2.0 Decisions
60FVP-UNSAT-2.0 Dont Cares
61FVP-UNSAT-2.0 Clause Reduction
62Other Series
63Conclusions
- No intrinsic reason to convert to CNF
- Many other structure based optimizations remain
to be investigated - Branching heuristics
- Non-clausal conflicts
- More complex gates