Title: Hierarchical Hardness Models for SAT
1 Hierarchical Hardness Models for SAT
Lin Xu, Holger H. Hoos, Kevin Leyton-Brown,
University of British Columbia
Classifying SAT Instances
Background and Motivation
Hierarchical Hardness Models
One goal of using empirical hardness models is to
predict the runtime of an algorithm based on some
polytime-computable features. We use ridge linear
regression in this research.
- Building the hierarchical hardness models
- The classifier is trained on a set of training
instances, then tested on validation and test
data sets. - Three empirical hardness models Msat, Munsat,
Muncond are trained using training data with
satisfiable, unsatisfiable and mixture instances.
Each model is used to predict the runtime of
every instance in the test set, and the responses
are rsat, runsat, runcond. - For each test instance, the predicted runtime is
selected from one of three response sets based on
its classification score and score threshold of
sat and unsat. Those thresholds are selected by
cross-validation.
SMLR Sparse Multinomial Logistic
Regression Input vector the same set of raw
features as that in regression Output the
probabilities of an instance belonging to each
class
The response is a linear combination of basis
functions
Encouraging Results
The free parameters w are computed by
1. High overall classification accuracy
Experimental Results
Experimental Setup
Greater accuracy and less bias runtime prediction
Four types of SAT-distributions
Random 3-SAT Structured SAT
rand3-var, rand3-fix QCP,
SW-GCP
84 features can be classified into nine
categories
2. Big fraction of instances with high
classification scores
problem size, variable-clause graph, variable
graph, clause graph, balance features,
proximity-to-horn formulae, LP-based, DPLL search
space and local search space
Solvers for different problem distributions
Random 3-SAT
Structured SAT kcnfs, oksolver
oksolver, zchaff, sato
march_dl, satz satelite, minisat,
satzoo
Distribution rand3-var Solver satz
Motivation
Left rand3-var Right QCP
Much simpler and more accurate empirical hardness
models can be learned when all instances are
either satisfiable or unsatisfiable.
Left rand3-fix Right SW-GCP
3. High classification accuracy for a small
number of features
Distribution QCP Solversatelite
References
Unconditional model the model trained on a
mixture of satisfiable and unsatisfiable
instances. Magic model if we had an oracle that
could determine the satisfiability of an unsolved
test instance, we could use models trained on
satisfiable instances (Msat) or unsatisfiable
instances (Munsat) to predict the runtime.
- Using only five features can achieve over 97 of
the accuracy obtained using all features. - Local search-based features turned out to be very
important for all four data sets. - The important features for regression and
classification are similar.
- L. Xu, H. H. Hoos, K. Leyton-Brown. Hierarchical
hardness model. Submitted April, 2006 - E. Nudelman, K. Leyton-Brown, H. H. Hoos, A.
Devkar, and Y. Shoham. Understanding random SAT
Beyond the clauses-to-variables ratio. In CP 04,
438-452, 2004.
Approximating the performance of the magic model
offers the possibility of performance gains for
empirical hardness models. Using the wrong model
could result in a significant reduction in
performance. Key point An accurate,
computationally efficient way for distinguishing
between satisfiable and unsatisfiable instances.
Using a classifier to distinguish satisfiable and
unsatisfiable instances is feasible and
reliable. Since all the features are computed for
ridge linear regression anyway, the
classification is free. NP-hardness of SAT
problems indicates that no existing classifier
can achieve 100 classification accuracy.
Acknowledgement This research was supported by a
Precarn scholarship
16th Annual Canadian Conference on Intelligent
Systems, Victoria, BC, May, 2006