Title: Understanding Problem Hardness: Recent Developments and Directions Bart Selman Cornell University
1Understanding Problem Hardness Recent
Developments and Directions Bart Selman
Cornell University
2Introduction Motivation
-
- Computational Challenges in Planning, Reasoning,
- Learning, and Adaptation.
- What are the characteristics of challenging
- computational problems?
3A Few Examples
- Reasoning
- many forms of deduction
- abduction / diagnosis (e.g. de Kleer 1989)
- default reasoning (e.g. Kautz and
Selman 1989) - Bayesian inference (e.g. Dagum and Luby
1993) - Planning
- domain-dependent and independent (STRIPS)
(e.g. Chapman 1987 Gupta and Nau 1991
Bylander1994) - Learning
- neural net loading problem (e.g. Blum and
Rivest 1989) - Bayesian net learning
- decision tree learning
4- An abundance of negative complexity results for
- many interesting tasks.
- Results often apply to very restricted
formalisms, - and also to finding approximate solutions.
- But worst-case, what about average-case?
- Sometimes surprising results.
- A closer look leads to new insights
- algorithms and solution strategies.
5Outline
-
- A --- Early results
- phase transitions
computational hardness - B --- Current focus
- --- problem mixtures
(tractable / intractable) - --- adding global
structure - C --- Future directions and
prospects - --- modeling resource
constraints - --- adaptive computing
- --- deeper theoretical
understanding
6 7Example Domain Satisfiability
- SAT Given a formula in propositional calculus,
is there an assignment to its variables making it
true? - We consider clausal form, e.g.
- (a b c) ( b d (b
c e) . . . - The canonical NP-complete problem.
- (exponential search space)
8(No Transcript)
9Generating Hard Random Formulas
- Key Use fixed-clause-length model.
- (Mitchell, Selman, and Levesque 1992 Kirkpatrick
and Selman 1994) - Critical parameter ratio of the number of
clauses to the
number of variables. - Hardest 3SAT problems at ratio 4.25
10(No Transcript)
11Intuition
- At low ratios
- few clauses (constraints)
- many assignments
- easily found
- At high ratios
- many clauses
- inconsistencies easily detected
12(No Transcript)
13Phase transition 2-, 3-, 4-, 5-, and 6-SAT
14Theoretical Status Of Threshold
- Very challenging problem ...
- Current status
- 3SAT threshold lies between 3.003 and 4.6.
- (Motwani et al. 1994 Broder et al. 1992
- Frieze and Suen 1996 Dubois 1990, 1997
- Kirousis et al. 1995 Friedgut 1997
- Archlioptas et al. 1999 / related work
- Beame, Karp, Pitassi, and Saks 1998
- Bollobas, Borgs, Chayes, Han Kim, and
- Wilson 1999)
15 - Phase transition and combinatorial problems is an
- active research area with fruitful
interactions - between computer science, physics (approaches
- from statistical mechanics), and mathematics
- (combinatorics / random structures).
- Also, a close interaction between experimental
and - theoretical work. (With experimental
findings quite often - confirmed by formal analysis within months
to a few years.) - Finally, relevance to applications via
algorithmic - advances and notion of critically
constrained - problems.
-
16Consequences for Algorithm Design
- Phase transition work instances led to
- improvements in algorithms
-
- --- local search methods (e.g., GSAT /
Walksat) - (Selman et al. 1992 1996 Min Li
1996 Hoos 1998, etc.) - --- backtrack-style methods (Davis-Putnam and
- variants / complete)
- (Crawford 1993 Dubois 1994 Bayardo
1997 Zane 1998, etc.) -
17Progress
-
- Propositional reasoning and search (SAT)
- 1990 100 variables / 200 clauses
(constraints) - 1998 10,000 - 100,000 variables /
106 clauses -
- Novel applications
- e.g. in planning (Kautz Selman),
- program debugging (Jackson),
- protocol verification
(Clarke), and - machine learning (Resende).
18B. Current Focus
- --- mixtures of problem classes, e.g., 2-SAT
- and 3-SAT (moving between P and NP)
- the 2p-SAT model
- --- structured instances
- perturbed quasi-group completion problems
-
19Focus --- 1) mixtures 2p-SAT problem
- mixture of binary and ternary clauses
- p fraction ternary
- p 0.0 --- 2-SAT / p 1.0 ---
3-SAT - What happens in-between?
- (Monasson, Zecchina, Kirkpatrick, Selman, and
Troyansky, - Nature, to appear)
20 Phase Transition for 2p-SAT
21Location Threshold
22Computational Cost
23Results for 2p-SAT
- p lt 0.41 --- model essentially behaves
as 2-SAT - search proc.
sees only binary constraints - smooth, continuous
phase transition -
- p gt 0.41 --- behaves as 3-SAT
(exponential scaling) - abrupt,
discontinuous scaling - Many new, rigorous results (including
scaling) by - Achlioptas, Bollobas, Borgs,
Chayes, Han Kim, - and Wilson. (Next talk.)
-
24Consequences for Algorithm Design
- 1) Strategies that exploit tractable
substructure - with propagation are most effective.
- (consistent with the best empirically
discovered - methods)
- 2) In addition, use early branching on
critically - constrained variables.
- (the backbone variables / suggests use of
- clustering and statistical learning
methods) - (Boyan and Moore 1998)
-
25Focus --- 2) Structure
- Proposal study the influence of global
- structure on problem hardness.
-
(Gomes and Selman 1997 1998)
26Quasigroups
Defn. a pair (Q, ) where Q is a set, and is a
binary operation on Q such that
a x b y a b are uniquely
solvable for every pair of elements a,b in
Q. The multiplication table of its binary
operation defines a latin square (i.e., each
element of Q appears exactly once in each
row/column). Example
Quasigroup of order 4
27Quasigroup Completion Problem (QCP)
Given a partial latin square, can it be
completed? Example
28Quasigroup Completion Problem A Framework for
Studying Search
- NP-Complete (Colbourn 1983, 1984 Anderson 1985).
- Has a regular global structure not found in
- random instances.
- Leads to interesting search problems when
- structure is perturbed.
- similar to e.g. structure found in the channel
assignment problem - for cellular networks
29Computational Cost
30(No Transcript)
31Consequences for Algorithm Design
- On these structured problems, backtrack
- search methods show so-called
- heavy-tailed probability distributions.
- (Gomes, Selman Crato 1997, 1998).
- Both very short and very long runs occur
- much more frequent than one would expect.
32Standard Distribution
33Heavy Tailed Cost Distribution
34Fringe of Search Tree
35 - Algorithmic Strategy
- Rapid Random Restarts.
- Order of magnitude speedup.
- (Gomes et al. 1998 1999)
- Related
- . Algorithm portfolios (Huberman 1998
Gomes 1998) - . Universal strategies
- (Ertel and Luby 1993 Alt et al. 1996)
36Rapid Restarts --- Planning
37Portfolio for heavy-tailed search procedures
(2-20 processors)
38C. Future directions and prospects
-
-
- Modeling resource
constraints - user requirements / utility
- should be possible to
identify optimal - restart strategies,
possibly adaptive - --- may need way of
measuring progress - (Horvitz and Klein
1995 Gomes and Selman 1999) -
-
39-
- Adaptive Computing
- combine statistical learning
methods with - combinatorial search
techniques. - first success STAGE system
for local search. - (Boyan and Moore
1998) - extension train a
planner on small instances -
(Selman, Kautz, Huang 1999) - Deeper theoretical
understanding - with continued
interactions with experiments - and applications
40Summary
- During the past few years, we have obtained a
much - better understanding of the nature of
- computationally hard problems.
- Rich interactions between physics, computer
- science and mathematics, and between
theory, - experiments, and applications.
- Clear algorithmic progress with room for future
- improvements (possibly another level of
scaling - 106 Boolean variables, 108 constraints.
Further - applications.)