Title: Exploiting Structure and Randomization in Combinatorial Search Carla P. Gomes gomes@cs.cornell.edu www.cs.cornell.edu/gomes Intelligent Information Systems Institute Department of Computer Science Cornell University
1Exploiting Structure and Randomization in
Combinatorial SearchCarla P. Gomesgomes_at_cs.corn
ell.eduwww.cs.cornell.edu/gomesIntelligent
Information Systems InstituteDepartment of
Computer ScienceCornell University
2Outline
- A Structured Benchmark Domain
-
- Randomization
- Conclusions
-
3Outline
- A Structured Benchmark Domain
-
- Randomization
- Conclusions
-
4Quasigroups or Latin Squares An Abstraction for
Real World Applications
Given an N X N matrix, and given N colors, a
quasigroup of order N is a a colored matrix,
such that -all cells are colored. - each
color occurs exactly once in each row. -
each color occurs exactly once in each
column.
Quasigroup or Latin Square (Order 4)
5Quasigroup Completion Problem (QCP)
Given a partial assignment of colors (10 colors
in this case), can the partial quasigroup (latin
square) be completed so we obtain a full
quasigroup? Example
32 preassignment
(Gomes Selman 97)
6Quasigroup Completion Problem A Framework for
Studying Search
- NP-Complete.
- Has a structure not found in random instances,
- such as random K-SAT.
- Leads to interesting search problems when
structure is perturbed (more about it later). - Good abstraction for several real world problems
scheduling and timetabling, routing in fiber
optics, coding, etc
(Anderson 85, Colbourn 83, 84, Denes Keedwell
94, Fujita et al. 93, Gent et al. 99, Gomes
Selman 97, Gomes et al. 98, Meseguer Walsh 98,
Stergiou and Walsh 99, Shaw et al. 98, Stickel
99, Walsh 99 )
7Fiber Optic Networks
Nodes connect point to point fiber optic links
8Fiber Optic Networks
Nodes connect point to point fiber optic links
9Routing in Fiber Optic Networks
Input Ports
Output Ports
1
1
2
2
3
3
4
4
Routing Node
How can we achieve conflict-free routing in each
node of the network?
Dynamic wavelength routing is a NP-hard problem.
10QCP Example Use Routers in Fiber Optic Networks
Dynamic wavelength routing in Fiber Optic
Networks can be directly mapped into the
Quasigroup Completion Problem.
(Barry and Humblet 93, Cheung et al. 90, Green
92, Kumar et al. 99)
11Traditional View of Hard Problems - Worst Case
View
- Theyre NP-Completetheres no way to do
anything but try heuristic approaches and hope
for the best.
12New Concepts in Computation
- Not all NP-Hard problems are the same!
- We now have means for discriminating easy from
hard instances - ---gt Phase Transition concepts
13 NP-completeness is a worst-case notion what
about average complexity? Structural
differences between instances of the same NP-
complete problem (QCP)
14 Are all the Quasigroup Instances (of same size)
Equally Difficult?
What is the fundamental difference between
instances?
15 Are all the Quasigroup Instances Equally
Difficult?
1820
165
50
40
16Complexity of Quasigroup Completion
Median Runtime (log scale)
Fraction of pre-assignment
17Phase Transition
Fraction of unsolvable cases
Fraction of pre-assignment
18- These results for the QCP - a structured
domain, nicely complement previous results on
phase transition and computational complexity for
random instances such as SAT, Graph Coloring,
etc. - (Broder et al. 93 Clearwater and Hogg 96,
Cheeseman et al. 91, Cook and Mitchell 98,
Crawford and Auton 93, Crawford and Baker 94,
Dubois 90, Frank et al. 98, Frost and Dechter
1994, Gent and Walsh 95, Hogg, et al. 96,
Mitchell et al. 1992, Kirkpatrick and Selman 94,
Monasson et 99, Motwani et al. 1994, Pemberton
and Zhang 96, Prosser 96, Schrag and Crawford
96, Selman and Kirkpatrick 97, Smith and Grant
1994, Smith and Dyer 96, Zhang and Korf 96, and
more)
19QCPDifferent Representations / Encodings
20Rows
Colors
Columns
Cubic representation of QCP
21QCP as a MIP
- Variables -
-
- Constraints -
Row/color line
Column/color line
Row/column line
22QCP as a CSP
- Variables -
-
- Constraints -
vs. for MIP
row
column
23Exploiting Structure for Domain Reduction
- A very successful strategy for domain reduction
in CSP is to exploit the structure of groups of
constraints and treat them as global constraints. - Example using Network Flow Algorithms
- All-different constraints
(Caseau and Laburthe 94, Focacci, Lodi, Milano
99, Nuijten Aarts 95, Ottososon Thorsteinsson
00, Refalo 99, Regin 94 )
24Exploiting Structure in QCP ALLDIFF as Global
Constraint
(Berge 70, Regin 94, Shaw and Walsh 98 )
25Exploiting Structure Arc Consistency vs. All Diff
AllDiff Solves up to order 33 Size search space
26Quasigroup as Satisfiability
- Two different encodings for SAT
- 2D encoding (or minimal encoding)
- 3D encoding (or full encoding)
272D Encoding or Minimal Encoding
- Variables
- Each variables represents a color assigned to
a cell. - Clauses
- Some color must be assigned to each cell (clause
of length n) - No color is repeated in the same row (sets of
negative binary clauses) - No color is repeated in the same column (sets of
negative binary clauses)
283D Encoding or Full Encoding
- This encoding is based on the cubic
representation of the quasigroup each line of
the cube contains exactly one true variable - Variables
- Same as 2D encoding.
- Clauses
- Same as the 2 D encoding plus
- Each color must appear at least once in each row
- Each color must appear at least once in each
column - No two colors are assigned to the same cell
29Capturing Structure - Performance of SAT Solvers
- State of the art backtrack and local search and
complete SAT solvers using 3D encoding are very
competitive with specialized CSP algorithms. - In contrast SAT solvers perform very poorly on 2D
encodings (SATZ or SATO) - In contrast local search solvers (Walksat)
perform well on 2D encodings -
30SATZ on 2D encoding (Order 20 -28)
Order 28
1,000,000
Order 20
SATZ and SATO can only solve up to order 28 when
using 2D encoding When using 3D encoding
problems of the same size take only 0 or 1
backtrack and much higher orders can be solved
31Walksat on 2D and 3D encoding(Order 30-33)
1,000,000
2D order 33
3D order 33
Walksat shows an unsual pattern - the 2D
encodings are somewhat easier than the 3D
encoding at the peak and harder in the
undereconstrained region
32Quasigroup - Satisfiability
- Encoding the quasigroup using only
- Boolean variables in clausal form using
- the 3D encoding is very competitive.
- Very fast solvers - SATZ, GRASP,
- SATO,WALKSAT
-
33-
- Structural features of instances provide
insights into their hardness namely - Backbone
- Inherent Structure and Balance
34Backbone
Backbone is the shared structure of all the
solutions to a given instance.
This instance has 4 solutions
35Phase Transition in the Backbone
- We have observed a transition in the backbone
from a phase where the size of the backbone is
around 0 to a phase with backbone of size close
to 100. - The phase transition in the backbone is sudden
and it coincides with the hardest problem
instances.
(Achlioptas, Gomes, Kautz, Selman 00, Monasson et
al. 99)
36New Phase Transition in Backbone QCP (satisfiable
instances only)
Backbone
of Backbone
Computational cost
Fraction of preassigned cells
37-
-
- Inherent Structure and Balance
38Quasigroup Patterns and Problems Hardness
Tractable
Very hard
(Kautz, Ruan, Achlioptas, Gomes, Selman 2001)
39SATZ
Balanced QCP
Rectangular QCP
QCP
QWH
Aligned QCP
40Walksat
Balanced filtered QCP
Balance QWH
QCP
QWH
aligned
rectangular
We observe the same ordering in hardness when
using Walksat, SATZ, and SATO Balacing makes
instances harder
41Phase Transitions, Backbone, Balance
- Summary
- The understanding of the structural properties
of problem instances based on notions such as
phase transitions, backbone, and balance provides
new insights into the practical complexity of
many computational tasks. - Active research area with fruitful interactions
between computer science, physics
(approaches - from statistical mechanics), and mathematics
(combinatorics / random structures).
42Outline
- A Structured Benchmark Domain
-
- Randomization
- Conclusions
-
43Randomized Backtrack Search Procedures
44Background
- Stochastic strategies have been very successful
in the area of local search. - Simulated annealing
- Genetic algorithms
- Tabu Search
- Gsat and variants.
- Limitation inherent incomplete nature of local
search methods. -
45Background
-
- We want to explore the addition of a
- stochastic element to a systematic search
- procedure without losing completeness.
46 Randomization
- We introduce stochasticity in a backtrack
search method, e.g., by randomly breaking ties in
variable and/or value selection. - Compare with standard lexicographic
tie-breaking.
47Randomization
- At each choice point break ties (variable
selection and/or value selection) randomly or - Heuristic equivalence parameter (H)
- at every choice point consider as equally
good H top choices randomly select a choice
from equally good choices. -
48Randomized Strategies
49Quasigroup Demo
50Distributions of Randomized Backtrack Search
- Key Properties
- I Erratic behavior of mean
- II Distributions have heavy tails.
51Erratic Behavior of Search CostQuasigroup
Completion Problem
3500!
sample mean
Median 1!
number of runs
521
5375lt30
5gt100000
Proportion of cases Solved
54Heavy-Tailed Distributions
- infinite variance infinite mean
- Introduced by Pareto in the 1920s
- --- probabilistic curiosity.
- Mandelbrot established the use of heavy-tailed
distributions to model real-world fractal
phenomena. - Examples stock-market, earth-quakes, weather,...
55Decay of Distributions
- Standard --- Exponential Decay
- e.g. Normal
-
- Heavy-Tailed --- Power Law Decay
- e.g. Pareto-Levy
-
-
56(No Transcript)
57Normal, Cauchy, and Levy
58Tail Probabilities (Standard Normal, Cauchy,
Levy)
59Example of Heavy Tailed Model(Random Walk)
- Random Walk
- Start at position 0
- Toss a fair coin
- with each head take a step up (1)
- with each tail take a step down (-1)
X --- number of steps the random walk takes
to return to position 0.
60(No Transcript)
61Heavy-tails vs. Non-Heavy-Tails
Normal (2,1000000)
1-F(x) Unsolved fraction
O,1gt200000
Normal (2,1)
X - number of steps the walk takes to return to
zero (log scale)
62How to Check for Heavy Tails?
- Log-Log plot of tail of distribution
- should be approximately linear.
- Slope gives value of
-
- infinite mean and
infinite variance - infinite variance
-
-
63Heavy-Tailed Behavior in QCP Domain
(1-F(x))(log) Unsolved fraction
Number backtracks (log)
64-
-
- Formal Models of Heavy-Tailed Behavior in
Combinatorial Search
Chen, Gomes, Selman 2001
65Motivation
- Research on heavy-tails has been largely based on
empirical studies of run time distribution. - Goal to provide a formal characterization of
tree search models and show under what conditions
heavy-tailed distributions can arise. - Intuition Heavy-tailed behavior arises
- from the fact that wrong branching decisions may
lead the procedure to explore an exponentially
large subtree of the search space that contains
no solutions - the procedure is characterized by a large
variability in the time to find a solution on
different runs, which leads to highly different
trees from run to run -
-
66Balanced vs. Imbalanced Tree Model
- Balanced Tree Model
- chronological backtrack search model
- fixed variable ordering
- random child selection with no propagation
mechanisms
(show demo)
67- T(n) - the number of leaf nodes visited
- - choice at level i (1 - bad choice 0
-good choice) - (note there is exactly one choice of zero-one
assignments to the variables for each
possible value of T(n) any such assignment has
probability . -
- T(n) follows an Uniform distribution
68The run time distribution of chronological
backtrack search on a complete balanced tree is
uniform (therefore not heavy-tailed). Both the
expected run time and variance scale
exponentially
69Balanced Tree Model
- The expected run time and variance scale
exponentially, in the height of the search tree
(number of variables) - The run time distribution is Uniform, (not heavy
tailed ). - Backtrack search on balanced tree model has no
restart strategy with exponential polynomial
time.
Chen, Gomes Selman 01
70- How can we improve on the balanced serach tree
model? - Very clever search heuristic that leads quickly
to the solution node - but that is hard in
general - Combination of pruning, propagation, dynamic
variable ordering that prune subtrees that do not
contain the solution, allowing for runs that are
short. - ---gt resulting trees may vary dramatically from
run to run.
71Formal Model Yielding Heavy-Tailed Behavior
- T - the number of leaf nodes visited up to and
including the successful node b - branching
factor
(show demo)
b 2
72- Expected Run Time
- (infinite expected time)
- Variance
-
- (infinite variance)
- Tail
- (heavy-tailed)
-
73Bounded Heavy-Tailed Behavior
(show demo)
74No Heavy-tailed behavior for Proving Optimality
75Proving Optimality
76Small-World Vs. Heavy-Tailed Behavior
- Does a Small-World topology (Watts Strogatz)
induce heavy-tail behavior?
The constraint graph of a quasigroup exhibits a
small-world topology (Walsh 99)
77Exploiting Heavy-Tailed Behavior
- Heavy Tailed behavior has been observed in
several domains QCP, Graph Coloring, Planning,
Scheduling, Circuit synthesis, Decoding, etc. - Consequence for algorithm design
- Use restarts or parallel / interleaved
runs to exploit the extreme variance
performance.
Restarts provably eliminate heavy-tailed
behavior.
(Gomes et al. 97, Hoos 99, Horvitz 99, Huberman,
Lukose and Hogg 97, Karp et al 96, Luby et al.
93, Rish et al. 97, Wlash 99)
78Super-linear Speedups
Interleaved (1 machine) 10 x 1 10 seconds
5 x speedup
79Restarts
70 unsolved
no restarts
1-F(x) Unsolved fraction
restart every 4 backtracks
0.001 unsolved
250 (62 restarts)
Number backtracks (log)
80Example of Rapid Restart Speedup(planning)
Number backtracks (log)
Cutoff (log)
81Sketch of proof of elimination of heavy tails
- Lets truncate the search procedure
- after m backtracks.
- Probability of solving problem with truncated
version - Run the truncated procedure and restart it
repeatedly.
82 Y - does not have Heavy Tails
83Decoding in Communication Systems
84Retransmissions in Sequential Decoding
without retransmissions
1-F(x) Unsolved fraction
with retransmissions
Number backtracks (log)
Gomes et al. 2000 / 20001
85Paramedic Crew Assignment
Paramedic crew assignment is the problem of
assigning paramedic crews from different
stations to cover a given region, given several
resource constraints.
86Deterministic Search
87Restarts
88Results on Effectiveness of Restarts
Deterministic
() not found after 2 days
89Algorithm Portfolio Design
90Motivation
- The runtime and performance of randomized
algorithms can vary dramatically on the same
instance and on different instances. - Goal Improve the performance of different
algorithms by combining them into a portfolio to
exploit their relative strengths.
91Branch BoundBest Bound vs. Depth First
Search
92Branch Bound(Randomized)
- Standard OR approach for solving Mixed Integer
Programs (MIPs) - Solve linear relaxation of MIP
- Branch on the integer variables for which the
solution of the LP relaxation is non-integer - apply a good heuristic (e.g., max infeasibility)
for variable selection ( randomization ) and
create two new nodes (floor and ceiling of the
fractional value) - Once we have found an integer solution, its
objective value can be used to prune other nodes,
whose relaxations have worse values -
93Branch BoundDepth First vs. Best bound
- Critical in performance of Branch Bound
- the way in which the next node to be expanded
is selected. -
- Best-bound - select the node with the
best LP bound - (standard OR approach) ---gt
- this case is equivalent to A, the LP
relaxation provides an admissible search
heuristic - Depth-first - often quickly reaches an integer
solution - (may take longer to produce an overall optimal
value) -
94Portfolio of Algorithms
- A portfolio of algorithm is a collection of
algorithms and / or copies of the same
algorithm running interleaved or on different
processors. - Goal to improve on the performance of the
component algorithms in terms of - expected computational cost
- risk (variance)
- Efficient Set or Efficient Frontier set of
portfolios that are best in terms of expected
value and risk.
95Brandh Bound for MIP Depth-first vs.
Best-bound
Depth-First Average - 18000St. Dev. 30000
96 - Depth-First and Best and Bound do not dominate
each other overall.
97Heavy-tailed behavior of Depth-first
98Portfolio for heavy-tailed search procedures (2
processors)
2 DF / 0 BB
Expected run time of portfolios
0 DF / 2 BB
Standard deviation of run time of portfolios
99Portfolio for 6 processors
0 DF / 6 BB
Expected run time of portfolios
6 DF / 0BB
Standard deviation of run time of portfolios
100Portfolio for 20 processors
0 DF / 20 BB
The optimal strategy is to run Depth First on
the 20 processors!
Expected run time of portfolios
Optimal collective behavior emerges from
suboptimal individual behavior.
20 DF / 0 BB
Standard deviation of run time of portfolios
101Compute Clusters and Distributed Agents
- With the increasing popularity of compute
clusters and distributed problem solving / agent
paradigms, portfolios of algorithms --- and
flexible computation in general --- are rapidly
expanding research areas.
(Baptista and Marques da Silva 00, Boddy Dean
95, Bayardo 99, Davenport 00, Hogg 00, Horvitz
96, Matsuo 00, Steinberg 00, Russell 95, Santos
99, Welman 99. Zilberstein 99)
102Portfolio for heavy-tailed search procedures
(2-20 processors)
103 - A portfolio approach can lead to substantial
improvements in the expected cost and risk of
stochastic algorithms, especially in the presence
of heavy-tailed phenomena.
104Summary of Randomization
- Considered randomized backtrack search.
- Showed Heavy-Tailed Distributions.
- Suggests Rapid Restart Strategy.
- --- cuts very long runs
- --- exploits ultra-short runs
- Experimentally validated on previously
unsolved planning and scheduling problems. - Portfolio of Algorithms for cases where no
single heuristic dominates
105Research DirectionLearning Restart Policies
106Bayesian Model Structure Learning
Learning to infer predictive models from data and
to identify key variables gt restarts, cutoffs
and other adaptive behavior of search algorithms.
(Horvitz, Ruan, Gomes, Kautz, Selman, Chickering
2001)
107Quasigroup Order 34 (CSP)
Min depth
Avg Depth
Variance in number of uncolored cells across
rows and columns
Number uncolored cells per column
Max number of uncolored cells across rows and
columns
Green - long runs Gray - short runs
Model accuracy 96.8 vs 48 for the marginal model
108Analysis of different solver features and problem
features
109Outline
- A Structured Benchmark Domain
-
- Randomization
- Conclusions
-
110Summary
- The understanding of the structural properties of
problem instances based on notions such as phase
transitions, backbone, and balance provides new
insights into the practical complexity of many
computational tasks. - Active research area with fruitful interactions
between computer science, physics
(approaches - from statistical mechanics), and mathematics
(combinatorics / random structures).
111 Summary
- Stochastic search methods (complete and
incomplete) have been shown very effective. - Restart strategies and portfolio approaches can
lead to substantial improvements in the expected
runtime and variance, especially in the presence
of heavy-tailed phenomena. - Randomization is therefore a tool to improve
algorithmic performance and robustness. - Machine Learning techniques can be used to learn
predicitive models.
112Bridging the Gap
Exploiting Structure Tractable
Components Transition Aware Systems (phase
transition constrainedness backbone
resources) Randomization Exploits variance to
improve robustness and performance
General Solution Methods
Real World Problems
113www.cs.cornell.edu/gomesCheck
alsowww.cis.cornell.edu/iisi
Demos, papers, etc.