Exploiting Structure and Randomization in Combinatorial Search Carla P. Gomes gomes@cs.cornell.edu www.cs.cornell.edu/gomes Intelligent Information Systems Institute Department of Computer Science Cornell University

About This Presentation

Title:

Exploiting Structure and Randomization in Combinatorial Search Carla P. Gomes gomes@cs.cornell.edu www.cs.cornell.edu/gomes Intelligent Information Systems Institute Department of Computer Science Cornell University

Description:

Given an N X N matrix, and given N colors, a quasigroup of order N is a a ... (Berge 70, Regin 94, Shaw and Walsh 98 ) Carla P. Gomes. School on Optimization. CPAIOR02 ... – PowerPoint PPT presentation

Number of Views:120

Avg rating:3.0/5.0

Slides: 113

Provided by: CarlaP5

Learn more at: http://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: Exploiting Structure and Randomization in Combinatorial Search Carla P. Gomes gomes@cs.cornell.edu www.cs.cornell.edu/gomes Intelligent Information Systems Institute Department of Computer Science Cornell University

1
Exploiting Structure and Randomization in
Combinatorial SearchCarla P. Gomesgomes_at_cs.corn
ell.eduwww.cs.cornell.edu/gomesIntelligent
Information Systems InstituteDepartment of
Computer ScienceCornell University

2
Outline

A Structured Benchmark Domain
Randomization
Conclusions

3
Outline

A Structured Benchmark Domain
Randomization
Conclusions

4
Quasigroups or Latin Squares An Abstraction for
Real World Applications
Given an N X N matrix, and given N colors, a
quasigroup of order N is a a colored matrix,
such that -all cells are colored. - each
color occurs exactly once in each row. -
each color occurs exactly once in each
column.
Quasigroup or Latin Square (Order 4)
5
Quasigroup Completion Problem (QCP)
Given a partial assignment of colors (10 colors
in this case), can the partial quasigroup (latin
square) be completed so we obtain a full
quasigroup? Example
32 preassignment
(Gomes Selman 97)
6
Quasigroup Completion Problem A Framework for
Studying Search

NP-Complete.
Has a structure not found in random instances,
such as random K-SAT.
Leads to interesting search problems when
structure is perturbed (more about it later).
Good abstraction for several real world problems
scheduling and timetabling, routing in fiber
optics, coding, etc

(Anderson 85, Colbourn 83, 84, Denes Keedwell
94, Fujita et al. 93, Gent et al. 99, Gomes
Selman 97, Gomes et al. 98, Meseguer Walsh 98,
Stergiou and Walsh 99, Shaw et al. 98, Stickel
99, Walsh 99 )
7
Fiber Optic Networks
Nodes connect point to point fiber optic links
8
Fiber Optic Networks
Nodes connect point to point fiber optic links
9
Routing in Fiber Optic Networks
Input Ports
Output Ports
1
1
2
2
3
3
4
4
Routing Node
How can we achieve conflict-free routing in each
node of the network?
Dynamic wavelength routing is a NP-hard problem.
10
QCP Example Use Routers in Fiber Optic Networks
Dynamic wavelength routing in Fiber Optic
Networks can be directly mapped into the
Quasigroup Completion Problem.
(Barry and Humblet 93, Cheung et al. 90, Green
92, Kumar et al. 99)
11
Traditional View of Hard Problems - Worst Case
View

Theyre NP-Completetheres no way to do
anything but try heuristic approaches and hope
for the best.

12
New Concepts in Computation

Not all NP-Hard problems are the same!
We now have means for discriminating easy from
hard instances
---gt Phase Transition concepts

13
NP-completeness is a worst-case notion what
about average complexity? Structural
differences between instances of the same NP-
complete problem (QCP)
14

Are all the Quasigroup Instances (of same size)
Equally Difficult?
What is the fundamental difference between
instances?
15

Are all the Quasigroup Instances Equally
Difficult?
1820
165
50
40
16
Complexity of Quasigroup Completion
Median Runtime (log scale)
Fraction of pre-assignment
17
Phase Transition
Fraction of unsolvable cases
Fraction of pre-assignment
18

These results for the QCP - a structured
domain, nicely complement previous results on
phase transition and computational complexity for
random instances such as SAT, Graph Coloring,
etc.
(Broder et al. 93 Clearwater and Hogg 96,
Cheeseman et al. 91, Cook and Mitchell 98,
Crawford and Auton 93, Crawford and Baker 94,
Dubois 90, Frank et al. 98, Frost and Dechter
1994, Gent and Walsh 95, Hogg, et al. 96,
Mitchell et al. 1992, Kirkpatrick and Selman 94,
Monasson et 99, Motwani et al. 1994, Pemberton
and Zhang 96, Prosser 96, Schrag and Crawford
96, Selman and Kirkpatrick 97, Smith and Grant
1994, Smith and Dyer 96, Zhang and Korf 96, and
more)

19
QCPDifferent Representations / Encodings
20

Rows
Colors
Columns
Cubic representation of QCP
21
QCP as a MIP

Variables -
Constraints -

Row/color line
Column/color line
Row/column line
22
QCP as a CSP

Variables -
Constraints -

vs. for MIP
row
column
23
Exploiting Structure for Domain Reduction

A very successful strategy for domain reduction
in CSP is to exploit the structure of groups of
constraints and treat them as global constraints.
Example using Network Flow Algorithms
All-different constraints

(Caseau and Laburthe 94, Focacci, Lodi, Milano
99, Nuijten Aarts 95, Ottososon Thorsteinsson
00, Refalo 99, Regin 94 )
24
Exploiting Structure in QCP ALLDIFF as Global
Constraint
(Berge 70, Regin 94, Shaw and Walsh 98 )
25
Exploiting Structure Arc Consistency vs. All Diff
AllDiff Solves up to order 33 Size search space

26
Quasigroup as Satisfiability

Two different encodings for SAT
2D encoding (or minimal encoding)
3D encoding (or full encoding)

27
2D Encoding or Minimal Encoding

Variables
Each variables represents a color assigned to
a cell.
Clauses
Some color must be assigned to each cell (clause
of length n)
No color is repeated in the same row (sets of
negative binary clauses)
No color is repeated in the same column (sets of
negative binary clauses)

28
3D Encoding or Full Encoding

This encoding is based on the cubic
representation of the quasigroup each line of
the cube contains exactly one true variable
Variables
Same as 2D encoding.
Clauses
Same as the 2 D encoding plus
Each color must appear at least once in each row
Each color must appear at least once in each
column
No two colors are assigned to the same cell

29
Capturing Structure - Performance of SAT Solvers

State of the art backtrack and local search and
complete SAT solvers using 3D encoding are very
competitive with specialized CSP algorithms.
In contrast SAT solvers perform very poorly on 2D
encodings (SATZ or SATO)
In contrast local search solvers (Walksat)
perform well on 2D encodings

30
SATZ on 2D encoding (Order 20 -28)
Order 28
1,000,000
Order 20
SATZ and SATO can only solve up to order 28 when
using 2D encoding When using 3D encoding
problems of the same size take only 0 or 1
backtrack and much higher orders can be solved
31
Walksat on 2D and 3D encoding(Order 30-33)
1,000,000
2D order 33
3D order 33
Walksat shows an unsual pattern - the 2D
encodings are somewhat easier than the 3D
encoding at the peak and harder in the
undereconstrained region
32
Quasigroup - Satisfiability

Encoding the quasigroup using only
Boolean variables in clausal form using
the 3D encoding is very competitive.
Very fast solvers - SATZ, GRASP,
SATO,WALKSAT

Structural features of instances provide
insights into their hardness namely
Backbone
Inherent Structure and Balance

34
Backbone
Backbone is the shared structure of all the
solutions to a given instance.
This instance has 4 solutions
35
Phase Transition in the Backbone

We have observed a transition in the backbone
from a phase where the size of the backbone is
around 0 to a phase with backbone of size close
to 100.
The phase transition in the backbone is sudden
and it coincides with the hardest problem
instances.

(Achlioptas, Gomes, Kautz, Selman 00, Monasson et
al. 99)
36
New Phase Transition in Backbone QCP (satisfiable
instances only)
Backbone
of Backbone
Computational cost
Fraction of preassigned cells
37

Inherent Structure and Balance

38
Quasigroup Patterns and Problems Hardness
Tractable
Very hard
(Kautz, Ruan, Achlioptas, Gomes, Selman 2001)
39
SATZ
Balanced QCP
Rectangular QCP
QCP
QWH
Aligned QCP
40
Walksat
Balanced filtered QCP
Balance QWH
QCP
QWH
aligned
rectangular
We observe the same ordering in hardness when
using Walksat, SATZ, and SATO Balacing makes
instances harder
41
Phase Transitions, Backbone, Balance

Summary
The understanding of the structural properties
of problem instances based on notions such as
phase transitions, backbone, and balance provides
new insights into the practical complexity of
many computational tasks.
Active research area with fruitful interactions
between computer science, physics
(approaches
from statistical mechanics), and mathematics
(combinatorics / random structures).

42
Outline

A Structured Benchmark Domain
Randomization
Conclusions

43
Randomized Backtrack Search Procedures
44
Background

Stochastic strategies have been very successful
in the area of local search.
Simulated annealing
Genetic algorithms
Tabu Search
Gsat and variants.
Limitation inherent incomplete nature of local
search methods.

45
Background

We want to explore the addition of a
stochastic element to a systematic search
procedure without losing completeness.

46

Randomization

We introduce stochasticity in a backtrack
search method, e.g., by randomly breaking ties in
variable and/or value selection.
Compare with standard lexicographic
tie-breaking.

47
Randomization

At each choice point break ties (variable
selection and/or value selection) randomly or
Heuristic equivalence parameter (H)
- at every choice point consider as equally
good H top choices randomly select a choice
from equally good choices.

48
Randomized Strategies

49
Quasigroup Demo
50
Distributions of Randomized Backtrack Search

Key Properties
I Erratic behavior of mean
II Distributions have heavy tails.

51
Erratic Behavior of Search CostQuasigroup
Completion Problem
3500!
sample mean
Median 1!
number of runs
52
1
53
75lt30
5gt100000
Proportion of cases Solved
54
Heavy-Tailed Distributions

infinite variance infinite mean
Introduced by Pareto in the 1920s
--- probabilistic curiosity.
Mandelbrot established the use of heavy-tailed
distributions to model real-world fractal
phenomena.
Examples stock-market, earth-quakes, weather,...

55
Decay of Distributions

Standard --- Exponential Decay
e.g. Normal
Heavy-Tailed --- Power Law Decay
e.g. Pareto-Levy

56
(No Transcript)
57
Normal, Cauchy, and Levy
58
Tail Probabilities (Standard Normal, Cauchy,
Levy)

59
Example of Heavy Tailed Model(Random Walk)

Random Walk
Start at position 0
Toss a fair coin
with each head take a step up (1)
with each tail take a step down (-1)

X --- number of steps the random walk takes
to return to position 0.
60
(No Transcript)
61
Heavy-tails vs. Non-Heavy-Tails
Normal (2,1000000)
1-F(x) Unsolved fraction
O,1gt200000
Normal (2,1)
X - number of steps the walk takes to return to
zero (log scale)
62
How to Check for Heavy Tails?

Log-Log plot of tail of distribution
should be approximately linear.
Slope gives value of
infinite mean and
infinite variance
infinite variance

63
Heavy-Tailed Behavior in QCP Domain
(1-F(x))(log) Unsolved fraction
Number backtracks (log)
64

Formal Models of Heavy-Tailed Behavior in
Combinatorial Search

Chen, Gomes, Selman 2001
65
Motivation

Research on heavy-tails has been largely based on
empirical studies of run time distribution.
Goal to provide a formal characterization of
tree search models and show under what conditions
heavy-tailed distributions can arise.
Intuition Heavy-tailed behavior arises
from the fact that wrong branching decisions may
lead the procedure to explore an exponentially
large subtree of the search space that contains
no solutions
the procedure is characterized by a large
variability in the time to find a solution on
different runs, which leads to highly different
trees from run to run

66
Balanced vs. Imbalanced Tree Model

Balanced Tree Model
chronological backtrack search model
fixed variable ordering
random child selection with no propagation
mechanisms

(show demo)
67

T(n) - the number of leaf nodes visited
- choice at level i (1 - bad choice 0
-good choice)
(note there is exactly one choice of zero-one
assignments to the variables for each
possible value of T(n) any such assignment has
probability .
T(n) follows an Uniform distribution

68
The run time distribution of chronological
backtrack search on a complete balanced tree is
uniform (therefore not heavy-tailed). Both the
expected run time and variance scale
exponentially
69
Balanced Tree Model

The expected run time and variance scale
exponentially, in the height of the search tree
(number of variables)
The run time distribution is Uniform, (not heavy
tailed ).
Backtrack search on balanced tree model has no
restart strategy with exponential polynomial
time.

Chen, Gomes Selman 01
70

How can we improve on the balanced serach tree
model?
Very clever search heuristic that leads quickly
to the solution node - but that is hard in
general
Combination of pruning, propagation, dynamic
variable ordering that prune subtrees that do not
contain the solution, allowing for runs that are
short.
---gt resulting trees may vary dramatically from
run to run.

71
Formal Model Yielding Heavy-Tailed Behavior

T - the number of leaf nodes visited up to and
including the successful node b - branching
factor

(show demo)
b 2
72

Expected Run Time
(infinite expected time)
Variance
(infinite variance)
Tail
(heavy-tailed)

73
Bounded Heavy-Tailed Behavior
(show demo)
74
No Heavy-tailed behavior for Proving Optimality
75
Proving Optimality
76
Small-World Vs. Heavy-Tailed Behavior

Does a Small-World topology (Watts Strogatz)
induce heavy-tail behavior?

The constraint graph of a quasigroup exhibits a
small-world topology (Walsh 99)
77
Exploiting Heavy-Tailed Behavior

Heavy Tailed behavior has been observed in
several domains QCP, Graph Coloring, Planning,
Scheduling, Circuit synthesis, Decoding, etc.
Consequence for algorithm design
Use restarts or parallel / interleaved
runs to exploit the extreme variance
performance.

Restarts provably eliminate heavy-tailed
behavior.
(Gomes et al. 97, Hoos 99, Horvitz 99, Huberman,
Lukose and Hogg 97, Karp et al 96, Luby et al.
93, Rish et al. 97, Wlash 99)
78
Super-linear Speedups
Interleaved (1 machine) 10 x 1 10 seconds
5 x speedup
79
Restarts
70 unsolved
no restarts
1-F(x) Unsolved fraction
restart every 4 backtracks
0.001 unsolved
250 (62 restarts)
Number backtracks (log)
80
Example of Rapid Restart Speedup(planning)
Number backtracks (log)
Cutoff (log)
81
Sketch of proof of elimination of heavy tails

Lets truncate the search procedure
after m backtracks.
Probability of solving problem with truncated
version
Run the truncated procedure and restart it
repeatedly.

82

Y - does not have Heavy Tails
83
Decoding in Communication Systems
84
Retransmissions in Sequential Decoding
without retransmissions
1-F(x) Unsolved fraction
with retransmissions
Number backtracks (log)
Gomes et al. 2000 / 20001
85
Paramedic Crew Assignment
Paramedic crew assignment is the problem of
assigning paramedic crews from different
stations to cover a given region, given several
resource constraints.
86
Deterministic Search
87
Restarts
88
Results on Effectiveness of Restarts
Deterministic

() not found after 2 days
89
Algorithm Portfolio Design

90
Motivation

The runtime and performance of randomized
algorithms can vary dramatically on the same
instance and on different instances.
Goal Improve the performance of different
algorithms by combining them into a portfolio to
exploit their relative strengths.

91
Branch BoundBest Bound vs. Depth First
Search
92
Branch Bound(Randomized)

Standard OR approach for solving Mixed Integer
Programs (MIPs)
Solve linear relaxation of MIP
Branch on the integer variables for which the
solution of the LP relaxation is non-integer
apply a good heuristic (e.g., max infeasibility)
for variable selection ( randomization ) and
create two new nodes (floor and ceiling of the
fractional value)
Once we have found an integer solution, its
objective value can be used to prune other nodes,
whose relaxations have worse values

93
Branch BoundDepth First vs. Best bound

Critical in performance of Branch Bound
the way in which the next node to be expanded
is selected.
Best-bound - select the node with the
best LP bound
(standard OR approach) ---gt
this case is equivalent to A, the LP
relaxation provides an admissible search
heuristic
Depth-first - often quickly reaches an integer
solution
(may take longer to produce an overall optimal
value)

94
Portfolio of Algorithms

A portfolio of algorithm is a collection of
algorithms and / or copies of the same
algorithm running interleaved or on different
processors.
Goal to improve on the performance of the
component algorithms in terms of
expected computational cost
risk (variance)
Efficient Set or Efficient Frontier set of
portfolios that are best in terms of expected
value and risk.

95
Brandh Bound for MIP Depth-first vs.
Best-bound
Depth-First Average - 18000St. Dev. 30000
96

Depth-First and Best and Bound do not dominate
each other overall.

97
Heavy-tailed behavior of Depth-first
98
Portfolio for heavy-tailed search procedures (2
processors)
2 DF / 0 BB
Expected run time of portfolios
0 DF / 2 BB
Standard deviation of run time of portfolios
99
Portfolio for 6 processors
0 DF / 6 BB
Expected run time of portfolios
6 DF / 0BB
Standard deviation of run time of portfolios
100
Portfolio for 20 processors
0 DF / 20 BB
The optimal strategy is to run Depth First on
the 20 processors!
Expected run time of portfolios
Optimal collective behavior emerges from
suboptimal individual behavior.
20 DF / 0 BB
Standard deviation of run time of portfolios
101
Compute Clusters and Distributed Agents

With the increasing popularity of compute
clusters and distributed problem solving / agent
paradigms, portfolios of algorithms --- and
flexible computation in general --- are rapidly
expanding research areas.

(Baptista and Marques da Silva 00, Boddy Dean
95, Bayardo 99, Davenport 00, Hogg 00, Horvitz
96, Matsuo 00, Steinberg 00, Russell 95, Santos
99, Welman 99. Zilberstein 99)
102
Portfolio for heavy-tailed search procedures
(2-20 processors)
103

A portfolio approach can lead to substantial
improvements in the expected cost and risk of
stochastic algorithms, especially in the presence
of heavy-tailed phenomena.

104
Summary of Randomization

Considered randomized backtrack search.
Showed Heavy-Tailed Distributions.
Suggests Rapid Restart Strategy.
--- cuts very long runs
--- exploits ultra-short runs
Experimentally validated on previously
unsolved planning and scheduling problems.
Portfolio of Algorithms for cases where no
single heuristic dominates

105
Research DirectionLearning Restart Policies

106
Bayesian Model Structure Learning
Learning to infer predictive models from data and
to identify key variables gt restarts, cutoffs
and other adaptive behavior of search algorithms.
(Horvitz, Ruan, Gomes, Kautz, Selman, Chickering
2001)
107
Quasigroup Order 34 (CSP)
Min depth
Avg Depth
Variance in number of uncolored cells across
rows and columns
Number uncolored cells per column
Max number of uncolored cells across rows and
columns
Green - long runs Gray - short runs
Model accuracy 96.8 vs 48 for the marginal model
108
Analysis of different solver features and problem
features
109
Outline

A Structured Benchmark Domain
Randomization
Conclusions

110
Summary

The understanding of the structural properties of
problem instances based on notions such as phase
transitions, backbone, and balance provides new
insights into the practical complexity of many
computational tasks.
Active research area with fruitful interactions
between computer science, physics
(approaches
from statistical mechanics), and mathematics
(combinatorics / random structures).

111

Summary

Stochastic search methods (complete and
incomplete) have been shown very effective.
Restart strategies and portfolio approaches can
lead to substantial improvements in the expected
runtime and variance, especially in the presence
of heavy-tailed phenomena.
Randomization is therefore a tool to improve
algorithmic performance and robustness.
Machine Learning techniques can be used to learn
predicitive models.

112
Bridging the Gap
Exploiting Structure Tractable
Components Transition Aware Systems (phase
transition constrainedness backbone
resources) Randomization Exploits variance to
improve robustness and performance
General Solution Methods
Real World Problems
113
www.cs.cornell.edu/gomesCheck
alsowww.cis.cornell.edu/iisi
Demos, papers, etc.

Write a Comment

User Comments (0)