Exploiting Structure and Randomization in Combinatorial Search Carla P. Gomes gomes@cs.cornell.edu www.cs.cornell.edu/gomes Intelligent Information Systems Institute Department of Computer Science Cornell University - PowerPoint PPT Presentation

About This Presentation
Title:

Exploiting Structure and Randomization in Combinatorial Search Carla P. Gomes gomes@cs.cornell.edu www.cs.cornell.edu/gomes Intelligent Information Systems Institute Department of Computer Science Cornell University

Description:

Given an N X N matrix, and given N colors, a quasigroup of order N is a a ... (Berge 70, Regin 94, Shaw and Walsh 98 ) Carla P. Gomes. School on Optimization. CPAIOR02 ... – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Exploiting Structure and Randomization in Combinatorial Search Carla P. Gomes gomes@cs.cornell.edu www.cs.cornell.edu/gomes Intelligent Information Systems Institute Department of Computer Science Cornell University


1
Exploiting Structure and Randomization in
Combinatorial SearchCarla P. Gomesgomes_at_cs.corn
ell.eduwww.cs.cornell.edu/gomesIntelligent
Information Systems InstituteDepartment of
Computer ScienceCornell University

2
Outline
  • A Structured Benchmark Domain
  • Randomization
  • Conclusions

3
Outline
  • A Structured Benchmark Domain
  • Randomization
  • Conclusions

4
Quasigroups or Latin Squares An Abstraction for
Real World Applications
Given an N X N matrix, and given N colors, a
quasigroup of order N is a a colored matrix,
such that -all cells are colored. - each
color occurs exactly once in each row. -
each color occurs exactly once in each
column.
Quasigroup or Latin Square (Order 4)
5
Quasigroup Completion Problem (QCP)
Given a partial assignment of colors (10 colors
in this case), can the partial quasigroup (latin
square) be completed so we obtain a full
quasigroup? Example
32 preassignment
(Gomes Selman 97)
6
Quasigroup Completion Problem A Framework for
Studying Search
  • NP-Complete.
  • Has a structure not found in random instances,
  • such as random K-SAT.
  • Leads to interesting search problems when
    structure is perturbed (more about it later).
  • Good abstraction for several real world problems
    scheduling and timetabling, routing in fiber
    optics, coding, etc

(Anderson 85, Colbourn 83, 84, Denes Keedwell
94, Fujita et al. 93, Gent et al. 99, Gomes
Selman 97, Gomes et al. 98, Meseguer Walsh 98,
Stergiou and Walsh 99, Shaw et al. 98, Stickel
99, Walsh 99 )
7
Fiber Optic Networks
Nodes connect point to point fiber optic links
8
Fiber Optic Networks
Nodes connect point to point fiber optic links
9
Routing in Fiber Optic Networks
Input Ports
Output Ports
1
1
2
2
3
3
4
4
Routing Node
How can we achieve conflict-free routing in each
node of the network?
Dynamic wavelength routing is a NP-hard problem.
10
QCP Example Use Routers in Fiber Optic Networks
Dynamic wavelength routing in Fiber Optic
Networks can be directly mapped into the
Quasigroup Completion Problem.
(Barry and Humblet 93, Cheung et al. 90, Green
92, Kumar et al. 99)
11
Traditional View of Hard Problems - Worst Case
View
  • Theyre NP-Completetheres no way to do
    anything but try heuristic approaches and hope
    for the best.

12
New Concepts in Computation
  • Not all NP-Hard problems are the same!
  • We now have means for discriminating easy from
    hard instances
  • ---gt Phase Transition concepts

13
NP-completeness is a worst-case notion what
about average complexity? Structural
differences between instances of the same NP-
complete problem (QCP)
14

Are all the Quasigroup Instances (of same size)
Equally Difficult?
What is the fundamental difference between
instances?
15

Are all the Quasigroup Instances Equally
Difficult?
1820
165
50
40
16
Complexity of Quasigroup Completion
Median Runtime (log scale)
Fraction of pre-assignment
17
Phase Transition
Fraction of unsolvable cases
Fraction of pre-assignment
18
  • These results for the QCP - a structured
    domain, nicely complement previous results on
    phase transition and computational complexity for
    random instances such as SAT, Graph Coloring,
    etc.
  • (Broder et al. 93 Clearwater and Hogg 96,
    Cheeseman et al. 91, Cook and Mitchell 98,
    Crawford and Auton 93, Crawford and Baker 94,
    Dubois 90, Frank et al. 98, Frost and Dechter
    1994, Gent and Walsh 95, Hogg, et al. 96,
    Mitchell et al. 1992, Kirkpatrick and Selman 94,
    Monasson et 99, Motwani et al. 1994, Pemberton
    and Zhang 96, Prosser 96, Schrag and Crawford
    96, Selman and Kirkpatrick 97, Smith and Grant
    1994, Smith and Dyer 96, Zhang and Korf 96, and
    more)

19
QCPDifferent Representations / Encodings
20

Rows
Colors
Columns
Cubic representation of QCP
21
QCP as a MIP
  • Variables -
  • Constraints -

Row/color line
Column/color line
Row/column line
22
QCP as a CSP
  • Variables -
  • Constraints -

vs. for MIP
row
column
23
Exploiting Structure for Domain Reduction
  • A very successful strategy for domain reduction
    in CSP is to exploit the structure of groups of
    constraints and treat them as global constraints.
  • Example using Network Flow Algorithms
  • All-different constraints

(Caseau and Laburthe 94, Focacci, Lodi, Milano
99, Nuijten Aarts 95, Ottososon Thorsteinsson
00, Refalo 99, Regin 94 )
24
Exploiting Structure in QCP ALLDIFF as Global
Constraint
(Berge 70, Regin 94, Shaw and Walsh 98 )
25
Exploiting Structure Arc Consistency vs. All Diff
AllDiff Solves up to order 33 Size search space

26
Quasigroup as Satisfiability
  • Two different encodings for SAT
  • 2D encoding (or minimal encoding)
  • 3D encoding (or full encoding)

27
2D Encoding or Minimal Encoding
  • Variables
  • Each variables represents a color assigned to
    a cell.
  • Clauses
  • Some color must be assigned to each cell (clause
    of length n)
  • No color is repeated in the same row (sets of
    negative binary clauses)
  • No color is repeated in the same column (sets of
    negative binary clauses)

28
3D Encoding or Full Encoding
  • This encoding is based on the cubic
    representation of the quasigroup each line of
    the cube contains exactly one true variable
  • Variables
  • Same as 2D encoding.
  • Clauses
  • Same as the 2 D encoding plus
  • Each color must appear at least once in each row
  • Each color must appear at least once in each
    column
  • No two colors are assigned to the same cell

29
Capturing Structure - Performance of SAT Solvers
  • State of the art backtrack and local search and
    complete SAT solvers using 3D encoding are very
    competitive with specialized CSP algorithms.
  • In contrast SAT solvers perform very poorly on 2D
    encodings (SATZ or SATO)
  • In contrast local search solvers (Walksat)
    perform well on 2D encodings

30
SATZ on 2D encoding (Order 20 -28)
Order 28
1,000,000
Order 20
SATZ and SATO can only solve up to order 28 when
using 2D encoding When using 3D encoding
problems of the same size take only 0 or 1
backtrack and much higher orders can be solved
31
Walksat on 2D and 3D encoding(Order 30-33)
1,000,000
2D order 33
3D order 33
Walksat shows an unsual pattern - the 2D
encodings are somewhat easier than the 3D
encoding at the peak and harder in the
undereconstrained region
32
Quasigroup - Satisfiability
  • Encoding the quasigroup using only
  • Boolean variables in clausal form using
  • the 3D encoding is very competitive.
  • Very fast solvers - SATZ, GRASP,
  • SATO,WALKSAT

33
  • Structural features of instances provide
    insights into their hardness namely
  • Backbone
  • Inherent Structure and Balance

34
Backbone
Backbone is the shared structure of all the
solutions to a given instance.
This instance has 4 solutions
35
Phase Transition in the Backbone
  • We have observed a transition in the backbone
    from a phase where the size of the backbone is
    around 0 to a phase with backbone of size close
    to 100.
  • The phase transition in the backbone is sudden
    and it coincides with the hardest problem
    instances.

(Achlioptas, Gomes, Kautz, Selman 00, Monasson et
al. 99)
36
New Phase Transition in Backbone QCP (satisfiable
instances only)
Backbone
of Backbone
Computational cost
Fraction of preassigned cells
37
  • Inherent Structure and Balance

38
Quasigroup Patterns and Problems Hardness
Tractable
Very hard
(Kautz, Ruan, Achlioptas, Gomes, Selman 2001)
39
SATZ
Balanced QCP
Rectangular QCP
QCP
QWH
Aligned QCP
40
Walksat
Balanced filtered QCP
Balance QWH
QCP
QWH
aligned
rectangular
We observe the same ordering in hardness when
using Walksat, SATZ, and SATO Balacing makes
instances harder
41
Phase Transitions, Backbone, Balance
  • Summary
  • The understanding of the structural properties
    of problem instances based on notions such as
    phase transitions, backbone, and balance provides
    new insights into the practical complexity of
    many computational tasks.
  • Active research area with fruitful interactions
    between computer science, physics
    (approaches
  • from statistical mechanics), and mathematics
    (combinatorics / random structures).

42
Outline
  • A Structured Benchmark Domain
  • Randomization
  • Conclusions

43
Randomized Backtrack Search Procedures
44
Background
  • Stochastic strategies have been very successful
    in the area of local search.
  • Simulated annealing
  • Genetic algorithms
  • Tabu Search
  • Gsat and variants.
  • Limitation inherent incomplete nature of local
    search methods.

45
Background
  • We want to explore the addition of a
  • stochastic element to a systematic search
  • procedure without losing completeness.

46

Randomization
  • We introduce stochasticity in a backtrack
    search method, e.g., by randomly breaking ties in
    variable and/or value selection.
  • Compare with standard lexicographic
    tie-breaking.

47
Randomization
  • At each choice point break ties (variable
    selection and/or value selection) randomly or
  • Heuristic equivalence parameter (H)
    - at every choice point consider as equally
    good H top choices randomly select a choice
    from equally good choices.

48
Randomized Strategies

49
Quasigroup Demo
50
Distributions of Randomized Backtrack Search
  • Key Properties
  • I Erratic behavior of mean
  • II Distributions have heavy tails.

51
Erratic Behavior of Search CostQuasigroup
Completion Problem
3500!
sample mean
Median 1!
number of runs
52
1
53
75lt30
5gt100000
Proportion of cases Solved
54
Heavy-Tailed Distributions
  • infinite variance infinite mean
  • Introduced by Pareto in the 1920s
  • --- probabilistic curiosity.
  • Mandelbrot established the use of heavy-tailed
    distributions to model real-world fractal
    phenomena.
  • Examples stock-market, earth-quakes, weather,...

55
Decay of Distributions
  • Standard --- Exponential Decay
  • e.g. Normal
  • Heavy-Tailed --- Power Law Decay
  • e.g. Pareto-Levy

56
(No Transcript)
57
Normal, Cauchy, and Levy
58
Tail Probabilities (Standard Normal, Cauchy,
Levy)

59
Example of Heavy Tailed Model(Random Walk)
  • Random Walk
  • Start at position 0
  • Toss a fair coin
  • with each head take a step up (1)
  • with each tail take a step down (-1)

X --- number of steps the random walk takes
to return to position 0.
60
(No Transcript)
61
Heavy-tails vs. Non-Heavy-Tails
Normal (2,1000000)
1-F(x) Unsolved fraction
O,1gt200000
Normal (2,1)
X - number of steps the walk takes to return to
zero (log scale)
62
How to Check for Heavy Tails?
  • Log-Log plot of tail of distribution
  • should be approximately linear.
  • Slope gives value of
  • infinite mean and
    infinite variance
  • infinite variance

63
Heavy-Tailed Behavior in QCP Domain
(1-F(x))(log) Unsolved fraction
Number backtracks (log)
64
  • Formal Models of Heavy-Tailed Behavior in
    Combinatorial Search

Chen, Gomes, Selman 2001
65
Motivation
  • Research on heavy-tails has been largely based on
    empirical studies of run time distribution.
  • Goal to provide a formal characterization of
    tree search models and show under what conditions
    heavy-tailed distributions can arise.
  • Intuition Heavy-tailed behavior arises
  • from the fact that wrong branching decisions may
    lead the procedure to explore an exponentially
    large subtree of the search space that contains
    no solutions
  • the procedure is characterized by a large
    variability in the time to find a solution on
    different runs, which leads to highly different
    trees from run to run

66
Balanced vs. Imbalanced Tree Model
  • Balanced Tree Model
  • chronological backtrack search model
  • fixed variable ordering
  • random child selection with no propagation
    mechanisms

(show demo)
67
  • T(n) - the number of leaf nodes visited
  • - choice at level i (1 - bad choice 0
    -good choice)
  • (note there is exactly one choice of zero-one
    assignments to the variables for each
    possible value of T(n) any such assignment has
    probability .
  • T(n) follows an Uniform distribution

68
The run time distribution of chronological
backtrack search on a complete balanced tree is
uniform (therefore not heavy-tailed). Both the
expected run time and variance scale
exponentially
69
Balanced Tree Model
  • The expected run time and variance scale
    exponentially, in the height of the search tree
    (number of variables)
  • The run time distribution is Uniform, (not heavy
    tailed ).
  • Backtrack search on balanced tree model has no
    restart strategy with exponential polynomial
    time.

Chen, Gomes Selman 01
70
  • How can we improve on the balanced serach tree
    model?
  • Very clever search heuristic that leads quickly
    to the solution node - but that is hard in
    general
  • Combination of pruning, propagation, dynamic
    variable ordering that prune subtrees that do not
    contain the solution, allowing for runs that are
    short.
  • ---gt resulting trees may vary dramatically from
    run to run.

71
Formal Model Yielding Heavy-Tailed Behavior
  • T - the number of leaf nodes visited up to and
    including the successful node b - branching
    factor

(show demo)
b 2
72
  • Expected Run Time
  • (infinite expected time)
  • Variance
  • (infinite variance)
  • Tail
  • (heavy-tailed)

73
Bounded Heavy-Tailed Behavior
(show demo)
74
No Heavy-tailed behavior for Proving Optimality
75
Proving Optimality
76
Small-World Vs. Heavy-Tailed Behavior
  • Does a Small-World topology (Watts Strogatz)
    induce heavy-tail behavior?

The constraint graph of a quasigroup exhibits a
small-world topology (Walsh 99)
77
Exploiting Heavy-Tailed Behavior
  • Heavy Tailed behavior has been observed in
    several domains QCP, Graph Coloring, Planning,
    Scheduling, Circuit synthesis, Decoding, etc.
  • Consequence for algorithm design
  • Use restarts or parallel / interleaved
    runs to exploit the extreme variance
    performance.

Restarts provably eliminate heavy-tailed
behavior.
(Gomes et al. 97, Hoos 99, Horvitz 99, Huberman,
Lukose and Hogg 97, Karp et al 96, Luby et al.
93, Rish et al. 97, Wlash 99)
78
Super-linear Speedups
Interleaved (1 machine) 10 x 1 10 seconds
5 x speedup
79
Restarts
70 unsolved
no restarts
1-F(x) Unsolved fraction
restart every 4 backtracks
0.001 unsolved
250 (62 restarts)
Number backtracks (log)
80
Example of Rapid Restart Speedup(planning)
Number backtracks (log)
Cutoff (log)
81
Sketch of proof of elimination of heavy tails
  • Lets truncate the search procedure
  • after m backtracks.
  • Probability of solving problem with truncated
    version
  • Run the truncated procedure and restart it
    repeatedly.

82

Y - does not have Heavy Tails
83
Decoding in Communication Systems
84
Retransmissions in Sequential Decoding
without retransmissions
1-F(x) Unsolved fraction
with retransmissions
Number backtracks (log)
Gomes et al. 2000 / 20001
85
Paramedic Crew Assignment
Paramedic crew assignment is the problem of
assigning paramedic crews from different
stations to cover a given region, given several
resource constraints.
86
Deterministic Search
87
Restarts
88
Results on Effectiveness of Restarts
Deterministic

() not found after 2 days
89
Algorithm Portfolio Design

90
Motivation
  • The runtime and performance of randomized
    algorithms can vary dramatically on the same
    instance and on different instances.
  • Goal Improve the performance of different
    algorithms by combining them into a portfolio to
    exploit their relative strengths.

91
Branch BoundBest Bound vs. Depth First
Search
92
Branch Bound(Randomized)
  • Standard OR approach for solving Mixed Integer
    Programs (MIPs)
  • Solve linear relaxation of MIP
  • Branch on the integer variables for which the
    solution of the LP relaxation is non-integer
  • apply a good heuristic (e.g., max infeasibility)
    for variable selection ( randomization ) and
    create two new nodes (floor and ceiling of the
    fractional value)
  • Once we have found an integer solution, its
    objective value can be used to prune other nodes,
    whose relaxations have worse values

93
Branch BoundDepth First vs. Best bound
  • Critical in performance of Branch Bound
  • the way in which the next node to be expanded
    is selected.
  • Best-bound - select the node with the
    best LP bound
  • (standard OR approach) ---gt
  • this case is equivalent to A, the LP
    relaxation provides an admissible search
    heuristic
  • Depth-first - often quickly reaches an integer
    solution
  • (may take longer to produce an overall optimal
    value)

94
Portfolio of Algorithms
  • A portfolio of algorithm is a collection of
    algorithms and / or copies of the same
    algorithm running interleaved or on different
    processors.
  • Goal to improve on the performance of the
    component algorithms in terms of
  • expected computational cost
  • risk (variance)
  • Efficient Set or Efficient Frontier set of
    portfolios that are best in terms of expected
    value and risk.

95
Brandh Bound for MIP Depth-first vs.
Best-bound
Depth-First Average - 18000St. Dev. 30000
96
  • Depth-First and Best and Bound do not dominate
    each other overall.

97
Heavy-tailed behavior of Depth-first
98
Portfolio for heavy-tailed search procedures (2
processors)
2 DF / 0 BB
Expected run time of portfolios
0 DF / 2 BB
Standard deviation of run time of portfolios
99
Portfolio for 6 processors
0 DF / 6 BB
Expected run time of portfolios
6 DF / 0BB
Standard deviation of run time of portfolios
100
Portfolio for 20 processors
0 DF / 20 BB
The optimal strategy is to run Depth First on
the 20 processors!
Expected run time of portfolios
Optimal collective behavior emerges from
suboptimal individual behavior.
20 DF / 0 BB
Standard deviation of run time of portfolios
101
Compute Clusters and Distributed Agents
  • With the increasing popularity of compute
    clusters and distributed problem solving / agent
    paradigms, portfolios of algorithms --- and
    flexible computation in general --- are rapidly
    expanding research areas.

(Baptista and Marques da Silva 00, Boddy Dean
95, Bayardo 99, Davenport 00, Hogg 00, Horvitz
96, Matsuo 00, Steinberg 00, Russell 95, Santos
99, Welman 99. Zilberstein 99)
102
Portfolio for heavy-tailed search procedures
(2-20 processors)
103
  • A portfolio approach can lead to substantial
    improvements in the expected cost and risk of
    stochastic algorithms, especially in the presence
    of heavy-tailed phenomena.

104
Summary of Randomization
  • Considered randomized backtrack search.
  • Showed Heavy-Tailed Distributions.
  • Suggests Rapid Restart Strategy.
  • --- cuts very long runs
  • --- exploits ultra-short runs
  • Experimentally validated on previously
    unsolved planning and scheduling problems.
  • Portfolio of Algorithms for cases where no
    single heuristic dominates

105
Research DirectionLearning Restart Policies

106
Bayesian Model Structure Learning
Learning to infer predictive models from data and
to identify key variables gt restarts, cutoffs
and other adaptive behavior of search algorithms.
(Horvitz, Ruan, Gomes, Kautz, Selman, Chickering
2001)
107
Quasigroup Order 34 (CSP)
Min depth
Avg Depth
Variance in number of uncolored cells across
rows and columns
Number uncolored cells per column
Max number of uncolored cells across rows and
columns
Green - long runs Gray - short runs
Model accuracy 96.8 vs 48 for the marginal model
108
Analysis of different solver features and problem
features
109
Outline
  • A Structured Benchmark Domain
  • Randomization
  • Conclusions

110
Summary
  • The understanding of the structural properties of
    problem instances based on notions such as phase
    transitions, backbone, and balance provides new
    insights into the practical complexity of many
    computational tasks.
  • Active research area with fruitful interactions
    between computer science, physics
    (approaches
  • from statistical mechanics), and mathematics
    (combinatorics / random structures).

111

Summary
  • Stochastic search methods (complete and
    incomplete) have been shown very effective.
  • Restart strategies and portfolio approaches can
    lead to substantial improvements in the expected
    runtime and variance, especially in the presence
    of heavy-tailed phenomena.
  • Randomization is therefore a tool to improve
    algorithmic performance and robustness.
  • Machine Learning techniques can be used to learn
    predicitive models.

112
Bridging the Gap
Exploiting Structure Tractable
Components Transition Aware Systems (phase
transition constrainedness backbone
resources) Randomization Exploits variance to
improve robustness and performance
General Solution Methods
Real World Problems
113
www.cs.cornell.edu/gomesCheck
alsowww.cis.cornell.edu/iisi
Demos, papers, etc.
Write a Comment
User Comments (0)
About PowerShow.com