Constraint Optimization - PowerPoint PPT Presentation

About This Presentation
Title:

Constraint Optimization

Description:

Bucket elimination, dynamic programming, tree-clustering, bucket-elimination ... Non-serial Dynamic Programming (Bertele and Briochi, 1973) Generating the MPE-tuple ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 105
Provided by: Rin21
Learn more at: https://ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Constraint Optimization


1
Chapter 13
  • Constraint Optimization
  • And counting, and enumeration
  • 275 class

2
From satisfaction to counting and
enumeration and to general graphical models
  • ICS-275Spring 2007

3
Outline
  • Introduction
  • Optimization tasks for graphical models
  • Solving optimization problems with inference and
    search
  • Inference
  • Bucket elimination, dynamic programming
  • Mini-bucket elimination
  • Search
  • Branch and bound and best-first
  • Lower-bounding heuristics
  • AND/OR search spaces
  • Hybrids of search and inference
  • Cutset decomposition
  • Super-bucket scheme

4
Constraint Satisfaction
  • Example map coloring
  • Variables - countries (A,B,C,etc.)
  • Values - colors (e.g., red, green, yellow)
  • Constraints

Task consistency? Find a solution, all
solutions, counting
5
Propositional Satisfiability
? (C), (A v B v C), (A v B v E), (B v C v
D).
6
Constraint Optimization Problemsfor Graphical
Models
f(A,B,D) has scope A,B,D
7
Constraint Optimization Problemsfor Graphical
Models
f(A,B,D) has scope A,B,D
Primal graph Variables --gt nodes Functions,
Constraints -? arcs f1(A,B,D) f2(D,F,G) f3(B,C,F
)
F(a,b,c,d,f,g) f1(a,b,d)f2(d,f,g)f3(b,c,f)
G
8
Constrained Optimization
Example power plant scheduling
9
Probabilistic Networks
P(DC,B)
P(S,C,B,X,D) P(S) P(CS) P(BS) P(XC,S)
P(DC,B)
10
Graphical models
  • A graphical model (X,D,C)
  • X X1,Xn variables
  • D D1, Dn domains
  • C F1,,Ft functions(constraints, CPTS, cnfs)
  • Primal graph

11
Graphical models
  • A graphical model (X,D,C)
  • X X1,Xn variables
  • D D1, Dn domains
  • C F1,,Ft functions(constraints, CPTS, cnfs)
  • Primal graph

A COP problem defined by operators sum, product
and min,max over consistent solutions
  • MPE maxX ?j Pj
  • CSP ?X ?j Cj
  • Max-CSP minX ?j Fj
  • Optimization minX ?j Fj

12
Outline
  • Introduction
  • Optimization tasks for graphical models
  • Solving by inference and search
  • Inference
  • Bucket elimination, dynamic programming,
    tree-clustering, bucket-elimination
  • Mini-bucket elimination, belief propagation
  • Search
  • Branch and bound and best-first
  • Lower-bounding heuristics
  • AND/OR search spaces
  • Hybrids of search and inference
  • Cutset decomposition
  • Super-bucket scheme

13
Computing MPE
MPE
B
C
E
D
P(a)
14
Finding
Algorithm elim-mpe (Dechter 1996)Non-serial
Dynamic Programming (Bertele and Briochi, 1973)
Elimination operator
15
Generating the MPE-tuple
16
Complexity
Algorithm elim-mpe (Dechter 1996)Non-serial
Dynamic Programming (Bertele and Briochi, 1973)
Elimination operator
17
Complexity of bucket elimination
Bucket-elimination is time and space
r number of functions
The effect of the ordering
Finding smallest induced-width is hard
18
Directional i-consistency
Adaptive
d-arc
d-path
19
Mini-bucket approximation MPE task
Split a bucket into mini-buckets gtbound
complexity
20
Mini-Bucket Elimination
P(A)
P(BA)
P(CA)
P(EB,C)
P(DA,B)
21
MBE-MPE(i) Algorithm Approx-MPE (DechterRish
1997)
  • Input i max number of variables allowed in a
    mini-bucket
  • Output lower bound (P of a sub-optimal
    solution), upper bound

Example approx-mpe(3) versus elim-mpe
22
Properties of MBE(i)
  • Complexity O(r exp(i)) time and O(exp(i))
    space.
  • Yields an upper-bound and a lower-bound.
  • Accuracy determined by upper/lower (U/L) bound.
  • As i increases, both accuracy and complexity
    increase.
  • Possible use of mini-bucket approximations
  • As anytime algorithms
  • As heuristics in search
  • Other tasks similar mini-bucket approximations
    for belief updating, MAP and MEU (Dechter and
    Rish, 1997)

23
Outline
  • Introduction
  • Optimization tasks for graphical models
  • Solving by inference and search
  • Inference
  • Bucket elimination, dynamic programming
  • Mini-bucket elimination
  • Search
  • Branch and bound and best-first
  • Lower-bounding heuristics
  • AND/OR search spaces
  • Hybrids of search and inference
  • Cutset decomposition
  • Super-bucket scheme

24
The Search Space
Objective function
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
25
The Search Space
0
0
0
1
2
0
0
4
0
1
0
1
3
1
5
4
0
2
2
5
0
1
0
1
0
1
0
1
5
6
4
2
2
4
1
0
5
6
4
2
2
4
1
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
3
5
3
5
3
5
3
5
1
3
1
3
1
3
1
3
5
2
5
2
5
2
5
2
3
0
3
0
3
0
3
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
Arc-cost is calculated based on cost components.
26
The Value Function
5
0
0
5
7
0
1
2
0
0
4
6
5
7
4
0
1
0
1
3
1
5
4
0
2
2
5
8
5
3
1
7
4
2
0
0
1
0
1
0
1
0
1
5
6
4
2
2
4
1
0
5
6
4
2
2
4
1
0
3
3
3
3
1
1
1
1
2
2
2
2
0
0
0
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
3
5
3
5
3
5
3
5
1
3
1
3
1
3
1
3
5
2
5
2
5
2
5
2
3
0
3
0
3
0
3
0
0
2
0
0
0
2
2
2
0
2
0
0
0
2
2
2
1
0
1
1
1
0
0
0
1
0
1
1
1
0
0
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
Value of node minimal cost solution below it
27
An Optimal Solution
5
0
0
5
7
0
1
2
0
0
4
6
5
7
4
0
1
0
1
3
1
5
4
0
2
2
5
8
5
3
1
7
4
2
0
0
1
0
1
0
1
0
1
5
6
4
2
2
4
1
0
5
6
4
2
2
4
1
0
3
3
3
3
1
1
1
1
2
2
2
2
0
0
0
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
3
5
3
5
3
5
3
5
1
3
1
3
1
3
1
3
5
2
5
2
5
2
5
2
3
0
3
0
3
0
3
0
0
2
0
0
0
2
2
2
0
2
0
0
0
2
2
2
1
0
1
1
1
0
0
0
1
0
1
1
1
0
0
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
3
0
2
2
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
1
2
0
4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
Value of node minimal cost solution below it
28
Basic Heuristic Search Schemes
Heuristic function f(x) computes a lower bound on
the best extension of x and can be used to guide
a heuristic search algorithm. We focus on
29
Classic Branch-and-Bound
Upper Bound UB
Lower Bound LB
g(n)
LB(n) g(n) h(n)
n
Prune if LB(n) UB
h(n)
OR Search Tree
30
How to Generate Heuristics
  • The principle of relaxed models
  • Linear optimization for integer programs
  • Mini-bucket elimination
  • Bounded directional consistency ideas

31
Generating Heuristic for graphical models(Kask
and Dechter, 1999)
Given a cost function
C(a,b,c,d,e) f(a) f(b,a) f(c,a) f(e,b,c)
P(d,b,a)
Define an evaluation function over a partial
assignment as the probability of its best
extension
D
f(a,e,d) minb,c f(a,b,c,d,e) f(a)
minb,c f(b,a) P(c,a) P(e,b,c) P(d,a,b)
g(a,e,d) H(a,e,d)
32
Generating Heuristics (cont.)
H(a,e,d) minb,c f(b,a) f(c,a) f(e,b,c)
P(d,a,b) minc f(c,a) minb f(e,b,c)
f(b,a) f(d,a,b) lt minc f(c,a) minb
f(e,b,c) minb f(b,a) f(d,a,b) minb
f(b,a) f(d,a,b) minc f(c,a) minb
f(e,b,c) hB(d,a) hC(e,a) H(a,e,d)
f(a,e,d) g(a,e,d) H(a,e,d) lt
f(a,e,d) The heuristic function H is what is
compiled during the preprocessing stage of the
Mini-Bucket algorithm.
33
Generating Heuristics (cont.)
H(a,e,d) minb,c f(b,a) f(c,a) f(e,b,c)
P(d,a,b) minc f(c,a) minb f(e,b,c)
f(b,a) f(d,a,b) gt minc f(c,a) minb
f(e,b,c) minb f(b,a) f(d,a,b) minb
f(b,a) f(d,a,b) minc f(c,a) minb
f(e,b,c) hB(d,a) hC(e,a) H(a,e,d)
f(a,e,d) g(a,e,d) H(a,e,d) lt
f(a,e,d) The heuristic function H is what is
compiled during the preprocessing stage of the
Mini-Bucket algorithm.
34
Static MBE Heuristics
  • Given a partial assignment xp, estimate the cost
    of the best extension to a full solution
  • The evaluation function f(xp) can be computed
    using function recorded by the Mini-Bucket scheme

f(a,e,D))g(a,e) H(a,e,D )
35
Heuristics Properties
  • MB Heuristic is monotone, admissible
  • Retrieved in linear time
  • IMPORTANT
  • Heuristic strength can vary by MB(i).
  • Higher i-bound ? more pre-processing ? stronger
    heuristic ? less search.
  • Allows controlled trade-off between preprocessing
    and search

36
Experimental Methodology
  • Algorithms
  • BBMB(i) Branch and Bound with MB(i)
  • BBFB(i) - Best-First with MB(i)
  • MBE(i)
  • Test networks
  • Random Coding (Bayesian)
  • CPCS (Bayesian)
  • Random (CSP)
  • Measures of performance
  • Compare accuracy given a fixed amount of time -
    how close is the cost found to the optimal
    solution
  • Compare trade-off performance as a function of
    time

37
Empirical Evaluation of mini-bucket
heuristics,Bayesian networks, coding
38
Max-CSP experiments(Kask and Dechter, 2000)
39
Dynamic MB Heuristics
  • Rather than pre-compiling, the mini-bucket
    heuristics can be generated during search
  • Dynamic mini-bucket heuristics use the
    Mini-Bucket algorithm to produce a bound for any
    node in the search space
  • (a partial assignment, along the given variable
    ordering)

40
Dynamic MB and MBTE Heuristics(Kask, Marinescu
and Dechter, 2003)
  • Rather than precompile compute the heuristics
    during search
  • Dynamic MB Dynamic mini-bucket heuristics use
    the Mini-Bucket algorithm to produce a bound for
    any node during search
  • Dynamic MBTE We can compute heuristics
    simultaneously for all un-instantiated variables
    using mini-bucket-tree elimination .
  • MBTE is an approximation scheme defined over
    cluster-trees. It outputs multiple bounds for
    each variable and value extension at once.

41
Cluster Tree Elimination - example
ABC
1
BC
BCDF
2
BF
BEF
3
EF
EFG
4
42
Mini-Clustering
  • Motivation
  • Time and space complexity of Cluster Tree
    Elimination depend on the induced width w of the
    problem
  • When the induced width w is big, CTE algorithm
    becomes infeasible
  • The basic idea
  • Try to reduce the size of the cluster (the
    exponent) partition each cluster into
    mini-clusters with less variables
  • Accuracy parameter i maximum number of
    variables in a mini-cluster
  • The idea was explored for variable elimination
    (Mini-Bucket)

43
Idea of Mini-Clustering
Split a cluster into mini-clusters gt bound
complexity
44
Mini-Clustering - example
ABC
1
BC
BCDF
2
BF
BEF
3
EF
EFG
4
45
Mini Bucket Tree Elimination
ABC
ABC
1
1
BC
BC
BCDF
BCDF
2
2
BF
BF
BEF
BEF
3
3
EF
EF
EFG
EFG
4
4
46
Mini-Clustering
  • Correctness and completeness Algorithm MC(i)
    computes a bound (or an approximation) for each
    variable and each of its values.
  • MBTE when the clusters are buckets in BTE.

47
Branch and Bound w/ Mini-Buckets
  • BB with static Mini-Bucket Heuristics (s-BBMB)
  • Heuristic information is pre-compiled before
    search. Static variable ordering, prunes current
    variable
  • BB with dynamic Mini-Bucket Heuristics (d-BBMB)
  • Heuristic information is assembled during search.
    Static variable ordering, prunes current variable
  • BB with dynamic Mini-Bucket-Tree Heuristics
    (BBBT)
  • Heuristic information is assembled during search.
    Dynamic variable ordering, prunes all future
    variables

48
Empirical Evaluation
  • Measures
  • Time
  • Accuracy ( exact)
  • Backtracks
  • Bit Error Rate (coding)
  • Algorithms
  • Complete
  • BBBT
  • BBMB
  • Incomplete
  • DLM
  • GLS
  • SLS
  • IJGP
  • IBP (coding)
  • Benchmarks
  • Coding networks
  • Bayesian Network Repository
  • Grid networks (N-by-N)
  • Random noisy-OR networks
  • Random networks

49
Random Bayesian Networks Average Accuracy
Average Accuracy. Random Bayesian (N100, C90,
P2), w 17 100 samples, 10 observations (a) K
2, (b) K 3.
50
Random Bayesian Networks Solution Quality
Solution Quality. Random Bayesian (N100, C90,
P3), w30, 100 samples, 10 observations (a) K
2, (b) K 5.
51
Grid Networks
Average Accuracy. Grid Networks (N100), w
15 100 samples, 10 observations
52
Random Coding Networks Bit Error Rate
Average BER. Random Coding (N200, P4),
w22, 100 samples, 60 seconds
53
Real World Benchmarks
Average Accuracy and Time. 30 samples, 10
observations, 30 seconds
54
Empirical Results Max-CSP
  • Random Binary Problems ltN, K, C, Tgt
  • N number of variables
  • K domain size
  • C number of constraints
  • T Tightness
  • Task Max-CSP

55
BBBT(i) vs BBMB(i), N50
BBBT(i) vs. BBMB(i)
56
BBBT(i) vs BBMB(i), N100
BBBT(i) vs. BBMB(i).
57
Searching the Graph caching goods
5
0
0
0
1
A context(A) A
2
0
0
4
B context(B) AB
0
1
0
1
3
1
5
4
0
2
2
5
C context(C) ABC
0
1
0
1
0
1
0
1
6
4
4
1
6
4
4
D context(D) ABD
5
2
2
0
5
2
2
0
0
1
0
1
0
1
0
1
5
2
1
3
E context(E) AE
5
1
2
3
3
0
3
5
3
5
3
0
0
1
0
1
F context(F) F
0
0
2
1
3
4
2
2
0
1
58
Searching the Graph caching goods
5
0
0
5
7
0
1
A context(A) A
2
0
0
4
B context(B) AB
6
5
7
4
0
1
0
1
3
1
5
4
0
2
2
5
C context(C) ABC
8
5
3
1
7
4
2
0
0
1
0
1
0
1
0
1
6
4
4
1
6
4
4
1
D context(D) ABD
5
2
2
0
5
2
2
0
3
3
1
1
2
2
0
0
0
1
0
1
0
1
0
1
5
2
1
3
E context(E) AE
5
1
2
3
3
0
3
5
3
5
3
0
0
2
1
0
0
1
0
1
F context(F) F
0
0
2
1
3
4
2
2
0
1
59
Outline
  • Introduction
  • Optimization tasks for graphical models
  • Solving by inference and search
  • Inference
  • Bucket elimination, dynamic programming
  • Mini-bucket elimination, belief propagation
  • Search
  • Branch and bound and best-first
  • Lower-bounding heuristics
  • AND/OR search spaces
  • Hybrids of search and inference
  • Cutset decomposition
  • Super-bucket scheme

60
Classic OR Search Space
Ordering A B E C D F
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
61
AND/OR Search Space
A
F
B
C
D
E
Primal graph
DFS tree
62
AND/OR vs. OR
AND
0
1
AND/OR
OR
B
B
AND
0
1
0
1
OR
E
C
E
C
E
C
E
C
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
OR
D
F
D
F
D
F
D
F
D
F
D
F
D
F
D
F
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
AND/OR size exp(4), OR size exp(6)
OR
63
OR space vs. AND/OR space
Random graphs with 20 nodes, 20 edges and 2
values per node.
64
AND/OR vs. OR
(A1,B1) (B0,C0)
AND/OR
F
65
AND/OR vs. OR
(A1,B1) (B0,C0)
AND/OR
Space linear Time O(exp(m)) O(w log n)
Linear space, Time O(exp(n))
F
66
CSP AND/OR Search Tree
OR
A
AND
0
1
OR
B
B
AND
0
1
0
1
C
C
OR
C
C
E
E
E
E
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
OR
D
D
D
D
D
D
D
D
F
F
F
F
F
F
F
F
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
67
CSP AND/OR Search Tree
OR
A
AND
0
1
OR
B
B
AND
0
1
0
1
C
C
OR
C
C
E
E
E
E
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
OR
D
D
D
D
D
D
D
D
F
F
F
F
F
F
F
F
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
68
CSP AND/OR Tree DFS
OR
A
AND
0
OR
B
AND node Combination operator (product)
AND
3
0
OR
C
3
E
1
OR node Marginalization operator (summation)
AND
2
1
1
0
0
1
0
1
OR
D
D
2
1
1
F
AND
0
1
0
1
0
1
1
1
1
0
0
1
69
AND/OR Tree Search for COP
5
A
0
5
0
B
5
2
6
0
C
3
E
3
3
1
3
5
0
1
5
2
0
2
0
1
D
D
F
F
5
2
0
2
0
1
0
1
0
1
0
1
5
6
4
2
3
0
2
2
AND node Combination operator (summation)
OR node Marginalization operator (minimization)
70
Summary of AND/OR Search Trees
  • Based on a backbone pseudo-tree
  • A solution is a subtree
  • Each node has a value cost of the optimal
    solution to the subproblem (computed recursively
    based on the values of the descendants)
  • Solving a task finding the value of the root
    node
  • AND/OR search tree and algorithms are
  • (Freuder Quinn85, Collin, Dechter
    Katz91, Bayardo Miranker95)
  • Space O(n)
  • Time O(exp(m)), where m is the depth of the
    pseudo-tree
  • Time O(exp(w log n))
  • BFS is time and space O(exp(w log n)

71
An AND/OR Graph Caching Goods
A
J
A
F
B
B
C
K
C
E
D
E
D
F
G
J
G
H
OR
A
H
K
AND
0
1
OR
B
B
AND
0
1
0
1
OR
E
C
E
C
E
C
E
C
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
OR
D
F
D
F
D
F
D
F
D
F
D
F
D
F
D
F
AND
0
1
0
1
OR
G
G
J
J
AND
0
1
0
1
0
1
0
1
OR
H
H
H
H
K
K
K
K
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
72
Context-based Caching
  • Caching is possible when context is the same
  • context parent-separator set in induced
    pseudo-graph current variable parents
    connected to subtree below

context(B) A, B context(c)
A,B,C context(D) D context(F) F
73
Complexity of AND/OR Graph
  • Theorem Traversing the AND/OR search graphis
    time and space exponential in the induced
    width/tree-width.
  • If applied to the OR graph complexity is time and
    space exponential in the path-width.

74
CSP AND/OR Search Tree
OR
A
AND
0
1
OR
B
B
AND
0
1
0
1
C
C
OR
C
C
E
E
E
E
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
OR
D
D
D
D
D
D
D
D
F
F
F
F
F
F
F
F
AND
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
75
CSP AND/OR Tree DFS
OR
A
AND
0
OR
B
AND
3
0
OR
C
3
E
1
AND
2
1
1
0
0
1
0
1
OR
D
D
2
1
1
F
AND
0
1
0
1
0
1
1
1
1
0
0
1
76
CSP AND/OR Search Graph(Caching Goods)
OR
A
AND
0
1
OR
B
B
AND
0
1
0
1
C
C
OR
C
C
E
E
E
E
AND
0
1
0
1
0
1
0
1
OR
D
D
D
D
F
F
F
F
AND
0
1
0
1
77
CSP AND/OR Search Graph(Caching Goods)
Time and Space O(exp(w))
OR
A
AND
0
1
OR
B
B
AND
0
1
0
1
C
C
OR
C
C
E
E
E
E
AND
0
1
0
1
0
1
0
1
OR
D
D
D
D
F
F
F
F
AND
0
1
0
1
78
All Four Search Spaces
Full OR search tree 126 nodes
Context minimal OR search graph 28 nodes
Context minimal AND/OR search graph 18 AND nodes
Full AND/OR search tree 54 AND nodes
79
AND/OR vs. OR DFS algorithms
k domain size m pseudo-tree depth n
number of variables w induced width pw path
width
  • AND/OR tree
  • Space O(n)
  • Time O(n km)
  • O(n kw log n)
  • (Freuder85 Bayardo95 Darwiche01)
  • AND/OR graph
  • Space O(n kw)
  • Time O(n kw)
  • OR tree
  • Space O(n)
  • Time O(kn)
  • OR graph
  • Space O(n kpw)
  • Time O(n kpw)

80
Searching AND/OR Graphs
  • AO(i) searches depth-first, cache i-context
  • i the max size of a cache table (i.e. number of
    variables in a context)

i0
iw
Space O(exp w) Time O(exp w)
Space O(n) Time O(exp(w log n))
AO(i) time complexity?
81
End of class
  • We did not cover the rest of the slides

82
AND/OR Branch-and-Bound (AOBB)
  • Associate each node n with a static heuristic
    estimate h(n) of v(n)
  • h(n) is a lower bound on the value v(n)
  • For every node n in the search tree
  • ub(n) current best solution cost rooted at n
  • lb(n) lower bound on the minimal cost at n

83
Lower/Upper Bounds
UB(X) best cost below X (i.e. v(X,0))
X
LB(X) LB(X,1)
v(X,0)
1
0
LB(X,1) l(X,1) v(A) h(C) LB(B)
v(A)
h(C)
A
B
C
LB(B) LB(B,0)
h(B,0)
LB(B,0) h(B,0)
0
Prune below AND node (B,0) if LB(X) UB(X)
84
Shallow/Deep Cutoffs
UB(X)
Prune if LB(X) UB(X)
X
LB(X)
1
0
UB(X)
X
A
B
C
LB(X) h(X,1)
0
1
0
Shallow cutoff
E
D
0
Reminiscent of Minimax shallow/deep cutoffs
Deep cutoff
85
Summary of AOBB
  • Traverses the AND/OR search tree in a depth-first
    manner
  • Lower bounds computed based on heuristic
    estimates of nodes at the frontier of search, as
    well as the values of nodes already explored
  • Prunes the search space as soon as an upper-lower
    bound violation occurs

86
Heuristics for AND/OR
  • In the AND/OR search space h(n) can be computed
    using any heuristic. We used
  • Static Mini-Bucket heuristics
  • Dynamic Mini-Bucket heuristics
  • Maintaining FDAC Larrosa Schiex03
  • (full directional soft arc-consistency)

87
Empirical Evaluation
  • Tasks
  • Solving WCSPs
  • Finding the MPE in belief networks
  • Benchmarks (WCSP)
  • Random binary WCSPs
  • RLFAP networks (CELAR6)
  • Bayesian Networks Repository
  • Algorithms
  • s-AOMB(i), d-AOMB(i), AOMFDAC
  • s-BBMB(i), d-BBMB(i), BBMFDAC
  • Static variable ordering (dfs traversal of the
    pseudo-tree)

88
Random Binary WCSPs(Marinescu and Dechter, 2005)
D-AOMB vs D-BBMB
S-AOMB vs S-BBMB
Random networks with n20 (number of variables),
d5 (domain size), c100 (number of constraints),
t70 (tightness). Time limit 180 seconds. AO
search is superior to OR search
89
Random Binary WCSPs (contd.)
dense
sparse
n20 (variables), d5 (domain size), c100
(constraints), t70 (tightness)
n50 (variables), d5 (domain size), c80
(constraints), t70 (tightness)
AOMB for large i is competitive with AOMFDAC
90
Resource Allocation
Radio Link Frequency Assignment Problem (RLFAP)
CELAR6 sub-instances
AOMFDAC is superior to ORMFDAC
91
Bayesian Networks Repository
Time limit 600 seconds
available at http//www.cs.huji.ac.il/labs/compbio
/Repository
Static AO is better with accurate heuristic
(large i)
92
Outline
  • Introduction
  • Optimization tasks for graphical models
  • Solving by inference and search
  • Inference
  • Bucket elimination, dynamic programming
  • Mini-bucket elimination, belief propagation
  • Search
  • Branch and bound and best-first
  • Lower-bounding heuristics
  • AND/OR search spaces
  • Searching trees
  • Searching graphs
  • Hybrids of search and inference
  • Cutset decomposition
  • Super-bucket scheme

93
From Searching Trees to Searching Graphs
  • Any two nodes that root identical
    subtrees/subgraphs can be merged
  • Minimal AND/OR search graph closure under merge
    of the AND/OR search tree
  • Inconsistent sub-trees can be pruned too.
  • Some portions can be collapsed or reduced.

94
AND/OR Search Graph
context(A) A context(B) B,A context(C)
C,B context(D) D context(E)
E,A context(F) F
A
B
E
C
D
F
Primal graph
Pseudo-tree
95
AND/OR Search Graph
context(A) A context(B) B,A context(C)
C,B context(D) D context(E)
E,A context(F) F
A
B
E
C
D
F
Primal graph
Pseudo-tree
OR
A
AND
0
1
OR
B
B
AND
1
0
0
1
C
C
OR
C
C
E
E
E
E
AND
0
1
1
0
0
1
0
1
OR
D
D
D
D
F
F
F
F
AND
0
1
0
1
96
Context-based caching
context(A) A context(B) B,A context(C)
C,B context(D) D context(E)
E,A context(F) F
A
B
Primal graph
E
C
D
F
Cache Table (C)
Space O(exp(2))
97
Searching AND/OR Graphs
  • AO(j) searches depth-first, cache j-context
  • j the max size of a cache table (i.e. number of
    variables in a context)

j0
jw
Space O(exp w) Time O(exp w)
Space O(n) Time O(exp(w log n))
AO(j) time complexity?
98
Graph AND/OR Branch-and-Bound - AOBB(j)
Marinescu Dechter, CP2005
  • Like BB on tree except, during search, merge
    nodes based on context (caching) maintain cache
    tables of size O(exp(j)), where j is a bound on
    the size of the context.
  • Heuristics
  • Static mini-bucket
  • Dynamic mini-bucket
  • Soft directional arc-consistency

99
Pseudo-trees (I)
  • AND/OR graph/tree search algorithms influenced by
    the pseudo-tree quality
  • Finding the minimal context/depth pseudo-tree is
    a hard problem
  • Heuristics
  • Min-fill (min context)
  • Hypergraph separation (min depth)

100
Pseudo-trees (II)
  • MIN-FILL Kjæaerulff, 1990, Bayardo Miranker,
    1995
  • depth first traversal of the induced graph
    constructed along some elimination order
  • elimination order prefers variables with smallest
    fill set
  • HYPERGRAPH Darwiche, 2001
  • constraints are vertices in the hypergraph and
    variables are hyperedges
  • Recursive decomposition of the hypergraph while
    minimizing the separator size (i.e. number of
    variables) at each step
  • Using state-of-the-art software package hMeTiS

101
Quality of pseudo-trees
Bayesian Networks Repository
SPOT5 Benchmarks
102
Empirical Evaluation
  • Tasks
  • Solving binary WCSPs
  • Finding the MPE in belief networks
  • Benchmarks
  • Random networks
  • Resource allocation (SPOT5)
  • Genetic linkage analysis
  • Algorithms
  • s-AOMB(i,j), d-AOMB(i,j), AOMFDAC(j), VEC
  • j is the cache bound
  • Static variable ordering

103
Random Networks
Static heuristics
Dynamic Heuristics
Caching helps for static heuristics with small
i-bounds
Random networks with n100 (number of variables),
d3 (domain size), c90 (number of CPTs), p2
(number of parents per CPT). Time limit 180
seconds.
104
Resource Allocation SPOT5
Time limit 1800 seconds
Cache means ji. we picked bestperforming i
The heuristics MB(i) is strong enough to prune so
caching is less relevant
105
6 people, 3 markers
L12m
L12f
L11m
L11f
X12
X11
S15m
S13m
L13m
L13f
L14m
L14f
X14
X13
S15m
S15m
L15m
L15f
L16m
L16f
S16m
S15m
X15
X16
L22m
L22f
L21m
L21f
X21
X22
S25m
S23m
L23m
L23f
L24m
L24f
X23
X24
S25m
S25m
L25m
L25f
L26m
L26f
S26m
S25m
X25
X26
L32m
L32f
L31m
L31f
X32
X31
S35m
S33m
L33m
L33f
L34m
L34f
X33
X34
S35m
S35m
L35m
L35f
L36m
L36f
S36m
S35m
X35
X36
106
Empirical evaluation
averages over 5 runs
107
Pedigree 23
averages over 5 runs
108
Pedigree 30
averages over 5 runs
109
Outline
  • Introduction
  • Optimization tasks for graphical models
  • Solving by inference and search
  • Inference
  • Bucket elimination, dynamic programming
  • Mini-bucket elimination, belief propagation
  • Search
  • Branch and bound and best-first
  • Lower-bounding heuristics
  • AND/OR search spaces
  • Hybrids of search and inference
  • Cutset decomposition
  • Super-bucket scheme
Write a Comment
User Comments (0)
About PowerShow.com