Inference

About This Presentation

Title:

Inference

Description:

Knowledge-Based. Model Construction. Basic idea: Most of ground ... Knowledge-based model construction (KBMC): First construct minimum subset of network needed ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 67

Provided by: Pedr90

Learn more at: https://homes.cs.washington.edu

Category:

Tags: inference

more less

Transcript and Presenter's Notes

Title: Inference

1
Inference
2
Overview

The MC-SAT algorithm
Knowledge-based model construction
Lazy inference
Lifted inference

3
MCMC Gibbs Sampling
state ? random truth assignment for i ? 1 to
num-samples do for each variable x
sample x according to P(xneighbors(x))
state ? state with new value of x P(F) ? fraction
of states in which F is true
4
But Insufficient for Logic

ProblemDeterministic dependencies break
MCMCNear-deterministic ones make it very slow
SolutionCombine MCMC and WalkSAT? MC-SAT
algorithm

5
The MC-SAT Algorithm

MC-SAT MCMC SAT
MCMC Slice sampling with an auxiliary variable
for each clause
SAT Wraps around SampleSAT (a uniform
sampler)
to sample from highly non-uniform
distributions
Sound Satisfies ergodicity detailed balance
Efficient Orders of magnitude faster than
Gibbs and other MCMC algorithms

6
Auxiliary-Variable Methods

Main ideas
Use auxiliary variables to capture dependencies
Turn difficult sampling into uniform sampling
Given distribution P(x)
Sample from f (x,u), then discard u

7
Slice Sampling Damien et al. 1999
U
P(x)
Slice
u(k)
X
x(k1)
x(k)
8
Slice Sampling

Identifying the slice may be difficult
Introduce an auxiliary variable ui for each ?i

9
The MC-SAT Algorithm

Approximate inference for Markov logic
Use slice sampling in MCMC
Auxiliary var. ui for each clause Ci
Ci unsatisfied 0 ? ui ? 1
? exp ( wi fi ( x ) ) ? ui for any next state x
Ci satisfied 0 ? ui ? exp ( wi )
? With prob. 1 exp ( wi ), next state x must
satisfy Ci
to ensure that exp ( wi fi(x) ) ? ui

10
The MC-SAT Algorithm

Select random subset M of satisfied clauses
Larger wi ? Ci more likely to be selected
Hard clause (wi ? ?) Always selected
Slice ? States that satisfy clauses in M
Sample uniformly from these

11
The MC-SAT Algorithm

X ( 0 ) ? A random solution satisfying all hard
clauses
for k ? 1 to num_samples
M ? Ø
forall Ci satisfied by X ( k 1 )
With prob. 1 exp ( wi ) add Ci to M
endfor
X ( k ) ? A uniformly random solution satisfying
M
endfor

12
The MC-SAT Algorithm

Sound Satisfies ergodicity and detailed balance
(Assuming we have a perfect uniform sampler)
Approximately uniform sampler Wei et al. 2004
SampleSAT WalkSAT Simulated annealing
WalkSAT Find a solution very efficiently
But may be highly non-uniform
Sim. Anneal. Uniform sampling as temperature ? 0
But very slow to reach a solution
Trade off uniformity vs. efficiency
by tuning the prob. of WS steps vs. SA steps

13
Combinatorial Explosion

Problem If there are n constantsand the
highest clause arity is c,the ground network
requires O(n ) memory(and inference time grows
in proportion)
Solutions
Knowledge-based model construction
Lazy inference
Lifted inference

c
14
Knowledge-BasedModel Construction

Basic idea
Most of ground network may be unnecessary,because
evidence renders query independent of it
Assumption Evidence is conjunction of ground
atoms
Knowledge-based model construction (KBMC)
First construct minimum subset of network
neededto answer query (generalization of KBMC)
Then apply MC-SAT (or other)

15
Ground Network Construction
network ? Ø queue ? query nodes repeat node ?
front(queue) remove node from queue add
node to network if node not in evidence then
add neighbors(node) to queue until
queue Ø
16
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
17
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
18
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
19
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
20
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
21
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
22
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
23
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
24
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
25
Lazy Inference

Most domains are extremely sparse
Most ground atoms are false
Therefore most clauses are trivially satisfied
We can exploit this by
Having a default state for atoms and clauses
Grounding only those atoms and clauses with
non-default states
Typically reduces memory (and time) by many
orders of magnitude

26
Example Scientific Research
1000 Papers , 100 Authors
Author(person,paper)
100,000 possible groundings But only a few
thousand are true
Author(person1,paper) ? Author(person2,paper)
? Coauthor(person1,person2)
10 million possible groundings But only tens
of thousands are unsatisfied
27
Lazy Inference

Here LazySAT (lazy version of WalkSAT)
Method is applicable to many other algorithms
(including MC-SAT)

28
Naïve Approach

Create the groundings and keep in memory
True atoms
Unsatisfied clauses
Memory cost is O( unsatisfied clauses)
Problem
Need to go to the KB for each flip
Too slow!
Solution Idea Keep more things in memory
A list of active atoms
Potentially unsatisfied clauses (active clauses)

29
LazySAT Definitions

An atom is an Active Atom if
It is in the initial set of active atoms
It was flipped at some point during the search
A clause is an Active Clause if
It can be made unsatisfied by flipping zero or
more active atoms in it

30
LazySAT The Basics

Activate all the atoms appearing in clauses
unsatisfied by evidence DB
Create the corresponding clauses
Randomly assign truth value to all active atoms
Activate an atom when it is flipped if not
already so
Potentially activate additional clauses
No need to go to the KB for calculating the
change in cost for flipping an active atom

31
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
32
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
33
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
34
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
35
Example
Coa(C,A) ? Coa(A,A) ? Coa(C,A)
Coa(A,A)
False
Coa(A,B) ? Coa(B,C) ? Coa(A,C)
Coa(A,B)
False
Coa(C,A) ? Coa(A,B) ? Coa(C,B)
Coa(A,C)
False
Coa(C,B) ? Coa(B,A) ? Coa(C,A)
Coa(C,B) ? Coa(B,B) ? Coa(C,B)
. . .
. . .
36
Example
Coa(C,A) ? Coa(A,A) ? Coa(C,A)
Coa(A,A)
False
Coa(A,B) ? Coa(B,C) ? Coa(A,C)
Coa(A,B)
True
Coa(C,A) ? Coa(A,B) ? Coa(C,B)
Coa(A,C)
False
Coa(C,B) ? Coa(B,A) ? Coa(C,A)
Coa(C,B) ? Coa(B,B) ? Coa(C,B)
. . .
. . .
37
Example
Coa(C,A) ? Coa(A,A) ? Coa(C,A)
Coa(A,A)
False
Coa(A,B) ? Coa(B,C) ? Coa(A,C)
Coa(A,B)
True
Coa(C,A) ? Coa(A,B) ? Coa(C,B)
Coa(A,C)
False
Coa(C,B) ? Coa(B,A) ? Coa(C,A)
Coa(C,B) ? Coa(B,B) ? Coa(C,B)
. . .
. . .
38
LazySAT Performance

Solution quality
Performs the same sequence of flips
Same result as WalkSAT
Memory cost
O( potentially unsatisfied clauses)
Time cost
Much lower initialization cost
Cost of creating active clauses amortized over
many flips

39
Lifted Inference

We can do inference in first-order logic without
grounding the KB (e.g. resolution)
Lets do the same for inference in MLNs
Group atoms and clauses into indistinguishable
sets
Do inference over those
First approach Lifted variable elimination(not
practical)
Here Lifted belief propagation

40
Belief Propagation
Features (f)
Nodes (x)
41
Lifted Belief Propagation
Features (f)
Nodes (x)
42
Lifted Belief Propagation
Features (f)
Nodes (x)
43
Lifted Belief Propagation
?,? Functions of edge counts
?
?
Features (f)
Nodes (x)
44
Lifted Belief Propagation

Form lifted network composed of supernodesand
superfeatures
Supernode Set of ground atoms that all send
andreceive same messages throughout BP
Superfeature Set of ground clauses that all send
and receive same messages throughout BP
Run belief propagation on lifted network
Guaranteed to produce same results as ground BP
Time and memory savings can be huge

45
Forming the Lifted Network

1. Form initial supernodesOne per predicate and
truth value(true, false, unknown)
2. Form superfeatures by doing joins of their
supernodes
3. Form supernodes by projectingsuperfeatures
down to their predicatesSupernode Groundings
of a predicate with same number of projections
from each superfeature
4. Repeat until convergence

46
Example
Evidence Smokes(Ana) Friends(Bob,Charles),
Friends(Charles,Bob)
N people in the domain
47
Example
Evidence Smokes(Ana) Friends(Bob,Charles),
Friends(Charles,Bob)
Intuitive Grouping
Smokes(Bob) Smokes(Charles)
Smokes(James) Smokes(Harry)
Smokes(Ana)
48
Initialization
Supernodes
Superfeatures
Smokes(Ana)
Smokes(X) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
49
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana)
Smokes(Ana)
Smokes(X) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
50
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X)
Smokes(Ana)
Smokes(X) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
51
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X
? Ana
Smokes(Ana)
Smokes(X) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
52
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X
? Ana
Smokes(Ana)
Smokes(X) X ? Ana
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
53
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X
? Ana
Smokes(Ana)
Smokes(X) X ? Ana
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
54
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X
? Ana
Smokes(Ana)
Smokes(X) X ? Ana
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
Smokes(Bob) ? Friends(Bob, X) ? Smokes(X) X ?
Charles
55
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
56
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
57
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana

Populate with
projection counts

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
58
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
59
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
60
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
61
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
62
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Smokes(Bob) Smokes(Charles)

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
63
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Smokes(Bob) Smokes(Charles)

Smokes(Bob) ? Friends(Bob, Charles)
? Smokes(Charles)

Smokes(X) X ? Ana, Bob, Charles
Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
64
Theorem

There exists a unique minimal lifted network
The lifted network construction algo. finds it
BP on lifted network gives same result ason
ground network

65
Representing SupernodesAnd Superfeatures

List of tuples Simple but inefficient
Resolution-like Use equality and inequality
Form clusters (in progress)

66
Open Questions

Can we do approximate KBMC/lazy/lifting?
Can KBMC, lazy and lifted inference be combined?
Can we have lifted inference over both
probabilistic and deterministic dependencies?
(Lifted MC-SAT?)
Can we unify resolution and lifted BP?
Can other inference algorithms be lifted?

Write a Comment

User Comments (0)