Inference - PowerPoint PPT Presentation

About This Presentation
Title:

Inference

Description:

Knowledge-Based. Model Construction. Basic idea: Most of ground ... Knowledge-based model construction (KBMC): First construct minimum subset of network needed ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 67
Provided by: Pedr90
Category:
Tags: inference

less

Transcript and Presenter's Notes

Title: Inference


1
Inference
2
Overview
  • The MC-SAT algorithm
  • Knowledge-based model construction
  • Lazy inference
  • Lifted inference

3
MCMC Gibbs Sampling
state ? random truth assignment for i ? 1 to
num-samples do for each variable x
sample x according to P(xneighbors(x))
state ? state with new value of x P(F) ? fraction
of states in which F is true
4
But Insufficient for Logic
  • ProblemDeterministic dependencies break
    MCMCNear-deterministic ones make it very slow
  • SolutionCombine MCMC and WalkSAT? MC-SAT
    algorithm

5
The MC-SAT Algorithm
  • MC-SAT MCMC SAT
  • MCMC Slice sampling with an auxiliary variable
  • for each clause
  • SAT Wraps around SampleSAT (a uniform
    sampler)
  • to sample from highly non-uniform
    distributions
  • Sound Satisfies ergodicity detailed balance
  • Efficient Orders of magnitude faster than
  • Gibbs and other MCMC algorithms

6
Auxiliary-Variable Methods
  • Main ideas
  • Use auxiliary variables to capture dependencies
  • Turn difficult sampling into uniform sampling
  • Given distribution P(x)
  • Sample from f (x,u), then discard u

7
Slice Sampling Damien et al. 1999
U
P(x)
Slice
u(k)
X
x(k1)
x(k)
8
Slice Sampling
  • Identifying the slice may be difficult
  • Introduce an auxiliary variable ui for each ?i

9
The MC-SAT Algorithm
  • Approximate inference for Markov logic
  • Use slice sampling in MCMC
  • Auxiliary var. ui for each clause Ci
  • Ci unsatisfied 0 ? ui ? 1
  • ? exp ( wi fi ( x ) ) ? ui for any next state x
  • Ci satisfied 0 ? ui ? exp ( wi )
  • ? With prob. 1 exp ( wi ), next state x must
    satisfy Ci
  • to ensure that exp ( wi fi(x) ) ? ui

10
The MC-SAT Algorithm
  • Select random subset M of satisfied clauses
  • Larger wi ? Ci more likely to be selected
  • Hard clause (wi ? ?) Always selected
  • Slice ? States that satisfy clauses in M
  • Sample uniformly from these

11
The MC-SAT Algorithm
  • X ( 0 ) ? A random solution satisfying all hard
    clauses
  • for k ? 1 to num_samples
  • M ? Ø
  • forall Ci satisfied by X ( k 1 )
  • With prob. 1 exp ( wi ) add Ci to M
  • endfor
  • X ( k ) ? A uniformly random solution satisfying
    M
  • endfor

12
The MC-SAT Algorithm
  • Sound Satisfies ergodicity and detailed balance
  • (Assuming we have a perfect uniform sampler)
  • Approximately uniform sampler Wei et al. 2004
  • SampleSAT WalkSAT Simulated annealing
  • WalkSAT Find a solution very efficiently
  • But may be highly non-uniform
  • Sim. Anneal. Uniform sampling as temperature ? 0
  • But very slow to reach a solution
  • Trade off uniformity vs. efficiency
  • by tuning the prob. of WS steps vs. SA steps

13
Combinatorial Explosion
  • Problem If there are n constantsand the
    highest clause arity is c,the ground network
    requires O(n ) memory(and inference time grows
    in proportion)
  • Solutions
  • Knowledge-based model construction
  • Lazy inference
  • Lifted inference

c
14
Knowledge-BasedModel Construction
  • Basic idea
  • Most of ground network may be unnecessary,because
    evidence renders query independent of it
  • Assumption Evidence is conjunction of ground
    atoms
  • Knowledge-based model construction (KBMC)
  • First construct minimum subset of network
    neededto answer query (generalization of KBMC)
  • Then apply MC-SAT (or other)

15
Ground Network Construction
network ? Ø queue ? query nodes repeat node ?
front(queue) remove node from queue add
node to network if node not in evidence then
add neighbors(node) to queue until
queue Ø
16
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
17
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
18
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
19
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
20
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
21
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
22
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
23
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
24
Example Grounding
Friends(A,B)
Smokes(A)
Friends(A,A)
Smokes(B)
Friends(B,B)
Cancer(A)
Cancer(B)
Friends(B,A)
P( Cancer(B) Smokes(A), Friends(A,B),
Friends(B,A))
25
Lazy Inference
  • Most domains are extremely sparse
  • Most ground atoms are false
  • Therefore most clauses are trivially satisfied
  • We can exploit this by
  • Having a default state for atoms and clauses
  • Grounding only those atoms and clauses with
    non-default states
  • Typically reduces memory (and time) by many
    orders of magnitude

26
Example Scientific Research
1000 Papers , 100 Authors
Author(person,paper)
100,000 possible groundings But only a few
thousand are true
Author(person1,paper) ? Author(person2,paper)
? Coauthor(person1,person2)
10 million possible groundings But only tens
of thousands are unsatisfied
27
Lazy Inference
  • Here LazySAT (lazy version of WalkSAT)
  • Method is applicable to many other algorithms
    (including MC-SAT)

28
Naïve Approach
  • Create the groundings and keep in memory
  • True atoms
  • Unsatisfied clauses
  • Memory cost is O( unsatisfied clauses)
  • Problem
  • Need to go to the KB for each flip
  • Too slow!
  • Solution Idea Keep more things in memory
  • A list of active atoms
  • Potentially unsatisfied clauses (active clauses)

29
LazySAT Definitions
  • An atom is an Active Atom if
  • It is in the initial set of active atoms
  • It was flipped at some point during the search
  • A clause is an Active Clause if
  • It can be made unsatisfied by flipping zero or
    more active atoms in it

30
LazySAT The Basics
  • Activate all the atoms appearing in clauses
    unsatisfied by evidence DB
  • Create the corresponding clauses
  • Randomly assign truth value to all active atoms
  • Activate an atom when it is flipped if not
    already so
  • Potentially activate additional clauses
  • No need to go to the KB for calculating the
    change in cost for flipping an active atom

31
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
32
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
33
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
34
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
35
Example
Coa(C,A) ? Coa(A,A) ? Coa(C,A)
Coa(A,A)
False
Coa(A,B) ? Coa(B,C) ? Coa(A,C)
Coa(A,B)
False
Coa(C,A) ? Coa(A,B) ? Coa(C,B)
Coa(A,C)
False
Coa(C,B) ? Coa(B,A) ? Coa(C,A)
Coa(C,B) ? Coa(B,B) ? Coa(C,B)
. . .
. . .
36
Example
Coa(C,A) ? Coa(A,A) ? Coa(C,A)
Coa(A,A)
False
Coa(A,B) ? Coa(B,C) ? Coa(A,C)
Coa(A,B)
True
Coa(C,A) ? Coa(A,B) ? Coa(C,B)
Coa(A,C)
False
Coa(C,B) ? Coa(B,A) ? Coa(C,A)
Coa(C,B) ? Coa(B,B) ? Coa(C,B)
. . .
. . .
37
Example
Coa(C,A) ? Coa(A,A) ? Coa(C,A)
Coa(A,A)
False
Coa(A,B) ? Coa(B,C) ? Coa(A,C)
Coa(A,B)
True
Coa(C,A) ? Coa(A,B) ? Coa(C,B)
Coa(A,C)
False
Coa(C,B) ? Coa(B,A) ? Coa(C,A)
Coa(C,B) ? Coa(B,B) ? Coa(C,B)
. . .
. . .
38
LazySAT Performance
  • Solution quality
  • Performs the same sequence of flips
  • Same result as WalkSAT
  • Memory cost
  • O( potentially unsatisfied clauses)
  • Time cost
  • Much lower initialization cost
  • Cost of creating active clauses amortized over
    many flips

39
Lifted Inference
  • We can do inference in first-order logic without
    grounding the KB (e.g. resolution)
  • Lets do the same for inference in MLNs
  • Group atoms and clauses into indistinguishable
    sets
  • Do inference over those
  • First approach Lifted variable elimination(not
    practical)
  • Here Lifted belief propagation

40
Belief Propagation
Features (f)
Nodes (x)
41
Lifted Belief Propagation
Features (f)
Nodes (x)
42
Lifted Belief Propagation
Features (f)
Nodes (x)
43
Lifted Belief Propagation
?,? Functions of edge counts
?
?
Features (f)
Nodes (x)
44
Lifted Belief Propagation
  • Form lifted network composed of supernodesand
    superfeatures
  • Supernode Set of ground atoms that all send
    andreceive same messages throughout BP
  • Superfeature Set of ground clauses that all send
    and receive same messages throughout BP
  • Run belief propagation on lifted network
  • Guaranteed to produce same results as ground BP
  • Time and memory savings can be huge

45
Forming the Lifted Network
  • 1. Form initial supernodesOne per predicate and
    truth value(true, false, unknown)
  • 2. Form superfeatures by doing joins of their
    supernodes
  • 3. Form supernodes by projectingsuperfeatures
    down to their predicatesSupernode Groundings
    of a predicate with same number of projections
    from each superfeature
  • 4. Repeat until convergence

46
Example
Evidence Smokes(Ana) Friends(Bob,Charles),
Friends(Charles,Bob)
N people in the domain
47
Example
Evidence Smokes(Ana) Friends(Bob,Charles),
Friends(Charles,Bob)
Intuitive Grouping
Smokes(Bob) Smokes(Charles)
Smokes(James) Smokes(Harry)
Smokes(Ana)
48
Initialization
Supernodes
Superfeatures
Smokes(Ana)
Smokes(X) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
49
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana)
Smokes(Ana)
Smokes(X) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
50
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X)
Smokes(Ana)
Smokes(X) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
51
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X
? Ana
Smokes(Ana)
Smokes(X) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
52
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X
? Ana
Smokes(Ana)
Smokes(X) X ? Ana
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
53
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X
? Ana
Smokes(Ana)
Smokes(X) X ? Ana
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
54
Joining the Supernodes
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X
? Ana
Smokes(Ana)
Smokes(X) X ? Ana
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Friends(Bob, Charles) Friends(Charles, Bob)
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X
? Charles .
Smokes(Bob) ? Friends(Bob, X) ? Smokes(X) X ?
Charles
55
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
56
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
57
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
  • Populate with
  • projection counts
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
58
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
59
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
60
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
61
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
62
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Smokes(Bob) Smokes(Charles)
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
63
Projecting the Superfeatures
Supernodes
Superfeatures
Smokes(Ana) ? Friends(Ana, X) ? Smokes(X) X ? Ana
Smokes(Ana)
Smokes(X) ? Friends(X, Ana) ? Smokes(Ana) X ? Ana
Smokes(Bob) Smokes(Charles)
  • Smokes(Bob) ? Friends(Bob, Charles)
  • ? Smokes(Charles)

Smokes(X) X ? Ana, Bob, Charles
Smokes(Bob) ? Friends(Bob,X) ? Smokes(X) X ?
Charles
64
Theorem
  • There exists a unique minimal lifted network
  • The lifted network construction algo. finds it
  • BP on lifted network gives same result ason
    ground network

65
Representing SupernodesAnd Superfeatures
  • List of tuples Simple but inefficient
  • Resolution-like Use equality and inequality
  • Form clusters (in progress)

66
Open Questions
  • Can we do approximate KBMC/lazy/lifting?
  • Can KBMC, lazy and lifted inference be combined?
  • Can we have lifted inference over both
    probabilistic and deterministic dependencies?
    (Lifted MC-SAT?)
  • Can we unify resolution and lifted BP?
  • Can other inference algorithms be lifted?
Write a Comment
User Comments (0)
About PowerShow.com