MemoryEfficient Inference in Relational Domains - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

MemoryEfficient Inference in Relational Domains

Description:

DeltaGain(v) returns the increase in weights(sat. clauses) // caused by flipping v ... Focus on function free finite FOL. Propositionalize the first order KB ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 30
Provided by: Dude7
Category:

less

Transcript and Presenter's Notes

Title: MemoryEfficient Inference in Relational Domains


1
Memory-Efficient Inference in Relational Domains
  • Parag Singla Pedro Domingos
  • Computer Science Engineering
  • University of Washington

2
Outline
  • Motivation
  • Satisfiability and Relational Inference
  • Memory-Efficient Inference
  • Experiments
  • Discussion

3
Motivation
  • Most problems in AI are relational
  • Multiple types of objects
  • Relations between objects
  • An efficient inference approach
    Propositionali-zation followed by satisfiability
    testing Kautz Selman 96
  • MPE/MAP inference in a large class of SRL models
    can be done using weighted satisfiability
    Richardson Domingos 06

4
Problem
  • Exponential cost of propositionalization
  • Memory cost is O(( objects)clause-arity)
  • Applicability is severely limited
  • Even moderately sized domains tend to blow out of
    memory

5
Example Scientific Research Domain
1000 Papers , 100 Authors
Author(person,paper)
100,000 possible groundings .. But only a few
thousand true
Author(person1,paper) ? Author(person2,paper)
? Coauthor(person1,person2)
10 million possible groundings .. But only tens
of thousands unsatisfied
6
Solution
  • Most real-world domains characterized by extreme
    sparseness
  • Majority of predicates are false
  • Majority of clauses are satisfied
  • Exploit sparseness
  • Embodied in lazy variation of WalkSAT
  • Create only potentially unsatisfied clause
    groundings
  • LazySAT
  • Memory-cost is O ( potentially unsatisfied
    clauses)

7
Outline
  • Motivation
  • Satisfiability and Relational Inference
  • Memory-Efficient Inference
  • Experiments
  • Discussion

8
Satisfiability
  • KB Set of formulas over Boolean variables
  • Every KB can be converted to a CNF form
  • (X1 ? X5 ? X7) ? .. (X12 ? X53 ? X5) ?
  • SAT Problem of finding a satisfying assignment
  • Protypical NP complete problem
  • Weighted SAT
  • Each clause given a weight
  • Maximize sum of weights of satisfied clauses
  • One of most efficient approaches is stochastic
    local search (e.g. WalkSAT Selman et al. 96)

9
(Max)WalkSAT
for i ? 1 to max-tries do soln random truth
assignment to all atoms for j ? 1 to
max-flips do if ? weights(sat. clauses)
threshold then return soln c
? random unsatisfied clause with
probability p vf ? a randomly
chosen variable from c else
for each variable v in c do
compute DeltaGain(v)
vf ? v with highest DeltaGain(v) soln ?
soln with vf flipped return failure, best
solution found // DeltaGain(v) returns the
increase in ? weights(sat. clauses) // caused by
flipping v
initialization
random move
greedy move
10
Relational Inference
  • First-Order Logic (FOL) explicitly represents a
    domains relational structure
  • Focus on function free finite FOL
  • Propositionalize the first order KB
  • Replace universal quantification by conjunction
  • Replace existential quantification by disjunction
  • Perform satisfiability testing over
    propositiona-lized KB

11
Statistical Relational Learning
  • Statistical relational models explicitly deal
    with
  • Relational structure
  • Probabilistic dependencies
  • Markov Logic Richardson Domingos 06
  • Weighted first-order formulas
  • Subsumes many other SRL models
  • MAP/MPE inference in Markov Logic is an instance
    of weighted satisfiability

12
Outline
  • Motivation
  • Satisfiability and Relational Inference
  • Memory-Efficient Inference
  • Experiments
  • Discussion

13
Naïve Approach
  • Create the groundings and keep in memory
  • True atoms
  • Unsatisfied clauses
  • Memory-cost is O( unsatisfied clauses)
  • Problem
  • Need to go to the KB for each flip
  • Too slow!
  • Solution Idea Keep more things in memory
  • A list of active atoms
  • Potentially unsatisfied clauses (active clauses)

14
LazySAT Definitions
  • An atom is an Active Atom if
  • It is in the initial set of active atoms
  • It was flipped at some point during the search
  • A clause is an Active Clause if
  • It can be made unsatisfied by flipping zero or
    more active atoms in it

15
LazySAT The Basics
  • Activate all the atoms appearing in clauses
    unsatisfied by evidence DB
  • Create the corresponding clauses
  • Randomly assign truth value to all active atoms
  • Activate an atom when it is flipped if not
    already so
  • Potentially activate additional clauses
  • No need to go to the KB for calculating the
    change in cost for flipping an active atom

16
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
17
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
18
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
19
LazySAT
for i ? 1 to max-tries do active_atoms ?
atoms in clauses unsatisfied by DB
active_clauses ? clauses activated by
active_atoms soln random truth assignment to
active_atoms for j ? 1 to max-flips do
if ? weights(sat. clauses) threshold then
return soln c ? random unsatisfied
clause with probability p vf ? a
randomly chosen variable from c else
for each variable v in c do
compute DeltaGain(v), using weighted_KB if vf ?
active_atoms vf ? v with highest
DeltaGain(v) if vf ? active_atoms then
activate vf and add clauses activated by vf
soln ? soln with vf flipped return failure,
best soln found
20
LazySAT Performance Analysis
  • Solution Quality
  • Performs the same sequence of flips
  • Same result as WalkSAT
  • Memory cost
  • O( potentially unsatisfied clauses)
  • Time cost
  • Much lower initialization cost
  • Cost of creating active clauses amortized over
    many flips

21
Outline
  • Motivation
  • Satisfiability and Relational Inference
  • Memory-Efficient Inference
  • Experiments
  • Discussion

22
Experiments
  • Two domains
  • De-duplicating citation databases
  • Cora
  • BibServ
  • Planning
  • Blocks world
  • Used the Alchemy system Kok et al. 2005
  • Experiments run on 3 GHz machines with
    3.46 GB of RAM

23
De-duplication
  • Weighted KB consisting of 33 first-order rules
  • High similarity score gt Same field
  • Same record gt Same fields
  • Cora
  • Cleaned version of McCallums hand-labeled
    dataset
  • 1295 citations to 132 different research papers
  • BibServ
  • Public repository of half-a-million citations
  • Experimented on user donated subset (21,805
    citations)

24
Methodology
  • Compared using varying number of records
  • Memory Number of clauses grounded
  • Speed Average Flips/sec
  • Results reported over an average of 5 randomly
    chosen subsets for a given record count

25
Results Memory
Cora
BibServ
n2.97
n2.98
n1.75
n2.34
Memory Reduction on full DB 300X
Memory Reduction on full DB 400,000X
26
Results Speed
Cora
BibServ
27
Outline
  • Motivation
  • Satisfiability and Relational Inference
  • Memory-Efficient Inference
  • Experiments
  • Discussion

28
Conclusion
  • Satisfiability testing is an effective approach
    to inference in relational domains
  • Limitation Exponential cost of
    propositionali-zation
  • LazySAT
  • Exploits sparseness
  • Reduces memory cost by many orders of magnitude
  • Same solution with similar speed

29
Future Work
  • Extending LazySAT to other SAT solvers, MCMC
  • Degrading gracefully when number of clauses
    exceeds available memory
  • Combining LazySAT with KBMC
  • LazySAT available in the Alchemy system
  • http//www.cs.washington.edu/ai/alchemy
Write a Comment
User Comments (0)
About PowerShow.com