First Order Inference - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

First Order Inference

Description:

Russell, S & Norvig, P 'Artificial Intelligence: A Modern Approach', Prentice Hall ... Knows(John,x) Knows(John,Jane) Knows(John,x) Knows(y,OJ) Knows(John,x) ... – PowerPoint PPT presentation

Number of Views:106
Avg rating:3.0/5.0
Slides: 46
Provided by: bobm8
Category:

less

Transcript and Presenter's Notes

Title: First Order Inference


1
First Order Inference
  • Bob McKay
  • School of Computer Science and Engineering
  • College of Engineering
  • Seoul National University
  • Largely based on
  • Russell Norvig, Edn 1, Ch 9
  • Lecture Notes by Ng Hwee Tou (Singapore)

2
Outline
  • Reducing first-order inference to propositional
    inference
  • Unification
  • Generalized Modus Ponens
  • Forward chaining
  • Backward chaining
  • Resolution

3
References
  • Russell, S Norvig, P Artificial Intelligence
    A Modern Approach, Prentice Hall
  • The library has edition 1, call number 006.3
    R917a
  • To buy, edition 2 (1995), ISBN 0137903952
  • Either is fine
  • Nilsson, NJ Artificial Intelligence A New
    Synthesis, Morgan Kaufmann, 1998, ISBN 1 55860
    535 5
  • Library call no 006.3 N599a
  • More detail than youll ever need
  • Leitsch, A The Resolution Calculus, Springer
    1997, ISBN 3 540 61882 1
  • Library call number 511.3 L537r

4
Universal instantiation (UI)
  • Every instantiation of a universally quantified
    sentence is entailed by it
  • ?v aSubst(v/g, a)
  • for any variable v and ground term g
  • E.g., ?x King(x) ? Greedy(x) ? Evil(x) yields
  • King(John) ? Greedy(John) ? Evil(John)
  • King(Richard) ? Greedy(Richard) ? Evil(Richard)
  • King(Father(John)) ? Greedy(Father(John)) ?
    Evil(Father(John))
  • .
  • .

5
Existential instantiation (EI)
  • For any sentence a, variable v, and constant
    symbol k that does not appear elsewhere in the
    knowledge base
  • ?v a
  • Subst(v/k, a)
  • E.g., ?x Crown(x) ? OnHead(x,John) yields
  • Crown(C1) ? OnHead(C1,John)
  • provided C1 is a new constant symbol, called a
    Skolem constant
  • (Easiest to think of this in terms of adding a
    new individual to the World it can have any
    relationships you like)
  • (There is a slight complication here
  • what about ? x (xJohn)?
  • Generates a new Skolem constant C2 which is
    not John, but is in every respect identical to
    John.
  • Even worse, you can make the statement that
    there is only one thing equal to John, and still
    have two things satisfy it!

6
Reduction to propositional inference
  • Suppose the KB contains just the following
  • ?x King(x) ? Greedy(x) ? Evil(x)
  • King(John)
  • Greedy(John)
  • Brother(Richard,John)
  • Instantiating the universal sentence in all
    possible ways, we have
  • King(John) ? Greedy(John) ? Evil(John)
  • King(Richard) ? Greedy(Richard) ? Evil(Richard)
  • King(John)
  • Greedy(John)
  • Brother(Richard,John)
  • The new KB is propositionalised proposition
    symbols are
  • King(John), Greedy(John), Evil(John),
    King(Richard), etc.

7
Reduction
  • Every first order KB can be propositionalised so
    as to preserve entailment
  • ( ground sentences are entailed by new KB iff
    entailed by original KB)
  • A ground sentence is one with no variables
  • Idea
  • propositionalise the KB
  • ask a query
  • apply resolution
  • return the result
  • Problem
  • with function symbols, there are infinitely many
    ground terms,
  • Father(Father(Father(John)))

8
Reduction contd.
  • Theorem Herbrand (1930) For first order logic
  • If a sentence a is entailed by a knowledge base
  • Then it is entailed by a finite subset of
    propositionalised KB
  • Idea For n 0 to 8 do
  • create a propositional KB by instantiating
    with depth-n terms
  • see if a is entailed by this KB
  • Problem works if a is entailed, loops if a is
    not entailed
  • Theorem Turing (1936), Church (1936) Entailment
    for FOL is
    semi-decidable
  • Algorithms exist that say yes to every entailed
    sentence
  • But no algorithm exists that also says no to
    every non-entailed sentence

9
Problems with propositionalisation
  • Propositionalisation generates many
    irrelevant-seeming sentences.
  • E.g., from
  • ?x King(x) ? Greedy(x) ? Evil(x)
  • King(John)
  • ?y Greedy(y)
  • Brother(Richard,John)
  • it seems obvious that Evil(John), but
    propositionalisation produces lots of facts such
    as Greedy(Richard) that seem irrelevant
  • With p k-ary predicates and n constants, there
    are pnk instantiations.

10
Unification
  • if we can find a substitution ? such that King(x)
    and Greedy(x) match King(John) and Greedy(y)
  • We dont need to generate all these irrelevant
    instances
  • We can generate the inference immediately
  • ? x/John,y/John does this
  • In general,
  • Unify(a,ß) ? if a? ß?
  • p q ?
  • Knows(John,x) Knows(John,Jane)
  • Knows(John,x) Knows(y,OJ)
  • Knows(John,x) Knows(y,Mother(y))
  • Knows(John,x) Knows(x,OJ)
  • Standardizing apart eliminates overlap of
    variables, e.g., Knows(z17,OJ)

11
Unification
  • if we can find a substitution ? such that King(x)
    and Greedy(x) match King(John) and Greedy(y)
  • We dont need to generate all these irrelevant
    instances
  • We can generate the inference immediately
  • ? x/John,y/John does this
  • In general,
  • Unify(a,ß) ? if a? ß?
  • p q ?
  • Knows(John,x) Knows(John,Jane) x/Jane
  • Knows(John,x) Knows(y,OJ)
  • Knows(John,x) Knows(y,Mother(y))
  • Knows(John,x) Knows(x,OJ)
  • Standardizing apart eliminates overlap of
    variables, e.g., Knows(z17,OJ)

12
Unification
  • if we can find a substitution ? such that King(x)
    and Greedy(x) match King(John) and Greedy(y)
  • We dont need to generate all these irrelevant
    instances
  • We can generate the inference immediately
  • ? x/John,y/John does this
  • In general,
  • Unify(a,ß) ? if a? ß?
  • p q ?
  • Knows(John,x) Knows(John,Jane) x/Jane
  • Knows(John,x) Knows(y,OJ) x/OJ, y/John
  • Knows(John,x) Knows(y,Mother(y))
  • Knows(John,x) Knows(x,OJ)
  • Standardizing apart eliminates overlap of
    variables, e.g., Knows(z17,OJ)

13
Unification
  • if we can find a substitution ? such that King(x)
    and Greedy(x) match King(John) and Greedy(y)
  • We dont need to generate all these irrelevant
    instances
  • We can generate the inference immediately
  • ? x/John,y/John does this
  • In general,
  • Unify(a,ß) ? if a? ß?
  • p q ?
  • Knows(John,x) Knows(John,Jane) x/Jane
  • Knows(John,x) Knows(y,OJ) x/OJ, y/John
  • Knows(John,x) Knows(y,Mother(y))
    x/Mother(John), y/John
  • Knows(John,x) Knows(x,OJ)
  • Standardizing apart eliminates overlap of
    variables, e.g., Knows(z17,OJ)

14
Unification
  • if we can find a substitution ? such that King(x)
    and Greedy(x) match King(John) and Greedy(y)
  • We dont need to generate all these irrelevant
    instances
  • We can generate the inference immediately
  • ? x/John,y/John does this
  • In general,
  • Unify(a,ß) ? if a? ß?
  • p q ?
  • Knows(John,x) Knows(John,Jane) x/Jane
  • Knows(John,x) Knows(y,OJ) x/OJ, y/John
  • Knows(John,x) Knows(y,Mother(y))
    x/Mother(John), y/John
  • Knows(John,x) Knows(x,OJ) fail
  • Standardizing apart eliminates overlap of
    variables, e.g., Knows(z17,OJ)

15
Unification
  • To unify Knows(John,x) and Knows(y,z),
  • ? y/John, x/z or ? y/John, x/John,
    z/John
  • The first unifier is more general than the
    second.
  • A unifier is a most general unifier (MGU) if no
    other unifier is more general
  • Theorem any two MGUs are equivalent in the sense
    that we can get one from the other just by
    renaming variables
  • We say the MGU is unique up to renaming of
    variables
  • Note that we dont guarantee an MGU exists
  • MGU y/John, x/z

16
The unification algorithm
17
The unification algorithm
18
Generalized Modus Ponens (GMP)
  • p1', p2', , pn', ( p1 ? p2 ? ? pn ?q)
  • q?
  • Example
  • p1' is King(John) p1 is King(x)
  • p2' is Greedy(y) p2 is Greedy(x)
  • ? is x/John,y/John q is Evil(x)
  • q ? is Evil(John)
  • GMP is used with KBs of definite clauses
  • exactly one positive literal
  • All variables assumed universally quantified

T is chosen so that pi'? pi ? for all i
19
Soundness of GMP
  • We need to show that
  • p1', , pn', (p1 ? ? pn ? q) q?
  • if we assume that pi'? pi? for all i
  • We know that for any sentence p
  • The Universal Instantiation Property tells us
    that p p?
  • So we know that
  • (p1 ? ? pn ? q) (p1 ? ? pn ? q)? (universal
    instantiation)
  • But (p1 ? ? pn ? q)? (p1? ? ? pn? ? q?)
  • p1', , pn' p1' ? ? pn' (conjunction
    introduction)
  • And p1' ? ? pn' p1'? ? ? pn'? (universal
    instantiation)
  • but pi'? pi? by assumption
  • So that (p1'? ? ? pn'?) (p1? ? ? pn?)
  • We already have (p1? ? ? pn? ? q?) from 2
  • Combining 6 and 7, q? follows by ordinary Modus
    Ponens

20
Example knowledge base
  • The law says that it is a crime for an American
    to sell weapons to hostile nations.
  • The country Nono, an enemy of America, has some
    missiles.
  • All of its missiles were sold to it by Colonel
    West, who is American.
  • Prove that Col. West is a criminal

21
Example knowledge base
  • ... it is a crime for an American to sell weapons
    to hostile nations
  • American(x) ? Weapon(y) ? Sells(x,y,z) ?
    Hostile(z) ? Criminal(x)
  • Nono has some missiles, i.e., ?x Owns(Nono,x) ?
    Missile(x)
  • Owns(Nono,M1) and Missile(M1)
  • all of its missiles were sold to it by Colonel
    West
  • Missile(x) ? Owns(Nono,x) ? Sells(West,x,Nono)
  • Missiles are weapons
  • Missile(x) ? Weapon(x)
  • An enemy of America counts as "hostile
  • Enemy(x,America) ? Hostile(x)
  • West, who is American
  • American(West)
  • The country Nono, an enemy of America
  • Enemy(Nono,America)

22
Forward chaining algorithm
23
Forward chaining proof
24
Forward chaining proof
25
Forward chaining proof
26
Properties of forward chaining
  • Sound and complete for first-order definite
    clauses
  • Datalog first-order definite clauses no
    functions
  • No functions is important, because it bounds
    the number of possible terms that may be
    constructed
  • Forward Chaining terminates for Datalog in a
    bounded number of iterations
  • If functions are allowed, Forward Chaining might
    not terminate if a is not entailed
    by the KB
  • This is unavoidable
  • Entailment with (predicate calculus) definite
    clauses is semi-decidable

27
Efficiency of forward chaining
  • We can improve the efficiency of forward chaining
    by some simple algorithm speed-ups
  • Incremental forward chaining
  • Suppose none of the premises of a particular
    rule were added (proven) on step k-1
  • We dont need to check again whether it
    matches, in step k
  • ? only match rules whose premises contain
    newly added ve literals
  • Matching itself can be expensive
  • Database indexing allows O(1) retrieval of known
    facts
  • e.g., query Missile(x) retrieves Missile(M1)
  • The index creates a pointer to all matches to the
    query
  • Forward chaining is widely used in deductive
    databases
  • Databases in which the database is extended by
    allowing the addition of rules

28
Hard matching example
Diff(wa,nt) ? Diff(wa,sa) ? Diff(nt,q) ?
Diff(nt,sa) ? Diff(q,nsw) ? Diff(q,sa) ?
Diff(nsw,v) ? Diff(nsw,sa) ? Diff(v,sa) ?
Colorable() Diff(Red,Blue) Diff (Red,Green)
Diff(Green,Red) Diff(Green,Blue) Diff(Blue,Red)
Diff(Blue,Green) (we need to add axioms
saying that Red, Green, Blue are the only
colours)
  • Colorable() is inferred iff the CSP has a
    solution
  • CSPs include 3SAT as a special case, hence
    matching is NP-hard

29
Backward chaining algorithm
  • We use the notation that
  • SUBST(COMPOSE(?1, ?2), p)
  • SUBST(?2, SUBST(?1, p))

30
Backward chaining example
31
Backward chaining example
32
Backward chaining example
33
Backward chaining example
34
Backward chaining example
35
Backward chaining example
36
Backward chaining example
37
Backward chaining example
38
Properties of backward chaining
  • Depth-first recursive proof search
  • Required space is linear in the size of the proof
  • The procedure is incomplete due to infinite loops
  • We can fix this by checking the current goal
    against every goal on the stack
  • The procedure is inefficient due to repeated
    checking of the same subgoals
  • (both successful and failing goals)
  • We can fix this by caching previous results
  • (at the cost of extra space)
  • Widely used for logic programming

39
Logic programming Prolog
  • Basic Idea Algorithm Logic Control
  • An algorithm can be expressed as logical
    relationships
  • Combined with a control strategy for determining
    the order of proofs attempted
  • Backward chaining proof strategy
  • Horn clauses some extensions
  • Especially, more than one positive literal
  • Un-soundly implemented negation
  • Widely used in Europe and Japan in the 1980s and
    early 1990s
  • More recent languages
  • Goedel, Mercury
  • Pure logic - no unsound inferences
  • Highly efficient compiler (Mercury)
  • Uses proof techniques to compile away much of the
    inefficiency of backward chaining
  • Backtracking avoided by proof techniques (may
    often be proven unnecessary)
  • Recursions may be replaced by iterations
  • Typically comparable in speed with Java
  • But allows more complex algorithms to be
    expressed compactly
  • Special strengths in generating correct code
  • And proving the code is correct

40
Prolog details
  • Program set of
  • head - literal1, literaln.
  • criminal(X) - american(X), weapon(Y),
    sells(X,Y,Z), hostile(Z).
  • Depth-first, left-to-right backward chaining
  • Some non-logical features
  • Unsound built-in predicates for arithmetic etc
  • X is YZ3
  • Built-in predicates that have side effects
  • Required for input and output
  • Changing knowledge base
  • assert/retract predicates
  • Closed-world assumption ("negation as failure")
  • alive(X) - not dead(X).
  • alive(joe) is proven if dead(joe) fails

41
Prolog
  • Appending two lists to produce a third
  • append(,Y,Y).
  • append(XL,Y,XZ) - append(L,Y,Z).
  • In predicate calculus, we would write this as
  • ?x, Append(Empty, y, y)
  • ? x, y, z, l (append(l, y, z) ? append(join(x,
    l), y, join(x,z)) )
  • query append(A,B,1,2) ?
  • answers A B1,2
  • A1 B2
  • A1,2 B

42
Resolution brief summary
  • Full first-order version
  • l1 ? ? lk, m1 ? ? mn
  • (l1 ? ? li-1 ? li1 ? ? lk ? m1 ? ?
    mj-1 ? mj1 ? ? mn)?
  • where Unify(li, ?mj) ?.
  • The two clauses are assumed to be standardized
    apart so that they share no variables.
  • For example,
  • ?Rich(x) ? Unhappy(x)
  • Rich(Ken)
  • Unhappy(Ken)
  • with ? x/Ken
  • Apply resolution steps to CNF(KB ? ?a) complete
    for FOL

43
Conversion to CNF
  • Everyone who loves all animals is loved by
    someone
  • ?x ?y Animal(y) ? Loves(x,y) ? ?y Loves(y,x)
  • 1. Eliminate biconditionals and implications
  • ?x ??y ?Animal(y) ? Loves(x,y) ? ?y
    Loves(y,x)
  • 2. Move ? inwards ??x p ?x ?p, ? ?x p ?x ?p
  • ?x ?y ?(?Animal(y) ? Loves(x,y)) ? ?y
    Loves(y,x)
  • ?x ?y ??Animal(y) ? ?Loves(x,y) ? ?y
    Loves(y,x)
  • ?x ?y Animal(y) ? ?Loves(x,y) ? ?y Loves(y,x)

44
Conversion to CNF contd.
  • Standardize variables each quantifier should use
    a different variable
  • ?x ?y Animal(y) ? ?Loves(x,y) ? ?z Loves(z,x)
  • Skolemize a more general form of existential
    instantiation.
  • Each existential variable is replaced by a Skolem
    function of the enclosing universally quantified
    variables
  • ?x Animal(F(x)) ? ?Loves(x,F(x)) ?
    Loves(G(x),x)
  • Drop universal quantifiers
  • Animal(F(x)) ? ?Loves(x,F(x)) ? Loves(G(x),x)
  • Distribute ? over ?
  • Animal(F(x)) ? Loves(G(x),x) ? ?Loves(x,F(x))
    ? Loves(G(x),x)

45
Resolution proof definite clauses
Write a Comment
User Comments (0)
About PowerShow.com