Title: First Order Inference
1First Order Inference
- Bob McKay
- School of Computer Science and Engineering
- College of Engineering
- Seoul National University
- Largely based on
- Russell Norvig, Edn 1, Ch 9
- Lecture Notes by Ng Hwee Tou (Singapore)
2Outline
- Reducing first-order inference to propositional
inference - Unification
- Generalized Modus Ponens
- Forward chaining
- Backward chaining
- Resolution
3References
- Russell, S Norvig, P Artificial Intelligence
A Modern Approach, Prentice Hall - The library has edition 1, call number 006.3
R917a - To buy, edition 2 (1995), ISBN 0137903952
- Either is fine
- Nilsson, NJ Artificial Intelligence A New
Synthesis, Morgan Kaufmann, 1998, ISBN 1 55860
535 5 - Library call no 006.3 N599a
- More detail than youll ever need
- Leitsch, A The Resolution Calculus, Springer
1997, ISBN 3 540 61882 1 - Library call number 511.3 L537r
4Universal instantiation (UI)
- Every instantiation of a universally quantified
sentence is entailed by it
- ?v aSubst(v/g, a)
- for any variable v and ground term g
- E.g., ?x King(x) ? Greedy(x) ? Evil(x) yields
- King(John) ? Greedy(John) ? Evil(John)
- King(Richard) ? Greedy(Richard) ? Evil(Richard)
- King(Father(John)) ? Greedy(Father(John)) ?
Evil(Father(John)) - .
- .
5Existential instantiation (EI)
- For any sentence a, variable v, and constant
symbol k that does not appear elsewhere in the
knowledge base
- ?v a
- Subst(v/k, a)
- E.g., ?x Crown(x) ? OnHead(x,John) yields
- Crown(C1) ? OnHead(C1,John)
- provided C1 is a new constant symbol, called a
Skolem constant
- (Easiest to think of this in terms of adding a
new individual to the World it can have any
relationships you like) - (There is a slight complication here
- what about ? x (xJohn)?
- Generates a new Skolem constant C2 which is
not John, but is in every respect identical to
John. - Even worse, you can make the statement that
there is only one thing equal to John, and still
have two things satisfy it!
6Reduction to propositional inference
- Suppose the KB contains just the following
- ?x King(x) ? Greedy(x) ? Evil(x)
- King(John)
- Greedy(John)
- Brother(Richard,John)
- Instantiating the universal sentence in all
possible ways, we have - King(John) ? Greedy(John) ? Evil(John)
- King(Richard) ? Greedy(Richard) ? Evil(Richard)
- King(John)
- Greedy(John)
- Brother(Richard,John)
- The new KB is propositionalised proposition
symbols are
-
- King(John), Greedy(John), Evil(John),
King(Richard), etc.
-
7Reduction
- Every first order KB can be propositionalised so
as to preserve entailment
- ( ground sentences are entailed by new KB iff
entailed by original KB)
- A ground sentence is one with no variables
- Idea
- propositionalise the KB
- ask a query
- apply resolution
- return the result
- Problem
- with function symbols, there are infinitely many
ground terms, - Father(Father(Father(John)))
8Reduction contd.
- Theorem Herbrand (1930) For first order logic
- If a sentence a is entailed by a knowledge base
- Then it is entailed by a finite subset of
propositionalised KB
- Idea For n 0 to 8 do
- create a propositional KB by instantiating
with depth-n terms - see if a is entailed by this KB
- Problem works if a is entailed, loops if a is
not entailed
- Theorem Turing (1936), Church (1936) Entailment
for FOL is
semi-decidable - Algorithms exist that say yes to every entailed
sentence - But no algorithm exists that also says no to
every non-entailed sentence
9Problems with propositionalisation
- Propositionalisation generates many
irrelevant-seeming sentences. - E.g., from
- ?x King(x) ? Greedy(x) ? Evil(x)
- King(John)
- ?y Greedy(y)
- Brother(Richard,John)
- it seems obvious that Evil(John), but
propositionalisation produces lots of facts such
as Greedy(Richard) that seem irrelevant
- With p k-ary predicates and n constants, there
are pnk instantiations.
10Unification
- if we can find a substitution ? such that King(x)
and Greedy(x) match King(John) and Greedy(y) - We dont need to generate all these irrelevant
instances - We can generate the inference immediately
- ? x/John,y/John does this
- In general,
- Unify(a,ß) ? if a? ß?
- p q ?
- Knows(John,x) Knows(John,Jane)
- Knows(John,x) Knows(y,OJ)
- Knows(John,x) Knows(y,Mother(y))
- Knows(John,x) Knows(x,OJ)
- Standardizing apart eliminates overlap of
variables, e.g., Knows(z17,OJ)
11Unification
- if we can find a substitution ? such that King(x)
and Greedy(x) match King(John) and Greedy(y) - We dont need to generate all these irrelevant
instances - We can generate the inference immediately
- ? x/John,y/John does this
- In general,
- Unify(a,ß) ? if a? ß?
- p q ?
- Knows(John,x) Knows(John,Jane) x/Jane
- Knows(John,x) Knows(y,OJ)
- Knows(John,x) Knows(y,Mother(y))
- Knows(John,x) Knows(x,OJ)
- Standardizing apart eliminates overlap of
variables, e.g., Knows(z17,OJ)
12Unification
- if we can find a substitution ? such that King(x)
and Greedy(x) match King(John) and Greedy(y) - We dont need to generate all these irrelevant
instances - We can generate the inference immediately
- ? x/John,y/John does this
- In general,
- Unify(a,ß) ? if a? ß?
- p q ?
- Knows(John,x) Knows(John,Jane) x/Jane
- Knows(John,x) Knows(y,OJ) x/OJ, y/John
- Knows(John,x) Knows(y,Mother(y))
- Knows(John,x) Knows(x,OJ)
- Standardizing apart eliminates overlap of
variables, e.g., Knows(z17,OJ)
13Unification
- if we can find a substitution ? such that King(x)
and Greedy(x) match King(John) and Greedy(y) - We dont need to generate all these irrelevant
instances - We can generate the inference immediately
- ? x/John,y/John does this
- In general,
- Unify(a,ß) ? if a? ß?
- p q ?
- Knows(John,x) Knows(John,Jane) x/Jane
- Knows(John,x) Knows(y,OJ) x/OJ, y/John
- Knows(John,x) Knows(y,Mother(y))
x/Mother(John), y/John - Knows(John,x) Knows(x,OJ)
- Standardizing apart eliminates overlap of
variables, e.g., Knows(z17,OJ)
14Unification
- if we can find a substitution ? such that King(x)
and Greedy(x) match King(John) and Greedy(y) - We dont need to generate all these irrelevant
instances - We can generate the inference immediately
- ? x/John,y/John does this
- In general,
- Unify(a,ß) ? if a? ß?
- p q ?
- Knows(John,x) Knows(John,Jane) x/Jane
- Knows(John,x) Knows(y,OJ) x/OJ, y/John
- Knows(John,x) Knows(y,Mother(y))
x/Mother(John), y/John - Knows(John,x) Knows(x,OJ) fail
- Standardizing apart eliminates overlap of
variables, e.g., Knows(z17,OJ)
15Unification
- To unify Knows(John,x) and Knows(y,z),
- ? y/John, x/z or ? y/John, x/John,
z/John
- The first unifier is more general than the
second.
- A unifier is a most general unifier (MGU) if no
other unifier is more general - Theorem any two MGUs are equivalent in the sense
that we can get one from the other just by
renaming variables - We say the MGU is unique up to renaming of
variables - Note that we dont guarantee an MGU exists
- MGU y/John, x/z
16The unification algorithm
17The unification algorithm
18Generalized Modus Ponens (GMP)
- p1', p2', , pn', ( p1 ? p2 ? ? pn ?q)
- q?
- Example
- p1' is King(John) p1 is King(x)
- p2' is Greedy(y) p2 is Greedy(x)
- ? is x/John,y/John q is Evil(x)
- q ? is Evil(John)
- GMP is used with KBs of definite clauses
- exactly one positive literal
- All variables assumed universally quantified
T is chosen so that pi'? pi ? for all i
19Soundness of GMP
- We need to show that
- p1', , pn', (p1 ? ? pn ? q) q?
- if we assume that pi'? pi? for all i
- We know that for any sentence p
- The Universal Instantiation Property tells us
that p p? - So we know that
- (p1 ? ? pn ? q) (p1 ? ? pn ? q)? (universal
instantiation) - But (p1 ? ? pn ? q)? (p1? ? ? pn? ? q?)
- p1', , pn' p1' ? ? pn' (conjunction
introduction) - And p1' ? ? pn' p1'? ? ? pn'? (universal
instantiation) - but pi'? pi? by assumption
- So that (p1'? ? ? pn'?) (p1? ? ? pn?)
- We already have (p1? ? ? pn? ? q?) from 2
- Combining 6 and 7, q? follows by ordinary Modus
Ponens
20Example knowledge base
- The law says that it is a crime for an American
to sell weapons to hostile nations. - The country Nono, an enemy of America, has some
missiles. - All of its missiles were sold to it by Colonel
West, who is American.
- Prove that Col. West is a criminal
21Example knowledge base
- ... it is a crime for an American to sell weapons
to hostile nations - American(x) ? Weapon(y) ? Sells(x,y,z) ?
Hostile(z) ? Criminal(x) - Nono has some missiles, i.e., ?x Owns(Nono,x) ?
Missile(x)
- Owns(Nono,M1) and Missile(M1)
- all of its missiles were sold to it by Colonel
West - Missile(x) ? Owns(Nono,x) ? Sells(West,x,Nono)
- Missiles are weapons
- Missile(x) ? Weapon(x)
- An enemy of America counts as "hostile
- Enemy(x,America) ? Hostile(x)
- West, who is American
- American(West)
- The country Nono, an enemy of America
- Enemy(Nono,America)
22Forward chaining algorithm
23Forward chaining proof
24Forward chaining proof
25Forward chaining proof
26Properties of forward chaining
- Sound and complete for first-order definite
clauses - Datalog first-order definite clauses no
functions - No functions is important, because it bounds
the number of possible terms that may be
constructed - Forward Chaining terminates for Datalog in a
bounded number of iterations - If functions are allowed, Forward Chaining might
not terminate if a is not entailed
by the KB - This is unavoidable
- Entailment with (predicate calculus) definite
clauses is semi-decidable
27Efficiency of forward chaining
- We can improve the efficiency of forward chaining
by some simple algorithm speed-ups - Incremental forward chaining
- Suppose none of the premises of a particular
rule were added (proven) on step k-1 - We dont need to check again whether it
matches, in step k - ? only match rules whose premises contain
newly added ve literals
- Matching itself can be expensive
- Database indexing allows O(1) retrieval of known
facts
- e.g., query Missile(x) retrieves Missile(M1)
- The index creates a pointer to all matches to the
query - Forward chaining is widely used in deductive
databases - Databases in which the database is extended by
allowing the addition of rules
28Hard matching example
Diff(wa,nt) ? Diff(wa,sa) ? Diff(nt,q) ?
Diff(nt,sa) ? Diff(q,nsw) ? Diff(q,sa) ?
Diff(nsw,v) ? Diff(nsw,sa) ? Diff(v,sa) ?
Colorable() Diff(Red,Blue) Diff (Red,Green)
Diff(Green,Red) Diff(Green,Blue) Diff(Blue,Red)
Diff(Blue,Green) (we need to add axioms
saying that Red, Green, Blue are the only
colours)
- Colorable() is inferred iff the CSP has a
solution - CSPs include 3SAT as a special case, hence
matching is NP-hard
29Backward chaining algorithm
- We use the notation that
- SUBST(COMPOSE(?1, ?2), p)
- SUBST(?2, SUBST(?1, p))
30Backward chaining example
31Backward chaining example
32Backward chaining example
33Backward chaining example
34Backward chaining example
35Backward chaining example
36Backward chaining example
37Backward chaining example
38Properties of backward chaining
- Depth-first recursive proof search
- Required space is linear in the size of the proof
- The procedure is incomplete due to infinite loops
- We can fix this by checking the current goal
against every goal on the stack
- The procedure is inefficient due to repeated
checking of the same subgoals - (both successful and failing goals)
- We can fix this by caching previous results
- (at the cost of extra space)
- Widely used for logic programming
39Logic programming Prolog
- Basic Idea Algorithm Logic Control
- An algorithm can be expressed as logical
relationships - Combined with a control strategy for determining
the order of proofs attempted - Backward chaining proof strategy
- Horn clauses some extensions
- Especially, more than one positive literal
- Un-soundly implemented negation
- Widely used in Europe and Japan in the 1980s and
early 1990s - More recent languages
- Goedel, Mercury
- Pure logic - no unsound inferences
- Highly efficient compiler (Mercury)
- Uses proof techniques to compile away much of the
inefficiency of backward chaining - Backtracking avoided by proof techniques (may
often be proven unnecessary) - Recursions may be replaced by iterations
- Typically comparable in speed with Java
- But allows more complex algorithms to be
expressed compactly - Special strengths in generating correct code
- And proving the code is correct
40Prolog details
- Program set of
- head - literal1, literaln.
- criminal(X) - american(X), weapon(Y),
sells(X,Y,Z), hostile(Z).
- Depth-first, left-to-right backward chaining
- Some non-logical features
- Unsound built-in predicates for arithmetic etc
- X is YZ3
- Built-in predicates that have side effects
- Required for input and output
- Changing knowledge base
- assert/retract predicates
- Closed-world assumption ("negation as failure")
- alive(X) - not dead(X).
- alive(joe) is proven if dead(joe) fails
41Prolog
- Appending two lists to produce a third
- append(,Y,Y).
- append(XL,Y,XZ) - append(L,Y,Z).
- In predicate calculus, we would write this as
- ?x, Append(Empty, y, y)
- ? x, y, z, l (append(l, y, z) ? append(join(x,
l), y, join(x,z)) ) - query append(A,B,1,2) ?
- answers A B1,2
- A1 B2
- A1,2 B
42Resolution brief summary
- Full first-order version
- l1 ? ? lk, m1 ? ? mn
- (l1 ? ? li-1 ? li1 ? ? lk ? m1 ? ?
mj-1 ? mj1 ? ? mn)? - where Unify(li, ?mj) ?.
- The two clauses are assumed to be standardized
apart so that they share no variables.
- For example,
- ?Rich(x) ? Unhappy(x)
- Rich(Ken)
- Unhappy(Ken)
- with ? x/Ken
- Apply resolution steps to CNF(KB ? ?a) complete
for FOL
43Conversion to CNF
- Everyone who loves all animals is loved by
someone - ?x ?y Animal(y) ? Loves(x,y) ? ?y Loves(y,x)
- 1. Eliminate biconditionals and implications
- ?x ??y ?Animal(y) ? Loves(x,y) ? ?y
Loves(y,x)
- 2. Move ? inwards ??x p ?x ?p, ? ?x p ?x ?p
- ?x ?y ?(?Animal(y) ? Loves(x,y)) ? ?y
Loves(y,x) - ?x ?y ??Animal(y) ? ?Loves(x,y) ? ?y
Loves(y,x) - ?x ?y Animal(y) ? ?Loves(x,y) ? ?y Loves(y,x)
44Conversion to CNF contd.
- Standardize variables each quantifier should use
a different variable
- ?x ?y Animal(y) ? ?Loves(x,y) ? ?z Loves(z,x)
-
- Skolemize a more general form of existential
instantiation. - Each existential variable is replaced by a Skolem
function of the enclosing universally quantified
variables
- ?x Animal(F(x)) ? ?Loves(x,F(x)) ?
Loves(G(x),x)
- Drop universal quantifiers
- Animal(F(x)) ? ?Loves(x,F(x)) ? Loves(G(x),x)
-
- Distribute ? over ?
- Animal(F(x)) ? Loves(G(x),x) ? ?Loves(x,F(x))
? Loves(G(x),x)
45Resolution proof definite clauses