Logic, Language and Learning - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Logic, Language and Learning

Description:

this is an operational point of view because there are many deductive operators ... lgg of literals (= atoms or negated atoms) : lgg(atom1,atom2) = see above ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 56
Provided by: profdrlu
Category:

less

Transcript and Presenter's Notes

Title: Logic, Language and Learning


1
Logic, Language and Learning
  • Chapter 14 Generality
  • Luc De Raedt

2
Generality in logic
  • Two ways of seeing generality
  • Various frameworks for generality
  • theta-subsumption and its variants
  • relative subsumption
  • (inverse) resolution

3
Generality in logic
4
Example
5
G S
  • S follows deductively from G
  • G follows inductively from S
  • therefore induction is the inverse of deduction
  • this is an operational point of view because
    there are many deductive operators - that
    implement
  • take any deductive operator and invert it and one
    obtains an inductive operator

6
Various frameworks for generality
  • Depending on the form of G and S
  • single clause
  • clausal theory
  • full first order theory
  • Depending on the choice of - to invert
  • theta subsumption (most popular !)
  • implication
  • resolution

7
Subsumption in Propositional logic
  • Clause g subsumes clause s
  • if and only g s
  • or, equivalently
  • g ? s
  • pos - p,q,r pos - p,q,r,s,t
  • because
  • pos, p, q,r ? pos, p, q,r, s,t

8
Subsumption in propositional logic
pos
pos -p pos -q pos -r
pos -p,q pos- p,r pos -q,r
pos - p,q,r
9
Subsumption in propositional logic
  • Perfect structure
  • Complete lattice
  • any two clauses have unique
  • least upper bound (least general generalization)
  • greatest lower bound
  • No syntactic variants
  • Easy specialization, generalization

10
Operators
  • Identical as for item-sets/monomials
  • Specialization operator
  • Generalization operator

11
Ref. Operators for propositional clauses
  • Specialization operator
  • Generalization operator

12
Subsumption in logical atoms
  • g subsumes s if and only if there is a
    substiution ? such that g? s
  • e.g. p(X,Y,X) subsumes p(a,Y,a)
  • e.g. p(f(X),Y) subsumes p(f(a),Y)

13
Subsumption insimple logical atoms
P(X,Y,Z)
P(a,Y,Z) ... P(X,b,Z) ... P(X,Y,c)
P(a,b,Z) P(a,Y,c) ... P(X,b,c)
P(a,b,c)
14
Subsumption insimple logical atoms
P(X,Y)
P(X,X) ... P(a,Y) P(b,Y) P(X,a) P(X,b)
P(a,a) P(a,b) ... P(b,b) ...
15
Subsumption inlogical atoms
P(X)
P(f(Y)) ... P(g(Y)) ... P(h(Y,Z)) ...
P(f(f(W)) P(f(g(W))) P(f(f(f(U))))
P(f(f(f(f(V)))) ...
16
Subsumption in logical atoms
  • g subsumes s if and only if there is a
    substitution ? such that g? s
  • Still nice properties and complete lattice up to
    variable renaming
  • p(X,a) and p(U,a)
  • greatest lower bound unification
  • unification p(X,a) and p(b,U) gives p(b,a)
  • least upper bound anti-unification lgg
  • lgg p(X,a,b) and p(c,a,d) p(X,a,Y)
  • lgg p(X,f(X,c)) and p(a,f(a,Y)) gives p(U,f(U,T))

17
Lgg of atoms
  • lgg of terms
  • lgg(t,t) t
  • lgg(f(s1, , sn), f(t1, , tn))
  • f(lgg(s1,t1), , lgg(sn,tn))
  • lgg(f(s1, , sn), g(t1, , tm)) V (throughout)
  • lgg of atoms
  • lgg(p(s1, , sn), p(t1, , tn))
  • p(lgg(s1,t1), , lgg(sn,tn))
  • lgg(p(s1, , sn), q(t1, , tm)) undefined

18
Operators
  • Ideal Specialization operator
  • apply a substitution X / Y where X,Y already
    appear in atom
  • apply a substitution X / f(Y1, , Yn) where
    Yi new variables
  • apply a substitution X / c where c is a
    constant
  • Ideal Generalization operator
  • apply an inverse substitution
  • Inverse substitution substitutes terms at
    specified places by variables
  • Invert one of the specialization steps above
  • Replace some (but not all) occurences of a
    variable X by a different variable Y
  • Replace all terms f(Y1,...,Yn) where Yi are
    distinct by a new variable X
  • Replace some occurences of a constant by a new
    variable

19
Ideal Specialization Operator
20
Optimal Specialization Operator
21
(No Transcript)
22
Inverting substitutions
23
(No Transcript)
24
Operators
  • Generalization
  • turn term into variable
  • p(a,f(b)) becomes p(X,f(b)) or p(a,f(X))
  • Inv. Substitutions lt1gt / X or lt2,1gt / X
  • p(a,a) becomes p(X,X) or p(a,X) or p(X,a)
  • Inv. Substitutions lt1gt / X , lt2gt / X , lt2gt
    / X , lt1gt / X
  • They all invert X / a
  • replace two occurences of variable X into X1 and
    X2
  • p(X,X) becomes p(X1,X2)

25
Theta-subsumption (Plotkin 70)
  • Most important framework for inductive logic
    programming. Used by all major ILP systems.
  • S and G are single clauses
  • Combines propositional subsumption and
    subsumption on logical atoms
  • c1 theta-subsumes c2 if and only if there is a
    substitution ? such that c1 ? ? c2
  • c1 father(X,Y) - parent(X,Y),male(X)
  • c2 father(jef,paul) - parent(jef,paul),
    parent(jef,an), male(jef), female(an)
  • ? X / jef, Y /paul

26
  • d1 p(X,Y) - q(X,Y), q(Y,X)
  • d2 p(Z,Z) - q(Z,Z)
  • d3 p(a,a) - q(a,a)
  • theta(1,2) X / Z, Y /Z
  • theta(2,3) Z/a
  • d1 is a generalization of d3
  • Mapping several literals onto one leads
    (sometimes) to combinatorial problems

27
Properties
  • Soundness if c1 theta-subsumes c2 then
  • c1 c2
  • Incompleteness (but only for self-recursive
    clauses) wrt logical entailment
  • c1 p(f(X)) - p(X)
  • c2 p(f(f(Y))) - p(Y)
  • Decidable (but NP-complete)
  • transitive and reflexive but not anti-symmetric

28
Structure
p(X,Y) - m(X,Y) p(X,Y) - m(X,Y), m(X,Z) p(X,Y)
- m(X,Y), m(X,Z), m(X,U) ...
lgg
p(X,Y) - m(X,Y),s(X) p(X,Y) - m(X,Y),
m(X,Z),s(X) ...
p(X,Y) - m(X,Y),r(X) p(X,Y) - m(X,Y),
m(X,Z),r(X) ...
p(X,Y) - m(X,Y),s(X),r(X) p(X,Y) - m(X,Y),
m(X,Z),s(X),r(X) ...
glb
reduced
29
Properties (2)
  • Equivalence classes c
  • parent(X,Y) - mother(X,Y), mother(X,Z)
  • parent(X,Y) - mother(X,Y)
  • c1 reduced clause of c2 iff c1 minimal subset of
    literals of c2 that is equivalent with c2
  • parent(X,Y) - mother(X,Y), mother(X,Z)
  • parent(X,Y) - mother(X,Y) reduced form
  • this gives an algorithm for reduction
  • reduced class representative of equivalence
    class, unique up to variable renaming

30
(No Transcript)
31
Properties (3)
  • Equivalence classes induce a lattice L
  • any two equivalence classes have least upper
    bound (least general generalization - lgg)
  • any two equivalence classes have greatest lower
    bound
  • infinite descending and ascending chains exist,
    e.g.
  • - p(X1,X2),p(X2,X1)
  • - p(X1,X2),p(X2,X1), p(X1,X3),p(X3,X1),p(X2,X3),p
    (X3,X2)
  • - p(Xi,Xj) for which i\j and i and j between
    1 and n
  • .
  • - p(X1,X1)

32
Lgg of clauses
  • lgg of literals ( atoms or negated atoms)
  • lgg(atom1,atom2) see above
  • lgg(not atom1, not atom2) not lgg(atom1, atom2)
  • lgg(not atom1, atom2) undefined
  • lgg of clauses
  • lgg( l1, lm, k1, , kn) lgg(li,kj)
    lgg(li,kj) defined
  • f(t,a) - p(t,a), m(t), f(a)
  • f(j,p) - p(j,p), m(j), m(p)
  • lgg f(X,Y) - p(X,Y), m(X), m(Z)

33
Refinement operators
  • In general,
  • optimal and ideal operators do not exist for
    theta-subsumption
  • due to the infinite ascending chains, the result
    may not be finite (and can therefore not be
    computed)

34
Generalization operator
  • On single clause
  • should return all proper minimal generalizations
    of given clause
  • problematic for some clauses
  • e.g. h(X,X) - p(X,X)
  • infinite clauses !
  • Bound the size of clauses
  • better to start from two clauses and apply lgg

35
Generalization operator
  • On single clause
  • should return all proper minimal generalizations
    of given clause
  • problematic for some clauses
  • e.g. - p(X,X)
  • infinite clause !
  • Ideal generalization operator does not exist.
  • Similarly for specialization (but more
    complicated)
  • Ideal operator does not exist
  • Some clauses have an infinite number of proper
    minimal specializations

36
Specialization Operators
  • Pragmatic solution
  • rho(c ) c c is a maximally general
    specialization of c (theory)
  • rho(c ) ?? c U l l is literal U c? ?
    is a substitution (practice)
  • rho(parent(X,Y)) includes
  • parent(X,X)
  • parent(X,Y) - male(X)
  • parent(X,Y) - parent(Y,Z),
  • .?

37
d daughter, p parent, f female, m male
d(X,Y)
d(X,Y) -p(X,Z)
d(X,X)
d(X,Y) - p(Y,X)
d(X,Y) - f(X)
d(X,Y) - f(X), p(X,Y)
d(X,Y)-f(X),f(Y)
38
Variants of theta-subsumption
  • Inverting implication
  • to resolve the incompleteness of
    theta-subsumption w.r.t. entailment
  • OI subsumption
  • to resolve the problems w.r.t. the syntactic
    variants, the non-existence of ideal operators

39
OI subsumption
40
(No Transcript)
41
Inverting implication
  • Framework addresses incompleteness of
    theta-subsumption (Muggleton, AIJto appear
    Idestam-Almquist JAIR)
  • main issue find an lgg under implication
  • c p(f(X)) - p(X)
  • d p(f(f(X))) - p(X)
  • c does not theta subsume d but c d
  • lgg(c,d) under implication not unique
  • p(f(X)) - p(X)
  • p(f(f(X)) - p(Y)
  • Learning (recursive) clauses from few examples
  • computationally expensive

42
Relative generalization
  • Using background theory in the generality
    relation
  • B a set of definite clauses
  • g and s single clauses
  • again various frameworks exist due to choice of
    - to implement

43
Basic ideas
  • bottom clause
  • the most specific clause covering a specific
    clause w.r.t. B
  • least general generalization relative to the
    background theory, the rlgg

44
Bottom clauses
45
Algorithm
46
Relative lgg
47
Relative lgg (Plotkin 71)
  • Relative to background theory B (here B is a set
    of ground facts, the model of the background
    theory)
  • B may be computed from Program
  • let e1 and e2 be two facts
  • rlgg(e1,e2) lgg(e1 - B, e2 - B)
  • the basis of the Golem system (Muggleton and
    Feng, ALT90)

48
Example RLGG
  • Let
  • e1 fa(t,a)
  • e2 fa(j,p)
  • B p(t,a), m(t), f(a), p(j,p), m(j), m(p)
    all true facts
  • Then H rlgg (e1,e2)
  • lgg(fa(t,a) - p(t,a), m(t), f(a), p(j,p), m(j),
    m(p)
  • fa(j,p) - p(t,a), m(t), f(a), p(j,p),
    m(j), m(p) )
  • fa(Vtj,Vap) - p(t,a), m(t), f(a), p(j,p),
    m(j), m(p),
  • p(Vtj,Vap), m(Vtj), m(Vtp),
    p(Vjt,Vpa),
  • m(Vjt), m(Vjp), m(Vpt),
    m(Vpj)

49
Simplify RLGG
  • Rlgg will be used relative to B to check
    coverage, I.e. H and B e
  • therefore B in H is redundant and can be deleted
  • reduction with respect to Background B
  • gives H
  • fa(Vtj,Vap) - p(Vtj,Vap), m(Vtj), m(Vtp),
    p(Vjt,Vpa),
  • m(Vjt), m(Vjp), m(Vpt),
    m(Vpj)
  • further reduction gives (according to theta
    subsumption)
  • fa(Vtj,Vap) - p(Vtj,Vap), m(Vtj)
  • what was wanted

50
Simplify RLGG
  • RLGG is often used in simplified form
  • using BIAS
  • Bias is anything which influences the learning
    process and which is not justified by the data
  • Here syntactic restrictions on clauses to be
    induced
  • rlgg(e1,e2) then becomes
  • lgg(e1 - bias(e1,B) e2 - bias(e2,B))
  • where bias would compute relevant literals in B
  • given the arguments of ei
  • e.g. in previous example
  • bias(fa(j,p),B) p(j,p), m(j), m(p)
  • bias would be a bias of the system
  • used by Golem (Muggleton and Feng, ALT90)

51
Inverting Resolution
  • G and S are sets of clauses and - is resolution
  • Absorption
  • from q - A and p - A,B
  • infer p - q, B
  • Identification
  • from p - A, B and p- q,B
  • infer q - A

p -q,B q- A
p - A,B
52
Inverting Resolution
q-B p-A,q q -C
  • Intra construction
  • from p - A,B and p - A,C
  • infer q - B and p - A,q and q - C
  • Inter construction
  • from p - A,B and q - A,C
  • infer p - r,B and r- A and q - r, C
  • Invent new predicates
  • apply intra construction on
  • grandparent(X,Y) - father(X,Z), father(Z,Y)
  • grandparent(X,Y) - father(X,Z), mother(Z,Y)

p-A,B p-A,C
53
Problems
  • Results inverse resolution not unique
  • father(j,p) - male(j)
  • parent(j,p)
  • gives
  • father(j,p) - male(j), parent(j,p)
  • or
  • father(X,Y) - male(X), parent(X,Y)
  • by inverse resolution

54
Example inverse resolution
m(j)
f(X,Y) - p(X,Y),m(X)
f(j,Y) - p(j,Y)
p(j,m)
f(j,m)
55
grandparent(X,Y) - father(X,Z), parent(Z,Y)
father(X,Y) - male(X), parent(X,Y)
grandparent(X,Y) - male(X), parent(X,Z),
parent(Z,Y)
male(jef)
grandparent(jef,Y) - parent(jef,Z),parent(Z,Y)
parent(jef,an)
grandparent(jef,Y) - parent(an,Y)
parent(an,paul)
grandparent(jef,paul)
56
Inverse resolution and bottom clauses
  • Production of the bottom clause can be regarded
    as repeated application of inverse resolution
  • consider
  • pos(X) - red(X), square(X)
  • polygon(X) - square(X)
  • inverse resolution could give
  • pos(X) - red(X), polygon(X) but also
  • pos(X) - red(X), polygon(X), square(X)
  • applying last choice systematically and
    repeatedly will result in bottom clause.

57
Conclusions
  • Many frameworks exist they have different
    purposes
  • Most important
  • theta-subsumption

58
Not part of the exam
  • Section 4.11 (working with borders)
  • Sections marked with a in Chapter 5 except for
    RLGG
Write a Comment
User Comments (0)
About PowerShow.com