Inductive Logic Programming - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Inductive Logic Programming

Description:

Table of cars: Predict the attribute affordable ' ! Rule discovered: ... size=small & luxury=low affordable. Data Mining Example 2 (1) [L. De Raedt, 2000] ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 30
Provided by: zele6
Category:

less

Transcript and Presenter's Notes

Title: Inductive Logic Programming


1
Inductive Logic Programming
  • and its use in Data Mining
  • Filip ZeleznyCVUT Prague
  • March 2001

2
Structure of Talk
  • Motivation, ILP Concept
  • Basic Technique
  • Some Applications
  • Novel Approaches
  • Conclusions

3
Points of View
  • Software Engineering View
  • ILP synthesizes logic programs from examples
  • ... but the programs may be used for data
    classification ...
  • Machine Learning View
  • ILP develops theories about data using predicate
    logic
  • ... but the theories are as expressive as
    algorithms (Turing machine) ...

4
A Motivation
5
Data Mining Example 1
  • Table of cars
  • Predict the attribute  affordable  !
  • Rule discovered
  • Attribute learning is appropriate.
  • sizesmall luxurylow ? affordable

6
Data Mining Example 2 (1)L. De Raedt, 2000
  • Positive Examples
  • Negative Examples

7
Data Mining Example 2 (2)L. De Raedt, 2000
  • How to represent in AVL?
  • Assume fixed number of objects
  • Problem 1 exchange objects 1 2
  • exponential number of different representations
    for the same entity

8
Data Mining Example 2 (3)L. De Raedt, 2000
  • Problem 2 Positional relations
  •  explosion of false atributes
  • Problem 3 Variable number of objects
  • explosion of empty fields
  • explosion of entire table
  • We need a more powerful representation!

9
The language of Prolog
10
The Language of Prolog- Informal Introduction (1)
  • Ground facts (Predicate w. constants)
  • add(1,1,2).
  • Variables
  • add(X,0,X).
  • Functions
  • e.g. s(X) - successor of X
  • Rules (implications)
  • add(s(X),Y,s(Z)) ? add(X,Y,Z).add(0,X,X).

11
The Language of Prolog- Informal Introduction (2)
  • Invertibility
  • minus(A,B,C) ? add(B,C,A).
  • Functions can be avoided (flattening)
  • suc(X,Y) ? X is Y-1. (built-in arithmetics)
  • add(0,X,X).
  • add(X,Y,Z) ? suc(A,X) suc(B,Z) add(A,Y,B).

12
The ILP Concept
13
Deduction (in Logic Programming)
Apriori (background) knowledge about integers
Theory (hypothesis) about addition
suc(X,Y) ? X is Y-1.
add(0,X,X). add(X,Y,Z) ? suc(A,X) suc(B,Z)
add(A,Y,B).
add(1,3,5), add(8,7,6), add(1,1,1), ...
add(1,1,2), add(3,5,8), add(4,1,5), ...
Positive examples of addition
Negative examples of addition
14
Induction(in Inductive Logic Programming)
Apriori (background) knowledge about integers
Positive and negative examples of addition
add(1,1,2), add(3,5,8), add(4,1,5), ...
suc(X,Y) ? X is Y-1.
add(1,3,5), add(8,7,6), add(1,1,1), ...
add(0,X,X). add(X,Y,Z) ? suc(A,X) suc(B,Z)
add(A,Y,B).
Theory (hypothesis) about addition
15
Basic ILP Technique (1)
  • Search through a clause implication lattice
  • From general to specific (top-down)
  • From specific to general (bottom-up)

add(X,Y,Z)
add(X,Y,Z) ? suc(A,X)
add(X,Y,Z) ? suc(B,Z)
add(X,Y,Z) ? suc(A,X), suc(B,X) ... etc.
add(X,Y,Z) ? suc(A,X) suc(B,Z) add(A,Y,B)
16
Basic ILP Technique (2)
  • Clauses usually constructed one-by-one
  • e.g. specialize until covers no negatives,then
    begin a new clause for the rest of positives
  • Implication is undecidable
  • instead use syntactic. subsumtion (NP - hard)
  • measure generality of clause with background
    knowledge
  • Efficiency use strong bias!
  • syntactical
  • indicate input/output vars maximum clause length
  • semantical e.g. preference heuristics

17
Applications
18
Protein Structure Prediction(1) Muggleton, 1992
  • Predict the secondary structure of protein
  • examples
  • alpha(Protein, Position). - residue at Position
    in Protein is in alpha helix.
  • negatives all other residues
  • background knowledge
  • position(Protein, Pos, Residue)
  • chem. properties of Residues
  • basic arithmetics
  • etc.

19
Protein Structure Prediction(2) Muggleton, 1992
  • Results
  • added to background knowledge, then 2nd search
  • again added to B for the 3rd search

alpha0(A,B) ? ... position(A,D,O)
not_aromatic(O) small_or_polar(O)
position(A,B,C) very_hydrophobic(C)
not_aromatic(C) ...etc (22 literals)
alpha1(A,B) ? oct(D,E,F,G,B,H,I,J,K)
alpha0(A,F) alpha0(A,G).
alpha2(A,B) ? oct(C,D,E,F,B,G,H,I,J)
alpha1(A,B) alpha1(A,G) alpha1(A,H).
20
Protein Structure Prediction(3) Muggleton, 1992
  • Final accuracy on testing set 81
  • Best previous result (neural net) 76
  • General-purpose bottom-up ILP system Golem used.
  • Experiment published in the  Protein
    Engineering  journal.

21
Mutagenecity PredictionSrinivasan, 1995
  • Predict mutagenecity (carcinogenecity) of
    chemicals with general system Progol Muggleton
  • Examples compounds
  • Active Inactive
  • Result structural alert

22
Datamining in TelephonyZelezny et al, 2000
  • Discover frequent patterns of operations in an
    enterprise telephone exchange
  • Examples history of calls related attributes
  • Result e.g. rule (lower case constant)
  • covers
  • Predicates day, prefix, etc. in background
    knowledge.

redirection(A,B,C,10) ? day(tuesday,A)
prefix(C,5,0,2).
redirection(15, 13,14,48, 5,0,0,0,0,0,0,0,
10). redirection(15, 14,18,58,
5,0,9,6,0,1,8,9, 10). redirection(22,
18,50,30, 5,0,0,0,0,0,0,0, 10).
redirection(29, 13,35,56, 5,0,0,0,0,0,0,0,
10). redirection(29, 13,57,36,
5,0,0,0,0,0,0,0, 10).
23
Other Applications
  • Finite element mesh design
  • Control of dynamical systems
  • qualitative simulation
  • Software Engineering
  • Many more, especially in data mining

24
Novel Approaches
25
Descriptive ILP
  • Examples are interpretations (models)
  • is one example
  • Hypothesis must be true in all examples
  • Suited for data mining
  • finds ALL true hypothesis - maximum
    characterisation

triangle(t,up) circle(c1) inside(c,t)
circle(c2) right_of (c2,t) class(positive)
class(positive) ? triangle(X,Y) circle(Z)
inside(Z,X).
26
Upgrades of Propositional Learnes1st-order
Decision Trees
  • Upgrades the C4.5 algorithm
  • E.g. Tilde Blockheel, De Raedt

? - circle(C1)
? - triangle(T,up) inside(C1,T)
class(positive)
? - circle(C2) inside(C1,C2)
class(positive)
class(positive)
class(negative)
27
More Upgrades of Propositional Learners
  • 1st-order association rules
  • the WARMR system Dehaspe
  • upgrade of Apriori
  • 1st-order Bayesian Nets
  • 1st-order Clustering
  • 1st-order Distance Based Learning

28
Concluding Remarks
  • Advantages of ILP
  • Theoretical Turing-equivalent expressive power
  • Practical rich but understandable language,
    integration of background knowledge,
    MULTI-relational data mining
  • Problems still to be solved...
  • efficiency, handling numbers, user interfaces

29
http//cyber.felk.cvut.cz/ACAI01
Write a Comment
User Comments (0)
About PowerShow.com