Title: CIS303 Advanced Forensic Computing
 1CIS303Advanced Forensic Computing
  2Relational data mining
- Relational data mining is the data mining 
 technique for relational databases.
- Unlike traditional data mining algorithms, which 
 look for patterns in a single table
 (propositional patterns), relational data mining
 algorithms look for patterns among multiple
 tables (relational patterns).
- For most types of propositional patterns, there 
 are corresponding relational patterns.
- For example, there are relational classification 
 rules, relational regression trees, relational
 association rules, and so on.
- The most important theoretical foundation of 
 relational data mining is inductive logic
 programming.
3Rule-Based Learning
positive examples
negative examples
- Goal Induce a rule (or rules) that explains ALL 
 positive examples and NO negative examples
4Inductive Logic Programming (ILP)
- Encode background knowledge in first-order logic 
 as facts
containsBlock(ex1,block1A). containsBlock(ex1,bloc
k1B). is_red(block1A). is_square(block1A).
 is_blue(block1B). is_round(block1B). on_to
p_of(block1B,block1A).
and logical relations 
above(A,B) - onTopOf(A,B) above(A,B) - 
onTopOf(A,Z), above(Z,B).  
 5Inductive Logic Programming (ILP)
- Covering algorithm applied to explain all data
-
-
-
-
-
-
-
-
-
Choose some positive example
Generate best rule that covers this example
Remove all examples covered by this rule
Repeat until every positive example is covered 
 6Inductive Logic Programming (ILP)
- Saturate an example by writing everything true 
 about it
- The saturation of an example is the bottom clause 
 (?)
positive(ex2) - contains_block(ex2,block2A), 
 contains_block(ex2,block2B), 
contains_block(ex2,block2C), isRed(block2A), 
isRound(block2A), isBlue(block2B), 
isRound(block2B), isBlue(block2C), 
isSquare(block2C), onTopOf(block2B,block2A), 
 onTopOf(block2C,block2B), above(block2B,block2
A), above(block2C,block2B), 
above(block2C,block2A).
ex2
C
B
A 
 7Inductive Logic Programming (ILP)
Selected literals from ?
- Candidate clauses are generated by 
- choosing literals from ? 
- converting ground terms to variables 
- Search through the space of candidate clauses 
 using standard AI search algo
- Bottom clause ensures search finite
containsBlock(ex2,block2B)
isRed(block2A)
onTopOf(block2B,block2A)
Candidate Clause
positive(A) - containsBlock(A,B), 
onTopOf(B,C), isRed(C). 
 8Contact lense dataset 
 9Contact lense decision list
IF Tear-production  reduced THEN Lenses  none 
(12) ELSE / Tear-production  normal / IF 
Astigmatic  no THEN Lenses  soft (6/1) 
ELSE / Astigmatic  yes / IF 
Spectacles  myope THEN Lenses  hard (3) 
 ELSE / Spectacles  hypermetrope / 
 Lenses  none (3/1)  Confusion Matrix 
 a b c lt-- classified as 5 0 0  a 
 soft 0 3 1  b  hard 1 0 14  c  none 
 10Contact lense association rules
 1. (1.00) Tear-productionreduced 12 gt 
Lensesnone 12 2. (1.00) Astigmaticyes 
Tear-productionreduced 6 gt Lensesnone 6 3. 
(1.00) Astigmaticno Tear-productionreduced 6 
gt Lensesnone 6 4. (1.00) Spectacleshypermetro
pe Tear-productionreduced 6 gt Lensesnone 6 
5. (1.00) Spectaclesmyope Tear-productionreduced
 6 gt Lensesnone 6 6. (1.00) Lensessoft 5 gt 
Astigmaticno Tear-productionnormal 5 7. (1.00) 
Astigmaticno Lensessoft 5 gt 
Tear-productionnormal 5 8. (1.00) 
Tear-productionnormal Lensessoft 5 gt 
Astigmaticno 5 9. (1.00) Lensessoft 5 gt 
Tear-productionnormal 5 10. (1.00) Lensessoft 5 
gt Astigmaticno 5 11. (0.86) Astigmaticno 
Lensesnone 7 gt Tear-productionreduced 6 12. 
(0.86) Spectaclesmyope Lensesnone 7 gt 
Tear-productionreduced 6 13. (0.83) 
Astigmaticno Tear-productionnormal 6 gt 
Lensessoft 5 14. (0.83) Spectacleshypermetrope 
Astigmaticyes 6 gt Lensesnone 5 15. (0.80) 
Lensesnone 15 gt Tear-productionreduced 12 16. 
(0.75) Astigmaticyes Lensesnone 8 gt 
Tear-productionreduced 6 17. (0.75) 
Spectacleshypermetrope Lensesnone 8 gt 
Tear-productionreduced 6 18. (0.75) 
Agepresbyopic 8 gt Lensesnone 6
 B ?B H 12 315 ?H  0 9 9 
121224
 B ?B H  6 915 ?H  2 7 9  
81624 
 11Clauses sorted by confirmation
 1. (.76 .00) Tear-prodreduced gt Lensesnone 
2. (.76 .12) Lensesnone gt Tear-prodreduced 
3. (.67 .04) Lensesnone gt Agepresb or 
Tear-prodreduced 4. (.63 .04) Astigmno and 
Tear-prodnormal gt Lensessoft 5. (.54 .00) 
Astigmno and Tear-prodnormal gt Agepresb or 
Lensessoft 6. (.50 .08) Astigmyes and 
Tear-prodnormal gt Lenseshard 7. (.50 .08) 
Lensesnone gt Agepre-presb or 
Tear-prodreduced 8. (.47 .04) Lensesnone gt 
Specshmetr or Tear-prodreduced 9. (.47 .04) 
Lensesnone gt Astigmyes or Tear-prodreduced 10
. (.47 .00) Lensessoft gt Astigmno 11. (.47 
.00) Lensessoft gt Tear-prodnormal 12. (.47 
.00) Specsmyope and Astigmyes and 
Tear-prodnormal gt Lenseshard 13. (.47 .00) 
Lensesnone gt Agepresb or Specshmetr or 
Tear-prodreduced 14. (.47 .00) Lensesnone gt 
Agepresb or Astigmyes or Tear-prodreduced 15. 
(.45 .00) Specshmetr and Astigmno and 
Tear-prodnormal gt Lensessoft 16. (.44 .29) 
Astigmno gt Lensessoft 17. (.44 .29) 
Tear-prodnormal gt Lensessoft
 B ?B H 12 315 ?H  0 9 9 
121224
 B ?B H  5 0 5 ?H  71219 
121224 
 12A toy example 
 13East-West trains (flattened)
- Example eastbound(t1). 
- Background knowledgehasCar(t1,c1). 
 hasCar(t1,c2). hasCar(t1,c3).
 hasCar(t1,c4).cshape(c1,rect). cshape(c2,rect).
 cshape(c3,rect). cshape(c4,rect).clength(c1,shor
 t).clength(c2,long).clength(c3,short).clength(c4,l
 ong).croof(c1,none). croof(c2,none).
 croof(c3,peak). croof(c4,none).cwheels(c1,2).
 cwheels(c2,3). cwheels(c3,2).
 cwheels(c4,2).hasLoad(c1,l1). hasLoad(c2,l2).
 hasLoad(c3,l3). hasLoad(c4,l4).lshape(l1,circ).
 lshape(l2,hexa). lshape(l3,tria).
 lshape(l4,rect).lnumber(l1,1). lnumber(l2,1).
 lnumber(l3,1). lnumber(l4,3).
- Hypothesis eastbound(T)-hasCar(T,C),clength(C,s
 hort), not
 croof(C,none).
14East-West trains (flattened)
- Example eastbound(t1). 
- Background knowledgehasCar(t1,c1). 
 hasCar(t1,c2). hasCar(t1,c3).
 hasCar(t1,c4).cshape(c1,rect). cshape(c2,rect).
 cshape(c3,rect). cshape(c4,rect).clength(c1,shor
 t).clength(c2,long).clength(c3,short).clength(c4,l
 ong).croof(c1,none). croof(c2,none).
 croof(c3,peak). croof(c4,none).cwheels(c1,2).
 cwheels(c2,3). cwheels(c3,2).
 cwheels(c4,2).hasLoad(c1,l1). hasLoad(c2,l2).
 hasLoad(c3,l3). hasLoad(c4,l4).lshape(l1,circ).
 lshape(l2,hexa). lshape(l3,tria).
 lshape(l4,rect).lnumber(l1,1). lnumber(l2,1).
 lnumber(l3,1). lnumber(l4,3).
- Hypothesis eastbound(T)-hasCar(T,C),clength(C,s
 hort), not
 croof(C,none).
15East-West trains (terms)
- Example eastbound(car(rect,short,none,2,load(cir
 c,1)), car(rect,long,
 none,3,load(hexa,1)),
 car(rect,short,peak,2,load(tria,1)),
 car(rect,long, none,2,load(rect,3))).
- Background knowledge member/2, arg/3 
- Hypothesis eastbound(T)-member(C,T),arg(2,C,sho
 rt), not arg(3,C,none).
16East-West trains (terms)
- Example eastbound(car(rect,short,none,2,load(cir
 c,1)), car(rect,long,
 none,3,load(hexa,1)),
 car(rect,short,peak,2,load(tria,1)),
 car(rect,long, none,2,load(rect,3))).
- Background knowledge member/2, arg/3 
- Hypothesis eastbound(T)-member(C,T),arg(2,C,sho
 rt), not arg(3,C,none).
17ER diagram for East-West trains 
 18Train-as-set database
SELECT DISTINCT TRAIN_TABLE.TRAIN FROM 
TRAIN_TABLE, CAR_TABLE WHERE  TRAIN_TABLE.TRAIN 
 CAR_TABLE.TRAIN AND  CAR_TABLE.SHAPE  'short' 
AND  CAR_TABLE.ROOF ! 'none' 
 19Individual-centred representations
- ER diagram is a tree (approximately) 
- root denotes individual 
- looking downwards from the root, only one-to-one 
 or one-to-many relations are allowed
- one-to-one cycles are allowed 
- Database can be partitioned according to 
 individual
- Alternative all information about a single 
 individual packed together in a term
- tuples, lists, trees, sets, multisets, graphs, 
20Mutagenesis 
 21Complexity of learning problems
- Simplest case single table with primary key 
- attribute-value or propositional learning 
- example corresponds to tuple of constants 
- Next single table without primary key 
- multi-instance problem 
- example corresponds to set of tuples of constants 
- Complexity resides in many-to-one foreign keys 
- non-determinate variables 
- lists, trees, sets, multisets, graphs, 
22Subgroup discovery
- An interesting subgroup has a class distribution 
 which differs significantly from overall
 distribution
- This can be modelled as classification with 
 profits (for true pos/neg) and costs (for false
 pos/neg)
- Requires different heuristics and/or trade-off 
 between accuracy and generality
23Upgrading to first-order logic
- Use function-free Prolog as representation 
 language
- normal-form logic, simple syntax 
- specialisation well understood 
- For rule evaluation, generate all grounding 
 substitutions
- specialisation may increase sample size 
- if problematic, use first-order features and 
 count only over global variables
24First-order features
- Features concern interactions of local variables 
- The following rule has one boolean feature has 
 a short closed car
-  eastbound(T)-hasCar(T,C), clength(C,short),
 not croof(C,none).
- The following rule has two boolean features has 
 a short car and has a closed car
-  eastbound(T)- hasCar(T,C1),clength(C1,short
 ), hasCar(T,C2),not croof(C2,none).
25Propositionalising rules
- Equivalently 
- eastbound(T)-hasShortCar(T),hasClosedCar(T). 
- hasShortCar(T)-hasCar(T,C1),clength(C1,short). 
- hasClosedCar(T)-hasCar(T,C2),not croof(C2,none). 
- Given a way to construct and select first-order 
 features, rule construction is semi-propositional
- head and body literals have the same global 
 variable(s)
- corresponds to single table, one row per example
26First-order feature bias in Tertius
- Flattened representation, but derived from 
 strongly-typed term representation
- one free global variable 
- each (binary) structural predicate introduces a 
 new existential local variable and uses either
 global variable or local variable introduced by
 other structural predicate
- utility predicates only use variables 
- all variables are used 
- NB. features can be non-boolean 
- if all structural predicates are one-to-one
27The Tertius system
- A, anytime top-down search algorithm 
- optimal refinement operator 
- 7500 lines of GNU C 
- propositional Weka plug-in available 
- P.A. Flach  N. Lachiche (2001), 
 Confirmation-guided discovery of first-order
 rules with Tertius, Machine Learning 42(1/2)
 6195
- www.cs.bris.ac.uk/Research/MachineLearning/Tertius
 /
28Subgroups vs. classifiers
- Classification rules aim at pure subgroups 
- Subgroups aim at significantly higher (or 
 different) proportion of positives
- essentially the same as cost-sensitive 
 classification
- instead of FNcost we have TPprofit
29Relational Data Mining
- Single-table assumption 
- (Multi-)relational data mining and ILP 
- FO representations 
- Upgrading propositional DM systems to FOL 
- A case study Mining Association rules 
- Conclusions
30Standard Data Mining Approach
- Most existing data mining approaches look for 
 patterns in a single table of data (or DB
 relation)
- Each row represents an object and columns 
 represent properties of objects.
- Single table assumption
31Standard Data Mining Approach
- In the customer table we can add as many 
 attributes about our customers as we like.
- A persons number of children 
- For other kinds of information the single-table 
 assumption turns out to be a significant
 limitation
- Add information about orders placed by a 
 customer, in particular
- Delivery and payment modes 
- With which kind of store the order was placed 
 (size, ownership, location)
- For simplicity, no information on the goods 
 ordered
32Standard Data Mining Approach
- This solution works fine for once-only customers 
- What if our business has repeat customers? 
- Under the single-table assumption we can 
- Make one entry for each order in our customer 
 table
- We have usual problems of non-normalized tables 
- Redundancy, anomalies,  
33Standard Data Mining Approach
- one line per order ? analysis results will really 
 be about orders, not customers, which is not what
 we might want!
- Aggregate order data into a single tuple per 
 customer.
- No redundancy. Standard DM methods work fine, but 
- There is a lot less information in the new table 
- What if the payment mode and the store type are 
 important?
34Relational Data
- A database designer would represent the 
 information in our problem as a set of tables (or
 relations)
35Relational Data Mining 
- (Multi-)Relational data mining algorithms can 
 analyze data distributed in multiple relations,
 as they are available in relational database
 systems.
- These algorithms come from the field of inductive 
 logic programming (ILP)
- ILP has been concerned with finding patterns 
 expressed as logic programs
- Initially, ILP focussed on automated program 
 synthesis from examples
- In recent years, the scope of ILP has broadened 
 to cover the whole spectrum of data mining tasks
 (association rules, regression, clustering, )
36ILP successes in scientific fields
- In the field of chemistry/biology 
- Toxicology 
- Prediction of Dipertene classes from nuclear 
 magnetic resonance (NMR) spectra
- Analysis of traffic accident data 
- Analysis of survey data in medicine 
- Prediction of ecological biodegradation rates 
- The first commercial data mining systems with ILP 
 technology are becoming available.
37Relational patterns
- Relational patterns involve multiple relations 
 from a relational database.
- They are typically stated in a more expressive 
 language than patterns defined on a single data
 table.
- Relational classification rules 
- Relational regression trees 
- Relational association rules 
- IF Customer(C1,N1,FN1,Str1,City1,Zip1,Sex1,SoSt1, 
 In1,Age1,Resp1)
-  AND order(C1,O1,S1,Deliv1, Pay1) 
-  AND Pay1  credit_card 
-  AND In1 ? 108000 
- THEN Resp1  Yes 
38Relational patterns
- IF Customer(C1,N1,FN1,Str1,City1,Zip1,Sex1,SoSt1, 
 In1,Age1,Resp1)
-  AND order(C1,O1,S1,Deliv1, Pay1) 
-  AND Pay1  credit_card 
-  AND In1 ? 108000 
- THEN Resp1  Yes 
- good_customer(C1) ? 
-  customer(C1, N1,FN1,Str1,City1,Zip1,Sex1,SoSt1, 
 In1,Age1,Resp1) ? order(C1,O1,S1,Deliv1,
 credit_card) ?
-  In1 ? 108000 
-  
- This relational pattern is expressed in a subset 
 of first-order logic!
- A relation in a relational database corresponds 
 to a predicate in predicate logic (see deductive
 databases)
39Relational decision tree 
- Equivalent Prolog program 
- class(sendback) - worn(X), not_replaceable(X), 
 !.
- class(fix) - worn(X), !. 
- class(keep).
40Relational regression rule
Induced model 
 41Relational association rule
likes(KID, piglet), likes(KID, ice-cream)
? likes (KID, dolphin) (9, 85)
likes(KID, A), has(KID, B) ? prefers (KID, A, B) 
(70, 98) 
 42First-order representations
- An example is a set of ground facts, that is a 
 set of tuples in a relational database
- From the logical point of view this is called a 
 (Herbrand) interpretation because the facts
 represent all atoms which are true for the
 example, thus all facts not in the example are
 assumed to be false.
- From the computational point of view each example 
 is a small relational database or a Prolog
 knowledge base
- A Prolog interpreter can be used for querying an 
 example.
-  
43FO representation (ground clauses)
- Example 
- eastbound(t1)- car(t1,c1),rectangle(c1),short(c1
 ),none(c1),two_wheels(c1), load(c1,l1),circle(l1
 ),one_load(l1), car(t1,c2),rectangle(c2),long(c2)
 ,none(c2),three_wheels(c2), load(c2,l2),hexagon(
 l2),one_load(l2), car(t1,c3),rectangle(c3),short(
 c3),peaked(c3),two_wheels(c3), load(c3,l3),trian
 gle(l3),one_load(l3), car(t1,c4),rectangle(c4),lo
 ng(c4),none(c4),two_wheels(c4), load(c4,l4),rect
 angle(l4),three_load(l4).
- Background theory 
-  polygon(X) - rectangle(X) 
-  polygon(X) - triangle(X) 
- Hypothesis eastbound(T)-car(T,C),short(C),not 
 none(C).
44Background knowledge
- As background knowledge is visible for each 
 example, all the facts that can be derived from
 the background knowledge and an example are part
 of the extended example.
- Formally, an extended example is the minimal 
 Herbrand model of the example and the background
 theory.
- When querying an example, it suffices to assert 
 the background knowledge and the example the
 Prolog interpreter will do the necessary
 derivations.
45Learning from interpretations
- The ground-clause representation is peculiar of 
 an ILP setting denoted as learning from
 interpretations.
- Similar to older work on structural matching. 
- It is common to several relational data mining 
 systems, such as
- CLAUDIEN searches for a set of clausal 
 regularities that hold on the set of examples
- TILDE top-down induction of logical decision 
 trees
- ICL Inductive classification logic (upgrade of 
 CN2)
- It contrasts with the classical ILP setting 
 employed by the systems PROGOL and FOIL.
46FO representation (flattened)
- Example eastbound(t1). 
- Background theorycar(t1,c1). car(t1,c2). 
 car(t1,c3). car(t1,c4).rectangle(c1).
 rectangle(c2). rectangle(c3).
 rectangle(c4).short(c1). long(c2).
 short(c3). long(c4).none(c1).
 none(c2). peaked(c3).
 none(c4).two_wheels(c1). three_wheels(c2).
 two_wheels(c3). two_wheels(c4).load(c1,l1).
 load(c2,l2). load(c3,l3).
 load(c4,l4).circle(l1). hexagon(l2).
 triangle(l3). rectangle(l4).one_load(l1).
 one_load(l2). one_load(l3).
 three_loads(l4).
- Hypothesis eastbound(T)-car(T,C),short(C),not 
 none(C).
47FO representation (terms)
- Example eastbound(c(rectangle,short,none,2,l(cir
 cle,1)), c(rectangle,long,none,3,l(hexa
 gon,1)), c(rectangle,short,peaked,2,l(t
 riangle,1)), c(rectangle,long,none,2,l(
 rectangle,3))).
- Background theory empty 
- Hypothesis eastbound(T)-member(C,T),arg(2,C,sho
 rt), not arg(3,C,none).
48FO representation (strongly typed)
- Type signature data Shape  Rectangle  
 Hexagon   data Length  Long  Shortdata
 Roof  None  Peaked   data Object  Circle
 Hexagon  type Wheels  Int type Load
 (Object,Number) type Number  Inttype Car
 (Shape,Length,Roof,Wheels,Load) type Train
 CareastboundTrain-gtBool
- Example eastbound((Rectangle,Short,None,2,(Circ
 le,1)), (Rectangle,Long,None,3,(Hexagon
 ,1)), (Rectangle,Short,Peaked,2,(Triang
 le,1)), (Rectangle,Long,None,2,(Rectang
 le,3)))  True
- Hypothesis eastbound(t)  (exists \c -gt 
 member(c,t)   proj2(c)Short
 proj3(c)!None)
- Example language Escher functional logic 
 programming
49FO representation (database)
SELECT DISTINCT TRAIN_TABLE.TRAIN FROM 
TRAIN_TABLE, CAR_TABLE WHERE TRAIN_TABLE.TRAIN  
CAR_TABLE.TRAIN AND  CAR_TABLE.LENGTH  short 
AND CAR_TABLE.ROOF ! 'none' 
 50Individual-centered representation
- The database contains information on a number of 
 trains.
- Each train is an individual. 
- The database can be partitioned according to 
 individual to obtain a ground-clause
 representation
- Problem sometime individuals share common parts. 
 
- Example we want to discriminate 
- black and white figures on the basis of their 
- position. 
- Each geom. figure is an individual 
51Object-centered representation
- The whole sequence is an object, which can be 
 represented by a multiple-head ground clause
- black(x11) ? black(x12) ? white(x13) ? 
 black(x14) -
-  first(x11), crl(x11), next(x12,x11), crl(x12), 
-  sqr(x13), crl(x14), next(x14,x13), 
 next(x13,x12)
- This is the representation adopted in ATRE. 
52How to upgrade propositional DM algorithms to 
first-order 
- Identify the propositional DM system that best 
 matches the DM task
- Use interpretations to represent examples 
- Upgrade the representation of propositional 
 hypotheses attribute-value tests by first-order
 literals and modify the coverage test
 accordingly.
- Structure the search-space by a more-general-than 
 relation that works on first-order
 representations
- ?-subsumption 
- Adapt the search operators for searching the 
 corresponding rule space
- Use a declarative bias mechanism to limit the 
 search space
- Implement 
- Evaluate your (first-order) implementation on 
 propositional and relational data
- Add interesting extra features 
53Mining association rules a case study
- A set I of literals called items. 
- A set D of transactions ts such that t ? I. 
- X ? Y (s, c) 
- "IF a pattern X appears in a transaction t, THEN 
 the pattern Y tends to hold in the same
 transaction t"
- X ?I, Y ?I, X?Y? 
- s  p(X?Y) support 
- c  p(YX)  p(X?Y) / p(X) confidence 
- Agrawal, Imielinsky  Swami. 
- Mining association rules between sets of items in 
 large databases.
- Proc. SIGMOD 1993 
54What is an association rule?
Example market basket analysis. Each 
transaction is the list of items bought by a 
customer on a single visit to a store. It is 
represented as a row in a table IF a customer 
buys bread and butter THEN he also buys cheese 
(20, 66)  Given that 20 of customers buy 
bread, cheese and butter, 66 of customers who 
buy bread and butter also buy cheese 
 55Mining association rules The propositional 
approach
- Problem statement 
- Given 
- a set of transactions D 
- a couple of thresholds, minsup and minconf 
- Find 
- all association rules that have support and 
 confidence greater than minsup and minconf
 respectively.
56Mining association rules The propositional 
approach
- Problem decomposition 
- Find large (or frequent) itemsets 
- Generate highly-confident association rules 
- Representation issues 
- The transaction set D may be a data file, a 
 relational table or the result of a relational
 expression
- Each transaction is a binary vector 
57Mining association rules The propositional 
approach
- Solution to the first sub-problem 
- The APRIORI algorithm (Agrawal  Srikant, 1999) 
- Find large 1-itemsets 
- Cycle on the size (kgt1) of the itemsets 
- APRIORI-gen Generate candidate k-itemsets from 
 large (k-1)-itemsets
- Generate large k-itemsets from candidate 
 k-itemsets (cycle on the transactions in D)
- until no more large itemsets are found. 
58Mining association rules The propositional 
approach
- Solution to the second sub-problem 
- For every large itemset Z, find all non-empty 
 subsets Xs of Z
- For every subset X, output a rule of the form X ? 
 (Z-X) if support(Z)/support(X) ? minconf.
- Relevant work 
- Agrawal  Srikant (1999). Fast Algorithms for 
 Mining Association Rules, in Readings in Database
 Systems, Morgan Kaufmann Publishers.
- Han  Fu (1995). Discovery of Multiple-Level 
 Association Rules from Large Databases, in Proc.
 21st VLDB Conference
59Mining association rules The ILP approach
- Problem statement 
- Given 
- a deductive relational database D 
- a couple of thresholds, minsup and minconf 
- Find 
- all association rules that have support and 
 confidence greater than minsup and minconf
 respectively.
60Mining association rules The ILP approach
- Problem decomposition 
- Find large (or frequent) atomsets 
- Generate highly-confident association rules 
- Representation issues 
- A deductive relational database is a relational 
 database which may be represented in first-order
 logic as follows
- Relation ? Set of ground facts (EDB) 
- View ? Set of rules (IDB)
61Mining association rules The ILP approach
- Example Relational database 
-  
-  
likes(joni, ice-cream) atom
likes(KID, piglet), likes(KID, ice-cream) 
atomset
? likes (KID, dolphin) (9, 85)
likes(KID, A), has(KID, B) ? prefers (KID, A, B) 
(70, 98) 
 62Mining association rules The ILP approach
- Solution to the first sub-problem 
- The WARMR algorithm (Dehaspe  De Raedt, 1997) 
- L. Dehaspe  L. De Raedt (1997). Mining 
 Association Rules in Multiple Relations, Proc.
 Conf. Inductive Logic Programming
- Compute large 1-atomsets 
- Cycle on the size (kgt1) of the atomsets 
- WARMR-gen Generate candidate k-atomsets from 
 large (k-1)-atomsets
- Generate large k-atomsets from candidate 
 k-atomsets (cycle on the observations loaded from
 D)
- until no more large atomsets are found.
63Mining association rules The ILP approach
- WARMR 
- Breadth-first search on the atomset lattice 
- Loading of an observation o from D (query result) 
- Largeness of candidate atomsets computed by a 
 coverage test
- APRIORI 
- Breadth-first search on the itemset lattice 
- Loading of a transaction t from D (tuple) 
- Largeness of candidate itemsets computed by a 
 subset check
64Mining association rules The ILP approach
false ?q Q1 ? ? is_a(X, large_town) ? 
intersects(X, R) ? is_a(R, road) ?q Q2? ? 
is_a(X, large_town) ? intersects(X,Y) ?q Q3? ? 
is_a(X, large_town) ?q true 
 65Mining association rules The ILP approach
 Refinement step
is_a(X, large_town), intersects(X,R), is_a(R, 
road)
 Pruning step 
 66Mining association rules The ILP approach
is_a(X, large_town), intersects(X,R), is_a(R, 
road), adjacent_to(X,W), is_a(W, water)
?- is_a(X, large_town), intersects(X,R), is_a(R, 
road), adjacent_to(X,W), is_a(W, water)
D 
Large?
ltXbarletta,Ra14,Wadriaticogt ltXbari,Rss16bis,W
adriaticogt... 
 67Mining association rules The ILP approach
is_a(X, large_town), intersects(X,R), is_a(R, 
road), adjacent_to(X,W), is_a(W, water) 
 68Conclusions and future work
- Multi-relational data mining more data mining 
 than logic program synthesis
- choice of representation formalisms 
- input format more important than output format 
- data modelling  e.g. object-oriented data mining 
- new learning tasks and evaluation measures 
- Reference 
- Saso Dzeroski and Nada Lavrac, editors, 
- Relational Data Mining, 
- Springer-Verlag, September 2001