Title: Advanced Topics in Data Mining
1Advanced Topics in Data Mining
- Association Rules
- Sequential Patterns
- Web Mining
2Where to Find References?
- Proceedings
- Proceedings of ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining - Proceedings of IEEE International Conference on
Data Mining (ICDM) - Proceedings of IEEE International Conference on
Data Engineering (ICDE) - Proceedings of the International Conference on
Very Large Data Bases (VLDB) - ACM SIGMOD Workshop on Research Issues in Data
Mining and Knowledge Discovery - Proceedings of the International Conference on
Data Warehousing and Knowledge Discovery
3Where to Find References?
- Proceedings
- Proceedings of ACM SIGMOD International
Conference on Management of Data - Pacific-Asia Conference on Knowledge Discovery
and Data Mining (PAKDD) - European Conference on Principles of Data Mining
and Knowledge Discovery (PKDD) - Proceedings of the International Conference on
Database Systems for Advanced Applications
(DASFAA) - Proceedings of the International Conference on
Database and Expert Systems Applications (DEXA)
4Where to Find References?
- Journal
- IEEE Transactions on Knowledge and Data
Engineering (TKDE) - Data Mining and Knowledge Discovery
- Journal of Intelligent Information Systems
- ACM SIGMOD Record
- The International Journal on Very Large Database
- Knowledge and Information Systems
- Data Knowledge Engineering
- International Journal of Cooperative Information
Systems
5Advanced Topics in Data MiningAssociation Rules
6Association Analysis
7What Is Association Mining?
- Association Rule Mining
- Finding frequent patterns, associations,
correlations, or causal structures among item
sets in transaction databases, relational
databases, and other information repositories - Applications
- Market basket analysis (marketing strategy items
to put on sale at reduced prices),
cross-marketing, catalog design, shelf space
layout design, etc - Examples
- Rule form Body Head Support, Confidence.
- buys(x, Computer) buys(x, Software) 2,
60 - major(x, CS) takes(x, DB) grade(x, A)
1, 75
8Market Basket Analysis
Typically, association rules are considered
interesting if they satisfy both a minimum
support threshold and a minimum confidence
threshold.
9Rule Measures Support and Confidence
- Let minimum support 50, and minimum confidence
50, we have - A ? C 50, 66.6
- C ? A 50, 100
10Support Confidence
11Association Rule Basic Concepts
- Given
- (1) database of transactions,
- (2) each transaction is a list of items
(purchased by a customer in a visit) - Find all rules that correlate the presence of one
set of items with that of another set of items - Find all the rules A ? B with minimum confidence
and support - support, s, P(A ? B)
- confidence, c, P(BA)
12Association Rule MiningA Road Map
- Boolean vs. quantitative associations (Based on
the types of values handled in the rule set) - buys(x, SQLServer) buys(x, DM Book) ?
buys(x, DBMiner) 0.2, 60 - age(x, 30..39) income(x, 42..48K) ? buys(x,
PC) 1, 75 - Single dimension vs. multiple dimensional
associations - Single level vs. multiple-level analysis (Based
on the levels of abstractions involved in the
rule set)
13Terminologies
- Item
- I1, I2, I3,
- A, B, C,
- Itemset
- I1, I1, I7, I2, I3, I5,
- A, A, G, B, C, E,
- 1-Itemset
- I1, I2, A,
- 2-Itemset
- I1, I7, I3, I5, A, G,
14Terminologies
- K-Itemset
- If the length of the itemset is K
- Frequent K-Itemset
- If the length of the itemset is K and the itemset
satisfies a minimum support threshold. - Association Rule
- If a rule satisfies both a minimum support
threshold and a minimum confidence threshold
15Analysis
- The number of itemsets of a given cardinality
tends to grow exponentially
16Mining Association Rules Apriori Principle
Min. support 50 Min. confidence 50
- For rule A ? C
- support support(A ? C) 50
- confidence support(A ? C)/support(A)
66.6 - The Apriori principle
- Any subset of a frequent itemset must be frequent
17Mining Frequent Itemsets the Key Step
- Find the frequent itemsets the sets of items
that have minimum support - A subset of a frequent itemset must also be a
frequent itemset - i.e., if AB is a frequent itemset, both A and
B should be a frequent itemset - Iteratively find frequent itemsets with
cardinality from 1 to k (k-itemset) - Use the frequent itemsets to generate
association rules
18Example
19Apriori Algorithm
20Apriori Algorithm
21Apriori Algorithm
22Example of Generating Candidates
- L3abc, abd, acd, ace, bcd
- Self-joining L3L3
- abcd from abc and abd
- acde from acd and ace
- Pruning
- acde is removed because ade is not in L3
- C4abcd
23Another Example 1
24Another Example 2
25Is Apriori Fast Enough? Performance Bottlenecks
- The core of the Apriori algorithm
- Use frequent (k1)-itemsets to generate candidate
frequent k-itemsets - Use database scan to collect counts for the
candidate itemsets - The bottleneck of Apriori
- Huge candidate sets
- 104 frequent 1-itemset will generate 107
candidate 2-itemsets - To discover a frequent pattern of size 100, e.g.,
a1, a2, , a100, one needs to generate 2100 ?
1030 candidates. - Multiple scans of database
- Needs (n 1) scans, n is the length of the
longest pattern
26Demo-IBM Intelligent Minner
27Demo Database
28(No Transcript)
29(No Transcript)
30(No Transcript)
31Methods to Improve Aprioris Efficiency
- Hash-based itemset counting A k-itemset whose
corresponding hashing bucket count is below the
threshold cannot be frequent - Transaction reduction A transaction that does
not contain any frequent k-itemset is useless in
subsequent scans - Partitioning Any itemset that is potentially
frequent in DB must be frequent in at least one
of the partitions of DB - Sampling mining on a subset of given data, lower
support threshold a method to determine the
completeness
32Partitioning
33Hash-Based Itemset Counting
34Compare Apriori DHP (Direct Hash Pruning)
Apriori
35Compare Apriori DHP
DHP
36DHP Database Trimming
37DHP (Direct Hash Pruning)
- A database has four transactions
- Let min_sup 50
38Example Apriori
39Example DHP
40Example DHP
41Example DHP
42Mining Frequent Patterns Without Candidate
Generation
- Compress a large database into a compact,
Frequent-Pattern tree (FP-tree) structure - highly condensed, but complete for frequent
pattern mining - avoid costly database scans
- Develop an efficient, FP-tree-based frequent
pattern mining method - A divide-and-conquer methodology decompose
mining tasks into smaller ones - Avoid candidate generation sub-database test
only!
43Construct FP-tree from a Transaction DB
44Construction Steps
- Scan DB once, find frequent 1-itemset (single
item pattern) - Order frequent items in frequency descending
order - Sorting DB according to the frequency descending
order - Scan DB again, construct FP-tree
45Benefits of the FP-Tree Structure
- Completeness
- never breaks a long pattern of any transaction
- preserves complete information for frequent
pattern mining - Compactness
- reduce irrelevant informationinfrequent items
are gone - frequency descending ordering more frequent
items are more likely to be shared - never be larger than the original database (if
not count node-links and counts) - Compression ratio could be over 100
46Frequent Pattern Growth
Order frequent items in frequency descending order
- For I5
- I1, I5, I2, I5, I2, I1, I5
- For I4
- I2, I4
- For I3
- I1, I3, I2, I3, I2, I1, I3
- For I1
- I2, I1
47Frequent Pattern Growth
- For I5
- I1, I5, I2, I5, I2, I1, I5
- For I4
- I2, I4
- For I3
- I1, I3, I2, I3, I2, I1, I3
- For I1
- I2, I1
Sub DB
Trimming Databases
Sub DB
Sub DB
FP-tree
Sub DB
Conditional FP-tree from Conditional Pattern-Base
48Conditional FP-tree
Conditional FP-tree from Conditional
Pattern-Base for I3
49Mining Results Using FP-tree
- For I5 (???NULL)
- Conditional Pattern Base
- (I2I11), (I2I1I31)
- Conditional FP-tree
- Generate Frequent Itemsets
- I22 Rule I2I52
- I12 Rule I1I52
- I2I12 Rule I2I1I52
? NULL 2
Item ID Support count Node Link
I2 2
I1 2
? I2 2
? I1 2
50Mining Results Using FP-tree
- For I4
- Conditional Pattern Base
- (I2I11), (I21)
- Conditional FP-tree
- Generate Frequent Itemsets
- I22 Rule I2I42
? NULL 2
Item ID Support count Node Link
I2 2
? I2 2
51Mining Results Using FP-tree
- For I3
- Conditional Pattern Base
- (I2I12), (I22), (I12)
- Conditional FP-tree
? NULL 4
Item ID Support count Node Link
I2 4
I1 4
? I2 4
? I1 2
? I1 2
52Mining Results Using FP-tree
- For I1/I3
- Conditional Pattern Base
- (NULL2), (I22)
- Conditional FP-tree
- Generate Frequent Itemsets
- Null4 Rule I1I34
- I22 Rule I2I1I32
? NULL 4
Item ID Support count Node Link
I2 2
? I2 2
53Mining Results Using FP-tree
- For I2/I3
- Conditional Pattern Base
- (NULL4)
- Conditional FP-tree
- Generate Frequent Itemsets
- Null Rule I2I34
? NULL 4
54Mining Results Using FP-tree
- For I1
- Conditional Pattern Base
- (NULL2), (I24)
- Conditional FP-tree
- Generate Frequent Itemsets
- I24 Rule I2I14
? NULL 6
Item ID Support count Node Link
I2 4
? I2 4
55Mining Results Using FP-tree
56Mining Frequent PatternsUsing FP-tree
- General idea (divide-and-conquer)
- Recursively grow frequent pattern path using the
FP-tree - Method
- For each item, construct its conditional
pattern-base, and then its conditional FP-tree - Repeat the process on each newly created
conditional FP-tree - Until the resulting FP-tree is empty, or it
contains only one path (single path will generate
all the combinations of its sub-paths, each of
which is a frequent pattern)
57Major Steps to Mine FP-tree
- Construct conditional pattern base for each item
in the FP-tree - Construct conditional FP-tree from each
conditional pattern-base - Recursively mine conditional FP-trees and grow
frequent patterns obtained so far - If the conditional FP-tree contains a single
path, simply enumerate all the patterns
58Virtual Items in Association Mining
- Different region exhibit different selling
patterns. Thus, including as virtual item the
information on the location or the type of stores
(existing or new) where the purchase was made
will enable the comparisons between locations or
types within a single chain - Virtual item may include information on whether
the purchase was made with cash, a credit card or
check. The inclusion of such virtual item allows
to analyze the association between the payment
method and items purchased. - Virtual item may include information on the day
of the week or the time of the day the
transaction occurred. The inclusion of such
virtual item allows to analyze the association
between the transaction time and items purchased
59Virtual Items An Example
60Dissociation Rules
- A dissociation rule is similar to an association
rule except that it can have not item-name in
the condition or the result of the rule - A and not B C
- A and D not E
- Dissociation rules can be generated by a simple
adaptation of the association rule analysis
61Discussions
- The size of a typical transaction grows because
it now includes inverted items - The total number of items used in the analysis
doubles - Since the amount of computation grows
exponentially with the number of items, doubling
the number of items seriously degrades
performance - The frequency of the inverted items tends to be
much larger than the frequency of the original
items. So, it tends to produce rules in which all
items are inverted. These rules are less likely
to be actionable. - not A and not B not C
- It is useful to invert only the most frequent
items in the set used for analysis. It is also
useful to invert some items whose inverses are of
interest.
62Interestingness Measurements
- Subjective Measures
- A rule (pattern) is interesting if
- it is unexpected (surprising to the user)
- actionable (the user can do something with it)
- Objective Measures
- Two popular measurements
- Support
- confidence
63Criticism to Support and Confidence
- Example 1
- Among 5000 students
- 3000 play basketball
- 3750 eat cereal
- 2000 both play basket ball and eat cereal
- play basketball ? eat cereal 40, 66.7 is
misleading because the overall percentage of
students eating cereal is 75 which is higher
than 66.7.
64Criticism to Support and Confidence
- Example 2
- X and Y positively correlated,
- X and Z, negatively related
- support and confidence of
- XgtZ dominates
- We need a measure of dependent or correlated
events
65Criticism to Support and Confidence
- Improvement (Correlation)
- Taking both P(A) and P(B) in consideration
- P(AB)P(B)P(A), if A and B are independent
events - A and B negatively correlated, if the value is
less than 1otherwise A and B positively
correlated - When improvement is less than 1, negating the
result produces a better rule - X gt NOT Z
66Multiple-Level Association Rules
- Items often form hierarchy
- Items at the lower level are expected to have
lower support - Rules regarding itemsets at appropriate levels
could be quite useful - Transaction database can be encoded based on
dimensions and levels - We can explore multi-level mining
67Transaction Database
68Concept Hierarchy
69Mining Multi-Level Associations
- A top_down, progressive deepening approach
- First find high-level strong rulesmilk bread
20, 60 - Then find their lower-level weaker rules2
milk wheat bread 6, 50 - Variations at mining multiple-level association
rules. - Cross-level association rules (Generalized Asso.
Rules)2 milk Wonder wheat bread - Association rules with multiple, alternative
hierarchies2 milk Wonder bread
70Multi-level Association Uniform Support vs.
Reduced Support
- Uniform Support the same minimum support for all
levels - One minimum support threshold is needed.
- Lower level items do not occur as frequently. If
support threshold - too high ? miss low level associations
- too low ? generate too many high level
associations - Reduced Support reduced minimum support at lower
levels - There are 4 search strategies
- Level-by-level independent
- Level-cross filtering by k-itemset
- Level-cross filtering by single item
- Controlled level-cross filtering by single item
71 Uniform Support
?
- Optimization Technique
- The search avoids examining itemsets containing
any item whose ancestors do not have minimum
support.
72Uniform Support An Example
L1 2,3,4,5,6 L2 23,24,25,26,34,45,46,56 L3
234,245,246,256,456 L4 2456
min_sup50
L1 2,3,4 L2 23,24,34 L3 234
min_sup60
L1 2,3,6,8,9 L2 23,68,69,89 L3 689
min_sup50
L1 9
min_sup60
73Uniform Support An Example
All
Cheese
Crab
Milk
3
1
2
1
2
6
3
5
Kings Crab
Sunset Milk
Dairyland Milk
Dairyland Cheese
Best Cheese
Bread
Apple
Pie
4
5
6
4
9
7
10
8
Best Bread
Wonder Bread
Goldenfarm Apple
Westcoast Bread
Tasty Pie
74Uniform Support An Example
L1 2,3,4,5,6 L2 23,24,25,26,34,45,46,56 L3
234,245,246,256,456 L4 2456
min_sup50
Apriori/DHP FP Growth
(1)
(2)
min_sup50
L1 2,3,6,8,9
Scan DB
C2 23,36,29,69,28,68,39,89 C3 239,369,289,689
(3)
min_sup50
Scan DB
L2 23,68,69,89 L3 689
75Uniform Support An Example
All
Cheese
Crab
Milk
3
1
2
1
2
6
3
5
Kings Crab
Sunset Milk
Dairyland Milk
Dairyland Cheese
Best Cheese
Bread
Apple
Pie
4
5
6
4
9
7
10
8
Best Bread
Wonder Bread
Goldenfarm Apple
Westcoast Bread
Tasty Pie
76Reduced Support
- Each level of abstraction has its own minimum
support threshold.
77Search Strategies forReduced Support
- There are 4 search strategies
- Level-by-level independent
- Full-Breadth Search
- No pruning
- No background knowledge of frequent itemsets is
used for pruning - Level-cross filtering by single item
- An item at the ith level is examined if and only
if its parent node at the (i-1)th level is
frequent. - Level-cross filtering by k-itemset
- An k-itemset at the ith level is examined if and
only if its corresponding parent k-itemset at the
(i-1)th level is frequent. - Controlled level-cross filtering by single item
78Level-Cross Filtering by Single Item
79Reduced Support An Example
L1 2,3,4 L2 23,24,34 L3 234
min_sup60
Apriori/DHP FP Growth
(1)
(1)
min_sup50
L1 2,3,6,9
Scan DB
(2)
min_sup50
L2 23,69
Apriori/DHP FP Growth
(3)
80Reduced Support An Example
All
Cheese
Crab
Milk
3
1
2
1
2
6
3
5
Kings Crab
Sunset Milk
Dairyland Milk
Dairyland Cheese
Best Cheese
Bread
Apple
Pie
4
5
6
4
9
7
10
8
Best Bread
Wonder Bread
Goldenfarm Apple
Westcoast Bread
Tasty Pie
81Level-Cross Filtering by K-Itemset
?
?
82Reduced Support An Example
L1 2,3,4 L2 23,24,34 L3 234
min_sup60
Apriori/DHP FP Growth
(1)
(2)
min_sup50
L1 2,3,6,9
Scan DB
C2 23,36,29,69,39 C3 239,369
(3)
min_sup50
Scan DB
L2 23,69
83Reduced Support An Example
All
Cheese
Crab
Milk
3
1
2
1
2
6
3
5
Kings Crab
Sunset Milk
Dairyland Milk
Dairyland Cheese
Best Cheese
Bread
Apple
Pie
4
5
6
4
9
7
10
8
Best Bread
Wonder Bread
Goldenfarm Apple
Westcoast Bread
Tasty Pie
84Reduced Support
- Level-by-level independent
- It is very relaxed in that it may lead to
examining numerous infrequent items at low
levels, finding associations between items of
little importance. - Level-cross filtering by k-itemset
- It allows the mining system to examine only the
children of frequent k-itemsets. - This restriction is very strong in that there
usually are not many k-itemsets. - Many valuable patterns may be filtered out.
- Level-cross filtering by single item
- A compromise between the above two approaches
- This method may miss associations between low
level items that are frequent based on a reduced
minimum support, but whose ancestors do not
satisfy minimum support.
85Controlled Level-Cross Filtering by Single Item
86Reduced Support An Example
L1 2,3,4 L2 23,24,34 L3 234
min_sup60
Apriori/DHP FP Growth
level_passage_sup50 L1 2,3,4,5,6
(1)
(1)
min_sup50
L1 2,3,6,8,9
Scan DB
(2)
L2 23,68,69,89 L3 689
min_sup50
Apriori/DHP FP Growth
(3)
87Reduced Support An Example
All
Cheese
Crab
Milk
3
1
2
1
2
6
3
5
Kings Crab
Sunset Milk
Dairyland Milk
Dairyland Cheese
Best Cheese
Bread
Apple
Pie
4
5
6
4
9
7
10
8
Best Bread
Wonder Bread
Goldenfarm Apple
Westcoast Bread
Tasty Pie
88Multi-Dimensional Association
- Single-Dimensional (Intra-Dimension) Rules
Single Dimension (Predicate) with Multiple
Occurrences. - buys(X, milk) ? buys(X, bread)
- Multi-Dimensional Rules ? 2 Dimensions
- Inter-dimension association rules (no repeated
predicates) - age(X,19-25) ? occupation(X,student) ?
buys(X,coke) - hybrid-dimension association rules (repeated
predicates) - age(X,19-25) ? buys(X, popcorn) ? buys(X,
coke)
89Summary
- Association rule mining
- Probably the most significant contribution from
the database community in KDD - A large number of papers have been published
- Some Important Issues
- Generalized Association Rules
- Multiple-Level Association Rules
- Association Analysis in Other Types of Data
- Spatial Data, Multimedia Data, Time Series Data,
etc. - Weighted Association Rules
- Quantitative Association Rules
90Weighted Association Rules
- Why Weighted Association Analysis?
- In previous work, all items in a transactional
database are treated uniformly - Items are given weights to reflect their
importance to the user - The weights may correspond to special promotions
on some products, or the profitability of
different items - Some products may be under promotion and hence
are more interesting, or some products are more
profitable and hence rules concerning them are of
greater values
91Weighted Association Rules
- A simple attempt to solve this problem is to
eliminate the items with small weights - However, a rule for a heavy weighted item may
also consist of low weighted items - Is Apriori algorithm feasible?
- Apriori algorithm depends on the downward closure
property which governs that subsets of a frequent
itemset are also frequent - However, it is not true for the weighted case
92Weighted Association RulesAn Example
- Total Benefits 500
- Benefits for the First Transaction
(403030202010 10)160 - Benefits for the Second Transaction (403020
201010)130 - Benefits for the Third Transaction
(4030201010) 110 - Benefits for the Fourth Transaction
(3030201010)100 - Suppose Weighted_Min_Sup 40
- Minimum Benefits 500 40 200
93An Example
- Minimum Benefits 500 40 200
- Itemset 3,5,6,7
- Benefits 70
- Support Count (Frequency) 3
- 70 3 210 gt 200 ?3,5,6,7is a Frequent
Itemset - Itemset 3,5,6
- Benefits 60
- Support Count ( Frequency) 3
- 60 3 180 lt 200 ?3,5,6 is not a Frequent
Itemset
Apriori Principle can not be applied!
94K-Support Bound
- If Y is a frequent q-itemset
- Support_Count(Y) ? (Weighted_Min_Sup
Total_Benefits) / Benefits(Y) - Example
- 3,5,6,7 is a Frequent 4-Itemset
- Support_Count(3,5,6,7) 3 ? (40 500) /
Benefits(3,5,6,7) (40 500) / 70 2.857 - If X is a frequent k-itemset containing q-itemset
Y - Minimum_Support_Count(X) ? (Weighted_Min_Sup
Total_Benefits) / (Benefits(Y) (k-q) Maximum
Remaining Weights) - Example
- X is a Frequent 5-Itemset containing 3,5,6,7
- Minimum_Support_Count(X) ? (40 500) / (70
40) 1.81
K-Support Bound
95K-Support Bound
- Itemset 1,2
- Benefits 70
- Support_Count(1,2) 1 lt (40 500) /
Benefits(1,2) (40 500) / 70 2.857 - 1,2 is not a Frequent Itemset
- If X is a frequent 3-itemset containing 1,2
- Minimum_Support_Count(X) ? (40 500) / (70
30) 2 - But, Maximum_Support_Count(X) 1
- No frequent 3-itemsets containing 1,2
- If X is a frequent 4-itemset containing 1,2
- Minimum_Support_Count(X) ? (40 500) / (70 30
20) 1.667 - But, Maximum_Support_Count(X) 1
- No frequent 4-itemsets containing 1,2
- Similarly, no frequent 5, 6, 7-itemsets
containing 1,2 - The algorithm is designed based on this k-support
bound
96MINWAL Algorithm
97Step by Step
- Input Product Transactional Databases
Weighted_Min_Sup 50
Total Profits 1380
98Step 2, 7
- Search(D)
- This subroutine finds out the maximum transaction
size in that transactional database D - Size 4 in this case
- Counting(D, w)
- This subroutine cumulates the support counts of
the 1-itemsets - The k-support bounds of each 1-itemset will be
calculated, and the 1-itemsets with support
counts greater than any of the k-support bounds
will be kept in C1
99Step 7
?
K-Support Bound ? (50 1380) / (10 90) 6.9
100Step 11
- Join(Ck-1)
- The Join step generates Ck from Ck-1 as Apriori
Algorithm - If we have 1, 2, 3, 1, 2, 4 in Ck-1 1, 2, 3,
4 will be generated in Ck - In this case,
- C1 1 (4), 2 (5), 4 (6), 5 (7)
- C2 Join(C1) 12, 14, 15, 24, 25, 45
Support_Count
101Step 12
- Prune(Ck)
- The itemset will be pruned in either of the
following cases - A subset of the candidate itemset in Ck does not
exist in Ck-1 - Estimate an upper bound on the support count (SC)
of the joined itemset X, which is the minimum
support count among the k different (k-1)-subsets
of X in Ck-1. If the estimated upper bound on
the support count shows that the itemset X cannot
be a subset of any large itemset in the coming
passes (from the calculation of k-support bounds
for all itemsets), that itemset will be pruned - In this case,
- C2 Prune(C2) 12 (4), 14 (4), 15 (4), 24 (5),
25 (5), 45 (6) - Using K-Support-Bound
Estimated_Support_Count
102Step 12
- Prune(Ck)
- Using K-Support-Bound (No one is pruned)
103Step 13
- Checking(Ck, D)
- Scan DB Generate Lk
?
?
C2
L2
104Step 11, 12
- Join(C2)
- C2 15 (4), 24 (5), 25 (5), 45 (6)
- C3 Join(C2) 245
- Prune(C3)
- C3 Prune(C3) 245 (5)
- Using K-Support-Bound (No one is pruned)
105Step 13
- Checking(C3, D) Scan DB
- C3 L3 245
- Finally, L 45, 245
106Step 15
- Generate Rules for L 45, 245
- 4 ? 5 (confidence 100)
- 5 ? 4 (confidence 85.7)
- 24 ? 5 (confidence 100)
- 25 ? 4 (confidence 100)
- 45 ? 2 (confidence 83.3)
- 2 ? 45 (confidence 100)
- 4 ? 25 (confidence 83.3)
- 5 ? 24 (confidence 71.4)
Min_Conf90
107Generalized Association Rules
108Quantitative Association Rules
- Let min_sup 50, we have
- A, B 60
- B, D 70
- A, B, D 50
- A(1..2), B(3) 50
- A(1..2), B(3..5) 60
- A(1..2), B(1..5) 60
- B(1..5), D(1..2) 60
- B(3..5), D(1..3) 60
- B(1..5), D(1..3) 70
- B(1..3), D(1..2) 50
- B(3..5), D(1..2) 50
- A(1..2), B(3..5), D(1..2) 50
?