Title: Learning%20Fuzzy%20Association%20Rules%20and%20Associative%20Classification%20Rules
1Learning Fuzzy Association Rules and Associative
Classification Rules
- Jianchao Han
- Computer Science Department
- California State University Dominguez Hills
2Agenda
- Introduction
- Traditional Association Rules
- Positive and Negative Fuzzy Association Rules
- An Illustrative Example
- Positive and Negative Fuzzy Associative
Classification Rules - Implementation Algorithms
- Conclusion
3Introduction
- Association
- a relationship between data items
- Sales data association
- If a set of items A occurs in a sale transaction,
then another set of items B will likely also
occurs in the same transaction - Limitations
- Data are described in binary attribute values
- Only positive associations are pursued
- Solutions
- Fuzzy attribute values
- Negative associations
4Traditional Association Rules
- Basket data
- II1, I2 , , Im, a set of possible items
- Dt1, t2 , , tn, a database of transactions
- t?D is represented as a binary vector, with
- tIk1 if t contains Ik
- tIk0 if t does not contain Ik
- Support of itemset
- ?X?I, t satisfies X, if ?Ik?I, tIk1
- The support of X in D is defined as
- Supp(X) t?D t satisfies X
- That is the number of transactions that satisfy X
5Traditional Association Rules
- Itemset (binary) association rules
- For any X, Y?I, X?Y?, X?Y is an association rule
if - The support of the rule Supp(X?Y) is the
probability of occurrence of X?Y in D - The confidence of the rule Conf(X?Y) is the
conditional probability of Y given X - Mining association rules
- Look for all possible associations X?Y such that
Supp(X?Y) a a given threshold and Conf(X?Y)
ß another given threshold
6Association Rules Mining Algorithm
- Two steps
- Discovering all frequent itemsets that have the
support a - Generating association rules
- Partition each frequent itemset into two parts, X
and Y - Test the Conf(X?Y)
- Level-wise algorithm
- Observation if X is a frequent itemset, its all
subsets are - Test all 1-item itemsets
- Test all 2-item itemsets that are the superset of
frequent 1-item itemsets - Repeat until no new frequent itemsets are found
7Fuzzy Association Rules
- Binary value is extended to the interval 0,1
- Example -- Item Tomato belongs to Vegetable in
some degree, say 0.7 - Itemset AA1, A2 , , Al?I, where Ai is a
fuzzy subset of I - Support of an itemset A is defined as
- Support of a rule A?B is
- Confidence of a rule A?B is
8Positive vs. Negative Association Rules
- Positive association rules
- Like A?B
- Negative association rules
- Like A?B, A?B, A?B
- Different rule-interest measures exist for
negative association rules, e.g. - Negative example of A?B is positive example of
B?A - A?B, if
- A?B is infrequent
- A?B is frequent
- Supp(A?B) Supp(A)Supp(B)a
- Supp(A?B)/Supp(A) ß
9Fuzzy Positive Association Rules
- Simple fuzzy extension to traditional association
rules - A?B is a fuzzy positive association rule, if
- A?B ?
-
-
10Fuzzy Negative Association Rules
- A?B is a negative association rule if
- A?B ?
- Supp(A) a
- Supp(B) a
- Supp(A?B) lt ?
-
-
11Fuzzy Negative Association Rules
- A?B is a negative association rule if
- A?B ?
- Supp(A) a
- Supp(B) a
- Supp(A?B) lt ?
-
-
12Fuzzy Negative Association Rules
- A?B is a negative association rule if
- A?B ?
- Supp(A) a
- Supp(B) a
- Supp(A?B) lt ?
-
-
13Algorithm for Mining both Positive and Negative
Fuzzy Rules
- Two steps
- Generating all frequent and infrequent itemsets
- Extracting fuzzy association rules
- Positive rules are extracted from the frequent
itemsets - Negative rules are extracted from the infrequent
itemsets
14An Example
Transaction Database
Frequent vs. Infrequent Itemsets With support
threshold 40
Trans. i1 i2 i3 i4 i5 i6
t1 1.0 0.7 0.2 0.0 1.0 1.0
t2 0.8 0.0 0.6 0.8 0.4 0.2
t3 0.5 0.8 0.0 0.8 0.8 0.0
t4 0.7 0.2 1.0 0.9 1.0 0.8
t5 0.4 0.4 0.0 0.6 0.8 0.9
t6 0.8 0.0 0.1 1.0 0.1 0.8
t7 0.9 0.9 0.8 0.2 1.0 1.0
t8 0.6 0.1 0.1 0.8 0.7 0.8
1-itemset 1-itemset 2-itemsets 2-itemsets 3-itemsets 3-itemsets
itemset support itemset Support itemset support
i1 5.7/8 i1, i4 3.37/8 i1, i4, i5 1.99/8
i2 3.1/8 i1, i5 4.14/8 i1, i5, i6 3.21/8
i3 2.8/8 i1, i6 4.10/8
i4 5.1/8 i4, i5 3.20/8
i5 5.8/8 i4, i6 3.06/8
i6 5.5/8 i5, i6 4.24/8
15An Example Positive Fuzzy Association Rules
itemset association support confidence
i1, i4 i1?i4 i4?i1 3.37/8 59.1 66.1
i1, i5 i1?i5 i5?i1 4.14/8 72.6 71.4
i1, i6 i1?i6 i6?i1 4.10/8 71.9 74.5
i4, i5 i4?i5 i5?i4 3.20/8 62.7 55.2
i5, i6 i5?i6 i6?i5 4.24/8 73.1 77.1
i1, i5, i6 i1, i5?i6 i1, i6?i5 i5, i6?i1 i1?i5, i6 i5?i1, i6 i6?i1, i5 3.21/8 77.6 78.3 75.8 56.4 55.4 58.4
Support threshold 50 Confidence threshold 70
Support threshold 40 Confidence threshold 75
16An Example Negative Fuzzy Association Rules
itemset association support confidence
i4, i6 i4??i6 ?i6?i4 2.04/8 35.8 73.0 35.8 73.0
i4, i6 i6??i4 ?i4?i6 2.44/8 44.4 84.1 44.4 84.1
i4, i6 ?i4??i6 ?i6??i6 0.46/8 15.9 18.4 15.9 18.4
i1, i4, i5 i1, i4??i5 ?i5?i1, i4 1.376/8 40.8 62.5 40.8 62.5
i1, i4, i5 i1, i5?i4 ?i4?i1, i5 2.146/8 51.8 74.0 51.8 74.0
i1, i4, i5 i4, i5??i1 ?i1?i4, i5 1.206/8 37.6 52.4 37.6 52.4
i1, i4, i5 i1??i4, ?i5 ?i4, ?i5?i1 0.184/8 3.20 61.3 3.20 61.3
i1, i4, i5 i4??i1, ?i5 ?i1, ?i5?i4 0.524/8 10.3 81.9 10.3 81.9
i1, i4, i5 i5??i1, ?i4 ?i1, ?i4?i5 0.454/8 7.80 79.6 7.80 79.6
i1, i4, i5 ?i1??i4, ?i5 ?i4, ?i5??i1 ?i4??i1, ?i5 ?i1, ?i5??i4 ?i5??i1, ?i4 ?i1, ?i4??i5 0.116/8 5.00 38.7 4.00 18.1 5.30 20.4 5.00 38.7 4.00 18.1 5.30 20.4
Support threshold 25 Confidence threshold 70
17Associative Classification Rules
- Associative classification rules are a special
subset of association rules whose right-hand-side
is restricted to the class labels. - In classification, data attributes are
partitioned into two categories condition
attributes and decision attributes. - For simplicity, decision attributes are converted
into decision attribute-value pairs that are
indicated as class labels. - Thus, class labels are also items in the
database, but separate from condition items.
18Two Constraints
- the left-hand-side of classification rules must
be frequent itemsets of condition attributes, or
the negation of infrequent conditional itemsets - the class labels that appear in the
right-hand-side of classification rules must also
be frequent 1-itemsets
19Positive Fuzzy Associative Classification Rules
- Let A?I be an itemset, and c? C be a class label.
The relationship A?c is a positive fuzzy
associative classification rule, if the following
conditions hold - A ?c is a frequent itemsets in D,
Supp(A?c)/D ? minsupp - 2) A? c is confident,
- Conf(A?cSupp(A?c)/Supp(A)? minconf
20Negative Fuzzy Associative Classification Rules
- We only consider the format ?A?c
- where A is a frequent itemset,
- c is a frequent class label,
- A?c is infrequent
- ?A?c is a negative fuzzy associative
classification rule if - 1 Supp(A) minsupp
- 2 Supp(c) minsupp
- 3 Supp(A?c)/D lt minsupp
- 4 Supp(A?c)/D minsupp
- 5 Conf(?A?c)Supp(A?c)/Supp(A)minconf.
21Learning Algorithm
- Step 1 Finding the set of frequent conditional
itemsets for associative classification rules - Step 2 Inducing both positive and negative fuzzy
associative classification rules - add each frequent class label c to each frequent
itemset X - If X ?c is still frequent, then test if X?c is
a positive fuzzy association rule - If X ?c is infrequent, then test if ?X?c is a
negative fuzzy association rule. - a frequent itemset Y is partitioned into two
subsets A and B, and the associations ?A?B?c and
A??B?c are tested against the support threshold
and confidence threshold.
22Conclusion
- Traditional association rules
- Fuzzy extensions and negative rules
- Fuzzy associative classification rules
- An example
- Algorithms