Title: Data%20Mining%20Techniques%20Association%20Rule
1Data Mining Techniques Association Rule
2What Is Association Mining?
- Association Rule Mining
- Finding frequent patterns, associations,
correlations, or causal structures among item
sets in transaction databases, relational
databases, and other information repositories - Applications
- Market basket analysis (marketing strategy items
to put on sale at reduced prices),
cross-marketing, catalog design, shelf space
layout design, etc - Examples
- Rule form Body Head Support, Confidence.
- buys(x, Computer) buys(x, Software) 2,
60 - major(x, CS) takes(x, DB) grade(x, A)
1, 75
3Market Basket Analysis
Typically, association rules are considered
interesting if they satisfy both a minimum
support threshold and a minimum confidence
threshold.
4Rule Measures Support and Confidence
- Let minimum support 50, and minimum confidence
50, we have - A ? C 50, 66.6
- C ? A 50, 100
5Support Confidence
6Association Rule Basic Concepts
- Given
- (1) database of transactions,
- (2) each transaction is a list of items
(purchased by a customer in a visit) - Find all rules that correlate the presence of one
set of items with that of another set of items - Find all the rules A ? B with minimum confidence
and support - support, s, P(A ? B)
- confidence, c, P(BA)
7Terminologies
- Item
- I1, I2, I3,
- A, B, C,
- Itemset
- I1, I1, I7, I2, I3, I5,
- A, A, G, B, C, E,
- 1-Itemset
- I1, I2, A,
- 2-Itemset
- I1, I7, I3, I5, A, G,
8Terminologies
- K-Itemset
- If the length of the itemset is K
- Frequent (Large) K-Itemset
- If the length of the itemset is K and the itemset
satisfies a minimum support threshold. - Association Rule
- If a rule satisfies both a minimum support
threshold and a minimum confidence threshold
9Analysis
- The number of itemsets of a given cardinality
tends to grow exponentially
10Fast Algorithms for Mining Association Rules
11Mining Association Rules Apriori Principle
Min. support 50 Min. confidence 50
- For rule A ? C
- support support(A ? C) 50
- confidence support(A ? C)/support(A)
66.6 - The Apriori principle
- Any subset of a frequent itemset must be frequent
12Mining Frequent Itemsets the Key Step
- Find the frequent itemsets the sets of items
that have minimum support - A subset of a frequent itemset must also be a
frequent itemset - i.e., if AB is a frequent itemset, both A and
B should be a frequent itemset - Iteratively find frequent itemsets with
cardinality from 1 to k (k-itemset) - Use the frequent itemsets to generate
association rules
13Example
14Example of Generating Candidates
- L3abc, abd, acd, ace, bcd
- Self-joining L3L3
- abcd from abc and abd
- acde from acd and ace
- Pruning
- acde is removed because ade is not in L3
- C4abcd
15Example
16Apriori Algorithm
17Apriori Algorithm
18Apriori Algorithm
19Exercise 4
min-sup 20 min-conf 80
20Demo-IBM Intelligent Minner
21Demo Database
22(No Transcript)
23(No Transcript)
24(No Transcript)
25Multi-Dimensional Association
- Single-Dimensional (Intra-Dimension) Rules
Single Dimension (Predicate) with Multiple
Occurrences. - buys(X, milk) ? buys(X, bread)
- Multi-Dimensional Rules ? 2 Dimensions
- Inter-dimension association rules (no repeated
predicates) - age(X,19-25) ? occupation(X,student) ?
buys(X,coke) - hybrid-dimension association rules (repeated
predicates) - age(X,19-25) ? buys(X, popcorn) ? buys(X,
coke) - Categorical (Nominal) Attributes
- finite number of possible values, no ordering
among values - Quantitative Attributes
- numeric, implicit ordering among values
26Exercise 5
min-sup 20 min-conf 80
27Research Topics
- Quantitative Association Rules
- buys (bread, 5) buys (milk, 3)
- Weighted Association Rules
- High Utility Association Rules
- Non-redundant Association Rule
- Constrained Association Rules Mining
- Multi-dimensional Association Rules
- Generalized Association Rules
- Negative Association Rules
- Incremental Mining Association Rules
- Data Stream Association Rule Mining
- Interactive Mining Association Rules