Title: Chapter 11 Association Rules
1Chapter 11Association Rules
- Dongil Kim
- Data Mining Lab.
- March, 31, 2008
2- From the Survey Result of ??????? (n1,300)
- The most important ability (skill) in the fields
- 1. Excel
- 2. Powerpoint
- 3. Data Analysis
- 4..
31. Introduction
- What Goes with What?
- Market Baskets
41. Introduction
- What Goes with What?
- Book Stores
Which one seems more associated?
52. Association Rules
- Association Rules (Affinity Analysis)
- (A.K.A.) Market Basket Analysis (MBA)
- What goes with what?
- Which items are purchased together?
- Diaper and Beer
- They are purchased together
- Why?
62. Association Rules
- What Can We Do?
- Purpose of analyzing market basket
- Wine, Cheese
- Understanding customers
- Understanding merchandise
- Discount Wine (Discount Cheese?)
- Cross-selling
- Display (Short or longer?)
72. Association Rules
- Where Can MBA be Applied?
- Retail
- Wine, Cheese, Diaper, Beer, Bread,
Car-stuffs - Credit card
- Gas, Family Restaurant, Airlines, Bus, VISA
- Telecom. Company option design
- Bank
- Medical history
82. Association Rules
Define Items and Transactions
Find Rules Support, Confidence, Lift
Analyze Results
92. Association Rules
- Define Items (WHAT goes with WHAT)
- POS data
- Items Beer, Whiskey, Coke, 7-up, Nacho,
Homerun-ball - Categories Drinks, Beverages, Snacks
- Book Store
- Items Hackers TOEIC, Tomato TOEIC, Hacker TOEFL,
Norwegian Wood - Categories TOEIC, Business, Novel
102. Association Rules
- Define Items (WHAT goes with WHAT)
- Taxonomy
- Low More actionable result, less confident
result - High Less actionable result, more confident
result - A trade-off
Merchandise
Drinks, Microwave Instants, Snacks
Coke, Beer, Wine, Milk
Is there any point you can easily find?
Cocacola, Pepsi, Krombacker, Becksdark
Cocacola Zero 500ml, Cocacola 1L
112. Association Rules
- Define Transactions (what goes WITH what)
- POS data
- A single market basket
- Ingredient
- A single dish of food vs. A course of the whole
food - Bank Account
- Open 3 or 4 different accounts in a single day?
- Credit Card
- Use credit card a day? or for all time?
- Time Space Purpose Character
122. Association Rules
Binary n m table (XLMiner, any mathematical
software)
Maybe an original form of transaction data
(XLMiner)
2 1 table (SAS)
132. Association Rules
- Statement of the Rule
- IF (condition) THEN (result)
- IF (antecedent) THEN (consequent)
- IF (A) THEN (B)
- A -gt B
- Multi-item conditions
- IF (A,B) THEN (C)
- IF (Radiohead, Oasis) THEN (Suede)
- Reflexive?
- IF (Diaper) THEN (Beer)
- IF (Beer) THEN (Diaper)
142. Association Rules
- Support
- How many times the item A appeared in the
database - P(A) or P(A and B) or P(AnB)
- Ex)
Support (Beer) 3/6 Support (Wine) 3/6 Support
(Beer, Wine) 1/6 Support (Snack) 5/6
152. Association Rules
- Confidence
- Conditional probability (Not reflexive)
- P(result cond) P(cond n result)/P(cond)
- If confidence is large, we can say they have
association - Ex)
Conf(Beer -gt Snack) P(Beer n Snack)/P(Beer)
(3/6) / (3/6) 1 Conf(Wine -gt Snack) P(Wine n
Snack)/P(Wine) (2/6) / (3/6) 2/3
162. Association Rules
- Lift
- Lift(cond -gt result) Conf (cond -gt
result)/P(result) - P(cond n result)/(P(cond)P(result))
- If lift gt 1 they have correlation (we can
except the result) - Ex)
Lift(Beer -gt Snack) Conf(Beer -gt
Snack)/P(Snack) 1/(5/6) 6/5 gt1 Lift(Wine -gt
Snack) Conf(Wine -gt Snack)/P(Snack)
(2/3)/(5/6) 12/15 lt 1
172. Association Rules
- The Role of Lift
- Confidence just shows us a conditional
probability - Adjust Confidence by the frequency of the result
item - Lift Confidence/P(result)
- Avoid meaningless rules
- Supermarket vs Liquor Store?
- Promotion sales?
- Ex)
- Conf(A-gtB) 0.9
- If P(B) 1, then Lift(A-gtB) 0.9
- If P(B) 1/10, then Lift(A-gtB) 9
182. Association Rules
192. Association Rules
- Analyze the Result
- Actionable Information
- Too trivial
- IF (Pizza) THEN (Coke)
- IF (Barbie Doll) THEN (Candy)
- So What
- Result of a previous action or a irregular event
- IF (Water) THEN (Chocolate)
- The dataset was gathered from 2/122/13
202. Association Rules
- Analyze the Result
- Correlation vs Causation
- Wine -gt Strong Heart
- Rice Noodle -gt Poor
- What can we do with the result?
- Changing display? (Shorter? Longer?)
- Discount?
- Cross-selling?
- Is there any profitable action?
212. Association Rules
- Analyze the Result
- Complement vs. Substitute
- Bread -gt Milk
- Beer -gt Wine
- Can we promote to cross-sell or discount?
- Why?
Vs.
222. Association Rules
- Conclusion
- Advantages
- Originality
- Interpretability
- Easy to explain to people (who issue grant)
- Actionable
- Difficulties
- Scalability (Min-Support concept with Apriori)
- Item-Transaction define
- Rare item
- Problem dependent
23- Q A
- Homepage http//dmlab.snu.ac.kr
- E-mail dikim01_at_snu.ac.kr
- Phone 02-883-4913 (Lab.),
- 010-3439-7982 (Personal)