Title: ??????%20Practices%20of%20Business%20Intelligence
1??????Practices of Business Intelligence
Tamkang University
????????? (Data Mining for Business Intelligence)
1022BI06 MI4 Wed, 9,10 (1610-1800) (B113)
Min-Yuh Day ??? Assistant Professor ?????? Dept.
of Information Management, Tamkang
University ???? ?????? http//mail.
tku.edu.tw/myday/ 2014-03-26
2???? (Syllabus)
- ?? (Week) ?? (Date) ?? (Subject/Topics)
- 1 103/02/19 ?????? (Introduction to
Business Intelligence) - 2 103/02/26 ?????????????
(Management Decision Support System and
Business Intelligence) - 3 103/03/05 ?????? (Business Performance
Management) - 4 103/03/12 ???? (Data Warehousing)
- 5 103/03/19 ????????? (Data Mining for
Business Intelligence) - 6 103/03/26 ????????? (Data Mining for
Business Intelligence) - 7 103/04/02 ??????? (Off-campus study)
- 8 103/04/09 ???????????
(Data Science and Big Data Analytics)
3???? (Syllabus)
- ?? ?? ??(Subject/Topics)
- 9 103/04/16 ???? (Midterm Project
Presentation) - 10 103/04/23 ????? (Midterm Exam)
- 11 103/04/30 ????????? (Text and Web
Mining) - 12 103/05/07 ?????????
(Opinion Mining and Sentiment Analysis) - 13 103/05/14 ?????? (Social Network
Analysis) - 14 103/05/21 ???? (Final Project
Presentation) - 15 103/05/28 ????? (Final Exam)
4A Taxonomy for Data Mining Tasks
Source Turban et al. (2011), Decision Support
and Business Intelligence Systems
5Market Basket Analysis
Source Han Kamber (2006)
6Association Rule Mining
Source Turban et al. (2011), Decision Support
and Business Intelligence Systems
7Basic Concepts Frequent Patterns and Association
Rules
- Itemset X x1, , xk
- Find all the rules X ? Y with minimum support and
confidence - support, s, probability that a transaction
contains X ? Y - confidence, c, conditional probability that a
transaction having X also contains Y
Transaction-id Items bought
10 A, B, D
20 A, C, D
30 A, D, E
40 B, E, F
50 B, C, D, E, F
Let supmin 50, confmin 50 Freq. Pat.
A3, B3, D4, E3, AD3 Association rules A ?
D (60, 100) D ? A (60, 75)
A ? D (support 3/5 60, confidence 3/3
100) D ? A (support 3/5 60, confidence
3/4 75)
Source Han Kamber (2006)
8Market basket analysis
- Example
- Which groups or sets of items are customers
likely to purchase on a given trip to the store? - Association Rule
- Computer ? antivirus_software support 2
confidence 60 - A support of 2 means that 2 of all the
transactions under analysis show that computer
and antivirus software are purchased together. - A confidence of 60 means that 60 of the
customers who purchased a computer also bought
the software.
Source Han Kamber (2006)
9Association rules
- Association rules are considered interesting if
they satisfy both - a minimum support threshold and
- a minimum confidence threshold.
Source Han Kamber (2006)
10Frequent Itemsets, Closed Itemsets, and
Association Rules
- Support (A? B) P(A ? B)
- Confidence (A? B) P(BA)
Source Han Kamber (2006)
11Support (A? B) P(A ? B)Confidence (A? B)
P(BA)
- The notation P(A ? B) indicates the probability
that a transaction contains the union of set A
and set B - (i.e., it contains every item in A and in B).
- This should not be confused with P(A or B), which
indicates the probability that a transaction
contains either A or B.
Source Han Kamber (2006)
12- itemset
- A set of items is referred to as an itemset.
- K-itemset
- An itemset that contains k items is a k-itemset.
- Example
- The set computer, antivirus software is a
2-itemset.
Source Han Kamber (2006)
13- If the relative support of an itemset I satisfies
a prespecified minimum support threshold, then I
is a frequent itemset. - i.e., the absolute support of I satisfies the
corresponding minimum support count threshold - The set of frequent k-itemsets is commonly
denoted by LK
Source Han Kamber (2006)
14- the confidence of rule A? B can be easily derived
from the support counts of A and A ? B. - once the support counts of A, B, and A ? B are
found, it is straightforward to derive the
corresponding association rules A?B and B?A and
check whether they are strong. - Thus the problem of mining association rules can
be reduced to that of mining frequent itemsets.
Source Han Kamber (2006)
15Transactional data for an AllElectronics branch
Source Han Kamber (2006)
16Example Apriori
- Lets look at a concrete example, based on the
AllElectronics transaction database, D. - There are nine transactions in this database,
that is, D 9. - Apriori algorithm for finding frequent itemsets
in D
Source Han Kamber (2006)
17Example Apriori AlgorithmGeneration of
candidate itemsets and frequent itemsets, where
the minimum support count is 2.
Source Han Kamber (2006)
18Example Apriori Algorithm C1 ? L1
Source Han Kamber (2006)
19Example Apriori Algorithm C2 ? L2
Source Han Kamber (2006)
20Example Apriori Algorithm C3 ? L3
Source Han Kamber (2006)
21The Apriori algorithm for discovering frequent
itemsets for mining Boolean association rules.
Source Han Kamber (2006)
22Generating Association Rules from Frequent
Itemsets
Source Han Kamber (2006)
23ExampleGenerating association rules
- frequent itemset l I1, I2, I5
- If the minimum confidence threshold is, say, 70,
then only the second, third, and last rules above
are output, because these are the only ones
generated that are strong.
Source Han Kamber (2006)
24????????????Support Confidence
Source SAS Enterprise Miner Course Notes, 2014,
SAS
25Support Confidence ????????????????
Checking Account
No
Yes
4,000
No
Saving Account
6,000
Yes
10,000
Lift (SVG ? CK) Confidence/Expected Confidence
0.83/0.85 lt 1
Source SAS Enterprise Miner Course Notes, 2014,
SAS
26????????????Lift???
- ???????????????
- ??? Saving account ? Checking
account???????????????Checking account???????? - ???(Lift)???????????????????????
- Lift (SVG ? CK) Confidence/Expected Confidence
0.83/0.85 lt 1
Source SAS Enterprise Miner Course Notes, 2014,
SAS
27Support (A?B) Confidence (A?B)Expected
Confidence (A?B)Lift (A?B)
28Support (A? B) P(A ? B) A?B ??????/?????Count(A
B)/Count(Total) Confidence (A? B) P(BA) Conf
(A ? B) Supp (A ? B)/ Supp (A) A?B
??????/A????? Count(AB)/Count(A) Expected
Confidence (A?B) Support(B) Count(B) Lift (A ?
B) Confidence (A?B) / Expected Confidence
(A?B) Lift (A ? B) Supp (A ? B) / (Supp (A) x
Supp (B)) Lift (Correlation) Lift (A?B)
Confidence (A?B) / Support(B)
29Lift (A?B)
- Lift (A?B) Confidence (A?B) / Expected
Confidence (A?B) Confidence (A?B) /
Support(B) (Supp (AB) / Supp (A)) / Supp(B)
Supp (AB) / Supp (A) x Supp (B) - Lift ??? (???)Lift (A?B) 2 ?? A?B ?????????
2,??????A??????B???,????BÂ ????? (??)?2??
30?????????????????????
- ?????????
- ????????????????
- ???????????????
- ???????????????????
- ??????????,?????????
- ????????????????????
- ???????????????
- ?????????,????
Source SAS Enterprise Miner Course Notes, 2014,
SAS
31???????????????
Source SAS Enterprise Miner Course Notes, 2014,
SAS
32???????? (SAS EM ????)Case Study 2 (Association
Analysis using SAS EM) Web Site Usage Associations
33??????????
34????
- ABC??????????????,???????,????????????????????????
???????,???????????(music streams)?????(podcasts)?
????(news streams)?????(live Web
)????????(archives)??????????????????????????????
?????,?????????????? - ????????????150??????????
Source SAS Enterprise Miner Course Notes, 2014,
SAS
35??????
- ????? webstation.sas7bdat
ARCHIVE ??????
EXTREF ????
LIVESTREAM ??????
MUSICSTREAM ?????
NEWS ????
PODCAST ????
SIMULCAST ????
WEBSITE ??
Source SAS Enterprise Miner Course Notes, 2014,
SAS
36??????????????
- ????
- ???????????,???????????????????????
????
????????? ?????? ????????
Source SAS Enterprise Miner Course Notes, 2014,
SAS
37SAS Enterprise Miner (SAS EM) Case Study
- SAS EM ????4??
- Step 1. ???? (New Project)
- Step 2. ????? (New / Library)
- Step 3. ?????? (Create Data Source)
- Step 4. ????? (Create Diagram)
- SAS EM SEMMA ????
38Download EM_Data.zip (SAS EM Datasets) http//mail
.tku.edu.tw/myday/teaching/1022/DM/Data/EM_Data.zi
p
http//mail.tku.edu.tw/myday/teaching.htm
39Upzip EM_Data.zip to C\DATA\EM_Data
40Upzip EM_Data.zip to C\DATA\EM_Data
41VMware Horizon View Clientsoftcloud.tku.edu.twSA
S Enterprise Miner
42SAS Enterprise Guide (SAS EG)
43SAS EG New Project
44SAS EG Open Data
45SAS EG Open webstation.sas7bdat
46webstation.sas7bdat
47webstation.sas7bdat
48SAS Enterprise Miner 12.1 (SAS EM)
49SAS EM ????4??
- Step 1. ???? (New Project)
- Step 2. ????? (New / Library)
- Step 3. ?????? (Create Data Source)
- Step 4. ????? (Create Diagram)
50Step 1. ???? (New Project)
51Step 1. ???? (New Project)
52Step 1. ???? (New Project)
53SAS Enterprise Miner (EM_Project2)
54Step 2. ????? (New / Library)
55Step 2. ????? (New / Library)
56Step 2. ????? (New / Library)
57Step 2. ????? (New / Library)
58Step 2. ????? (New / Library)
59Step 3. ?????? (Create Data Source)
60Step 3. ?????? (Create Data Source)
61Step 3. ?????? (Create Data Source)
62Step 3. ?????? (Create Data Source)
63Step 3. ?????? (Create Data Source)
64Step 3. ?????? (Create Data Source)
DatabaseName.TableName
LibraryName.TableName
EM_LIB.WEBSTATION
65Step 3. ?????? (Create Data Source)
66Step 3. ?????? (Create Data Source)
67Step 3. ?????? (Create Data Source)
68Step 3. ?????? (Create Data Source)
69Step 3. ?????? (Create Data Source)
70Step 3. ?????? (Create Data Source)
71Step 3. ?????? (Create Data Source)
72Step 3. ?????? (Create Data Source)
Data Source Attribute Role Transaction
73Step 3. ?????? (Create Data Source)
74Step 3. ?????? (Create Data Source)
75Step 4. ????? (Create Diagram)
76Step 4. ????? (Create Diagram)
77Step 4. ????? (Create Diagram)
78SAS Enterprise Miner (SAS EM) Case Study
- SAS EM ????4??
- Step 1. ???? (New Project)
- Step 2. ????? (New / Library)
- Step 3. ?????? (Create Data Source)
- Step 4. ????? (Create Diagram)
- SAS EM SEMMA ????
79????????
80?????? (Sample)
81EM_Lib.Webstation
82?????? (Sample)Edit Variable
83?????? (Sample)Edit Variable - Explore
84?????? (Sample)Edit Variable - Explore
85Explore - Association
86???? (Association Analysis)
87???? (Association Analysis)
88???? (Association Analysis)
89???? (Association Analysis)
90???? (Association Analysis)
91???? (Association Analysis)
92???? (Association Analysis)
93???? (Association Analysis) Support 1
(Minimum Support 1)
94???? (Association Analysis)
95???? (Association Analysis)
96???? (Association Analysis)??/??/???? (Rules
Table)
97???? (Association Analysis)Association Rules -
???? (Rules Table)
98???? (Association Analysis)Association Rules -
???? (Rules Table)
99???? (Association Analysis)??/??/???? (Link
Graph)
100???? (Association Analysis)???? (Link Graph)
101???? (Association Analysis) Maximum Number of
Items 3000000
102???? (Association Analysis)
103???? (Association Analysis)Association Rules -
???? (Rules Table)
104???? (Association Analysis)???? (Link Graph)
105References
- Efraim Turban, Ramesh Sharda, Dursun Delen,
Decision Support and Business Intelligence
Systems, Ninth Edition, 2011, Pearson. - Jiawei Han and Micheline Kamber, Data Mining
Concepts and Techniques, Second Edition, 2006,
Elsevier - Jim Georges, Jeff Thompson and Chip Wells,
Applied Analytics Using SAS Enterprise Miner,
SAS, 2010 - SAS Enterprise Miner Course Notes, 2014, SAS
- SAS Enterprise Miner Training Course, 2014, SAS
- SAS Enterprise Guide Training Course, 2014, SAS