Title: Data Warehouses
1????? ??????
2Knowledge Discovery in Databases (KDD)
- ????? ??? ?????? ?????? ??? ????? ????? ?????
????? ???? ????? ?? ???? ???????? ???????? ???
????? ????? ????? ?????? ?????, ?????, ???????
????? ??????. - "???? ????? ?? ????".
- "???????? ????????".
- "???? ??????" ?????? ????? ???????.
- "?????"
- "???????" ????? ??????? ?????? ?????? ???????
?????? ??? ???? ????? ??????. - "?????" ?????? ???? ???????? ?? ??? ?????? ????
?? ??? ??????. ???? ???? "???? ???? ????? ??????"
??? ????. ????? ??? ?? ????? ??????? ???? ??
????? ??? ??? "?????".
3Data Mining
- ????? ?? ????? ?????? (Data Mining) ?????? ??????
???????? ?? ?????? ???? ????? ????? ????. - ????? ????? ???? ???? ???????? ?????? ?????????
?? ????? ????? ??? ????? ?????? ???. - ???? ????? ????? ?????? ???? ????? ???????? ?????
"????? ??? ?????? ??????" ?????? "????? ??????"
??????? ??????. ??? "????? ??? ?????? ??????"
?????? ?????? ??????? ????? ????? ?????? ??? ??
??? ??? ???? ?????? - ???? ?? ??????? ??????????
?????. ???? ??? ????? ??????? ???? ?????? ?????
?????, ???? ?????? ?????? ??????? (???? ????
???????) ????? ?????? (???? ???? ???????).
4Knowledge Discovery in Databases (KDD)Basic
Steps - based upon Fayyad, Piatetsky-Shapiro, and
Smyth (1996)
- Data selection
- Cleaning and pre-processing of target data
- Removing noisy and erroneous data
- Handling missing data values
- Dimensionality reduction and transformation
- Selecting most important features (attributes)
5Knowledge Discovery in Databases
(continued)Basic Steps - based upon Fayyad,
Piatetsky-Shapiro, and Smyth (1996)
- Data Mining
- Selecting DM methods and tools
- Select the methods parameters.
- Build a model.
6Knowledge Discovery in Databases
(continued)Basic Steps - based upon Fayyad,
Piatetsky-Shapiro, and Smyth (1996)
- Post-processing , interpretation, and evaluation
of results - Visualization of results
- Evaluation of discovered patterns by statistical
significance, interestingness, importance,
relevance, actionability, etc. - If necessary returning to previous steps
7Taxonomy of Data Mining Methods
Discovery
Verification
- Goodness of Fit
- Hypotheses testing
- Analysis of Variance
Prediction
Description
- Clustering
- Association Rules
- Linguistic Summary
- Visualization
Regression
Classification
Bayesian Networks
Decision Trees
Support Vectors
Neural Networks
Others
8Data Mining vs. Statistics
9Methods of Data Mining
- Classification
- Credit approval
- Fraud Detection
- Churn Detection
- Medical diagnosis
- Pattern recognition
- Clustering
- Customer analysis
- Documents clustering.
10Methods of Data Mining(contd)
- Association Rules
- Retail analysis
- Regression
- Forecasting
11Look at all of the different industries
www.kdnuggets.com
12Typical questions addressed by DM
- Which customers are most likely to drop their
cell phone service? - What is the probability that a customer will
purchase at least 100 worth of merchandise from
a particular mail-order catalog? - Which prospects are most likely to respond to a
particular offer?
13Example
- X bad situation
- O good situation
- The graph represent historical data.
- How to decide when to give a loan?
14Thumb Rule Example
- X bad situation
- O good situation
- 3 good cases will not get the loan
- 2 bad cases will get the loan.
15Data Mining Methods
- Classification - is the learning of a function
that maps a data into one of several predefined
classes. - 1 good case will not get the loan
- 2 bad cases will get the loan.
16Data Mining Methods
- Regression - is the learning of a function that
maps data item into A real variable, for example
the Debt value according to income.
17Data Mining Methods
- Clustering - Identify a set of items with common
characteristics
18ID3 AlgorithmExample -Complete decision tree
outlook
sunny
overcast
rain
P
humidity
windy
normal
high
true
false
N
P
N
P
19Decision Tree StructureMain Components
- Nodes - tests of some attribute
- Branch - one of possible values for the attribute
- Leaves (leaf nodes) - classifications
- Path (from the tree root to a leaf) - conjunction
of attribute tests
20Decision Tree Learning
Example a simple decision tree - university
studies Rule Extraction
- If the test is over 700, then the final grade is
above average - If the test is between 600 and 700, and the
gender is male, then the final grade is below
average - If the test is between 600 and 700, and the
gender is female, then the final grade is above
average - If the test is below 600 and the place of birth
is Diaspora, then the final grade is average - If the test is below 600 and the place of birth
is Israel, then the final grade is below average
21Decision Tree LearningAppropriate Problems
- Instances are described by a fixed set of
attributes - Each predicting attribute takes a small number of
disjoint possible values - The target function has discrete output values
(each value class / concept) - The training data may contain errors (noise)
- The training data may contain missing attribute
values
22Bayesian Networks
23Bayesian Approach
Classify the new instance by the most probable
target value
24Naive Bayes Classifier
Assumption all attribute values are
conditionally independent given the target value
25Naive Bayes Classifier example (two target
values)
26Naive Bayes Classifier example (continued)
- New instance
- Outlook sunny
- Temperature cool
- Humidity high
- Wind strong
- Play Tennis Yes / No ?
27Naive Bayes Classifier example (continued)
- P(Yes) 9/14 0.64
- P(No) 5 / 14 0.36
- P(Yes) P(Sunny / Yes)P(Cool/ Yes)P(High/
Yes)P(Strong/ Yes) - lt
- P(No) P(Sunny / No)P(Cool/ No)P(High/
No)P(Strong/ No)
28Multilayer Neural NetworkA Sample
Backpropagation Network
Interconnection weights
Input Layer
Hidden Layer
Output Layer
29Neural Network
30Artificial Neural NetworksAppropriate Problems
- Instances are represented by many attributes
- The target function may be discrete-valued,
real-valued, or a vector of several discrete /
real-valued attributes - The training data may contain errors
- Long training times are acceptable
- Fast prediction of the target function in a new
instance may be required - The ability of understanding the learned target
function is not important