Title: Tools for
1Part II
- Tools for
- Knowledge Discovery
2Knowledge Discovery in Databases
35.1 A KDD Process Model
4(No Transcript)
5(No Transcript)
6Step 1 Goal Identification
- Define the Problem.
- Choose a Data Mining Tool.
- Estimate Project Cost.
- Estimate Project Completion Time.
- Address Legal Issues.
- Develop a Maintenance Plan.
7Step 2 Creating a Target Dataset
8(No Transcript)
9Step 3 Data Preprocessing
10Noisy Data
- Locate Duplicate Records.
- Locate Incorrect Attribute Values.
- Smooth Data.
11Preprocessing Missing Data
- Discard Records With Missing Values.
- Replace Missing Real-valued Items With the
Class Mean. - Replace Missing Values With Values Found Within
Highly Similar Instances.
12Processing Missing Data While Learning
- Ignore Missing Values.
- Treat Missing Values As Equal Compares.
- Treat Missing values As Unequal Compares.
13Step 4 Data Transformation
- Data Normalization
- Data Type Conversion
- Attribute and Instance Selection
14Data Normalization
- Decimal Scaling
- Min-Max Normalization
- Normalization using Z-scores
- Logarithmic Normalization
15Attribute and Instance Selection
- Eliminating Attributes
- Creating Attributes
- Instance Selection
16(No Transcript)
17Step 5 Data Mining
- Choose training and test data.
- Designate a set of input attributes.
- If learning is supervised, choose one or more
output attributes. - Select learning parameter values.
- Invoke the data mining tool.
18Step 6 Interpretation and Evaluation
- Statistical analysis.
- Heuristic analysis.
- Experimental analysis.
- Human analysis.
19Step 7 Taking Action
- Create a report.
- Relocate retail items.
- Mail promotional information.
- Detect fraud.
- Fund new research.
-
205.9 The Crisp-DM Process Model
- Business understanding
- Data understanding
- Data preparation
- Modeling
- Evaluation
- Deployment
215.10 Experimenting with ESX
22A Four-Step Model for Knowledge Discovery
- Identify the goal.
- Prepare the data.
- Apply data mining.
- Interpret and evaluate the results.
23Experiment 1 Attribute Evaluation
- Applying the Four-Step Process Model to the
Credit Screening Dataset
24(No Transcript)
25(No Transcript)
26Experiment 2 Parameter Evaluation
- Applying the Four-Step Process Model to the
Satellite Image Dataset
27(No Transcript)