Data Mining - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Data Mining

Description:

Data Mining Kelby Lee Overview Transaction Database What is Data Mining Data Mining Primitives Data Mining Objectives Predictive Modeling Knowledge Discovery Other ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 32
Provided by: MikeWh
Category:

less

Transcript and Presenter's Notes

Title: Data Mining


1
Data Mining
  • Kelby Lee

2
Overview
  • Transaction Database
  • What is Data Mining
  • Data Mining Primitives
  • Data Mining Objectives
  • Predictive Modeling
  • Knowledge Discovery
  • Other Objectives to Data Mining
  • What Data Mining is Not
  • Other Factors in Data Mining Categorization
  • Conclusion

3
Transaction Database
  • Relation Consisting of Transactions
  • TID (Transaction Identifier)
  • Regularities between Transaction Behavior

4
Transaction Database
  • Table 1.1 Transaction Database
  • TID Customer Item Date Price Quantity
  • --------------------------------------------------
    --------------------------------------------------
    -----------------------------
  • 100 C1 chocolate 01/11/2001 1.59 2
  • 100 C1 ice cream 01/11/2001 1.89 1
  • 200 C2 chocolate 01/12/2001 1.59 3
  • 200 C2 candy bar 01/12/2001 1.19 2
  • 200 C2 jackets 01/12/2001 120.39 2
  • 300 C3 jackets 01/14/2001 168.88 1
  • 300 C3 color shirts 01/14/2001 27.95 2
  • 400 C4 jackets 01/15/2001 149.49 1

5
Association Rules
  • A customer who buys chocolate will likely buy
    candy bar
  • one type of Data Mining task

6
Discovered Rules
  • Table 1.2 Discovered Rules
  • Rule Bought this... ...also bought that
  • --------------------------------------------------
    -----------------------------------------------
  • 1 chocolate ice cream
  • 2 candy bar chocolate
  • 3 ski pants colored shirt
  • 4 beer diaper

7
What is Data Mining
  • Retrieve individual elements
  • Given a name of a product, find price and
    producer
  • Analysis
  • Average monthly sales amount and derivation

8
Advances Allow For
  • Large amounts of Data to be Handled
  • Aspect of Analysis
  • Data Rich but Knowledge Poor

9
Discover Patterns
  • Improve Business Performance
  • Exploit favorable patterns
  • Avoid problematic patterns
  • Increase Understanding
  • Predict Outcome

10
Answer the Key Business Questions
  • Who will buy? What will they buy? How much?
  • Classification and Prediction
  • What are the different types of Customers?
  • Segmentation of Customers

11
Answer the Key Business Questions
  • What relationship exists between customers or
    Website visitors and the products?
  • Association
  • What are the groupings hidden in the data?
  • Clustering Analysis

12
Data Mining Definition
  • Non Trivial Extraction of implicit, previously
    unknown, interesting, and potentially useful
    information from data

13
Different Types of Data Mining
  • Business Data Mining
  • Scientific Data Mining
  • Internet Data Mining

14
Data Mining Applications
  • Medical
  • Control Theory
  • Engineering
  • Public Administration
  • Marketing and Finance
  • Data Mining on the Web
  • Scientific Data Base
  • Fraud Detection

15
Data Mining Primitives
  • Fundamental Elements Needed to Define a Data
    Mining Task
  • Eight Elements (P,D,K,B,T,M,I,U)
  • 8 - Tuple

16
Elements
  • P - Problem Specification
  • D - Task Relevant Data
  • K - Kind of Knowledge to be Mined
  • B - Background Knowledge
  • T - Specific algorithms or techniques
  • M - Models developed or knowledge patterns
    extracted
  • I - Interestingness
  • U- User

17
Diagram
18
Relationship between Elements
  • User Defines Problem (P) and specifies
    Interestingness (I)
  • Data Miner with K and T as core elements
    utilizing D and B and incorporates I
  • Data Miner produces M

19
Data Mining Objectives
  • Discovery
  • Finding human interpretable patterns describing
    the data
  • Prediction
  • Using some variables or fields in database to
    predict unknown or future values or other
    variables of interest

20
Data Mining Objectives
  • Knowledge Discovery
  • Stage somewhat prior to prediction where
    information is insufficient
  • Closer to decision support

21
Predictive Modeling
  • Predict Values Based on Similar Groups of Data
  • Submit records with some unknown fields and
    system will predict value

22
Predictive Modeling
  • Pattern Recognition
  • Association of an observation to past experience
    or knowledge
  • Interchangeable with classification

23
Predictive Modeling
  • Classification
  • Process of assigning finite set of labels to an
    observation
  • Estimation
  • Assign infinite number of numeric labels to an
    observation

24
Knowledge Discovery
  • Find Patterns in Data Base
  • If someone buys one thing, what else will they
    buy
  • Interesting Certain Knowledge
  • Output called Discovered Knowledge
  • KDD - Knowledge Discovery in Data Base

25
Data Mining
  • Is about why, about hidden regularities,
    important aspect related to perception, learning
    and evolving
  • Decision support process in which we search
    patterns of information in data
  • Once found, display in suitable format

26
Four Points of KDD
  • Discovered Knowledge Represented in High-Level
    Language
  • Accurately Portray contents of Database
  • Interesting to user
  • Process is Efficient

27
Important Issues
  • Human Centered
  • Under control of human user to meet human needs
  • Incorporate Interestingness
  • Provide Various Types
  • Provide Visualization

28
Other Objectives
  • Forensic analysis
  • Applying extracted patterns to find anomalous or
    unusual data elements largely involved in
    business applications
  • Find out what the norm is and find those that
    deviate from the norm

29
What Data Mining is Not
  • Analysis vs Monitoring
  • Analysis - previously collected information
  • Monitoring
  • Collect data as it comes in and compare to set of
    conditions
  • Unexpected Discovery
  • Must have general goal in mind

30
Other Factors in Categorization
  • Data Retention
  • Data is retained for future pattern matching
  • Pattern Distillation
  • Analyse data, extract pattern, leave data behind

31
Conclusion
  • Transaction Database
  • What is Data Mining
  • Data Mining Primitives
  • Data Mining Objectives
  • Predictive Modeling
  • Knowledge Discovery
  • Other Objectives to Data Mining
  • What Data Mining is Not
  • Other Factors in Data Mining Categorization
Write a Comment
User Comments (0)
About PowerShow.com