Data Mining in Database

1 / 15
About This Presentation
Title:

Data Mining in Database

Description:

Mining Association Rules ... Usually mining generalized and multiple-level association rules ... Mining Path Traversal Patterns ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 16
Provided by: aceboxUw

less

Transcript and Presenter's Notes

Title: Data Mining in Database


1
Data Mining in Database
  • Hongzhi Li
  • Department of Computing Information Science
  • Queens University
  • Kingston, Ontario
  • hongzhi_at_cs.queensu.ca
  • www.ace.uwaterloo.ca/liho/project/832

2
Introduction
  • Why data mining in databases?
  • Whats the requirements for data mining?
  • What kind of knowledge can be mined?

3
Why Data Mining?
  • Knowledge Discovery (KDD)
  • --Overall process of discovering useful
    knowledge
  • Data Mining (Computer-driven exploration)
  • -- Query formulation problem.
  • -- Visualize and understand of a large data
    set.
  • -- Data growth rate too high to be handled
    manually.
  • Data Warehouses (Human-driven exploration)
  • -- Querying summaries of transactions, etc.
    Decision support
  • Traditional Database (Transactions)
  • -- Querying data in well-defined processes.
    Reliable storage

4
Requirements of Data Mining
  • Handling of different type of data
  • Efficiency and scalability of algorithm
  • Usefulness, certainty and expressiveness of
    result
  • Expression of various kinds of mining results
  • Interactive mining knowledge at multiple levels
  • Mining information from different sources of data
  • Protection of privacy and data security

5
Data Mining Techniques
  • What kind of databases to work on?
  • relational data object-oriented
    Internet information miner
  • What kind of techniques to be utilized?
  • data-driven, query-driven miner
  • generalization-based, pattern-based mining
    .
  • What kind of knowledge to be mined
  • association rules, characteristic rules,
    classification rules, clustering
  • discriminant rules

6
Mining different kinds of knowledge
  • Association rules
  • Data generalization and summarization
  • Data classification
  • Data clustering
  • Pattern-based similarity search
  • Path traversal patterns

7
Mining Association Rules
  • A customer buys (one brand of) milk, s/he usually
    buys (another brand of) bread
  • Discover strong association rules only!
  • Interestingness of discovered association rules
  • Usually mining generalized and multiple-level
    association rules
  • Algorithm Efficiency to count the large
    itemsets
  • -- Apriori and DHP

8
Multi-level Data Generalization, Summarization,
and Characterization
  • How many customers buy milk in Kingston area?
    (2, 1, , BasicFood at Bath, at Montreal )
  • (Data warehousing)
  • Two approaches
  • -- data cube approach
  • -- attribute-oriented induction approach

9
Data Classification
  • A customer averagely spends 7 hrs to look for
    clothes and buy 2 pieces in each week
  • -- Whats the gender of this customer?
  • Classification Methods
  • -- Decision Tree
  • -- Nearest Neighbor

10
Clustering Analysis
  • A group of customers
  • Some like to buy lots of beers
  • Some like to buy lots of cigarettes
  • They can be grouped as hard-working
    people
  • (or
    Mental-Homeless??)
  • Distance-based clustering
  • Statistical-based clustering

11
Pattern-based Similarity Search
  • Search the database for a customer
  • Spend about 7 hrs shopping each week and buy
    about 100 grocery in Kingston area

-- Object-relative similarity query All-pair
similarity query -- Whole matching / subsequence
matching -- Similarity measure Comparison in
spatial / frequency domain Subsequence of
arbitrary length, scaling and translation
12
Mining Path Traversal Patterns
  • Customer surfing your companys Web page, whats
    their preferable routing?

-- For a distributed information environment
WWW and on-line services
13
Summary
  • Methods
  • -- Diversity / Rich Functionalities
  • Applications
  • -- Quest from IBM
  • Challenges
  • -- data mining in advanced DBS
  • -- Mining multiple kinds of knowledge at
    multiple
  • abstraction levels

Take a minute to enjoy my Knowledge Discovery and
have fun ?
14
Knowledge Discovery (Case to learn from)
It was so cold, the bird froze and fell to the
ground in a large field
15
Knowledge I Discovered.
  • Not everyone who drops shit on you is your enemy

2. Not everyone who gets you out of shit is your
friend
3. And when you are in deep shit, keep your mouth
shut!
So, I shut up and hear you singing ?
Write a Comment
User Comments (0)