ISDA'2003 Data Mining Techniques in Index Techniques - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

ISDA'2003 Data Mining Techniques in Index Techniques

Description:

University of Malaya. Faculty of Computer Science and Information Technology. 12/17/09 ... Work well only with low-cardinality data (Female, Male) ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 19
Provided by: jiaw210
Category:

less

Transcript and Presenter's Notes

Title: ISDA'2003 Data Mining Techniques in Index Techniques


1
ISDA'2003 Data Mining Techniques in Index
Techniques
  • Ying Wah Teh and Abu Bakar Zaitun
  • tehyw_at_.um.edu.my, zab_at_um.edu.my
  • University of Malaya
  • Faculty of Computer Science and Information
    Technology

2
Contents
  • Introduction
  • Query Processing Techniques
  • Evaluation of Data Mining Prototypes
  • Conclusion

3
Introduction
  • What data to gather and how to conceptually model
    the data and manage its storage
  • Logical database design
  • Physical database design
  • Very large data storage nowadays
  • Redundant data structures
  • the intelligent way of managing storage
  • Fast access to data
  • Selecting the right elements to build redundant
    data structures
  • Only a few data warehouse administrators can do
    justice to the task of picking the right
    redundant data structures.

4
Query Processing Techniques
  • Historical Perspectives
  • File Processing / Full Scan / Sequential Scan
  • Simple index
  • B-Tree index
  • Present Scenarios of Query Processing Techniques
  • BitMap Index
  • Single-column indexes

5
File Processing
  • A programmer needs to know at least one-third
    generation language for writing a data retrieval
    program to access the relevant information from a
    file system.
  • Query processing techniques (sequential scan or
    full scan)
  • It is more suitable for the small data volume
    environment.

6
(No Transcript)
7
Simple Indexes / Hashed Key
  • DBMSs were developed that included simple
    indexes.
  • It allows users to access information very
    quickly by a unique value.
  • It creates a list of record identification which
    acts as pointers to records.
  • Exactly key value to access data.

8
(No Transcript)
9
B-tree indexes
  • Partial key lookups and exactly key lookup.
  • It is a very costly to create for every query.
  • The intelligent way of handling the B-Tree index.

10
(No Transcript)
11
Present Scenario
  • Issues a query that only requires a small portion
    of the result of relations and the predicate is
    non-primary key.
  • Only one RID index can be used at a time.

12
BitMap Index
  • Bit-vector approach
  • A RID occupies at least 8 bits, while a BitMap
    index occupies only 1-bit pointer to a tuple of
    the relation.
  • Work well only with low-cardinality data (Female,
    Male).
  • The intelligent way of handling the BitMap is the
    vital issue.

13
Single-column indexes
  • Index intersection offers greater flexibility
  • A good strategy would be to define single-column
    indexes on all columns that will be frequently
    queries and let index intersection handle
    situation.
  • The intelligent way of handling the single-column
    indexes is the vital issue.

14
Our Research Perspective
  • Most researchers apply data mining at the
    application level of data warehouse.
  • We applied data mining in the physical design of
    data warehouses to optimise the base relation.

15
Architecture of One-column Index Selection
16
Evaluation of Data Mining Prototypes
17
Conclusion
  • It is necessary to have an intelligent way of
    handling the various query processing techniques
    (such as indexes).
  • Data mining techniques can be used in the
    physical design of a data warehouse to generate
    single-column indexes.
  • The positive results from the study should
    motivate further efforts to make it into a fully
    functional SQL engine.

18
Thank You
  • Questions?
  • tehyw_at_um.edu.my
Write a Comment
User Comments (0)
About PowerShow.com