Data Warehousing - PowerPoint PPT Presentation

About This Presentation
Title:

Data Warehousing

Description:

Virtual University of Pakistan Data Warehousing Lecture-27 Need for Speed: Special Indexing Techniques Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics ... – PowerPoint PPT presentation

Number of Views:274
Avg rating:3.0/5.0
Slides: 21
Provided by: aa70130
Category:

less

Transcript and Presenter's Notes

Title: Data Warehousing


1
Data Warehousing
Virtual University of Pakistan
  • Lecture-27
  • Need for Speed Special Indexing Techniques

Ahsan Abdullah Assoc. Prof. Head Center for
Agro-Informatics Research www.nu.edu.pk/cairindex.
asp National University of Computers Emerging
Sciences, Islamabad Email ahsan1010_at_yahoo.com
2
Special Index Structures
  • Inverted index
  • Bit map index
  • Cluster index
  • Join indexes

3
Sample table
4
Inverted index Concept
5
Inverted Index Example-1
D1 M. Asalm BS Computer Science Lahore
Campus D2 Sana Aslam of Lahore MS Computer
Engineering with GPA 3.4 Karachi Campus
Inverted index for the documents D1 and D2 is
as follows

Karachi ? D2 Lahore ? D1, D2 M. ?
D1 MS ? D2 of ? D2 Sana ?
D2 Science ? D1 with ? D2
3.4 ? D2 Asalm ? D1, D2 BS ?
D1 Campus ? D1, D2 Computer ? D1,
D2 Engineering ? D2 GPA ? D2
6
Inverted Index Example-2

7
Inverted Index Query
  • Query
  • Get students with age 20 and tech telecom
  • List for age 20 r4, r18, r34, r35
  • List for tech telecom r5, r35
  • Answer is intersection r35

8
Bitmap Indexes Concept
9
Bitmap Indexes Example
  • The index consists of bitmaps, with a column for
    each unique value

Index on Tech (smaller table)
Index on City (larger table)
10
Bitmap Index Query
  • Query
  • Get students with age 20 and campus Lahore
  • List for age 20 1101100000
  • List for campus Lahore 1010000001
  • Answer is AND 1000000000
  • Good if domain cardinality is small
  • Bit vectors can be compressed
  • Run length encoding

11
Bitmap Index Compression
  • Basic Concept
  • 1111000011110000001111100000011111 INPUT
  • 14041406150615 OUTPUT
  • 1010101010101010101010101010101010 INPUT
  • 1101110111011101 OUTPUT
  • 11111111111111110000000000000000 INPUT
  • 117017 OUTPUT

Case-1 Case-2 Case-3
12
Bitmap Index More Queries
  • Which students from Lahore are enrolled in
    CS?
  • How many students are enrolled in CS?

13
Bitmap Index Adv.
14
Bitmap Index Performance Points
15
Bitmap Index Dis. Adv.
16
Cluster Index Concept
17
Cluster Index Example
One indexing column at a time
18
Cluster Index Issues
19
Join Index Example
The rows of the table consist entirely of such
references, which are the RIDs of the relevant
rows.
PROGRAM
join index
CAMPUS
20
Materialized views
Write a Comment
User Comments (0)
About PowerShow.com