Title: Data Warehousing
1Data Warehousing
Virtual University of Pakistan
- Lecture-27
- Need for Speed Special Indexing Techniques
Ahsan Abdullah Assoc. Prof. Head Center for
Agro-Informatics Research www.nu.edu.pk/cairindex.
asp National University of Computers Emerging
Sciences, Islamabad Email ahsan1010_at_yahoo.com
2Special Index Structures
- Inverted index
- Bit map index
- Cluster index
- Join indexes
3Sample table
4Inverted index Concept
5Inverted Index Example-1
D1 M. Asalm BS Computer Science Lahore
Campus D2 Sana Aslam of Lahore MS Computer
Engineering with GPA 3.4 Karachi Campus
Inverted index for the documents D1 and D2 is
as follows
Karachi ? D2 Lahore ? D1, D2 M. ?
D1 MS ? D2 of ? D2 Sana ?
D2 Science ? D1 with ? D2
3.4 ? D2 Asalm ? D1, D2 BS ?
D1 Campus ? D1, D2 Computer ? D1,
D2 Engineering ? D2 GPA ? D2
6Inverted Index Example-2
7Inverted Index Query
- Query
- Get students with age 20 and tech telecom
- List for age 20 r4, r18, r34, r35
- List for tech telecom r5, r35
- Answer is intersection r35
8Bitmap Indexes Concept
9Bitmap Indexes Example
- The index consists of bitmaps, with a column for
each unique value
Index on Tech (smaller table)
Index on City (larger table)
10Bitmap Index Query
- Query
- Get students with age 20 and campus Lahore
- List for age 20 1101100000
- List for campus Lahore 1010000001
- Answer is AND 1000000000
- Good if domain cardinality is small
- Bit vectors can be compressed
- Run length encoding
11Bitmap Index Compression
- Basic Concept
- 1111000011110000001111100000011111 INPUT
- 14041406150615 OUTPUT
- 1010101010101010101010101010101010 INPUT
- 1101110111011101 OUTPUT
- 11111111111111110000000000000000 INPUT
- 117017 OUTPUT
Case-1 Case-2 Case-3
12Bitmap Index More Queries
- Which students from Lahore are enrolled in
CS? - How many students are enrolled in CS?
13Bitmap Index Adv.
14Bitmap Index Performance Points
15Bitmap Index Dis. Adv.
16Cluster Index Concept
17Cluster Index Example
One indexing column at a time
18Cluster Index Issues
19Join Index Example
The rows of the table consist entirely of such
references, which are the RIDs of the relevant
rows.
PROGRAM
join index
CAMPUS
20Materialized views