An evaluation of Data Storage and Analysis With Oracle DBMS - PowerPoint PPT Presentation

About This Presentation
Title:

An evaluation of Data Storage and Analysis With Oracle DBMS

Description:

First step of the evaluation of the possibility to use advanced index features ... Most of the credit goes to Maciej Marczukajtis (technical student) ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 16
Provided by: NIC8169
Category:

less

Transcript and Presenter's Notes

Title: An evaluation of Data Storage and Analysis With Oracle DBMS


1
An evaluation of Data Storage and Analysis With
Oracle DBMS
  • Eric Grancher, CERN IT/DB
  • Database Workshop, CERN
  • July 11th-13th 2001

2
TAG data analysis
  • First step of the evaluation of the possibility
    to use advanced index features to perform simple
    cuts
  • Most of the credit goes to Maciej Marczukajtis
    (technical student)
  • TAG data imported from a ALEPH PAW file

3
Model
  • Data model has 16 double attributes
  • Cut retrieve entries for which field1 gt 12 and
    field2 lt 23 and field3 gt 56
  • The model can be implemented as a relational
    table, mapping every attribute to a column

4
Performing the Cuts
  • Normal Btree index cant be used in an efficient
    way as they cant be merged
  • Bitmap indexes can be merged
  • how to use the bitmap indexes to perform cuts?

5
Size f(cardinality)
1 million rows
5 million rows
size
size
cardinality
cardinality
6
Binning
0
10
20
30
40
1
2
3
4
0
  • Double values, problem with cardinality
  • Binning idea (c.f. Kurt Stockinger)
  • Purpose - reducing the cardinality. Binning
    function maps the whole range of values in column
    into bins (sub ranges)
  • Function used in tests lower, upper-gt
    0number of bins

7
Function based indexes
  • create bitmap index f_x1 on tag_data(binning(field
    1))
  • Queries have to exactly match the index creation
    statements

8
Several ways to use the bitmap indexes
Even if SQL is said non-procedural, the way you
write the statement produces different execution
plans !
Bitmap index 1 SELECT COUNT(PT_LEP_MC) FROM
tag_data WHERE F1(PT_LEP_MC) lt F1(20) AND
F2(ETA_LEP_MC) gt F2(0) UNION ALL SELECT
COUNT(PT_LEP_MC) FROM (SELECT PT_LEP_MC,
ETA_LEP_MC FROM tag_data WHERE (F1(PT_LEP_MC) lt
F1(20) AND F2(ETA_LEP_MC) gt F2(0)) AND NOT
(F1(PT_LEP_MC) lt F1(20) AND F2(ETA_LEP_MC) gt
F2(0)))WHERE PT_LEP_MC lt 20 AND ETA_LEP_MC gt 0
ETA_LEP_MC
F1 and F2 are the binningfunctions
9
Several Ways to Use the Bitmap Indexes
  • Bitmap index 2
  • SELECT COUNT(PT_LEP_MC) FROM (SELECT PT_LEP_MC,
    ETA_LEP_MC FROM tag_data WHERE F1(PT_LEP_MC) lt
    F1(20) AND F2(ETA_LEP_MC) gt F2(0)) WHERE
    PT_LEP_MC lt 20 AND ETA_LEP_MC gt 0

ETA_LEP_MC
F1 and F2 are the binningfunctions
10
Performance
  • Parallel query several processes to perform a
    single statement
  • Check the execution plan!
  • For some queries, it is almost as efficient in
    time to perform a full table scan (but with
    several users !)

11
Size
  • Block size 8KB
  • Table size 203.1 MB
  • Average bitmap index size 24.3 MB-gt more space
    used for the indexes than the data (not unusual
    for some heavily queried Oracle database)

12
Results
Bitmap index 1 only candidates are checked
against the table Bitmap index 2 all hits and
candidates are checked against the table
Heavily depends on the query. I/O doesnt mean
time.
13
Conclusion
  • Using bitmap index for tag data in the DBMS can
    provide a dramatic gain
  • Clear advantage in using database features when
    appropriate
  • Works efficiently with high selectivity (few
    records result from the query)
  • BTree indexes take a lot of space and are not
    efficient for multiple dimensional queries

14
Perspectives
  • Proper binning referring to the distribution of
    the data can improve performance of bitmap index,
    depends on distribution queries
  • Possibility to combine bitmap and Btree indexes
  • How to this fit for ESD/AOD and references/OCCI?

15
References Questions
  • RAW tests http//edmsoraweb.cern.ch8001/cedar/doc
    .info?document_id318241version1
  • TAG tests http//edmsoraweb.cern.ch8001/cedar/pro
    ject.info?proj_idCERN-0000005796
Write a Comment
User Comments (0)
About PowerShow.com