Title: What a DBA Needs to Know About Oracle
1 What a DBA Needs to Know AboutOracles
Bitmap Indexing to Retrieve Data Quickly? Part I
- Vilin Roufchaie
- Cingular Wireless
- vilin.roufchaie_at_cingular.com
- veeleen_at_yahoo.com
- Download slides www.nocoug.org
2Presentation Assumptions Prerequisites
- Familiarity with basic Database and Data
Warehouse concepts is required - Bitmap and Bit Vector will be used
interchangeably
3Who Is This Presentation For?
- Data warehouse designers developers
- DBAs
- Performance DBAs
- Capacity Planners
4Presentation SummaryPart I
- Overview, Characteristics, Structure Size of a
Bitmap Index - Performance Considerations of Bitmap Indexing
- Logical Layout of Bitmap Indexes
- Bitmap Index Creation Storage Issues
- Query Processing Bitmap Index Access Paths
5Presentation SummaryPart II
- Star Schema, Join, Transformation
- Star Optimization, and Transformation
- CBO Estimation
- Query Transformer
- Enabling Star Transformation
- Star Transformation Steps
6Presentation SummaryPart II
- Joinback Elimination
- Case Study Explain Plan Review
- To Star Query? OrTo Star Transform ?
- Conclusions
- Acknowledgements
7Indexing Rules-of-Thumb
- In building an index know data selectivity,
data distribution, workload, execution plan,
proof of utilization. - There is cost overhead in building, utilizing,
tuning, maintaining indexes of any sort! - What do we expect after paying those cost?
8Indexing Rules-of-Thumb
- Performance! Orders of magnitude in execution
speed ? Ideal!! - How about what makes the Business/users happy!
- Make sure business expectations are understood
adequate research tests are done to assess the
likelihood of succeeding before committing to
something
9Indexing Rules-of-Thumb
- The index must be beneficial to all SQL's
impacted by it across the board -- Wholly
beneficial - Try Aggregate/Collective tuning
- So we want to build few, efficient indexes
- Demand Low-overhead indexes, requiring less
frequent fast re/builds, be space-efficient
10Overview Of Bitmap Indexing
- Bitmap indexing is a query execution optimization
technique in Data Warehousing environments - Oracle provides OLTP Data Warehousing in one
engine
11Overview Of Bitmap Indexing
- Oracle Supports Ever-growing Types of Indexes
- B-tree
- B-tree Cluster
- Hash
- Reverse Key
- Function-Based
- Bitmap Index
- Bitmap Join Index (new in Oracle 9i)
12 Characteristics Of Bitmap Indexes
- Bitmap index entries have Bitmap vectors of 0s
and 1s lt01010000010000000010gt... - Each 1-bit in the Bitmap corresponds to a rowid
inside a table - BitmapltPAgt gtltstartrowidgtltendrowidgt lt010100000
100 gt - RID-list ltPAgtltrowid2gtltrowid4gtltrowid10gt...
13Characteristics Of A Bitmap Index
- 1-bits correspond to rowids
- A mapping function converts the bit position to
an actual rowid - A compression function compresses the long
sequence of 0s in the Bitmap - Good for low-mid-high cardinality columns
14Structure Size of A Bitmap Index?
- Index entries which contain bitmap/bit-vector,
instead of list of rowids (B-tree) - Each bit in bitmap maps to a rowid inside a table
- Bitmap index entry structure
- ltCAgtltstart rowidgtltend rowidgtlt100010000000001gt
- (key, 6 byte (start rowid), 6 byte (end rowid),
bitmap) - Size key 2 6 byte bitmap
15Structure Size of A Btree Index
- Uncompressed
- ltCAgtltrowid1gt (key, 6 byte rowid1)
- ltCAgtltrowid5gt (key, 6 byte rowid5)
- ltCAgtltrowid15gt (key, 6 byte rowid15)Size
(key 3 3 6 byte) - Compressed
- ltCAgtltrowid1gt gtltrowid5gtltrowid15gt
- (key, 6 byte rowid1, 6 byte rowid5, 6 byte
rowid15) - Size (key 3 6 byte)
16B-tree vs. Bitmap Index Storage
17Are all High Cardinality Columns Inappropriate
for Bitmap Indexing?
- Suppose we have CA as index entry (8 clustered
occurrences CA CA CA CA CA CA CA CA TX - Bitmapped Size (key 2 6 byte bitmap)
key 12 8 key 20 - Uncompressed B-tree
- Size (n key n 6 byte) 8 key 48
8 key 48 - Compressed B-tree
- Size (key n 6 byte) key 48
18Are all High Cardinality Columns Inappropriate
for Bitmap Indexing?
- Indexed a table with
- 31,029 rows, num_distinct 3879
- Column values were clustered
- 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4...9 9 9 9 9 9 9
9...15 15 15 15 15 15 15 15... 27 27 27 27 27 27
27 27 - Bitmap, Btree, and compressed B-tree indexes
were built on this column - BM index Size 167,631 bytes
- B-tree index Size 997,619 bytes
- B-tree Index Compressed 449,602 bytes
19 Oracle Manual As a general rule, a degree
of cardinality of under 1 makes a good
candidate for a bitmap index
20Characteristics Of A Bitmap Index
- Bitmap indexes are very space efficient, which
allow more entries per leaf block. - A bitmap can hold many bits pointing to many
rowids (low cardinality columns) - Significantly fewer index block processing and
disk I/O. - When a bitmap index entry is locked, many rows
stay locked (in OLTP) - In a B-tree index, a list of rowids are stored
for each index entry in a leaf block and, when an
index entry is locked, it will not impact many
table rows
21Characteristics Of A Bitmap Index
- Bitmaps are stored in the leaves of a B-tree
index as bitmap segments - Used on low/mid/high cardinality data
- Fast Boolean/Set operations on Bitmap values from
different Index entries - Example
- Bitmap AND, OR, AND-NOT ...
22Characteristics Of A Bitmap Index
- Bitmap Indexes can be combined with other Bitmap
indexes and B-tree indexes in the same access
path - Fully Updateable
- Parallelism Applied to All Aspects of Queries
Index Creation
23Performance ConsiderationsOf Bitmap Indexing
- Problem?
- Concurrent DML (updates/inserts/deletes)
operations can be problematic - I.e., not suitable for concurrent OLTP workloads
- Granularity of lock on Bitmap index segment 1
- Each index segment can hold hundreds of bits
- One lock per Index segment
24Performance ConsiderationsOf Bitmap Indexing
- Batched DMLs are done efficiently
- - Each bitmap index segment is modified by a
single statement (not transaction) in one trip,
once a segment lock is acquired - insert into table customer (select from
temp_table)
25Logical Layout of Bitmap Indexes
- Oracle implements B-tree structures to store
Bitmaps for each indexed key - Up to 30 keys can be specified for each composite
index - Bitmaps are broken down into chunks (not
exceeding half a database block) - They are laid out as
- ltkey1gt ... ... ltkeyngt
- ltstart rowidgtltend rowidgt ltstart rowidgtltend
rowidgt - ltbitmapgt... ... ... ltbitmapgt
- key start rowid end rowid bitmap
- B-777 10.0 11.7 1000101000011000
26Logical Layout of Bitmap Indexes
- Bitmaps are stored in the leaves of B-tree
indexes as bitmap segments - Aligned mapping in place between bitmaps and
table rows - BLOCK10
-
- Start End BLOCK 11
- Key Rowid Rowid Bitmap
- Red 10.0 12.7 100010010001001000000
- Blue 10.0 12.7 001000001100000010000
- yellow 10.0 12.7 010000100000000000100
- Green 10.0 12.7 000100000010010001000
-
- BLOCK12
27Logical Layout of Bitmap Indexes
- 1st Problem? Insertion of new rows misalign the
mapping in place between Bitmaps and table rows - Explanation I? Oracle divides up each block into
maximum number of slots - Based on minimum row size,
- (Minimum row size would be derived from
definition of each column in table)
28Logical Layout of Bitmap Indexes
- Oracle map bits to table rows based on maximum
number of rows per data block - bits-allocated-per-table-block
- And not to existing rows in the data block
- Excessive zeros are compressed
29Mapping Bitmaps to Rowids Optimization
Explanation II?
- Populate table 1st
- alter table XYZ minimize records_per_block
- (see nominimize to disable)
- Build bitmap indexes now
- Oracle scans table for maximum number of records
in any block - Oracle restricts the table to this maximum
records/block - Fewer bits go to each block
- Potentially smaller bitmap indexes -- run a test!
30Mapping Bitmaps to Rowids Optimization
- 2nd Problem? This approach creates a dependency
between table column definition Bitmaps created - Altering table definition, changing minimum row
length, may result in all bitmap indexes being
automatically invalidated, demanding index
rebuilds
31Creation Of Bitmap indexes
- A Bitmap Index may be constructed on one or more
columns of a table - In Oracle 9, a local bitmap index can be created
on a partitioned table
32Creation Of Bitmap indexes(1st Salve Set)
- Full table scan to fetch values of column(s)
- Column values are fed into bitmap generator to
create index entriesltkey1gtltkeyngtltstartrowidgtlten
drowidgtltbitmapgt - Allocate sufficient space for create_bitmap_area
_size (db_block_size 0.5) (num_distinct)
20
33Creation Of Bitmap indexes
- Second set of
- Parallel Salves
-
- Initial set of
- Parallel Salves
34Creation Of Bitmap indexes2nd Salve Set
- Bitmap index entries sorted onltkeygtltstartrowidgt
Set sort_area_size - Index entries compaction to piece together
bitmaps of the same key to reach half a database
block size - Placing index entries into a B-tree structure
35Bitmap Compression
- Patented algorithm (GennadyAntoshenkov),
Cleverly encoded, hence very low overhead - Storage Policy
- Store all 0-bits if next 1-bit is lt 8 bits away
(1000100100010010000001) - Store 0-bits length if distance to next 1-bit is
gt 8 bits(10000000000000100000000100000000000000
1)
36Query Processing Bitmap Access Methods
- Bitmap Index Probe (for equality and/or range
predicates) - Bitmap AND (set-based)(Intersection among
multiple bitmaps ) - Bitmap OR (set-based)(Union among multiple
bitmaps)
37Query Processing Bitmap Access Methods
- Bitmap MINUS (set-based)(between two bitmaps)
- Bitmap COUNT (set-based)
- Bitmap Merge (bitmap OR on bitmap values)
- Bitmap Conversion ( bitmap conversion to rowid
or count bits)
38Equality Predicates
- select count() from customer where
regionWEST -
- Add all counts together
-
- Count all 1-bits in the bitmaps fetched
-
- Fetch the bitmap corresponding
- to regionwest in the bitmap index
-
39AND Predicates
- select name from customer where stateGA
andgenderF
40OR Predicates
- select count() from customer where stateFL
OR state is NULL
41Not Equal Predicates
- select count() from customer where gender F
and state !VA - Assumption
- State is declared
- NOT NULL
42Range Predicates
- select count() from customer
- where genderF AND age gt 65
- BITMAP_MERGE_AREA_SIZE?
43Group By Queries
- select state, count()
- from customer group by state
- No need to sort for grouping
- Count 1-bits in each bitmapreturn key
- Key bitmap will be returned in order from
bitmap
44Distinct Queries
- select distinct statefrom customer
- Only the keys will be returned for each bitmap
index entry
Sort (Unique Nosort)
Bitmap Index (Full Scan)
45Combining Predicates
Table Access (By ROWID)
- select name from customer where income gt
90000AND gender F - Reverse not done!
- rowids sorted
- Generates all rowids
- satisfying the predicate
Bitmap Conversion (To ROWID)
Bitmap AND
Bitmap Conversion (From ROWID)
Bitmap Index (gender F)
Sort
B-tree Index (Income gt 90000)
46Star Transformation
- What makes bitmap indexes so powerful are their
ability to combine with same/other types of
indexes - So far we have learned how to combine indexes to
retrieve data from a table - Now we will learn how to combine indexes to
rapidly handle joins
47What a DBA Needs to Know About Oracles Star
Transformation Processing In a Star Schema? Part
II
- Vilin Roufchaie, Cingular Wireless
- vilin.roufchaie_at_cingular.com
- veeleen_at_yahoo.com
- Download Slideswww.nocoug.org
48Presentation SummaryPart II
- Star Schema, Join, Transformation
- Star Optimization, and Transformation
- CBO Estimation
- Query Transformer
- Enabling Star Transformation
- Star Transformation Steps
49Presentation SummaryPart II
- Joinback Elimination
- Case Study Explain Plan Review
- To Star Query? Or
- To Star Transform ?
- Conclusions
- Acknowledgements
50Star Transformation (ST)
- Star Transformation Is a Cost-Based Query
Transformation Aimed at Executing Star Queries
Efficiently - In a Star Schema
51Star Schema
- Is the Basic design of a data warehouse
- Made up of
- One or more fact tables
- A few dimension tables
52Times
Employee
Sales
Products
Customers
53Fact Table
- Contains all the quantitative information that
the user wants to see in the result set - Foreign keys to dimension tables
- Non-key columns contains numeric facts
summarized, analyzed, and reported - Narrow in record width
- Usually huge number of rows
- Examples Sales, Shipment
54Dimension Tables
- Contains the qualitative information defining
how users will analyze fact data - Primary Key column
- Non-key columns contain descriptive information
about a record - Therefore wide in record length
- Denormalized, enhances query performance
- Examples Time, product, employee
55What is a Star Join?
- A join process in which dimension tables
Primary Key column values are joined to the fact
tables Foreign Key column values in Star Schemas
(but the dimension tables are not joined to
each other)
56Star Transformation
- Oracle Data Warehousing functions equally apply
to - Star schemas
- 3rd Normal Form schemas
- Hybrid schemas
57Star Transformation
- ST is Powerful feature of Oracle utilizing
bitmap indexing to handle Star Joins - ST Handles
- Several dimension tables,
- Snowflakes, Views
- More than a single fact table (example sales
inventory) - Complex Queries (Inside-out)
- Predicate constraint on fact column(s)
- (should build BM index on facts non-key column)
- Parallelizable (by fact table rowid ranges)
58Star Transformation
- Data Warehousing Capabilities that work with all
schema models are - Partitioning
- Parallelism
- Materialized Views
59Star Transformation (vs.Star Optimization)
- Good for large number of dimension tables
- The Fact tables are sparsely/densely populated
- Ideal for creating combining single-column
bitmap indexes on fact columns (rather than
concatenated indexes) - Appropriate for cases where large number of
dimensions would lead to large Cartesian products
finding few matching rows in the fact table
60(Star Transformation vs.) Star Optimization
- Good for small number of dimension tables (with
relatively small number of rows) - Fact table should be densely populated
- Does not work well with sparsely populated facts
- Relies on computing a Cartesian product of the
dimension tables, based on the WHERE clause
predicates - In the last step, joins the result set to a fact
table via NESTED LOOPS through concatenated
B-tree index access path.
61What Does CBO Do?
- At the outset, the CBO DOES take indexing cost
into account when evaluating a query. It creates
two plans - Regular
- Transformed
- And picks the least costly plan to execute the
star query
62Query Transformer Plan Generator (CBO)
63Enabling and Implementing Star Transformation
- Set init.ora parameter star_transformation_enabl
ed TRUE - Create Single-column bitmap indexes on a fact
tables dimension keys (foreign-keys) - Create indexes on dimension tables attribute
columns found in a querys WHERE clause (also
known as dimension filters) - Analyze tables and indexes in the schema for the
CBO
64Star Transformation Passes
- Two passes are performed by the optimizer.
- 1st pass Matching fact rows (result set) are
retrieved via bitmap - 2nd pass Joins the result set to the dimension
tables (usually hash join) - This technique is called semi-join
65Star Transformation Processing
- Star Transformation Example
- Select c.Name, s.Pricefrom Employee e, Product
p, Customer c, Sales swhere e.Empid
s.Empid and p.Productid s.Productid and c
.Customerid s.Customerid and - c.Income gt 10000 and p.Supplier
Oracle and e.Department Telesales -
66Star Transformation Processing
- 1st Phase Identify and retrieve the needed rows
from the fact table by implementing sub-select
queries - Select c.Name, s.Pricefrom Employee e, Product
p, Customer c, Sales swhere e.Empid
s.Empid andp.Productid s.Productid andc.Cus
tomerid s.Customerid and - s.Productid in (select p.Productid from Product
p where p.Supplier Oracle) and s.Empid in
(select e.Empid from Employee e where
e.Department Telesales) and s.Customerid in
(select c.Customerid from Customer c wher
c.Income gt 100000)
67Star Transformation ProcessingSummary Of Steps
-- 1st Phase
- A dimension table predicate is executed to return
a subset of rows from the dimension - (SQL's creating dimensions temporary segments
(ORA_TEMP_1_4, etc.) can be retrieved from the
shared pool) - (A dimension must sufficiently constrain the fact
table to qualify, otherwise access will be
deferred to 2nd phase by the CBO)
68Star Transformation Processing Summary Of Steps
-- 1st Phase
- For each value in the return set, a bitmap index
is retrieved for the corresponding fact table
column (Bitmap Key Iteration step) - Bitmap merge operation is performed for all
bitmap values through a bitmap OR operation - Previous 3 steps are repeated for the remaining
dimension tables with predicate constraints
69Star Transformation Processing Summary Of Steps
-- 1st Phase
- Bitmap elimination of all bitmap merge values of
all fact rows not joining all dimension tables
are done through a bitmap AND operation (Bitmap
AND) - The final, merged bitmap set is then converted to
rowids (Bitmap Conversion To Rowid)
70Star Transformation ProcessingSummary Of Steps
-- 2nd Phase
- The final fact table rowid set is used to join
the dimension tables, utilizing the most
efficient joining algorithm available(usually a
Hash Join)
71Star Transformation Processing Joinback
Elimination
- Select c.Name, s.Price
- from Employee e, Product p, Customer c, Sales s
- where e.Empid s.Empid and
- p.Productid s.Productid and
- c.Customerids.Customerid and
- c.Income gt 10000 and
- p.Supplier Oracle and
- e.Department Telesales
-
72Star Transformation Processing
Bitmap Conversion to rowid
Bitmap AND
Bitmap Merge
Bitmap Key Iteration
Employee Table (DepartmentTeleSales)
sales.Empid Bitmap Index (Empidvariable)
73Star Transformation Joinback Elimination
Hash Join
Hash Join
Employee Table (Department Telesales)
Hash Join
Product Table (Supplier Oracle)
Sales Table (By ROWID)
Customer Table (Income gt 100000)
74Case Study
75Case Study
76Case Study
77Case Study
What low-cardinality predicate signifies is a low
percentage of
returned rows from a dimension table when the
predicate is
executed. This will result in significantly
reducing the number of
semi-joins of the dimension rows against the fact
table. The
example below illustrates the point
Operations
Cardinality ()
Number of Rows
-----------------------------------------
----------------------------- -------------------
-
select count() from TIME
where monthname in ('Apr97','Feb97')
57/3,866 1.48
TIME3866
78Case StudyTuning Steps
- Analyzed all tables in the FROM clause (use
DBMS_STATS ) - Single-column bitmap indexes were built on fact
tables dimension keys - Primary Key indexes were created on dimension
tables join columns - Appropriate indexes were built on dimension
filters - Explained the query in question
- Init.ora parameters were set
79Case studyExplain Plan
80Case study
81Case Study
82Star Query Or Star Transformation?
83Conclusion
- Reject transformation plan, if other plans prove
less costly - The performance of Oracle8/9 Star Transformation
algorithm is proportional to the amount of data
retrieved from the fact table - The performance gain associated with using Star
Transformation must outweigh the processing cost
incurred by applying star transformation to a SQL
query
84Conclusion
- Do not force a / STAR /. Other execution
plans might work better - Star Transformation with A Bitmap Join Index not
covered.
85Acknowledgements