Issues in Database Performance - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Issues in Database Performance

Description:

Issues in Database Performance Performance in Read / write are hardware issues = throw money at it Performance of DB = ability of engine to locate data – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 25
Provided by: FAd52
Category:

less

Transcript and Presenter's Notes

Title: Issues in Database Performance


1
Issues in Database Performance
  • Performance in Read / write are hardware issues
    gt throw money at it
  • Performance of DB ability of engine to locate
    data
  • Factors affecting speed of retrieval
  • Cache (sizing of objects)
  • Access method (table scan / indexes)
  • Contention between processes
  • Indirect processes (roll back, archiving)

2
Reading data in Oracle
  • 1 determine how to access block where data is
    located
  • read data dictionary stored in separate part of
    DB
  • DD may be loaded up in cache (aka row cache) to
    limit I/O activity
  • 2 - DD indicates preferred access method for
    block
  • B tree index, partitioning, hash clusters etc
  • 3 - Search begins either in full scan or with
    index until data found

3
Designing DB for reading
  • Supply methods for high precision access to data
  • But some queries will defeat the strategies
  • E.g. credit cards transactions monthly report
    of scattered items
  • No solution take off-line

4
Changing data
  • Oracle makes hard work of changes
  • Rollback data (immediate)
  • Log files (long term)
  • Changed blocks read and updated in buffer
  • Released to disk as buffer is cleared
  • But rollback info generate most I/O operations
  • In sensitive environments, simultaneous archiving
    makes it worse (ARCHIVELOG mode)

5
Indexing Problems
  • Super-fast indexes need updating as data is
    changed gt DB slows down.
  • More complex index more complex update
    mechanism more rollback .
  • DB physical structure degrades, so does index (eg
    split blocks)
  • Performance decreases over time
  • Rebuild needed (which interferes with operations)

INDEX_STATS tells you how big the index has grown
6
Side effects
  • Role of DBMS enforce data consistency
  • A reading process may need an older version of
    the data
  • Need to create a private version of the data
  • gt processes that should be Read Only require
    writes
  • Attempt to describe all required operations in
    executing a query requiring old data

7
Solution
  • To create an older version of data
  • Must apply roll back
  • find old roll back block (I/O)
  • Roll back index (I/O)
  • Find data (I/O)
  • Roll back data
  • Read old data
  • Reverse all changes (multiple I/O)
  • Significant I/O implications buffer full of old
    stuff

8
Conclusions
  • Writers and readers DO interfere with each other
  • The mechanisms used by Oracle to bypass locking
    have performance side effects
  • Performance come with minimising I/O
  • i.e. with good access techniques
  • Precision of data location
  • Physical proximity of related data (cf caching)
  • However Techniques to reduce I/O numbers tend to
    reduce the speed of access!

9
Indexing example (see figure 3.1)
  • Alphabetic search using B tree index for name
    Oscar Smith
  • Split the table in sections (eg half) and read
    until find start beyond letter S go back one
  • Do same in branch block
  • Smith, N is the one
  • Then look for page with Smith, N in header
  • Scan for actual entry in index
  • Read address
  • Move to table
  • Read

10
About Btree indexes
  • Root and Branch blocks approx 2 of index
    (small)
  • In frequent hit situations, both blocks loaded in
    Data Buffer all the time
  • Then only 2 I/O may be required
  • One to read leaf block
  • One to read the table
  • In practice, read in index and in table may
    require reading several blocks
  • ?

11
Creating and Using Indexes
  • Important for live access,
  • Even more for querying multiple tables
  • Value matching is costly process in RDB
  • No pointers
  • Connection purely on comparison basis only
  • One value against all values in joint field
  • If link between 2 huge tables, perf is low
  • All RDBs use some form of indexing
  • Some complexity involved as index can reduce
    physical I/O at the cost of logical I/O (CPU
    time)

12
Btree indexing
  • Creating an index means creating a table with X1
    columns
  • X number of columns in index
  • Rowid (added field) table block row
  • Index is then copied into consecutive blocks
  • PCTFREE function leaves space for data growth
  • high value will generate many leaf blocks but
    reduce occurrence of split blocks in time
  • Pointer is added to previous and next leaf blocks
    in header of block

13
Btree indexing (2)
  • Then branch layer is built
  • If index gt one block
  • Collect all first entries block address of each
    leaf block
  • Write down into the first level branch block
    (packed)
  • If branch block is full, initiates second level
    of branch blocks etc.
  • Room is saved in branch blocks
  • No forward and backward pointer in branch blocks
  • Entries are trimmed to the bare minimum
  • First entries are omitted
  • See figure 6.1

14
Syntax
  • CREATE INDEX name ON table name (field1, field2
    )
  • PCTFREE 50
  • utility programmes to assess performance of
    indexes eg INDEX_STATS View

15
Updating indexes
  • Index entries are NEVER changed
  • Marked as deleted and re-inserted
  • Space made available cannot be used until after
    index is re-built
  • Inserts that dont fit split the block (rarely
    50/50!)
  • If a blocks becomes empty, it is marked as free,
    but is never removed
  • Also, blocks never merge automatically

16
Some problems
  • Some situations cannot be addressed with indexes
  • e.g. In a FIFO processing situation (e.g. a
    queue), indexes will prove counterproductive
  • Index may grow to stupid proportions even with
    small error rate (unsuccessful processing of
    data)
  • Every time a transaction is added or processed
    (deleted) the index must change

17
Alternative Bitmap indexing
  • Imagine following query in huge table
  • Find customers living in London, with 2 cars and
    3 children occupying a 4 bed house
  • Index not useful why?
  • Too big
  • If query changes in any way gtnew index needed
  • Maintaining a set of indexes for each query would
    just be too costly
  • Use a bitmap (see table 6.1, 6.2 and 6.3)

18
Bitmap indexes (2)
  • Special for data warehouse type DBs
  • Build one bitmap for each relevant parameter
  • Combine bitmaps using the and SQL keyword
  • Also possible to use noting of bitmap (see
    table 6.4, 6.5)

19
Key points to remember
  • What is the key advantage of a bitmap index?
  • What situation does it best suit ?
  • Bitmaps can also be packed by Oracle compression
    features
  • But size is unpredictable why?

20
Example
  • Table with 1,000,000 rows
  • Bitmap on one column that can contain one of 8
    different values (e.g. city names)
  • Data is such that all same city together 125,000
    times
  • Write the bitmap
  • Imagine what compression can be achieved
  • Data is such that cities are in random order, but
    same number of each
  • Same questions

21
solution
  • First scenario
  • Bitmap for first city 125,000 ones and 875,000
    zeros trimmed off
  • Size 125,000 bits or approx 18Kbytes
  • Full bitmap 156 Kbytes
  • Second scenario
  • Bitmap for first city sequences of 1s and 7
    zeros, repeated 125,000 times
  • Size 1,000,000 bits or approx 140 Kbytes
  • Full bitmap 1.12 Mbytes
  • But BTree index for such data would be around
    12MB

22
Conclusion
  • Bitmap indexes work best when combined
  • They are very quick to build
  • Up to a million rows for 10 seconds
  • Work best when limited number of values when
    high repetitions
  • Best way to deal with huge volumes gt make
    drastic selection of interesting rows before
    reading the table
  • Warning one entry in a Bitmap hundreds of
    records gt locking can be crazy (OLTP systems)
  • gt for datawarehouse type applications (no
    contention)

23
What oracle says about it
  • Use index for queries with low hit ratio or when
    queries access lt 2 - 4 of data
  • Index maintenance is costly so index just in
    case is silly
  • Must analyse the type of data when deciding what
    kind of index
  • Do NOT use columns with loads of changes in an
    index
  • Use indexed fields in where statement
  • Can also write queries with NO_INDEX

24
Administering indexes
  • Indexes degrade over time
  • Should stabilise around 75 efficiency, but dont
  • Run stats
  • Analyse index NAME validate structure
  • Analyse index NAME compute statistics
  • Analyse index NAME estimate statistics sample 1
    percent

See table 6.6
Write a Comment
User Comments (0)
About PowerShow.com