Query Tree Question - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Query Tree Question

Description:

Query Tree Question. Should we do a ppname, pnumber then spname = Aquarius' then p pnumber ? ... project, see if pname = Aquarius' then use pnumber to perform ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 43
Provided by: susanv5
Category:

less

Transcript and Presenter's Notes

Title: Query Tree Question


1
Query Tree Question
  • Should we do a ppname, pnumber then
    spname Aquarius then p pnumber ?
  • No, since the operations are done together
  • the processor would read a row of project, see if
    pname Aquarius then use pnumber to perform
    the join.
  • Our query tree has only 2 groups, not 3

2
  • Select Operation Strategies
  • And Indexing
  • (Chapter 8)
  • Some info on slides from Dr. S. Son, U. Va

3
Disk access
  • DBs traditionally stored on disk
  • Cheaper to store on disk than in memory
  • Costs for
  • Seek time, latency, data transfer time
  •  Disk access is page oriented
  • 2 - 4 KB page size

4
Access time
  • Time to randomly access a page
  • 12-20 ms which is 50-83 I/O's per second
  • Large disparity between disk access and memory
    access (10-200 ns)
  • System initially determines if page in memory
    buffer (page tables, etc.)

5
Table scan
  • Linear search - all data rows read in
  • I/O parallelism can be used
  • multiple I/O read requests satisfied at the same
    time
  • stripe the data across different disks       
  • Problems with parallelism?
  • must balance disk arm load to gain maximum
    parallelism
  • requires the same total number of random I/O's,
    but using devices for a shorter time

6
Sequential prefetch I/O
  • Retrieve one disk page after another (on same
    track) - typically 32
  • Seek time no longer a problem
  • Must know in advance to read 32 successive pages
  • Speed up of I/O by a factor of 10 (500 I/O's
    per second vs. 70)

7
Access time
  • Seek time 10-15ms
  • Latency time 2-5 ms
  • Data transfer time 10-200 ns

8
Access time for fast I/O
  • RIO            Seq. Prefetch .010            
    .010                    Seek - disk arm to
    cylinder .002             .002                   
    Latency - platter to sector .0015          
    .048                 Data transfer - Page
    .0135           .060                   1 page
    vs. 32 pages
  • .43 seconds  .060 seconds for 32 pages for
    both

9
Textbook access time
  • RIO            Seq. Prefetch .008            
    .008                   Seek - disk arm to
    cylinder .004            .004                 
    Latency - platter to sector .0005          
    .016               Data transfer - Page
    .0125           .028                   1 page
    vs. 32 pages
  • .40 seconds  .028 seconds for 32 pages for
    both

10
Disk allocation
  • Disk Resource Allocation for Databases (DBA has
    control)
  • Goal contiguous sectors on disk - want data as
    close together as possible  to minimize seek time
  • No standard SQL approach, but general way to deal
    with allocation
  • Some OS allow specification of size of file and
    disk device

11
Tablespace
  • Allocation medium for tables and indexes for
    ORACLE, DB2, etc.
  • Usually relations (files) cannot span disk
    devices
  • Can put gt1 table in a table space if accessed
    together
  • Tablespace corresponds to 1 or more OS files and
    can span disk devices

12
Query Language
  • ORACLE DB's contain several tablespaces,
    including one called system -     data
    description   indexes user-defined tables
  • Create tablespace tspace1 datafile 'fname1',
    'fname2'
  • default tablespace given to each user
  • if multiple tablespaces - better control over
    load balancing
  • can take some disk space off-line

13
Extent
  • extent - contiguous storage on disk
  • when data segment or index segment first created,
    given an initial extent from tablespace 10KB (5
    pages)
  • if need more space given next contiguous extent
  • can increase the size by a positive (cannot
    decrease)                     initial n - size
    of initial extent                     next n -
    size of next                     max extents -
    maximum number of extents                    
    min extents - number of extents initially
    allocated                     pct increase n -
    by which next extent
  • grows over previous one

14
Create table
  • Create table statement - can specify tablespace,
    no. of extents
  • When initial extent full, new extent allocated
  • pctfree - determine how much space can be used
    for inserts of new rows
  • if pctfree 10, inserts stop when page is 90
    full
  • pctused determines when new inserts start again
  • if fall below certain percentage of total,
    default pctused 40                  pctfree
    pctused lt 100

15
Rows
  • Row layout on each disk page (see figure)
  • Row directory row number and page byte offset
  • Row number is row number in page book calls it
    slot
  • Page byte offset with varchar, row size not
    constant
  • To identify a particular row use RID (RowID)
  • page , slot file
  • slot is number in row directory (logical )

16
Differences in DBMSs
  • RID can be retrieved in ORACLE but not DB2
    (violates relational model rule)
  • ORACLE
  • rows can be slit between pages (row record
    fragmentation)
  • Can have rows from multiple tables on same page,
    more info
  • DB2, no splitting, entire row moved to new page,
    need forwarding pointer

17
Binary Search
  • Find all students with gpa gt 3.0
  • If data is in sorted file, do binary search to
    find first such student, then scan to find
    others.
  • Cost of binary search can be quite high.
  • Simple idea Create an index file.

Index File
kN
k2
k1
Data File
Page N
Page 3
Page 1
Page 2
18
Binary Search
  • Binary search on disk
  • optimal for comparisons - not optimal for
    disk-based look-up
  • must keep data in order
  • may be reading values from same page at
    different times
  •  Instead use B-tree index

19
Indexing
  • Keyed access retrieval method
  • index is a sorted file - sorted by index key
  • index entries
  • index key pointer  (RID)
  •   
  • pointer is RID  
  • index resides on disk, partially memory resident
    when accessed

20
Indexing
  • As for any index, 3 alternatives for data entries
    k
  • Data record with key value k
  • ltk, rid of data record with search key value kgt
  • ltk, list of rids of data records with search key
    kgt
  • Choice is orthogonal to the indexing technique
    used to locate data entries k.
  • Tree-structured indexing techniques support both
    range searches and equality searches.
  • B tree dynamic, adjusts gracefully under
    inserts and deletes.

21
B-tree
  • Most commonly used index structure type in DBs
    today
  • Based on B-tree
  • Used to minimize disk I/O
  • available in DB2, ORACLE also has hash cluster,
    Ingres has heap structure, B-tree, isam (chain
    together new nodes) Example

22
Structure of B Trees
  • leaf level pointers to data (RIDs)
  • the remaining are directory (index) nodes that
    point to other index nodes

23
Characteristics of B Tree
  • Insert/delete at log F N cost keep tree
    height-balanced. (F fanout, N leaf pages)
  • Minimum 50 occupancy (except for root). Each
    node contains d lt m lt 2d entries. The
    parameter d is called the order of the tree.
  • Supports equality and range-searches efficiently

24
Cost of I/O for B-tree
  • Assume number of entries in each index node fits
    on one page - one node is one page
  • If tree with depth of 3, 3 I/Os to get pointer
    to data B-tree structured to get most out of
    every disk page read
  • Read in index node, can make multiple probes to
    same page if remains in memory
  • likely since frequent access to upper -level
    nodes of actively used B-trees

25
B Trees in Practice
  • Typical order 100. Typical fill-factor 67.
  • average fanout 133
  • Typical capacities
  • Height 4 1334 312,900,700 records
  • Height 3 1333 2,352,637 records
  • Can often hold top levels in buffer pool
  • Level 1 1 page 8 Kbytes
  • Level 2 133 pages 1 Mbyte
  • Level 3 17,689 pages 133 MBytes

26
B-tree
  • Index has a directory structure that allows
    retrieval of a range of values efficiently
  • search for leftmost index entry Si such that
  • X lt Si
  • Index entries always placed in sequence by value
    - can use sequential prefetch on index
  • Index entries shorter than data rows and require
    proportionately less I/O

27
B-tree
  • Balancing of B-trees - insert, delete
  • nodes usually not full
  • utilities to reorganize to lower disk I/O
  • most systems allow nodes to become depopulated-
    no automatic algorithm to balance
  • average node below root level 71 full in active
    growing B-trees

28
Inserting into B Tree
  • Find correct leaf L.
  • Put data entry onto L.
  • If L has enough space, done!
  • Else, must split L (into L and a new node L2)
  • Redistribute entries evenly, copy up middle key.
  • Insert index entry pointing to L2 into parent of
    L.
  • This can happen recursively
  • To split index node, redistribute entries evenly,
    but push up middle key. (Contrast with leaf
    splits.)
  • Splits grow tree root split increases height.
  • Tree growth gets wider or one level taller at
    top.
  • Algorithm from MSU

29
Deleting from B tree
  • Start at root, find leaf L where entry belongs.
  • Remove the entry.
  • If L is at least half-full, done!
  • If L has only d-1 entries,
  • Try to re-distribute, borrowing from sibling
    (adjacent node with same parent as L).
  • If re-distribution fails, merge L and sibling.
  • If merge occurred, must delete entry (pointing to
    L or sibling) from parent of L.
  • Merge could propagate to root, decreasing height.
  • Algorithm from MSU

30
Duplicate key values
  • Duplicate key values in index
  • leaf nodes have sibling pointers
  • but a delete of a row that has a heavily
    duplicated key entails a long search through the
    leaf-level of the B-tree
  • Index compression - with multiple duplicates
  • header info PrX keyval RID RID ... RID PrX
    keyval RIDRID
  • where PrX is count of RID values

31
Create Index
  •    Options        
  • multiple columns         tablespace
            storage - initial extents, etc.
            percent free default 10
  • of each page left unfilled
  • free page (1 free page for every n index
    pages)
  •     Can control of B-tree node pages left
    unfilled when index created, refers to initial
    creation

32
Clustering
  • Placing rows on disk in order by some common
    index key value        (remember the index
    itself is always sorted)
  • clustered (clustering) index - index with rows in
    the same order as the key values
  • efficiency advantage        read in a page, get
    all of the rows with
  • the same value
  • clustering is useful for range queries
            e.g.  between keyval1 and keyval2

33
Clustering
  • can only cluster table by 1 clustering index at a
    time
  • In DB2
  • if the table is empty, rows sorted as placed on
    disk
  • subsequent insertions not clustered, must use
    REORG

34
Indexes vs. table scan
  • To illustrate the difference between table scan,
    secondary index (non clustered)     and
    clustered index
  • Assume 10 M customers, 200 cities
  • 2KB/page, row 100 bytes, 20 rows/page
  •             Select             From
    Customers             Where city Birmingham
  • 1/200 10M if assume selectivity 1/200
  • 50,000 customers in a city

35
Table Scan
  • Table Scan - read entire table
  • 10,000,000/20 500,000 pages   
  • If use prefetch?
  • 500000/32 .?

36
Clustering Index
  • Clustering Index
  • All entries for B'ham clustered on same pages
  • 50,000/20 2500 pages (with 20 rows per page)  
  • (3 50 2500)?

37
Secondary Index
  • Secondary Index
  • In the worst case 1 entry for B'ham per page
  • 50,000 pages (10M/200)
  • 3 upper nodes of the tree  
  • Assume 1000 index entries per leaf node, read
    50000/1000 index pages
  • (3 50 50,000)?

38
List Prefetch
  • Create list of data pages to access
  • system orders pages to minimize disk I/O
  • E.g. elevator algorithm for disk request
    scheduling

39
Free
  • Redo the previous calculations assuming relations
    created with 50 free option specified.

40
Multiple Indexes
  • More than one index on a relation            
  • e.g. class - one index, gender - one index

41
Composite Index
  • One index based on more than one attribute
     Create Index index_name on Table (col1,
    col2,... coln)
  •    Composite index entry - values for each
    attribute             class, gender            
    entry in index is  C1, C2, RID
  • What would B tree look like?

42
Creating Indexes
  • When determining what indexes to create consider
    workload - mix of queries and frequencies of
    requests             20 of requests are
    updates, etc.
  •             can create lots of indexes but
                    cost to create                
    insertions                 initial load time
    high if a large table                 index
    entries can become longer and longer as
    multiple columns included
Write a Comment
User Comments (0)
About PowerShow.com