DIRECT ACCESS FILES - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

DIRECT ACCESS FILES

Description:

The hashing function must minimixe the creation of synonyms (minimize collisions) ... h2(k)) mod q. Purpose is to cancel similarties of to synonym for rehashing ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 18
Provided by: ev9
Category:

less

Transcript and Presenter's Notes

Title: DIRECT ACCESS FILES


1
DIRECT ACCESS FILES
  • instructor ebru a. sezer
  • esezer_at_ata.cs.hun.edu.tr

2
Purpose of using hashes
  • Achieve O(1)
  • For efficient evaluation of equality selections

3
Hashing
  • A hash function is a black box that produce an
    address every time you drop in a key
  • Hash function converts key K to address
  • Note Now lets think address as record number
    in a file (or table)

4
Example
  • Hash function Convert key in Ascii format and
    multiply first two character and use rightmost
    three digit
  • H(Lowell) 4

5
Good Hash Function
  • Generated record numbers should be uniformly and
    randomly disrubuted over the file (0 lt h(key)lt
    N)
  • Small variation in the value of the key will
    cause large value of H(key)
  • The hashing function must minimixe the creation
    of synonyms (minimize collisions).

6
Load Factor
  • Load Factor Logical record number /
  • (Block number Record per block)
  • If address size increases
  • Collision decreases
  • LF decreases
  • LF lt 0.6 is unacceptable
  • Reduce collisions while keeping LF in acceptable
    level

7
Hashing Transformations-1
  • Digit analysis
  • Select determined place of digits from key
  • Division method
  • f(x) x mod n
  • Radix transformation
  • f(abc) a 112 b 11 c

8
Hashing Transformations-2
  • Truncation Method
  • f(k) k mod 10n
  • Produce n digit address
  • Midsquare Method
  • f(k) Midy (k2)

9
Hashing Tranformations-3
  • Shift Folding
  • Key is partitioned from left to right in
    subnumbers
  • Each subnumber contains digit in the rqd.address
  • Subnumbers added, carry discarded
  • Bounding Folding
  • Same as shift folding
  • First and last subnumbers are reverse

10
Overflow Managment Techniques
  • Open Addressing
  • Linear search
  • Nonlinear search
  • Chaining
  • Extendible Hashing
  • Linear Hashing

11
Open Addressing
  • When collision occurs, the new record will be
    inserted in the first address space after the
    hashed address
  • Linear Search
  • With pump
  • Without pump

12
Insert Algorithm-Linear Search
  • Algorithm Insert
  • home_address h(k)
  • i home_address
  • found False
  • while ((location i is full) and (not found) ) do
  • i (i1) mod (address space)
  • if (i home_address) then found True
  • if (found) then write (disk full)
  • else insert record in location i

13
Search Length
  • Average search length
  • total search length / total number of records
  • Total search length
  • sum of each records search length

14
Open Addressing-Nonlinear Search
  • (a) Rehashing
  • Employs a second hashing function
  • For overflow records, rehashing is used until
    free location is found or a full file is detected
  • r(i) (i p) mod q

15
Insert Algorithm-Rehashing
  • Algorithm Insert
  • home_address h(k)
  • i home_address
  • found False
  • while ((location i is full) and (not found) ) do
  • i r(i)
  • if (i home_address) then found True
  • if (found) then write (disk full)
  • else insert record in location i

16
Open Addressing-Nonlinear Search
  • (b) Double Hashing
  • Has two hashing function h1 and h2
  • For overflow records, h2 is used until free
    location is found or a full file is detected
  • r(i) (i h2(k)) mod q
  • Purpose is to cancel similarties of to synonym
    for rehashing

17
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com