Data Models - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Data Models

Description:

... 5, CS470. 1. Chapter 5. Data Models. Structure may be considered at different ... I can carry a backpack, my car can carry my bike, my house can hold a lot more ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 13
Provided by: chitrad
Category:
Tags: data | models

less

Transcript and Presenter's Notes

Title: Data Models


1
Chapter 5
Data Models
  • Structure may be considered at different levels
  • High level/Conceptual (similar to users
    viewpoint)
  • Architects view
  • Mid-level/Representational/Implementational
  • Civil Engineers view
  • Low-level/Physical (details of storage)
  • Contractors view

Abstraction
2
Concepts
  • The true meaning of near and dear and A bird
    in hand is worth two in the bush
  • Cost ( performance) decreases with distance
  • PDA gt laptop gt Desktop
  • Capacity increases with distance
  • I can carry a backpack, my car can carry my bike,
    my house can hold a lot more
  • Why getting ABS on your new car means also paying
    for the moonroof package
  • Flexibility in degree of granularity drives up
    costs (and retards performance)
  • Reason why FEDEX has pickups at end of day

3
Concepts
  • Increasing efficiency by parallelizing
  • Racing car team takes seconds to change all four
    tires
  • Protection against complete loss increases the
    probability of partial loss
  • Spreading the eggs (Diverse stock portfolio
    versus investing entire savings in Enron or
    Worldcom)

4
Physical storage
  • Primary faster than Secondary storage but smaller
  • Primary
  • Cache (L1, L2) faster than Main memory but
    smaller
  • Secondary
  • Magnetic disk faster than CD-ROM faster than
    magnetic tape
  • Tape jukeboxes
  • Flash memory (EEPROM)
  • Modem upgrades

5
Database storage
  • Dirty secret Ultimately, everything is organized
    in terms of files, each containing a set of
    records
  • Records in file may be
  • Heaped (Stuff in teenagers room)
  • Sequential (Sorted, Can use Binary Search)
  • Hashed (Almost unique shortcut to locations on
    disk)
  • Arranged as B-trees (Efficient retrieval)
  • Only part of Database in main memory
  • Rest online on Disk or offline on tape

6
Gramophone players still rule?
  • Disk drives (Figure 5.1) are digital gramophone
    players
  • Similarities
  • Circular Disks (Records)
  • Head (Stylus)
  • Seek time (Time to position stylus at beginning
    of song)
  • Latency/Rotational Delay (Wait for song to begin)
  • Differences
  • Disk packs/Cylinders (One LP record, one track at
    a time)
  • Concentric digital tracks (Spiral analog groove)
  • Blocks (Scratch by scratch)
  • Disk controller (Human being)
  • Uniform/Variable sector density (Variable sector
    density)

7
Redundant Array of Inexpensive (Independent) Disks
  • Improved Reliability and Performance compared to
    individual disks
  • Protect against total data loss (and improve
    performance) by striping across multiple disks
  • Bit level OR Block level
  • Protect against partial data loss by redundant
    copies of data
  • Duplication (Mirroring/Shadowing)
  • Error correcting codes (Parity bits, minimal
    redundant information)

8
Concurrency (Multi-tasking, NOT Concurrency
Control)
  • Apparent concurrency Switching between jobs
    faster than ability to detect (Superman dealing
    with Hawaii volcano and Himalayan avalanche at
    the same time)
  • True concurrency Simultaneous execution (Figure
    5.5)
  • Accessing RAID disks
  • Double buffering (Fig 5.6)
  • Process contents of block n in buffer A while
    block n1 is being read into buffer B

9
Record formats
  • Records and Record Types (déjà vu?)
  • Record type Field names Field Types
    Sequence (implicit)
  • Record Collection of either actual values for
    fields OR pointers to values (Binary Large
    Objects - BLOBs)
  • Example
  • struct movie
  • char name 50
  • char releaseDate 8
  • Film reel

10
Record Length
  • Records may be
  • Fixed-length Good performance, wasted space
  • Variable length Compact space, bookkeeping
    overhead
  • Heterogeneous file
  • Variable field lengths
  • Optional fields
  • Repeating (multivalued, déjà vu?) fields
  • Accommodate variable length by
  • Including field name with field value
  • Field separator OR field length parameter
  • Nulls (pseudo variable length)

11
Putting records into blocks
  • If blockSize gt recordSize, may have more than one
    record per block
  • Blocking factor number of records/block
  • ?blockSize / recordSize?
  • If blockSize is a perfect multiple of recordSize,
    no space wasted (Figure 5.8)
  • Blocking factor for variable length records?

12
Putting blocks into records
  • If recordSize gt blockSize, need spanned records
  • Spanned records also worth considering if
  • recordSize is almost as large as blockSize
  • record-length is variable (Figure 5.8)
  • overhead in performance
  • More than one block may need to be retrieved
  • Need to search for start of record

Putting blocks into files
  • Consecutive OR Linked
  • Clusters are a compromise
Write a Comment
User Comments (0)
About PowerShow.com