Data Storage - PowerPoint PPT Presentation

About This Presentation

Title:

Data Storage

Description:

time for processor and disk controller to process request, including resolving ... Instead use one spare disk no matter how many data disks you have. ... – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 34

Provided by: JeffU4

Learn more at: https://people.engr.tamu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Data Storage

1
Data Storage

Memory Hierarchy
Disks

Source our textbook
2
Strawman Implementation

Use UNIX file system to store relations, e.g.
Students(name, id, dept) in file /usr/db/Students
One line per tuple, each component stored as
character string, with as a separator, e.g.
tuple could be Smith123CS
Store schema in /usr/db/schema, e.g.
StudentsnameSTRidINTdeptSTR

3
Strawman cont'd

To execute SELECT FROM R WHERE ltConditiongt
read schema file for R
check that ltConditiongt is valid for R
print out column headers
read file R and for each line
check the condition
print the line if true

4
Strawman cont'd

To execute a query involving a join of two
relations R and S
for each tuple (line) in R do
for each tuple (line) in S do
if the condition is satisfied then
display the desired attributes

5
What's Wrong?

The storage of the tuples on disk is inflexible
if a student changes major from EE to ECON,
entire file must be rewritten
Search is very expensive (read entire relation)
Query processing is "brute force" -- there are
faster ways to do joins, etc.
Data is not buffered between disk and main memory
No concurrency control
No reliability in case of a crash

6
How to Fix these Problems

Take advantage of the characteristics of computer
hardware with clever algorithms to do things
better
We will cover
data storage (predominantly disks)
how to represent data elements
indexes
query optimization
failure recovery
concurrency control

7
Memory Hierarchy
faster, smaller, more expensive

cache
main memory
secondary storage (disk)
tertiary storage (tapes, CD-ROM)

slower, larger, cheaper
8
Cache Memory

Transfer a few bytes at a time between cache and
main memory instruction, integer, floating
point, short string
Processor operates on instruction and data in the
cache
Typical size 1 Mbyte (220 bytes)
Typical speed to/from main memory 100 nanosec (1
nanosec 10-9 sec)

9
Main Memory

Typical size 100 Mbytes to 10 Gbytes (1 Gbyte
230 bytes)
Typical access speed (to read or write) 10 to
100 nanosec
At least 100 times larger than cache
At least 10 times slower than cache

10
Secondary Storage

Usually disk
Divided logically into blocks, unit of transfer
between main memory (called disk I/O)
Typical size 100 Gbytes
Typical speed 10 millisec (10-3 sec)
At least 100 times larger than main memory
Much slower than main memory and much much slower
than cache can execute several million
instructions during one disk I/O

11
Tertiary Storage

Tape(s)
CD-ROM(s)
At least 1000 times slower than secondary storage
At least 1000 times larger than secondary storage

12
Volatile vs. Nonvolatile

Storage is volatile if the data is lost when the
power is gone
Usually main memory is volatile
Usually secondary and tertiary storage is
nonvolatile
Thus every change made to a database in main
memory must be backed up on disk before it can be
permanent.

13
Disks
platters each has two surfaces, each surface
consists of tracks (concentric rings)
one head per surface, very close to surface, does
the reading and writing
spindle
disk heads
14
More on Disks
orange ring is a track black squares are
gaps, which don't hold data part of track
between two gaps is a sector one or more
sectors make a block
15
Disk Controller

controls mechanical actuator that moves the heads
in and out (radius, distance from spindle)
one track from each surface at the same radius
forms a cylinder
selects a surface
selects a sector (senses when that sector is
under the corresponding head)
transfers bits

16
Typical Values

Rotation speed 5400 rmp
Number of platters 5
Number of tracks/surface 20,000
Number of sectors/track 500
Number of bytes/sector thousands

17
Disk Latency for a Read

Time between issuing command to read a block and
when contents of block appear in main memory
time for processor and disk controller to process
request, including resolving any contention
(negligible)
seek time time to move heads to correct radius
(0 to 40 millisec)
rotational latency time until first sector of
block is under the head (5 millisec)
transfer time until all sectors of the block
have passed under the head depends on rotation
speed and size of block

18
Disk Latency for Updates

For a write like reading plus verification
(read back and compare)
To modify a block
read it into main memory
change it in main memory
write it back to disk

19
Moral of the Story

Disks accesses are orders of magnitude slower
than accesses to main memory.
They are unavoidable in large databases.
Thus do everything possible to minimize them.
Can lead to different algorithms than for main
memory model.

20
Speeding Up Disk Accesses

Place blocks accessed together on same cylinder
reduces seek time and rotational latency
Divide data among several disks
head assemblies can move in parallel
Mirror a disk make copies of it
speeds up reads get data from disk whose head
is closest to desired block
no effect on writes write to all copies
also helps with fault tolerance

21
Speeding up Disk Accesses

Be clever about order in which read and write
requests are serviced, i.e., algorithm in OS or
DBMS or disk controller
Ex elevator algorithm
Prefetch blocks to main memory in anticipation of
future use (buffering)

22
Elevator Algorithm

Works well when there are many "independent" read
and write requests, i.e., don't need to be done
in a particular order, that are randomly
distributed over the disk.
Disk head assembly sweeps in and out repeatedly
When heads pass a cylinder with pending requests,
they stop to do the request
When reaching a point with no pending requests
ahead, change direction

23
Prefetching

Suppose you can predict order in which blocks
will be requested from disk.
Load them into main memory buffers before they
are needed.
Have flexibility to schedule the reads
efficiently
Can also delay writing buffered blocks if the
buffers are not needed immediately

24
Disk Failures

Intermittent failure attempt to read or write a
sector fails but a subsequent try succeeds
Impossible to read sector
Impossible to write a sector
Disk crash entire disk becomes unreadable

25
Coping with Intermittent Failures

Use redundant bits in each sector
Store checksums in the redundant bits
After a read, check if checksums are correct if
not then try again
After a write, can do a read and compare with
value written, or be optimistic and just check
the checksum of the read

26
Checksums

Suppose we use one extra bit, a parity bit.
if the number of 1's in the data bits is odd,
then set the parity bit to 1, otherwise to 0
This is not foolproof 101 and 110 both have
even parity so checksum would be 0 for both
Use n parity bits in the checksum
parity bit 1 stores parity of every n-th bit,
starting with first bit,
parity bit 2 stores parity of every n-th bit,
starting with second bit, etc.
Probability of missing an error is 1/2n

27
Coping with Permanent Read/Write Errors

Stable storage policy
Each "virtual" sector X is represented by two
real sectors, XL and XR.
To write value v to X
repeat write v to XL, read from XL until
read's checksum is correct or exceed max of
tries
do the same thing with XR
if XL or XR is discovered to be bad, then must
find a substitute

28
Handling Write Failures

Suppose write(s) to XL all fail.
Then old value is safe in XR.
Suppose write(s) to XR all fail.
Then new value is safe in XL.
Assumption is that it is highly unlikely for two
sectors to fail around the same time.

29
More on Stable Storage

To read from X
repeatedly read XL until checksum is good or
exceed max tries
if read of XL failed then repeatedly read XR
until checksum is good or exceed max tries
Handles permanent read failures, unless both XL
and XR fail about the same time (unlikely)

30
Coping with Disk Crashes

"Mean time to failure" of a disk is length of
time by which 50 of such disks will have had a
head crash
Goal is to have a much longer "mean time to data
loss" for your system
Key idea use redundancy
Discuss three such approaches next

31
Mirroring (RAID Level 1)

Keep another copy of each disk write to both,
read from one.
Only way data can be lost is if second disk
crashes while first is being repaired.
If mean time to crash of a single disk is 10
years and it takes 3 hours to repair a disk, then
mean time to data loss is 146,000 years.

32
Parity Blocks (RAID Level 4)

Drawback of previous scheme is that you need
double the number of disks.
Instead use one spare disk no matter how many
data disks you have.
Block i of the spare disk contains the parity
checks for block i of all the data disks.
If spare disk fails, get a new spare.
If a data disk fails, recompute its data from the
other data disks and the spare.

33
RAID Level 5

Drawback of previous scheme is that spare disk is
a bottleneck.
Instead, let each data disk also serve as the
spare disk for some blocks.
All these assume only one crash at a time. RAID
Level 6 uses error-correcting codes to be able to
handle multiple crashes.

Write a Comment

User Comments (0)