Storage Systems: Management of Persistent Data - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Storage Systems: Management of Persistent Data

Description:

RAID redundant array of inexpensive disks ... RAID Handling a Single Failure. Block ... RAID Benefits. Parallelism can write to n disks at the same time ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 35
Provided by: zach63
Category:

less

Transcript and Presenter's Notes

Title: Storage Systems: Management of Persistent Data


1
Storage Systems Management of Persistent Data
  • Randal Burns

2
What distinguishes storage problems?
  • What makes something a storage system?
  • Any software that manages persistent data.
  • Why is storage an interesting area?
  • Storage holds the crown jewels of every
    organization Information
  • Storage is the performance bottleneck in almost
    all systems
  • Cause of the longest lasting and most expensive
    failures

3
The Storage Systems Chainsaw
  • Horizontal cut across
  • Database
  • Operating systems
  • Web serving
  • Multimedia serving
  • Mobile and wireless devices
  • Discipline is broad
  • Studies same class of problems in entirely
    different systems

4
The Storage Systems Auger
  • Vertical cut through
  • Applications
  • Middleware
  • Operating systems
  • Logical volume
  • Device
  • Discipline is deep
  • Persistence needs to be considered at all levels

5
What are we going to do?
  • Look at the evolution of storage system
    technology
  • Disk
  • Tape
  • Future I/O devices
  • Sample a couple of research areas
  • Volume managers
  • File system recovery

6
The History of Innovation Devices
  • Punch cards 1884
  • Used to store information for calculators
  • Card readers allowed data to be reloaded
  • In computer systems 1950s
  • Load programs and program data
  • Computers had no memory or storage

7
The History of Innovation Devices
  • Tape (1952)
  • Used to store information for batch applications
  • Use same data set during multiple runs
  • Routine library shared by all applications
  • Reduce load time
  • Made programs smaller because not all information
    needs to be on punch cards
  • What happened to tape?
  • Became used as part of a hierarchical on-line
    store
  • Used for backup/archive, off-line data only
  • Has been declared dead several times

8
The History of Innovation Devices
  • IBM 7494 at Houston

Modern Tape Robots
9
The History of Innovation Devices
  • Disk (1956)
  • Introduced as a random-access device
  • Disks were small, hold runtime data, tapes were
    used for persistent storage
  • Disks were a memory
  • The evolution of disk drives
  • No longer a random-access device
  • Evolved into a store, with RAM as a memory
  • Starting to be used as archival storage
  • Recently has been proposed to be dieing

10
The History of Innovation Devices
Microdrive
RAMAC
11
The History of Innovation Devices
  • Non-volatile memories (1995)
  • Used as small, stable memories for crash recovery
    and synchronous I/O, in front of disks
  • Same relationship between disk and tape in 1958
  • Evolution of NV memories
  • Stable storage for portable devices
  • Cell-phones, PDAs, and handhelds

12
Conclusions Devices
  • Some historical observations
  • Device technologies move down the memory storage
    hierarchy as capacity increases
  • Devices are replaced by faster technology
  • Not all emerging devices fit the old model

13
Emerging Technology Devices
  • MEMS storage
  • Micro-electromechanical systems
  • Not rotational sled translates in XY direction
  • Performs much like a disk, but with different
    cost/density profile

14
Emerging Technology Devices
  • MRAM magnetic RAM
  • Stable storage without power
  • Access speed, cost, and density of DRAM (memory
    used in computers today)
  • Will not replace disk right away, because costs
    per unit will be high
  • First stable memory since disk drives

15
Volume Managers
  • External large-scale block storage
  • Highly reliable
  • Central repository of information is easy to
    manage
  • Backup and restore
  • User quotas
  • Configurable
  • Extend and move volumes
  • RAID redundant array of inexpensive disks
  • Add an extra disk to a system of disks so that
    data are preserved when a disk fails

16
Volume Managers
  • EMC Symmetrix
  • IBM Enterprise
  • Storage Server

17
RAID Adding a Parity Disk
  • RAID is the most important research idea
    underlying volume managers
  • A parity disk allows a storage systems to survive
    any single failure
  • Data can be read/written after a failure

18
RAID How does it go?
  • Observe that if then
  • Example
  • P stands for parity

19
RAID Handling a Single Failure
  • Block A can be rebuilt from B and P
  • Block B can be rebuilt from A and P

20
RAID Adding a Parity Disk
  • Data becomes highly available
  • Disk MTTF 5 years
  • 5 disks, MTTF 1 year
  • RAID 5 P
  • Assume MTTR of 1 day, add a new disk and rebuild
    the contents
  • MTTF 11,000 years
  • Chance of another disk failing during the day it
    takes to recover from the first failure

21
RAID Benefits
  • Parallelism can write to n disks at the same
    time
  • Good when writing a whole stripe of data

22
RAID Drawbacks
  • Small write problem
  • To update data smaller than a whole stripe, must
    read data from the disks before writing

23
RAID Drawbacks
  • Small write problem
  • Parity must change when writing just one block

24
RAID Drawbacks
  • Small write problem

25
RAID Tradeoffs
  • Small write problem plagues RAID systems
  • Advanced techniques attempt to mitigate it
  • RAID performs well in high-performance
    environments with large data
  • Scientific computing
  • Supercomputing
  • Availability benefits overshadow any performance
    drawbacks

26
File System Recovery
  • When your (or any) computer crashes, the
    operating system must check the file system
  • Problem is called restart recovery
  • Some file system data in memory was lost when
    system crashed
  • File system integrity is in question
  • Lost blocks of storage
  • Pointers to nowhere
  • Bad data in files
  • Unreachable files, unnavigable namespace

27
File System Checking
  • Traditional approach to file system recovery
  • Check all of the file system data structures,
    make sure that
  • Every file is in a directory
  • Every directory can be reached from the root
  • All disk blocks are either in files or empty
  • For small disks and file systems, no problem
  • Windows and UNIX
  • For large-scale storage systems, file system
    checking can take hours

28
A Simple Operation
rename /home/randal/files /home/bob
  • Before
  • After

29
Updating the Namespace
  • Changes do not occur all at the same time
  • System crash can occur at any point

30
Need for File System Checking
  • Several inconsistent states
  • All are recoverable by searching file system

31
Transactional Storage
  • For each operation that modifies the file system
  • Record actions in a log describing how the file
    system is going to be changed
  • Change the file system
  • The log contains all information needed to fix
    the file system
  • If a failure occurs during the log-write, the
    action never happened
  • If a failure occurs while changing the FS, the
    action can be redone based on the log

32
Updating the Namespace
  • After a system crash read the log
  • Specifies actions that may be incomplete

LOG action action mv /home/randal/files
/home/bob action
33
Performance of Transactional Storage
  • Logging can reduce latency and improve
    performance for writes to disk
  • Writing changes to log alone is sufficient
  • File system can be changed in the background
  • Log writes can be done sequentially
  • Does not reduce overall I/O, but can make single
    operations more responsive
  • File system recovery takes seconds rather than
    hours

34
Summary
  • Motivated storage systems as a broad and deep
    research area
  • Overview of device technology that drives the
    field
  • Disk, tape, NV RAM, and futuristic devices
  • Two exemplary research problems
  • RAID volume managers
  • File system recovery
Write a Comment
User Comments (0)
About PowerShow.com