Why panic()? Improving Reliability through Restartable File Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Why panic()? Improving Reliability through Restartable File Systems

Description:

Title: Why panic()? Improving Reliability through Restartable File Systems Author: Harini Last modified by: Swaminathan Sundararaman Created Date – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 29
Provided by: Har127
Category:

less

Transcript and Presenter's Notes

Title: Why panic()? Improving Reliability through Restartable File Systems


1
Why panic()? Improving Reliability through
Restartable File Systems
  • Swaminathan Sundararaman, Sriram Subramanian,
    Abhishek Rajimwale, Andrea C. Arpaci-Dusseau,
  • Remzi H. Arpaci-Dusseau, Michael M. Swift

2
Data Availability
  • Applications require data
  • Use FS to reliably store data
  • Both hardware and software can fail
  • Typical Solution
  • Large clusters for availability
  • Reliability through replication

Slave Nodes
GFS Master
GFS Master
Slave Nodes
3
User Desktop Environment
  • Replication infeasible for desktop environments
  • Wouldnt RAID work?
  • Can only tolerate H/W failures
  • FS crash are more severe
  • Services/applications are killed
  • Requiring OS reboot and recovery
  • Need better reliability in the
  • event of file system failures

App
App
App
OS
FS
Disk
Raid Controller
Disks
4
Outline
  • Motivation
  • Background
  • Restartable file systems
  • Advantages and limitations
  • Conclusions

5
Failure Handling in File Systems
  • Exception paths not tested thoroughly
  • Exceptions failed I/O, bad arguments, null
    pointer
  • On errors call panic,BUG,BUG_ON
  • After failure data becomes inaccessible
  • Reason for no recovery code
  • Hard to apply corrective measures
  • Not straightforward to add recovery

6
Realworld Example Linux 2.6.15
ReiserFS
int journal_mark_dirty(.)    struct
reiserfs_journal_cnode cn NULL    if (!cn)
        cn get_cnode(p_s_sb)        if
(!cn)             reiserfs_panic(p_s_sb,
"get_cnode failed!\n")       
File systems already detect failures
void reiserfs_panic(struct super_block sb,
...) BUG()   / this is not actually
called, but makes reiserfs_panic() "noreturn"
/    panic("REISERFS panic s\n,
error_buf)
Recovery simplified by generic recovery mechanism
7
Possible Solutions
  • Code to recover from all failures
  • Not feasible in reality
  • Restart on failure
  • Previous work have taken
  • this approach
  • FS need stateful lightweight
  • recovery

Heavyweight
Lightweight
CuriOS EROS
Stateful
Nooks/Shadow Xen, Minix L4, Nexus
SafeDrive Singularity
Stateless
8
Restartable File Systems
  • Goal build lightweight stateful solution to
    tolerate file-system failures
  • Solution single generic recovery mechanism
  • for any file system failure
  • Detect failures through assertions
  • Cleanup resources used by file system
  • Restore file-system state before crash
  • Continue to service new file system requests

FS Failures completely transparent to
applications
9
Challenges
  • Transparency
  • Multiple applications using FS upon crash
  • Intertwined execution
  • Fault-tolerance
  • Handle a gamut of failures
  • Transform to fail-stop failures
  • Consistency
  • OS and FS could be left in an inconsistent state

10
Guarantying FS Consistency
  • FS consistency required to prevent data loss
  • Not all FS support crash-consistency
  • FS state constantly modified by applications
  • Periodically checkpoint FS state
  • Mark dirty blocks as Copy-On-Write
  • Ensure each checkpoint is atomically written
  • On Crash revert back to the last checkpoint

11
Overview of Our Approach
Open (file)
write()
read()
write()
write()
Close()
Application
VFS
checkpoint
File System
Epoch 0
Epoch 1
time
Completed
In-progress
Legend
Crash
12
Checkpoint Mechanism
  • File systems constantly modified
  • Hard to identify a consistent recovery point
  • Naïve Solution Prevent any new FS operation and
    call sync
  • Inefficient and unacceptable overhead

13
Key Insight
App
App
App
All requests go through the VFS layer
VFS
File System
ext3
VFAT
Control requests to FS and dirty pages to disk
Page Cache
File Systems write to disk through Page Cache
Disk
14
Generic COW based Checkpoint
App
App
App
VFS
VFS
VFS
File System
File System
File System
Page Cache
Page Cache
Page Cache
Disk
Disk
Disk
At Checkpoint
After Checkpoint
Regular
Membrane
15
Interaction with Modern FSes
  • Have built-in crash consistency mechanism
  • Journaling or Snapshotting
  • Seamlessly integrate with these mechanism
  • Need FSes to indicate beginning and end of an
    transaction
  • Works for data and ordered journaling mode
  • Need to combine writeback mode with COW

16
Light-weight Logging
  • Log operations at the VFS level
  • Need not modify existing file systems
  • Operations open, close, read, write, symlink,
    unlink, seek, etc.
  • Read
  • Logs are thrown away after each checkpoint
  • What about logging writes?

17
Page Stealing Mechanism
  • Mainly used for replaying writes
  • Goal Reduce the overhead of logging writes
  • Soln Grab data from page cache during recovery

Write (fd, buf, offset, count)
VFS
VFS
VFS
File System
File System
File System
Page Cache
Page Cache
Page Cache
Before Crash
During Recovery
After Recovery
18
Handling Non-Determinism
19
Skip/Trust Unwind Protocol
20
Evaluation
  • Setup

21
OpenSSH Benchmark
22
Postmark Benchmark
23
Recovery Time
  • Restart ext2 during random-read micro benchmark

24
Recovery Time (Cont.)
Data (Mb) Recovery Time (ms)
10 12.9
20 13.2
40 16.1
Open Sessions Recovery Time (ms)
200 11.4
400 14.6
800 22.0
Log Records Recovery Time (ms)
200 11.4
400 14.6
800 22.0
25
Advantages
  • Improves tolerance to file system failures
  • Build trust in new file systems (e.g., ext4,
    btrfs)
  • Quick-fix bug patching
  • Developer transform corruptions to restart
  • Restart instead of extensive code restructuring
  • Encourage more integrity checks in FS code
  • Assertions could be seamlessly transformed to
    restart
  • File systems more robust to failures/crashes

26
Limitations
  • Only tolerate fail-stop failures
  • Not address-space based
  • Faults could corrupt other kernel components
  • FS restart may be visible to application
  • e.g., Inode numbers could be changed after
    restart

Inode Mismatch
File1 inode 15
File1 inode 12
create (file1)
stat (file1)
write (file1, 4k)
create (file1)
Application
stat (file1)
write (file1, 4k)
VFS
File System
File file1 Inode 15
File file1 Inode 12
Epoch 0
Epoch 0
After Crash Recovery
Before Crash
27
Conclusions
  • Failures are inevitable in file systems
  • Learn to cope and not hope to avoid them
  • Generic recovery mechanism for FS failures
  • Improves FS reliability availability
    of data
  • Users Install new FSes with confidence
  • Developers Ship FS faster as not all exception
    cases are now show-stoppers

28
Thank You!
  • Questions and Comments

Advanced Systems Lab (ADSL) University of
Wisconsin-Madison http//www.cs.wisc.edu/adsl
Write a Comment
User Comments (0)
About PowerShow.com