Disks and Files - PowerPoint PPT Presentation

About This Presentation
Title:

Disks and Files

Description:

A disk scheduling policy says 'handle the request that is closest to where the ... Arrange collection of disk blocks into files. Naming ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 38
Provided by: Kai45
Category:
Tags: arrange | disks | files

less

Transcript and Presenter's Notes

Title: Disks and Files


1
Disks and Files
  • Vivek Pai
  • Princeton University

2
Gedankyou
  • Imagine the following
  • A disk scheduling policy says handle the request
    that is closest to where the disk head currently
    is
  • On a system with lots of disk-intensive jobs,
    what problem can arise?
  • What tweaks can avoid this problem?

3
Why Files
  • Physical reality
  • Block oriented
  • Physical sector s
  • No protection among users of the system
  • Data might be corrupted if machine crashes
  • Filesystem model
  • Byte oriented
  • Named files
  • Users protected from each other
  • Robust to machine failures

4
File Structures
  • Byte sequence
  • Read or write a number of bytes
  • Unstructured or linear
  • Record sequence
  • Fixed or variable length
  • Read or write a number of records
  • Tree
  • Records with keys
  • Read, insert, delete a record (typically using
    B-tree)

5
File Structures Today
  • Stream of bytes
  • Simplest to implement in kernel
  • Easy to manipulate in other forms
  • Little performance loss
  • More complicated structures
  • Hardware assist fell out of favor
  • Special-purpose hardware slower, costly

6
File Types
  • ASCII plain text
  • A Unix executable file
  • header magic number, sizes, entry point, flags
  • Text (code)
  • Data
  • relocation bits
  • symbol table
  • Devices
  • Everything else in the system

7
So What Makes Filesystems Hard?
  • Files grow and shrink in pieces
  • Little a priori knowledge
  • 6 orders of magnitude in file sizes
  • Overcoming disk performance behavior
  • Desire for efficiency
  • Coping with failure

8
File System Components
User
  • Disk management
  • Arrange collection of disk blocks into files
  • Naming
  • User gives file name, not track or sector number,
    to locate data
  • Security
  • Keep information secure
  • Reliability/durability
  • When system crashes, lose stuff in memory, but
    want files to be durable

File Naming
File access
Disk management
Disk drivers
9
Some Definitions
  • File descriptor (fd) an integer used to
    represent a file easier than using names
  • Metadata data about data - bookkeeping data
    used to eventually access the real data
  • Open file table system-wide list of descriptors
    in use

10
Kinds of Metadata
  • inode index node, or a specific set of
    information kept about each file
  • Two forms on disk and in memory
  • Directory names and location information for
    files and subdirectories
  • Note stored in files in Unix
  • Superblock contains information to describe the
    file system, disk layout
  • Information about free blocks/inodes on disk

11
Contents of an Inode
  • Disk inode
  • File type, size, blocks on disk
  • Owner, group, permissions (r/w/x)
  • Reference count
  • Times creation, last access, last mod
  • Inode generation number
  • Padding other stuff
  • 128 bytes on classic Unix

12
Directories in Unix
  • Stored like regular files
  • Contents are file names and inode s
  • Names are nul-terminated strings
  • Logic
  • Separates file from location in tree
  • File can appear in multiple places
  • What are the drawbacks?

13
Effects of Corruption
  • inode file gets damaged
  • Maybe some free block gets viewed
  • Directory lose files/directories
  • Might get to read deleted files
  • Superblock cant figure out anything
  • This is why we replicate the superblock

14
Data Structures for A Typical File System
Process control block
Open file table (systemwide)
Memory Inode
Disk inode
Open file pointer array
. . .
15
Opening A File
fd open( FileName, access)
  • File name lookup and authentication
  • Copy the file metadata into the in-memory data
    structure, if it is not in yet
  • Create an entry in the open file table (system
    wide) if there isnt one
  • Create an entry in PCB
  • Link up the data structures
  • Return a pointer to user

PCB
Allocate link up data structures
Open file table
File name lookup authenticate
Metadata
File system on disk
16
Reading And Writing
  • What happens when you
  • read 10 bytes from a file?
  • write 10 bytes into an existing file?
  • write 4096 bytes into a file?
  • Disk works on blocks (sectors)
  • Can have temporary (ephemeral) buffers
  • Longer lasting buffers disk cache

17
Reading A Block
read( fd, userBuf, size )
PCB
Open file table
Get physical block to sysBuf copy to userBuf
Metadata
read( device, phyBlock, size )
Buffer cache
Logical ? phyiscal
Disk device driver
18
A Disk Layout for A File System
Super block
File metadata (i-node in Unix)
File data blocks
Boot block
  • Superblock defines a file system
  • size of the file system
  • size of the file descriptor area
  • free list pointer, or pointer to bitmap
  • location of the file descriptor of the root
    directory
  • other meta-data such as permission and various
    times
  • For reliability, replicate the superblock

19
File Usage Patterns
  • How do users access files?
  • Sequential bytes read in order
  • Random read/write element out of middle of
    arrays
  • Whole file or partial file
  • How are files used?
  • Most files are small
  • Large files use up most of the disk space
  • Large files account for most of the bytes
    transferred
  • Bad news
  • Need everything to be efficient

20
Data Structures for Disk Management
  • A header for each file (part of the file
    meta-data)
  • Disk sectors associated with each file
  • A data structure to represent free space on disk
  • Bit map
  • 1 bit per block (sector)
  • blocks numbered in cylinder-major order, why?
  • Linked list
  • Others?
  • How much space does a bit map need for a 4G disk?

21
Linked Files (Alto)
  • File header points to 1st block on disk
  • Each block points to next
  • Pros
  • Can grow files dynamically
  • Free list is similar to a file
  • Cons
  • random access horrible
  • unreliable losing a block means losing the rest

File header
. . .
null
22
Contiguous Allocation
  • Request in advance for the size of the file
  • Search bit map or linked list to locate a space
  • File header
  • first sector in file
  • number of sectors
  • Pros
  • Fast sequential access
  • Easy random access
  • Cons
  • External fragmentation
  • Hard to grow files

23
Single-Level Indexed Files orExtent-based
Filesystems
  • A user declares max size
  • A file header holds an array of pointers to point
    to disk blocks
  • Pros
  • Can grow up to a limit
  • Random access is fast
  • Cons
  • Clumsy to grow beyond limit
  • Periodic cleanup of new files
  • Up-front declaration a real pain

Disk blocks
File header
24
File Allocation Table (FAT)
  • Approach
  • A section of disk for each partition is reserved
  • One entry for each block
  • A file is a linked list of blocks
  • A directory entry points to the 1st block of the
    file
  • Pros
  • Simple
  • Cons
  • Always go to FAT
  • Wasting space

0
foo
217
217
619
399
EOF
619
399
FAT
25
Multi-Level Indexed Files (Unix)
data
  • 13 Pointers in a header
  • 10 direct pointers
  • 11 1-level indirect
  • 12 2-level indirect
  • 13 3-level indirect
  • Pros Cons
  • In favor of small files
  • Can grow
  • Limit is 16G and lots of seek
  • What happens to reach block 23, 5, 340?

data
1
2
. . .
data
. . .

11
12
13
. . .

data
. . .

. . .

data
. . .

. . .

26
Reliability In Disk Systems
  • Make sure certain actions have occurred before
    function completes
  • Known as synchronous operation
  • Ex make sure new inode is on disk that the
    directory has been modified before declaring a
    file creation is complete
  • Drawback speed
  • Some ops easily asynchronous access time
  • Some filesystems dont care Linux ext2fs

27
Recovery After Failure
  • Need to ensure consistency
  • Does free bitmap match tree walk?
  • Do reference counts in inodes match directory
    entries?
  • Do blocks appear in multiple inodes?
  • This kind of recovery grows with disk size
  • Clean shutdown mark as such, no recovery

28
Reducing Synchronous Times
  • Write to a faster storage
  • Nonvolatile memory expensive, requires some
    additional OS/firmware support
  • Write to a special disk or section logging
  • Only have to examine log when recovering
  • Eventually have to put information in place
  • Some information dies in the log itself
  • Write in a special order
  • Write metadata in a way that is consistent but
    possibly recovers less

29
Challenges
  • Unix filesystem has great flexibility
  • Extent-based filesystems have speed
  • Seeks kill performance locality
  • Bitmaps show contiguous free space
  • Linked lists easy to search
  • How do you perform backup/restore?

30
A Quick XOR Overview
  • XOR eXclusive OR
  • a XOR a 0
  • a XOR 0 a
  • a XOR b b XOR a
  • (a XOR b) XOR c a XOR (b XOR c)
  • In other words, count the bits,
  • even 0, odd 1

31
More Fun With XOR
  • Result XOR (a1, a2, a3, a4,)
  • a2 goes bad
  • Can we reconstruct a2?
  • a2 XOR (a1, result, a3, a4,)
  • What does this imply for disks?
  • What kinds of failures does it handle?

32
Bigger, Faster, Stronger
  • Making individual disks larger is hard
  • Throw more disks at the problem
  • Capacity increases
  • Effective access speed may increase
  • Probability of failure also increases
  • Use some disks to provide redundancy
  • Generally assume a fail-stop model
  • Fail-stop versus Byzantine failures

33
RAID (Redundant Array of Inexpensive Disks)
  • Main idea
  • Store the error correcting codes on other disks
  • General error correcting codes are too powerful
  • Use XORs or single parity
  • Upon any failure, one can recover the entire
    block from the spare disk (or any disk) using
    XORs
  • Pros
  • Reliability
  • High bandwidth
  • Cons
  • The controller is complex

RAID controller
XOR
34
Synopsis of RAID Levels
RAID Level 0 Non redundant (JBOD)
RAID Level 1Mirroring
RAID Level 2Byte-interleaved, ECC
RAID Level 3Byte-interleaved, parity
RAID Level 4Block-interleaved, parity
RAID Level 5Block-interleaved, distributed
parity
35
Did RAID Work?
  • Performance yes
  • Reliability yes
  • Cost no
  • Controller design complicated
  • Fewer economies of scale
  • High-reliability environments dont care
  • Now also software implementations

36
RAIDs Real Benefit
  • Partly addresses the failure problem
  • Backup/restore less of an issue
  • Failed disk rebuilt at sector level
  • Lower performance during rebuild, but system
    still on-line
  • Still not perfect
  • Geographic problems
  • Failure during rebuild

37
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com