Chapter 11: File System Implementation - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Chapter 11: File System Implementation

Description:

Recovery. Log-Structured File Systems. NFS. Example: WAFL File System. 11.3 ... Random access is quick and easy ... Simple, fast, easy to find contiguous space ... – PowerPoint PPT presentation

Number of Views:180
Avg rating:3.0/5.0
Slides: 60
Provided by: luce164
Category:

less

Transcript and Presenter's Notes

Title: Chapter 11: File System Implementation


1
Chapter 11 File System Implementation
2
Chapter 11 File System Implementation
  • File-System Structure
  • File-System Implementation
  • Directory Implementation
  • Allocation Methods
  • Free-Space Management
  • Efficiency and Performance
  • Recovery
  • Log-Structured File Systems
  • NFS
  • Example WAFL File System

3
Objectives
  • To describe the details of implementing local
    file systems and directory structures
  • To describe the implementation of remote file
    systems
  • To discuss block allocation and free-block
    algorithms and trade-offs

4
File-System Structure
  • File structure
  • Logical storage unit
  • Collection of related information
  • File system resides on secondary storage (disks)
  • File system organized into layers
  • File control block storage structure consisting
    of information about a file
  • File control blocks reside on disk and are copied
    into memory

5
Layered File System
6
A Typical File Control Block
  • Used by logical file system includes metadata
    about the file
  • Sometimes called an inode (or a vnode in VFS)

7
Other File System Structures
  • In memory partition table
  • Contains information about each mounted partition
  • In memory directory structure
  • Contains copies of recently accessed directories
  • System-wide open file table
  • Contains copy of FCB for each open file, plus
    other info.
  • Per-process open file table
  • Contains pointers to appropriate entries in
    system-wide open file table

8
Initial File Operations
  • To create a file
  • LFS allocates a new FCB, reads appropriate
    directory into memory, updates it, and forces it
    back out to disk
  • Some file systems (e.g., Unix) treat a directory
    just like a file
  • i.e., when create a directory, allocate a FCB /
    inode / vnode for it, and set a bit to indicate
    it is a directory
  • Other file systems (e.g., Windows) use different
    kind of structure
  • To open a file (so it can be used)
  • Pass the OS the filename lookup in directory
    structure
  • Copy files FCB from disk into memory, into
    system-wide open file table (or increment number
    of processes that have that file open)
  • Update per-process file table to point to entry
    in SWOFT, etc.
  • Return to caller a pointer (file descriptor /
    file handle) to appropriate entry in per-process
    file table
  • Caller uses this handle for all I/O to the file

9
In-Memory File System Structures
Opening a file
Reading a file
10
Virtual File Systems
  • Virtual File Systems (VFS) common standard that
    provides an object-oriented way of implementing
    file systems (common abstraction layer).
  • Separates generic file system operations from
    their implementations by defining a VFS interface
    (set of APIs)
  • VFS allows the same system call interface (the
    API) to be used for different types of file
    systems.
  • Invoke the generic API on the VFS interface,
    rather than a special API for each specific type
    of file system.
  • The VFS is based on a file representation
    structure called a vnode
  • Contains numerical designator for a network-wide
    unique file (or directory)
  • Kernel maintains one vnode structure for each
    active node (file or directory)

11
Schematic View of Virtual File System
The VFS hides file system type and whether file
systems are local or remote
12
Secondary Storage Where Files Reside
  • Secondary storage extension of system storage,
    which provides large, non-volatile area of
    storage
  • Today magnetic disks formerly magnetic tape (or
    cards)
  • Fixed head / movable head
  • Fixed / removable / RAM
  • Platter / cylinder / track / sector
  • Drive / controller / subsystem
  • Floppies / drums, even CDs / DVDs / thumb
    drives / USB drives
  • Characteristics
  • Permanent
  • Random-access
  • Reusable
  • Data stored in files look like large contiguous
    address space

13
Disk Structure
14
Magnetic Disk Structure
15
Logical Disk Structure / Addressing
  • Can be viewed as an array of blocks (sectors)
    like tape
  • A mapping scheme exists, to map from logical
    block to physical address (track and sector)
    IMPORTANT POINT
  • Sometimes block size sector size page size
    (512 bytes)
  • Smallest storage allocation area is a block
  • Storage allocated by block
  • Internal fragmentation within a block
  • Often each disk has a directory VTOC
  • Exists on disk itself
  • Contains information about files on disk
  • Filename / date(s)
  • Address / length
  • Owner / security
  • Some systems use other techniques (single level
    store)

16
Device Directory Implementation
  • Linear list of file names with pointer to the
    data blocks.
  • Simple to program
  • Time-consuming to execute
  • Can try to improve access by caching, sorting,
    other techniques
  • The difficulty is, this directory is on disk, and
    is not easy to expand, contract, etc.
  • Possible option -- Hash Table linear list with
    hash data structure.
  • Decreases directory search time
  • BUT -- must handle collisions situations where
    two file names hash to the same location
  • Hash table usually fixed size
  • Limited size and collisions can impact performance

17
File Allocation Methods
  • File is a logical unit of storage
  • Collection of related information
  • Exists at some stage in main memory, but stored
    permanently on mass storage (disk / tape)
  • May be stored on disk in a variety of ways
  • An allocation method refers to how disk blocks
    are allocated for files on disk
  • Contiguous allocation
  • Linked allocation
  • Indexed allocation
  • Look at implementation and advantages/disadvantage
    s of each

18
Contiguous Allocation
  • Each file occupies a set of contiguous blocks on
    the disk.
  • Simple only need starting location (block )
    and length (number of blocks)
  • Random access is quick and easy
  • Files cannot grow so must create extra large
    (results in internal fragmentation)
  • Wasteful of space external fragmentation
  • Allocate by first fit / best fit / worst fit /
    etc.
  • Compaction sometimes necessary
  • Like an array data structure

19
Contiguous Allocation of Disk Space
20
Extent-Based Systems
  • Some newer file systems (i.e. Veritas File
    System) use a modified contiguous allocation
    scheme.
  • Extent-based file systems allocate disk blocks in
    extents.
  • An extent is a contiguous block of disks. Extents
    are allocated for file allocation. A file
    consists of one or more extents.
  • Extents are also handy in saving directory space
  • OS/400 / i5/OS has used variable-sized extents
    for 30 years, along with other innovative
    techniques
  • Will discuss later

21
Linked Allocation
  • Each file is a linked list of disk blocks
  • Blocks in a file may be scattered anywhere on the
    disk
  • Each block contains a pointer (address) to next
    block in file
  • Like linked-list data structure

22
Linked Allocation (Cont.)
  • Files created easily, can grow easily
  • Simple need only starting address
  • No external fragmentation, no compaction
  • Potential difficulties
  • Random access
  • Reliability
  • Pointer space required in each block (no longer
    matches page size)
  • Addressed somewhat by not chaining individual
    blocks together, but rather chaining clusters of
    blocks together
  • Cluster size important in internal fragmentation
  • Variant FAT used by DOS, early Windows,
    OS/2 (like LL)
  • File Allocation Table entry contains both ptr to
    data block on disk, and ptr to next FAT entry for
    file
  • Also improves random access performance, if FAT
    cached

23
Linked Allocation
24
File-Allocation Table
Directory entry points to FAT entry FAT has
chain of addresses of disk blocks in
file Usually clusters not blocks. If FAT is
damaged, can lose pieces of file
25
Indexed Allocation
  • Each file has an index block, contains pointers
    to all blocks in file
  • Random access quick, easy
  • Easy to expand file
  • No external fragmentation
  • BUT need extra space for index tables one
    table per file internal fragmentation in index
    block
  • Like table of pointers/references to data
    structures/objects
  • Logical view

index table
26
Example of Indexed Allocation
27
What if Index Table Gets Full ?
  • Several possibilities
  • Link to another index block
  • No theoretical limit may be physical limit
  • Multi-level directories
  • Multiple levels of index blocks
  • Combinations
  • Especially to address performance
  • e.g., inode structure

28
Indexed Allocation Mapping (Cont.)
?
outer-index
Multi-level directories
file
index table
29
Combined Scheme UNIX (4K bytes per block)
This structure is an inode vnode
30
Free Space Management
  • Another question related to disk space allocation
  • HOW DO WE KEEP TRACK OF AND MANAGE
  • THE FREE SPACE ON DISK ? ? ?
  • Need free space list / free space directory
  • Then how do we implement it ? ? ?

31
Free-Space Management Bit Map
  • Simplest FS directory is bit map one bit for
    each block
  • Simple, fast, easy to find contiguous space
  • BUT can take up much space, especially since
    must be kept in mainstore to be very efficient
  • Block size 29 512 bytes
  • Disk size 230 1GB
  • Bitmap size blocks / bits per
    block
  • (230/29) / (2923) (221/212)
    29 blocks ¼ MB / GB just for bitmap
  • Problem similar (but worse) for
    larger disks

32
Other Problems with Bit Maps
  • Have multiple copies
  • One copy in memory, for quick access
  • One copy on disk, to maintain permanent state
  • Must keep copies in sync
  • Difficult to make updating both copies atomic
  • So, practically, the copies in memory and disk
    may differ.
  • BUT cannot have a situation where the in-memory
    copy says a block is allocated but the on-disk
    copy says it is not
  • Runtime solution
  • Set biti 1 in disk.
  • Allocate blocki
  • Set biti 1 in memory
  • But takes time
  • AND what happens if crash

33
Free-Space Management Linked List
  • Maintain a pointer in each free block to point to
    the next free block
  • Must also maintain a pointer to head of free
    space list
  • This must be kept on disk, so permanent
  • Little wasted space
  • Cannot get contiguous space easily and
    sometimes this is required
  • SLOW requires substantial I/O time to traverse
    list
  • Cannot take advantage of fact that most systems
    can read / write multiple contiguous blocks at
    once
  • Unless free space links are to sets of blocks,
    etc., and then can have fragmentation

34
Linked Free Space List on Disk
35
Free-Space Management Other Techniques
  • Grouping FS directory in a series of sectors /
    blocks
  • Store blocks of addresses of FS blocks all in
    same sector
  • Like an index block
  • Last address points to another block of addresses
  • Can find large number of blocks of FS quickly
    with little I/O
  • Counting to improve on grouping technique
    above
  • Since FS blocks tend to occur in groups
  • Keep address of first free block in group, plus
    number of contiguous following free blocks
  • Requires larger FS list entries, but list will be
    shorter
  • Contiguous space will be easier to find
  • /////////////////////////

36
Efficiency and Performance
  • Efficiency dependent on
  • Disk allocation and directory algorithms
  • Types of data kept in files directory entry
  • Location of directories on disk
  • Performance
  • Disk cache separate section of main memory for
    frequently used blocks
  • Also, caches in many controllers today
  • Free-behind and read-ahead techniques to
    optimize sequential access
  • Remove page from buffer when next page requested
  • Read in several pages past page requested
  • Improve performance by dedicating section of
    memory as virtual disk, or RAM disk.

37
Various Disk-Caching Locations
38
Page Cache
  • Have been discussing caching blocks from disk
  • Now discuss what happens after the blocks become
    pages in memory
  • A page cache caches pages rather than disk blocks
    using virtual memory techniques
  • Note this is a cache of pages the actual
    pages that will be used are elsewhere in memory,
    referenced by the page table, etc.
  • Memory-mapped I/O uses a page cache
  • Routine I/O through the file system uses the
    buffer (disk) cache

39
I/O Without a Unified Buffer Cache
Memory-mapped I/O goes first to page cache and
then to common buffer cache Regular I/O goes
directly to buffer cache Buffer cache then goes
to file system
40
Unified Buffer Cache
  • Note multiple caches result in cache coherency
    concerns, if same data is in both caches
  • Also, there is problem of having to cache data
    for memory mapped I/O twice
  • Have to move much data
  • Have to allocate twice as much space in main
    memory
  • A unified buffer cache uses the same page cache
    to cache both memory-mapped pages and ordinary
    file system I/O.

41
I/O Using a Unified Buffer Cache
Here, both memory-mapped I/O and regular I/O go
directly to the same buffer cache. Buffer cache
then goes to file system
42
File and Data Recovery
  • Consistency checking compares data in directory
    structure with data blocks on disk, and tries to
    fix inconsistencies.
  • Use system programs to back up data from disk to
    another storage device (tape, CD, DVD, SAN,
    library, etc.).
  • Need organized backup plan
  • Recover lost file or disk by restoring data from
    backup.
  • Often tradeoffs minimizing backup or recovery
    time
  • Backups more frequent than recoveries
  • So optimize performance for most frequent
    (backups)
  • Most recoveries not disaster recoveries rather,
    single files
  • However, must also be able to handle disaster
    recoveries
  • Performance critical in disaster recoveries
  • Need to test backups, replace media, disaster
    recovery processes/plans, etc.

43
Log Structured File Systems
  • Log structured (or journaling) file systems
    record each update to the file system as a
    transaction.
  • All transactions are written to a log.
  • A transaction is considered committed once it is
    written to the log.
  • However, the file system may not yet be updated.
  • The transactions in the log are written to the
    file system.
  • May be asynchronous or may be forced to be
    synchronous (and appear atomic)
  • When the file system is modified again, some
    systems remove the previous transaction from the
    log, other systems leave the log as a
    record/backup.
  • If the file system crashes, all remaining
    transactions in the log must still be performed.
  • So, keep log on separate disk / separate file
    system
  • However, if system crashes, may leave filesystem
    in unknown state, unless logs have been forced
    out
  • Most journaled file systems only journal changes
    to metadata NOT changes to data in files
  • NTFS, JFS, ext3, ReiserFS, UFS
  • iSeries also journals changes to data ! ! !

44
Can Also Use Journals for Backup / HA
  • Possible if also journaling data in a filesystem
  • Have two physical computer systems, a local
    (primary) and remote (backup)
  • Remote system also has copy of critical data (or
    database)
  • When change data on local system, in addition to
    generating local journal of changes, also send
    copy of all journal entries to remote system
  • Can be done either by system or application
    software
  • Remote system then applied changes to its copy of
    data
  • If primary system fails, can switch to backup
    system and backup copy of data will be up to date

45
Network File System (NFS)
  • Network file systems are common
  • Use NFS as an example
  • An implementation and a specification of a
    software system for accessing remote files across
    LANs (or WANs).
  • Available in implementations for many operating
    systems and architectures
  • Ubiquitous expect to be available everywhere
  • Both clients and servers (at least expect NFS
    clients)
  • Details of implementations may vary
  • Text looks at Sun's implementation
  • Applicable to more general implementations as well

46
NFS (Cont.)
  • NFS views interconnected workstations as a set of
    independent machines with independent file
    systems
  • Goal is to allow sharing among these file systems
    in a transparent manner
  • A remote directory is mounted over a local file
    system directory
  • To the user, the mounted directory looks like a
    local mount like an integral subtree of the
    local file system, replacing the subtree
    descending from the local directory
  • Specification of the remote directory for the
    mount operation is nontransparent
  • This means, the host name of the remote
    directory has to be provided
  • After the remote directory is mounted, however,
    files in it can then be accessed in a transparent
    manner using VFS interface
  • Subject to access-rights accreditation,
    potentially any file system (or directory within
    a file system), can be mounted remotely on top of
    any local directory (local mount point)

47
NFS (Cont.)
  • NFS is designed to operate in a heterogeneous
    environment of different machines, operating
    systems, and network architectures
  • The NFS specifications independent of these media
  • This independence is achieved through the use of
    RPC primitives built on top of an External Data
    Representation (XDR) protocol used between two
    implementation-independent interfaces
  • Standard interfaces, standard data
    representations, standard RPC primitives
  • The NFS specification distinguishes between the
    services provided by a mount mechanism and the
    actual remote-file-access services
  • i.e., the user must know about the remote system
    to mount the remote file system
  • But once file system is mounted, the user does
    not really know or care

48
Three Independent File Systems
Three independent file systems must be mounted
in order to use
49
Mounting in NFS
The new /dir1 is mounted over the old /dir1
covering it up and making it inaccessible until
the new /dir1 is unmounted
Mounts
Cascading mounts
50
NFS Mount Protocol
  • Establishes initial logical connection between
    server and client
  • Mount operation includes name of remote directory
    to be mounted and name of server machine storing
    it
  • Mount request is mapped to corresponding RPC and
    forwarded to mount server running on server
    machine
  • Export list specifies local file systems that
    server exports for mounting, along with names of
    machines that are permitted to mount them
  • Following a mount request that conforms to its
    export list, the server returns a file handlea
    key for further accesses
  • File handle a file-system identifier, and an
    inode number to identify the mounted directory
    within the exported file system
  • The mount operation changes only the users view
    and does not affect the server side
  • In users view, under VFS, looks just like a
    locally mounted file system

51
NFS Protocol
  • Provides a set of remote procedure calls for
    remote file operations. The procedures support
    the following operations
  • Searching for a file within a directory
  • Reading a set of directory entries
  • Manipulating links and directories
  • Accessing file attributes
  • Reading and writing files
  • NFS servers are stateless each request has to
    provide a full set of arguments (NFS V4 is just
    becoming available quite different, stateful)
  • Modified data must be committed to the servers
    disk before results are returned to the client
    (lose advantages of caching)
  • The NFS protocol does not provide
    concurrency-control mechanisms
  • Program must handle (e.g., by byte-range locking)

52
Three Major Layers of NFS Architecture
  • UNIX / Linux / Posix file-system interface (based
    on the open, read, write, and close calls, and
    file descriptors)
  • Virtual File System (VFS) layer distinguishes
    local files from remote ones, and local files are
    further distinguished according to their
    file-system types
  • The VFS activates file-system-specific operations
    to handle local requests according to their
    file-system types
  • Calls the NFS protocol procedures for remote
    requests
  • NFS service layer bottom layer of the
    architecture
  • Implements the NFS protocol

53
Schematic View of NFS Architecture
54
NFS Path-Name Translation
  • Performed by breaking the path into component
    names and performing a separate NFS lookup call
    for every pair of component name and directory
    vnode
  • To make lookup faster, a directory name lookup
    cache on the clients side holds the vnodes for
    remote directory names

55
NFS Remote Operations
  • Nearly one-to-one correspondence between regular
    UNIX / Linux system calls and the NFS protocol
    RPCs (except opening and closing files)
  • NFS adheres to the remote-service paradigm, but
    employs buffering and caching techniques for the
    sake of performance
  • File-blocks cache when a file is opened, the
    kernel checks with the remote server whether to
    fetch or revalidate the cached attributes
  • Cached file blocks are used only if the
    corresponding cached attributes are up to date
  • File-attribute cache the attribute cache is
    updated whenever new attributes arrive from the
    server
  • Clients do not free delayed-write blocks until
    the server confirms that the data have been
    written to disk

56
Example WAFL File System
  • Used on Network Appliance Filers distributed
    file system appliances
  • Write-anywhere file layout
  • Serves up NFS, CIFS, http, ftp
  • Random I/O optimized, write optimized
  • NVRAM for write caching
  • Similar to Berkeley Fast File System, with
    extensive modifications

57
The WAFL File Layout
58
Snapshots in WAFL
59
End of Chapter 11
Write a Comment
User Comments (0)
About PowerShow.com