Distributed File Systems Andrew file system - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Distributed File Systems Andrew file system

Description:

... require that Vice servers retain some state on behalf of the Venus clients even ... Cache content maps are stored in the common memory that is shared ... – PowerPoint PPT presentation

Number of Views:1101
Avg rating:3.0/5.0
Slides: 19
Provided by: wort
Category:

less

Transcript and Presenter's Notes

Title: Distributed File Systems Andrew file system


1
Distributed File SystemsAndrew file system
2
The Andrew File System
  • uses normal Unix file primitives
  • AFS servers hold local user files but server
    filing systems are NFS based.
  • primary design factor was scalability which is
    accomplished by
  • whole file serving unlike NFS
  • whole file caching on the client computer

3
Performance of the Andrew File System
  • design was based on a number of assumptions
  • locally cached copies of shared files that are
    infrequently updated ( eg. system ) or a single
    users files are likely to remain valid for long
    periods.
  • local cache is sufficiently large ( gt100 M ) to
    establish a working set of files for a user.
  • common file characteristics
  • small lt 10 K
  • reads are six times more common than writes
  • read/writes by one user only one user modifies
  • files are referenced in bursts
  • not suitable for distributed databases ( large
    number of users can modify )

4
Distribution of processes in the Andrew File
System
5
File name space seen by clients of AFS
6
System call interception in AFS
7
Implementation of file system calls in AFS
8
The main components of the Vice service interface
9
Cache consistency
  • maintained by the use of callbacks
  • if server updates a file it sends a callback
    request to all of the clients holding a copy of
    that file. This causes the client to set the
    callback to cancelled.
  • when accessing a cached copy of a file, a client
    check the callback. If it is cancelled a new
    copy of the file must be fetched from the server.

10
Cache consistency
  • upon reboot, a client checks all of its file
    callbacks as some callbacks may have been missed.
  • for all valid callbacks it finds it sends a
    validation request with a file timestamp
  • if server agrees with the timestamp, the callback
    is valid. If not the callback is cancelled ( ie.
    next time a new copy of file must be fetched ).
  • callbacks must be renewed before a open if a time
    T ( typically a few minutes ) has elapsed since
    the file was cached if no communication with
    server

11
Cache consistency
  • callbacks work better than just using timestamps
    ( every open would require a check with the
    server )
  • callbacks require that Vice servers retain some
    state on behalf of the Venus clients even when a
    reboot is required
  • callback promise mechanism maintains a
    well-defined approximation to one-copy semantics
    that are not practical in large systems

12
Other aspects of AFS worth noting
  • UNIX kernel in AFS hosts is modified to allow
    Vice to perform file operations in terms of file
    handles can remain stateless
  • each server contains a fully replicated location
    database giving a mapping of volume names to
    servers. If a volume os moved, forwarding
    information is left behind
  • Vice and Venus make use of a non-pre-emptive
    thread package so that reuests can be processed
    concurrently at both the server and the client.
    Cache content maps are stored in the common
    memory that is shared between Venus threads

13
Other aspects of AFS worth noting
  • read-only replicas can be used for infrequently
    updated files
  • large bulk transfers in large packets reduce
    latency
  • AFS-3 allows for partial file caching in 64k
    chunks
  • load on AFS servers is reduced considerably from
    NFS
  • allows for wide-area support with multiple
    administrative cells

14
Recent advances in file services
  • NFS enhancements
  • WebNFS - NFS server implements a web-like service
    on a well-known port. Requests use a 'public file
    handle' and a pathname-capable variant of
    lookup(). Enables applications to access NFS
    servers directly, e.g. to read a portion of a
    large file.
  • One-copy update semantics (Spritely NFS, NQNFS) -
    Include an open() operation and maintain tables
    of open files at servers, which are used to
    prevent multiple writers and to generate
    callbacks to clients notifying them of updates.
    Performance was improved by reduction in
    gettattr() traffic.

15
Recent advances in file services
  • Improvements in disk storage organisation
  • RAID - improves performance and reliability by
    striping data redundantly across several disk
    drives
  • Log-structured file storage - updated pages are
    stored contiguously in memory and committed to
    disk in large contiguous blocks ( 1 Mbyte). File
    maps are modified whenever an update occurs.
    Garbage collection to recover disk space.

16
New design Approaches
  • Distribute file data across several servers
  • Exploits high-speed networks (ATM, Gigabit
    Ethernet)
  • Layered approach, lowest level is like a
    'distributed virtual disk'
  • Achieves scalability even for a single
    heavily-used file
  • 'Serverless' architecture
  • Exploits processing and disk resources in all
    available network nodes
  • Service is distributed at the level of individual
    files
  • Examples
  • xFS (section 8.5) Experimental implementation
    demonstrated a substantial performance gain over
    NFS and AFS
  • Frangipani (section 8.5) Performance similar to
    local UNIX file access
  • Tiger Video File System (see Chapter 15)
  • Peer-to-peer systems Napster, OceanStore (UCB),
    Farsite (MSR), Publius (ATT research) - see web
    for documentation on these recent systems

17
New design approaches
  • Replicated read-write files
  • High availability
  • Disconnected working
  • re-integration after disconnection is a major
    problem if conflicting updates have occurred
  • Examples
  • Bayou system (Section 14.4.2)
  • Coda system (Section 14.4.3)

18
Summary
  • Sun NFS is an excellent example of a distributed
    service designed to meet many important design
    requirements
  • Effective client caching can produce file service
    performance equal to or better than local file
    systems
  • Consistency versus update semantics versus fault
    tolerance remains an issue
  • Most client and server failures can be masked
  • Superior scalability can be achieved with
    whole-file serving (Andrew FS) or the distributed
    virtual disk approach
  • Future requirements
  • support for mobile users, disconnected operation,
    automatic re-integration (Cf. Coda file system,
    Chapter 14)
  • support for data streaming and quality of service
    (Cf. Tiger file system, Chapter 15)
Write a Comment
User Comments (0)
About PowerShow.com