Network File System NFS - PowerPoint PPT Presentation

1 / 66
About This Presentation
Title:

Network File System NFS

Description:

File handle is created by a server hosting the file system and it is unique with ... small number of dedicated Vice file servers, which are centrally administered. ... – PowerPoint PPT presentation

Number of Views:1330
Avg rating:3.0/5.0
Slides: 67
Provided by: raunotu
Category:

less

Transcript and Presenter's Notes

Title: Network File System NFS


1
Network File System (NFS)
  • Architecture and system model
  • Communication and processes

2
Network File System (NFS)
  • Developed by Sun Microsystems Inc.
  • Used widely (version 3)
  • Interoperable with various operating systems,
    mostly used in UNIX systems
  • Version 4 specification has been announced
  • Basic idea is that file server provides a
    standardized view of its local file system
  • Communication protocol allows clients access the
    files stored on a server
  • This makes possible to processes on different
    operating systems and machines to share a common
    file system

3
NFS Architecture
  • NFS model is remote file service model
  • Clients are normally unaware of the actual
    location of files (local/remote)
  • File system interface supports normal file
    operations (read, write etc) and it is called as
    remote access model (in contrast to
    upload/download model)
  • Files are accessed (typically in UNIX) via one
    interface to file system, Virtual File System
    layer that is hides differences between local and
    remote file systems
  • Communication is done through Remote Procedure
    Calls and server converts calls to normal local
    file system calls via VFS
  • NFS is largely independent of local file systems
    / operating systems

4
NFS Architecture
  • The remote access model (for example NFS)
  • The upload/download model (for example FTP)

5
NFS Architecture
  • The basic NFS architecture for UNIX systems.

6
NFS File System Model
  • File system model is almost the same than in
    UNIX-based systems (Files are sequences of bytes,
    hierarchical naming graph represents directories
    and files)
  • Hard and symbolic links are supported
  • Files are named but accessed via file handles
  • There are some differences in system operations
    between versions 3 and 4 of NFS protocol
  • Files have various attributes associated with
    them

7
NFS File System Model
  • An incomplete list of file system operations
    supported by NFS.

8
NFS Communication
  • Design of NFS makes it independent of operating
    systems, network architectures and transport
    protocols. For example Windows system can
    communicate with a UNIX file server
  • That is possible because NFS is placed on top of
    and RPC layer that hides the differences between
    various systems
  • Every NFS operation is implemented as a single
    RPC to a file server (in NFS v3). Version 4 of
    NFS protocol supports compound procedures by
    which several RPCs can be combined to a single
    request
  • Compound procedures are executed in order and if
    some operation fails error is returned and no
    further operations are executed at server side

9
NFS Communication
  • Reading data from a file in NFS version 3
  • Reading data using a compound procedure in
    version 4

10
NFS Processes
  • NFS is traditional client-server system in which
    clients request a file server to perform
    operations on files
  • Server implementation can be stateless (in NFS
    v3) and therefore simple. In version 4 of NFS
    server maintains information of clients.
  • For example locking of file and authentication
    requires server to maintain information of
    clients. In NFS v3 file locking is done with a
    separate NFS lock manager
  • NFS v4 maintains only very little information of
    clients and it is expected to also work in Wide
    Area Networks (WANs)
  • WAN usage requires efficient client cache and
    cache consistency protocol. It often works best
    when server maintains some information on files
    used by its clients

11
Network File System (NFS)
  • Naming
  • File handles
  • Automounting
  • File attributes

12
NFS Naming
  • Idea of NFS naming model is to provide clients
    complete transparent access to a remote file
    system maintained by a server
  • Transparency is achieved by letting a client be
    able to mount a remote file system into its local
    file system
  • Instead of mounting an entire file system of
    server, NFS allows clients to mount only a part
    of file system. Server is said to export a
    directory when it makes a directory and it
    entries available to clients
  • A directory exported by server can be mounted
    into a client's local name space (as some
    directory, called as mount point)
  • Clients can have different name spaces and for
    example two clients can refer to a same file on
    server with different path name because of
    different local name spaces
  • NFS server can mount a directory from other NFS
    server but it cannot export mounted directory to
    its clients. If clients want to access files on
    other server, clients can mount directory
    directly from other server

13
NFS Naming
  • Mounting (part of) a remote file system in NFS.

14
NFS Naming
  • Mounting nested directories from multiple servers
    in NFS.

15
NFS File Handles
  • File handle is reference to a file within a file
    system
  • It is independent of the name of the file it
    refers to
  • File handle is created by a server hosting the
    file system and it is unique with respect to all
    file systems exported by the server
  • Client is unaware of actual content of file
    handle
  • Length of file handle is up to 64 bytes (NFS v3)
    and 128 bytes (NFS v4)
  • File handle is true identifier for a file
    relative to a file system. It must be same all
    the time for a file and it cannot be reused after
    the file has been deleted

16
NFS Automounting
  • Because of different local namespace problemNFS
    client namespaces can be partly standardized
  • Remote file system can also be mounted when
    needed, this is called automounting
  • Automounting is done by automounter which runs a
    separate process on the client's machine
  • For example automounter can mount directory /home
    as automounting directory. When a process wants
    to access some directory in /home directory, file
    access operation is forwarded from kernel to NFS
    client, then from NFS client to automounter
  • Then automounter creates mount point and mounts a
    required remote file system and file can be
    accessed.

17
NFS Automounting
  • A simple automounter for NFS.

18
NFS Automounting
  • Problem with this simple automounter is that the
    automounter will have to be involved in all file
    operations to guarantee transparency. This can
    cause performance problems
  • A simple solution is to let the automounter mount
    directories in a special subdirectory, and create
    a symbolic link to each mounted directory

19
NFS Automounting
  • Using symbolic links with automounting.

20
NFS File Attributes
  • An NFS file has a number of associated attributes
  • In version 3 of NFS, set of attributes is fixed
    and every NFS implementation is expected to
    support those attributes (Fully implementing NFS
    on non-UNIX systems was sometimes difficult or
    impossible)
  • In version 4 of NFS, set of attributes has been
    split into a set of mandatory attributes that
    every implementation must support, and a set of
    recommended attributes that should be preferably
    supported, and an additional set of named
    attributes
  • Named attributes are actually not part of the NFS
    protocol, but are encoded as array (attribute,
    value)-pairs where attribute is string and value
    sequence of bytes
  • Attributes and values are stored with file or
    directory and there are NFS operations to read
    and write them. Interpretation of attributes and
    their values is left to an application, it is not
    defined in NFS specification

21
NFS File Attributes
  • Some general mandatory file attributes in NFS.

22
NFS File Attributes
  • Some general recommended file attributes.

23
Network File System (NFS)
  • File sharing
  • File locking
  • Caching and replication

24
NFS File Sharing
  • When two or more users share the same file, it is
    necessary to define the semantics of reading and
    writing precisely to avoid problems
  • In single-processor systems semantics normally
    state that when a read operation follows a write
    operation, the read returns the value just
    written. Similarly, when two writes happen in
    quick succession, followed by a read, the value
    read is the value stored by the last write.
    System enforces an absolute time ordering and
    always returns the most recent value. This model
    is called as UNIX semantics.
  • In distributed system UNIX semantics can be
    achieved easily as long there is only one file
    server and clients do not cache files
  • Performance in single server system is frequently
    poor
  • If client caching (local) is allowed, other
    problems will occur
  • If a client locally modifies a cached file and
    shortly after that another client reads the same
    file from the server, the second client will get
    an obsolete file

25
Semantics of File Sharing
  • On a single processor, when a read follows a
    write, the value returned by the read is the
    value just written.
  • In a distributed system with caching, obsolete
    values may be returned.

26
Semantics of File Sharing
  • It is difficult to propagate all changes to
    cached files back to the server immediately
  • An alternative solution is to relax the semantics
    of file sharing, for example defining new rule
    "Changes to an open file are initially visible
    only to the process that modified the file"
  • New rule doesn't change behaviors, it only
    defines them to be correct. It is known as
    session semantics and NFS supports it.
  • If two or more clients are caching and modifying
    the same file at the same time and session
    semantics is used, most recently processed close
    file operation wins. Or winning modification is
    one of the candidates and leave the winner
    unspecified
  • Immutable semantics only create and read
    operations are supported. Simplifies
    implementation of sharing and replication
  • Transactions BEGIN_TRANSACTION, operations,
    END_TRANSACTION. System guarantees that all
    operations are executed correctly or returns
    error and none of the operations is actually
    executed. If two or more at same time, final
    result is verified to be correct.

27
Semantics of File Sharing
  • Four ways of dealing with the shared files in a
    distributed system.

28
File locking in NFS
  • Locking files can be problematic on stateless
    servers
  • In NFS v3, file locking is handled by a separate
    protocol and implemented by a stateful lock
    manager. Performance is not good and
    implementations can be faulty
  • In NFS v4, file locking is integrated in NFS
    protocol making locking simpler
  • There are only four operations related to locking
    and NFS distinguishes read locks from write
    locks.
  • Multiple clients can simultaneously access the
    same part of a file provided they only read data.
    A write lock is needed to obtain exclusive access
    to modify part of a file
  • Locks are granted for a specified time (leases).
    Client must renew the lease if it wants to keep
    lock. Server automatically removes locks that
    have not been renewed
  • In addition to these locking operations, there is
    also implicit way to lock a file referred to as
    share reservation. It is independent of locking
    and it can be used to implement NFS for
    Windows-based system

29
File Locking in NFS
  • NFS version 4 operations related to file locking.

30
Client Caching
  • Like many distributed file systems, NFS makes
    extensive use of client caching to improve
    performance
  • Caching in NFS v3 has been mainly left outside of
    the protocol. This has led to the implementation
    of different caching policies, of which most
    never guaranteed consistency
  • NFS v4 solves some of the consistency problems,
    but essentially still leaves cache consistency to
    be handled in an implementation-dependent way
  • General caching model of NFS consists a client
    having a memory cache that contains data
    previously read from the server. In addition,
    there may be also a disk cache that is added as
    an extension to the memory cache, using the same
    consistency parameters.
  • Typically clients cache file data, attributes,
    file handles and directories. Different
    strategies exist to handle consistency of the
    cached data

31
Client Caching
  • Client-side caching in NFS.

32
Client Caching
  • NFS v4 supports two approaches for caching file
    data
  • Simplest Client opens file, caches data from
    read operations, allows write operations to
    cache, when client closes the file and
    modifications is made, cached data must be
    flushed back to the server (Session semantics)
  • Second Once (part of) a file has been cached, a
    client can keep its data in the cache even after
    closing the file. NFS requires that when a client
    opens previously closed file that has been
    (partly) cached, the client must immediately
    revalidate the cached data. Revalidation takes
    place by checking when the file was last modified
    and invalidating the cache in case it contains
    stale data.
  • Server may delegate some of its rights to a
    client when a file is opened. Open delegation
    takes place when the client machine is allowed to
    locally handle open and close operations from
    other clients on the same machine. This reduces
    need to communicate with server.
  • Server is able to recall the delegation for
    example if another client needs to obtain access
    rights. Recalling is implemented as callback RPC
    and that requires that the server keeps track of
    clients to which it has delegated a file

33
Client Caching
  • Using the NFS version 4 callback mechanism to
    recall file delegation.

34
Replication
  • NFS version 4 provides minimal support for file
    replication
  • Only whole file systems can be replicated
  • Support is provided in the form of the
    FS_LOCATIONS attribute that is recommended for
    each file
  • Attribute contains list of locations where the
    replica of file system in which the associated
    file is contained, may possibly occur
  • Each location is given as DNS name or IP address
  • It is up to a specific NFS implementation to
    actually provide replicated servers. NFS v4 does
    not specify how replication should take place

35
Network File System (NFS)
  • Fault tolerance
  • Security

36
Fault tolerance
  • Fault tolerance in NFS v3 has hardly been an
    issue, because NFS protocol did not require
    servers to be stateful and no state was ever
    lost. Of course separate lock managers were
    stateful and special measures were needed
  • In NFS v4 statefulness occurs in file locking and
    delegation
  • In addition special measures need to be taken to
    handle the unreliability of the RPC mechanism
    underlying the NFS protocol
  • RPC stubs can be configured to use reliable TCP
    or unreliable UDP
  • If RPC reply is lost, client can send request
    again and server executes operation again (more
    than once, the original was exactly once)
  • Situation can be handled with duplicate-request
    cache
  • Each RPC request from a client carries a unique
    transaction identifier (XID) in its header, which
    is cached by the server when the request comes
    in.
  • As long as the server has not sent a reply, it
    will indicate that the RPC request is in
    progress. When the request has been handled, its
    associated reply is also cached, after which the
    reply is returned to the client.

37
RPC Failures
  • Three situations for handling retransmissions.
  • The request is still in progress
  • The reply has just been returned
  • The reply has been some time ago, but was lost.

38
Fault tolerance
  • File locking
  • Client must renew lease, but false removal may
    happen for example if network is (temporary)
    partitioned and client's renew message don't
    reach the server. No special measures are taken
    to handle such situations.
  • When the server crashes and subsequently
    recovers, it may have lost information on locks
    it granted to clients. Solution is to enter grace
    period in which client can reclaim locks that
    were previously granted to it. In this way server
    builds up its previous state with respect to
    locks. During the grace period normal lock
    requests are refused.
  • There are also numerous problems that using
    leases introduces. For example leasing requires
    clocks to be synchronized and that may not easily
    solved in wide-area systems.

39
Fault tolerance
  • Delegation
  • Open delegation introduces additional problems
    when a client or server crashes
  • If a client to which the opening of the file has
    been delegated crashes, it presumably had not
    propagated updates to the server. In that case,
    unless the client's updates are locally saved to
    stable storage, full recovery of the file will be
    impossible.
  • In any case, client is made partially responsible
    for file recovery
  • When the server crashes an subsequently recovers,
    it follows a procedure similar to lock recovery.
    A client to which the opening of a file has been
    delegated will reclaim that delegation when the
    server comes up again. However server forces the
    client to flush all modifications back to the
    server, effectively recalling the delegation.
  • Because of that file server is up-to-date with
    respect to the most recent modifications of each
    file it had delegated to a client. And server is
    again in full charge of the file, and may decide
    to delegate the file to another client

40
Security
  • Basic idea behind NFS is that a remote file
    system is presented to clients as it were a local
    file system. Therefore security of NFS mainly
    focuses on the communication between a client and
    a server. Secure communication means that a
    secure channel between the two should be set up.
  • Because of NFS is layered on top of an RPC
    system, setting up a secure channel in NFS boils
    down to establishing secure RPCs
  • In addition to secure RPCs, it is necessary to
    control access to files which are handled by
    means of access control file attributes in NFS.
    A file server is in charge of verifying the
    access rights of its clients
  • NFS security architecture is so based in secure
    RPCs and access control attributes

41
Security
  • The NFS security architecture.

42
Security
  • Secure RPCs
  • In NFS v3 secure RPC meant that only
    authentication was taken care of. Possible ways
    to do authentication were system authentication
    (User ID Group ID), secure NFS (Diffie-Hellman
    key exchange to establish session key) and
    Kerberos (tickets).
  • In NFS v4 security is enhanced by the support for
    RPCSEC_GSS.
  • RPCSEC_GSS is general security framework, it
    provides the hooks for different authentication
    methods and supports also message integrity and
    confidentiality which were not supported in NFS
    v3
  • RPCSEC_GSS is based on standard interface for
    security services, namely GSS-API. RPCSEC_GSS is
    layered on top of this interface.
  • For NFS v4 RPCSEC_GSS should be configured with
    support for Kerberos v5, LIPKEY (public-key
    system that allows clients to be authenticated
    using a password while servers can be
    authenticated using a public key)

43
Secure RPCs
  • Secure RPC in NFS version 4.

44
Security
  • Access control
  • Access control is supported by means of ACL file
    attribute
  • This attribute is a list of access control
    entries, where each entry specifies the access
    rights for a specific user or group
  • The operations that NFS distinguishes with
    respect to access control are relatively
    straightforward
  • Compared to the simple access control mechanisms
    in, for example UNIX systems, NFS distinguishes
    many different kinds of operations
  • NFS has more richer semantics than most UNIX
    systems

45
Access Control
  • The classification of operations recognized by
    NFS with respect to access control.

46
Coda distributed file system
  • Overview
  • Communication and naming
  • File sharing and transactional semantics

47
Overview
  • Coda was designed to be a scalable, secure and
    highly available distributed file system
  • Coda is descendant of version 2 of the Andrew
    File System (AFS)
  • AFS was designed to support large community and
    to meet this requirement nodes are partioned into
    two groups. One group consists of a relatively
    small number of dedicated Vice file servers,
    which are centrally administered. The other group
    consists of a very much larger collection of
    Virtue workstations that give users and processes
    access to the file system
  • Coda follows the same organization as AFS. Every
    Virtue workstation hosts a user-level process
    called Venus, whose role is similar to that of an
    NFS client. A Venus process is responsible for
    providing access to the files that are maintained
    by the Vice file servers. Venus is also
    responsible to continue operation even if access
    to file servers is (temporary) impossible.

48
Architecture
  • The overall organization of Coda.

49
Architecture
  • Internal architecture of Virtue workstation
  • Important issue is that Venus runs as user-level
    process
  • Virtual File System (VFS) layer intercepts all
    calls from client applications and forwards these
    calls either to the local file system or to Venus
  • Organization with VFS is same than in NFS
  • Venus communicates with Vice file servers using
    user-level RPC system
  • RPC system is constructed on top of UDP datagrams
    and provides at-most-once semantics
  • At the server-side there are three different
    processes
  • Vice servers are responsible for maintaining a
    local collection of files
  • Trusted Vice machines are allowed to run
    authentication server
  • Update processes are used to keep
    meta-information on the file system consistent at
    each Vice server
  • Coda appears to its users as a traditional
    UNIX-based file system
  • Coda provides a globally shared name space that
    is maintained by the Vice servers. Clients have
    access to this name space via local /afs
    directory. When client looks up a name in this
    directory Venus ensures that the appropriate part
    of shared name space is mounted locally

50
Architecture
  • The internal organization of a Virtue workstation.

51
Communication
  • Interprocess communication is performed using
    RPCs
  • RPC2 system for Coda is more sophisticated than
    traditional RPC used for example in NFS
  • RPC2 offers reliable RPCs on top of the
    (unreliable) UDP protocol
  • Each time a remote procedure is called, the RPC2
    client code starts a new thread that sends an
    invocation request to server and subsequently
    blocks until it receives an answer. As request
    processing may take an arbitrary time to
    complete, the server regularly sends back
    messages to the client to let it know it is still
    working on the request. If the server dies,
    sooner or later the thread will notice it and
    return error to the calling application
  • An interesting aspect of RPC2 is its support for
    side effects. A side effect is a mechanism by
    which client and server can communicate using and
    application-specific protocol. For example
    opening a file at a video server can be done as
    continuous data stream. There are routines for
    setting up connection and transferring data
  • RPC2 also supports multicasting and it is used
    for invalidating client caches in parallel

52
Communication
  • Side effects in Coda's RPC2 system.

53
Communication
  • Sending an invalidation message one at a time.
  • Sending invalidation messages in parallel.

54
Naming
  • Coda maintains a naming system analogous to that
    of UNIX
  • Files are grouped into units referred to as
    volumes. A volume is similar to a UNIX disk
    partition, but generally has a much smaller
    granularity. It corresponds to a partial subtree
    in the shared name space. Like disk partitions,
    volumes can be mounted.
  • Volumes form the basic unit by which the entire
    name space is constructed. This construction
    takes place by mounting volumes at mount points.
    Clients can only mount the root of volumes.
    Volumes form also the unit for server-side
    replication
  • Granularity of volumes causes that name lookup
    will cross several mount points. To support
    naming transparency, Vice file server returns
    mounting information to a Venus process during
    name lookup. This allows Venus to automatically
    mount a volume into the client's name space when
    necessary. This is similar to crossing mount
    points as supported in NFS v4
  • Shared name space is accessible by means of a
    subdirectory /afs in client's local namespace

55
Naming
  • Clients in Coda have access to a single shared
    name space.

56
File identifiers
  • Because the collection of shared files may be
    replicated and distributed across multiple Vice
    servers, it becomes important to uniquely
    identify each file in such way it can be tracked
    to its physical location, while at the same time
    maintaining replication and location transparency
  • Each file in Coda is contained in exactly one
    volume. Because a volume may be replicated across
    several servers, Coda makes a distinction between
    logical and physical volumes.
  • A logical volume represents a possibly replicated
    physical volume and has associated Replicated
    Volume Identifier (RVID). An RVID is a location
    and replication independent volume identifier.
    Multiple replicas may be associated with the same
    RVID
  • Each physical volume has its own Volume
    Identifier (VID), which identifies specific
    replica in a location independent way.
  • Each file has 96-bit file identifier, first part
    is 32-bit RVID, second part is 64-bit file
    identifier that uniquely identifies the file
    within a volume

57
File Identifiers
  • The implementation and resolution of a Coda file
    identifier.

58
Sharing files
  • When a client successfully opens a file, an
    entire copy of file is transferred to client's
    machine. The server records that the client has
    copy of this file.
  • If client has opened file for writing and another
    client wants to open the same file, opening will
    fail. This is caused by the fact that the server
    has recorded that first client might have
    modified file. If first client had opened file
    for reading and another for writing, both will
    have succeeded.
  • If several copies of the same file is stored
    locally on clients and one client modifies file
    and closes it, file will be transferred back to
    the server. Other clients may proceed to read
    their own copies of file despite the fact that
    the copy is actually outdated.
  • The reason for this apparently inconsistent
    behavior, is that a session is treated as a
    transaction in Coda.

59
Sharing Files
  • The transactional behavior in sharing files in
    Coda.

60
Transactional Semantics
  • In Coda, the notion of a network partition plays
    a crucial role in defining transactional
    semantics
  • A partition is a part of the network that is
    isolated from the rest and which consists of a
    collection of clients or servers, or both. The
    basic idea is that series of file operations
    should continue to execute in the presence of
    conflicting operations across different
    partitions. Two operations are said to conflict
    if they both operate on the same data and at
    least one is a write operation.
  • Coda recognizes different types of sessions. For
    example, each UNIX system call is associated with
    a different session type. More complex session
    types are the ones that start with a call to
    open.
  • As an example, the store session type starts with
    opening a file for writing as a specific user.
    Meta-data entries associated with file is read
    and modified as needed.
  • Conflicting transactions force clients to save
    its local version of file for manual
    reconciliation.

61
Transactional Semantics
  • The metadata read and modified for a store
    session type in Coda.

62
Coda distributed file system
  • Caching and replication

63
Client Caching
  • Client-side caching is crucial to the operation
    of Coda for two reasons. First, caching is done
    to achieve scalability. Second, caching provides
    a higher degree of fault tolerance as the client
    becomes less dependent on the availability of the
    server. For these reasons, clients in Coda always
    cache entire files
  • When a file is opened for either reading or
    writing, an entire copy of the file is
    transferred to the client, where is subsequently
    cached
  • Cache coherence in Coda is maintained by means of
    callbacks. Server keeps track of which clients
    have a copy of that file. A server is said to
    record a callback promise for a client. When a
    client updates its local copy for the first time,
    it notifies the server, which sends an
    invalidation message to the other clients.
    Invalidation message is called a callback break,
    because server will then discard the callback
    promise it held for the client it just sent an
    invalidation
  • As long as client knows it has an outstanding
    callback promise at the server, it can safely
    access the file locally.

64
Client Caching
  • The use of local copies when opening a session
    in Coda.

65
Server replication
  • Coda allows file servers to be replicated and
    unit of replication is a volume. The collection
    of servers that have a copy of a volume, are
    known as that volume's Volume Storage Group (VSG)
  • In presence of failures, a client may not have
    access to all servers in VSG. Client maintains a
    list of servers that are accessible. It is called
    Accessible Volume Storage Group (AVSG). If the
    AVSG is empty, the client is said to be
    disconnected.
  • Coda uses a replicated-write protocol to maintain
    consistency of a replicated volume. It uses a
    variant of Read-One, Write-All.
  • If client needs to read a file, it contacts one
    of the members in its AVSG. If client needs to
    write, the closing of the file is different.
    Client transfers modified file back in parallel
    to each member of AVSG. Parallel transfer is done
    with multicast RPC2.
  • This scheme works fine as long as there are no
    failures. (For each client, that client's AVSG of
    a volume is the same as its VSG)

66
Server Replication
  • Two clients with different AVSG for the same
    replicated file.
Write a Comment
User Comments (0)
About PowerShow.com