Recap

About This Presentation

Transcript and Presenter's Notes

Title: Recap

1
Recap

Consistency Models
Fault Tolerance
Reliable Communication
Security

2
Today

Distributed File Systems

3
Distributed File Systems

The basis of many distributed applications
Allow multiple processes to share data over long
periods of time in a secure and reliable way
Well look at two of them
Sun Network File System (NFS)
Coda (a descendant of the Andrew File System, AFS)

4
Sun Network File System (NFS)

Originally developed by Sun for its UNIX systems
Each file server provides a standardized view of
its local file system
NFS has been around a long time
Version 1 internal to Sun, never released
Version 2 released in SunOS 2.0, 1985
Version 3 released in Solaris, 1994
Version 4 not yet released

5
NFS Overview

NFS isnt really a file system - its a
collection of protocols that provide clients with
a model of a distributed file system
Clients use local file system calls, but the file
system interface is really an interface to a
Virtual File System (VFS)
The VFS determines whether the call gets passed
to a local file system, or to an NFS client which
communicates with an NFS server via RPC

6
NFS Architecture

NFS is independent of the underlying local file
systems (it can live on UFS, EXT2, HFS, etc)

7
NFS File System Model

NFSs file system model is very similar to the
UFS file system model
Files are uninterpreted sequences of bytes
Hierarchical organization into a naming graph in
which nodes represent directories and files
Hard links as well as symbolic links
Files are named, but are accessed using file
handles
Each file has a set of attributes that can be
examined and changed

8
Communication in NFS

NFS is designed to be platform-, network
architecture-, and transport-independent
All communication in NFS uses the Open Network
Computing RPC (ONC RPC) protocol
Every NFS operation can be implemented as a
single RPC to a file server
NFS Version 4 allows several such RPCs to be
grouped into a single request, but with no
transactional semantics

9
Communication in NFS

Reading data from a file (a) depicts NFS version
3, (b) depicts NFS version 4

10
Processes in NFS

Servers in NFSv2 and NFSv3 are stateless, but in
NFSv4 maintain some state about clients
NFSv4 is expected to work over wide-area as well
as local-area networks, so clients need to use
caches for efficiency - it helps to have the
server maintain information on files being used,
such as read/write leases
NFSv4 supports OPEN and CLOSE operations, where
NFSv2 and NFSv3 did not, so an NFSv4 server has
to maintain state on open/closed files

11
Naming in NFS

Servers can export whole or partial file systems,
and clients can mount these into their local name
spaces

12
Naming in NFS

Exports dont work recursively (an NFS server
cant mount and re-export another servers
filesystem)

13
Synchronization in NFS

NFS implements session semantics for file
sharing
Changes to an open file are initially visible
only to the process that modified the file, and
are made visible to other machines only when the
file is closed
Local caches are generally used - so if two
people on two machines are editing the same file
on NFS, and they both save their copies at around
the same time, its not known which copy will be
the one thats really saved

14
Synchronization in NFS

Theres a protocol for file locking in NFSv3,
including the use of a separate lock manager
(because servers are stateless)
Its almost never used, because its so intricate
that there are few implementations that actually
work well
In NFSv4, file locking is integrated into the
file access protocol (this is possible because
servers are stateful)
Recovery mechanisms are necessary to prevent
deadlocks

15
Security in NFSv3

The most widely used authentication method in
NFS is system authentication, where the NFS
client passes its effective UID and GID to the
server along with a list of groups it claims to
be a member of - weve discussed this before
Secure NFS uses Diffie-Hellman key exchange to
establish a session key for a secure channel
Kerberos v4 is also sometimes used for NFS
authentication

16
Security in NFSv4

NFSv4 supports RPCSEC_GSS, a general security
framework that has plugins for multiple security
mechanisms and supports both message integrity
and confidentiality as well as authentication
RPCSEC_GSS is based on GSS-API, a standard
interface for security services
The primary authentication method is Kerberos v5,
and an alternative is LIPKEY, a public-key system
Access control is done with ACLs

17
Security in NFSv4

Because NFSv4 has no built-in security mechanism
of its own, its security architecture is
adaptable
If an exploit is found for some security
mechanism, another can be plugged in instead
Multiple security mechanisms can be used by the
same NFS server

18
Coda File System

Developed at CMU in the early 1990s, and now
integrated with many UNIX-based OSes
Designed to be scalable, secure, highly available
Highly available gt high failure transparency
Descended from version 2 of the Andrew File
System (AFS), also developed at CMU - much of the
architecture is the same as AFS
AFS was designed to support the entire CMU
community, approximately 10,000 workstations

19
Coda Architecture

Coda nodes are partitioned into two groups
A relatively small number of Vice file servers,
which are centrally administered
A much larger number of Virtue workstations that
give users and processes access to the file
system
Every Virtue workstations hosts a user-level
process called Venus, which is much like an NFS
client
Venus allows a workstation to continue operation
even if access to the file servers is
(temporarily) impossible

20
Coda Architecture
21
Coda Architecture

Venus communicates with Vice file servers using a
user-level RPC system
RPC system is constructed on top of UDP
datagrams, and has at-most-once semantics
Three different server-side processes
Vice file servers do most of the work
Trusted Vice machines run an authentication
server
Update processes are used to keep metainformation
on the file system consistent at each Vice server

22
Coda Architecture

Coda appears to its users as a traditional
UNIX-based file system
Supports most of the VFS operations, just like
NFS
Unlike NFS, Coda has a globally shared name space
maintained by the Vice servers
Clients have access to the name space through a
special subdirectory in their local name space,
such as /afs

23
Communication in Coda

Communication in Coda is performed using an RPC
system called RPC2, which implements reliable
RPCs on top of UDP
Each time an RPC is called, the RPC2 client
starts a new thread that sends an invocation and
waits for an answer
This thread also receives heartbeats from the
server saying that the request is still in
process
If the server dies, sooner or later the thread
will notice and report a failure to the calling
application

24
Communication in Coda

One special feature of RPC2 is side effects,
which allow a client and server to communicate
using an application-specific protocol

25
Communication in Coda

Another feature of RPC2 is multicast support,
which allows (for instance) a server to
invalidate multiple clients local copies of a
file simultaneously

26
Naming in Coda

Files are grouped into volumes, similar to disk
partitions but typically of smaller granularity
A volume corresponds to a partial subtree in the
shared name space maintained by the servers
Usually a collection of files associated with a
user
Volumes are the basic unit of the namespace,
which is constructed by mounting volumes at mount
points
Volumes form the unit for server-side replication

27
Naming in Coda

Since volumes are of small granularity, a single
name lookup will often cross several mount points
During name lookup, a Vice file server returns
mounting information to the Venus process, so
that the volumes can be automatically mounted as
necessary
When a volume from the shared name space is
mounted in the clients name space, Venus follows
the structure of the shared name space

28
Naming in Coda

Shared files have the same name as seen from all
clients - this is fundamentally different from NFS

29
Naming in Coda

Each file is contained in exactly one volume
Volumes can be replicated, so Coda distinguishes
between logical and physical volumes, where a
logical volume represents a possibly replicated
physical volume
Each logical volume has a Replicated Volume
Identifier (RVID), and each physical volume has
its own Volume Identifier (VID)
Each file has a 96 bit file identifier - 32 bits
are an RVID and 64 bits are a unique file ID

30
Naming in Coda

Implementation and resolution of a Coda file
identifier

31
Synchronization in Coda

Unlike NFS (and AFS), Coda supports a form of
transactional semantics
One of its goals is to support high availabity,
so Coda ensures that a client can use files it
has cached locally even if its disconnected from
the network
When a client opens a file, it gets a local copy
of the file - at most one client can open a file
for writing, and if a client has already opened a
file for writing, no other client can open it at
all

32
Synchronization in Coda

A client can continue to read its local copy of a
file, even if that copy is outdated (because some
other client wrote a new version to the server)

33
Synchronization in Coda

The notion of a network partition, a part of the
network that is isolated from the rest, is
critical in defining Codas transactional
semantics
The basic idea series of file operations should
continue to execute in the presence of
conflicting operations across different
partitions
Ideally, wed like one-copy serializability -
concurrent execution of operations by two
processes is equivalent to a joint serial
execution of those operations

34
Synchronization in Coda

The main problem is recognizing serializable
executions after they have taken place within a
partition
Coda addresses this problem by interpreting
sessions (openread/writeclose) as transactions
Most UNIX system calls are independent
transactions
These sessions are of various types, and Coda
knows in advance for each type what metadata will
be read or modified by a session of that type

35
Synchronization in Coda

For example, the store session type starts by
opening a file for writing on behalf of a user,
and ends by closing the file
Such a session reads, but does not modify, the
file identifier and the access rights of the file
- on the other hand, it both reads and
(potentially) modifies the last modification
time, file length, and file contents
The upshot is that its easier to recognize
conflicting operations when there are separate
locks for all the different metadata

36
Synchronization in Coda

All locks are acquired at the start of a session,
making it basically a 2PL system
If there are partitions, conflicts may need to be
resolved
When a partition occurs that disconnects a client
and server during a session, Venus allows the
client to continue and finish the session as if
nothing happened
When possible, it transfers the updates to the
server in the order they occurred at the client

37
Synchronization in Coda

Each file has a version number that indicates how
many updates have taken place since the file was
created - these are used to detect conflicts
The next update for a file f from the session of
a client whose connection has been restored can
be accepted if and only if Vnow 1 Vclient
Nupdates
Vclient is the version number acquired when f was
transferred from the server at the start of the
session, Nupdates is the number of updates in the
session that have been accepted by the server
after reintegration, and Vnow is the current
version number of f on the server

38
Synchronization in Coda

When a conflict is detected (f is updated in
concurrently executed sessions), the updates from
the clients session are undone, and the client
must save the local version of f for manual
reconciliation
In transactional terms, the transaction has been
aborted, and the conflict resolution is left to
the user

39
Caching and Replication in Coda

Caching and replication are critical to Codas
operation
Cache coherence is maintained by means of
callbacks
The server keeps a callback promise for each
client that has a local copy of a file
When a client changes its local copy, it notifies
the server, and the server sends an invalidation
message - or callback break - to the other
clients
As long as a client knows it has a callback
promise for a file, it can safely use its local
copy

40
Caching and Replication in Coda

File servers can be replicated in Coda, on a
per-volume basis
The servers that have a copy of a volume are that
volumes Volume Storage Group (VSG)
The Accessible Volume Storage Group (AVSG) for a
volume is those servers in the VSG that can be
contacted - if the AVSG is empty, the client is
disconnected
Coda uses a variant of Read-One, Write-All to
maintain consistency of replicated volumes

41
Caching and Replication in Coda

Coda uses an optimistic strategy for replication
Inconsistencies are detected through the
versioning scheme
Every server in a VSG maintains a Coda Version
Vector (CVV) for each file (much like logical
clock vectors)
The servers compare their version vectors as part
of the process of reintegrating after a partition
to detect conflicts
Sometimes, conflict resolution can be automated,
but often the users will have to resolve
conflicts manually

42
Fault Tolerance in Coda

Coda supports disconnected operation - a client
can keep working with local copies of files even
if it has empty AVSGs for those files
This works mainly because of Codas caching
strategy - Coda attempts to fill the cache with
useful information, based on hints from users and
information about which files have been used
recently
Another contributing factor is that, in practice,
write sharing rarely occurs

43
Fault Tolerance in Coda

Normally, a client is in the HOARDING state,
where it tries to keep its cache full
If the client becomes disconnected, it enters the
EMULATION state, where all file requests are
serviced locally
When it reconnects, it enters the REINTEGRATION
state, where conflicts are detected and resolved

44
Security in Coda

Coda uses the same security architecture as AFS
Secure channels are established using a
secret-key cryptosystem and a protocol derived
from Needham-Schröder
All client-server communication is based on
Codas secure RPC mechanism
Authentication is done with tokens, much like
Kerberos tickets

45
Security in Coda

Access control in Coda is somewhat different from
UNIX access control
A Vice server associates an ACL with directories
only, not with individual files
All files in the same directory share the same
protection
Access rights are distinguished with respect to
the following operations read, write, lookup
(check the status of a file), insert (add a
file), delete, and administer (modify a directory
ACL)

46
Security in Coda

Coda maintains user and group information
Unlike many other systems, it supports the
assignment of both rights and negative rights to
users and groups - it is possible to explicitly
state that a user is not allowed to do certain
things
This approach is convenient when a users rights
need to be revoked - rather than immediately
removing the user from all his member groups, the
user can be given negative access rights (and
then the groups can be cleaned up later)

Recap PowerPoint PPT Presentation