Title: CS 7103 Advanced Operating Systems Louisiana State University Rajgopal Kannan
1 Distributed File System
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
2 Characteristics of DFS
- Dispersed files and clients- login transparency,
access transparency- location transparency,
location independence - Multiple users and files- concurrency
transparency- replication transparency
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
3 DFS Design and Implementation
Basic Concepts
- Files and file systems
- Services and servers
- File mounting and server registration
- Stateful and stateless file servers
- File access and semantics od sharing
- Version control
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
4 1. Files and File Systems
- Files have names, attributes and data.
- File organization flat or hierarchical
- File access sequential, direct, indexed/indexed
sequential - Key components of file service directory,
authorization, file service (basic and
transaction) and system services.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
5 2. Services and Servers
- A service may be provided by one/many servers.
Similarly one process may provide multiple
services. - Client/server relationship is relative. A
schematic is given below.
Directory services
Auth. services
Clients
File services
System services
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
6 3. File Mounting and Server Registration
- Mounting is method to build users directory
structure by using files and directories
dispersed over multiple systems. - Different types of mounting Explicit, Boot and
Auto-mounting. - Mounting may or may not provide uniform global
view. - Mounting is not a location transparent protocol,
hence server registration is useful
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
7 4. Stateful and Stateless Servers
- A stateful server maintains state information for
each open file. Stateless server does not
maintain state information.Examples of state
information list of open files and clients,
file descriptors and handles, file position
pointers, mounting information, lock status,
session keys, cache or buffer etc. - For stateless servers there are certain issues to
be taken care of - Idempotency requirement -
File locking mechanism - Session key
management - Cache consistency
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
8 5a. File access
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
9 5b. Semantics of File Sharing
- UNIX Semantics- Can be implemented using only
server-side cache.- Closely approximated using
client-side cache with write-through and
write-invalidate/update- As a compromise
write-back (file updated and cache invalidated
only at the end of the batch) cache coherence
policy may be used. - Transaction Semantics- File at server is updated
only at the end of a successful transaction - Session Semantics- File at server is updated
only at the end of the session.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
10 6. Version Control
- One solution of the file sharing/replication
problem is to create a new version of the file
upon every write or upon every close. - Problem What if two applications open an older
version of a file, then application one closes
the file (and thus creates a newer version)
before application 2. Then application 2 closes
the file. - Three solutions
- Ignore Conflict
- Resolve version conflict
- Resolve serializability conflict.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
11 Transactions and Concurrency Control
- Transaction requirements
- The execution of transactions are all or none.
- The interleaving of multiple transactions is
serializable. - Update is atomic.
Clients
Transaction manager
Scheduler
Object manager
Objects
Transaction manager
Scheduler
Object manager
Objects
Clients
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
12 Serializability
A serializable schedule is a legal schedule such
that its execution will produce result equivalent
to some serial execution of all transactions.
A simple, inefficient method to achieve
serializability Complete transactions in private
space, then update distributed objects using
total order multicast.
- Three concurrency control protocol for
transaction management - Two-phase locking
- Timestamp ordering
- Optimistic concurrency control
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
13 Two-Phase Locking
An object must be locked before accessing it. No
new lock can be acquired after releasing a lock.
- Problems
- Potential deadlock. May be solved by rollback or
abort. - Commit dependence resulting in probable rolling
aborts if locks are released as early as
possible. May be solved using strict two-phase
locking locks released only at commit or abort
point.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
14 Timestamp Ordering
Each transaction has a time-stamp. Transactions
are serialized according to this time-stamp. In
other words, older transactions abort and restart
if they have a conflict with an younger
transaction. Each object has two time stamps RD
and WR times of the transactions which read
(wrote) that object last. Further each object
will have a list of tentative transaction times
for pending writes. Let Tmin be the minimum of
these tentative times.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
15 Timestamp Ordering
- The following actions are taken with each event
- READ Does not conflict with other reads. If the
time-stamp of this read is smaller than WR, this
transaction is aborted. If it is allowed to
proceed if its time stamp is in between WR and
Tmin. It is kept in a queue otherwise for some
other writes to commit. - WRITE Allowed to proceed only if its time stamp
is greater than both RD and WR. Then the write is
marked tentative and put in the list. - ABORT Aborted read has no effect. Aborted write
is removed from the tentative list and if a
pending read reaches the head of the queue, it is
performed. - COMMIT A commit may not involve any pending
read. If it has any pending write then all
preceding tentative writes aborted and this
transaction is committed.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
16 Optimistic Concurrency Control
Transactions are allowed to proceed freely in the
private work space. Before committing it is
validated. Once validated, the objects may be
updated in updation phase. Each transaction ti
has two time stamps, TSi and TVi for the start of
the execution and the validation phase
respectively. Each object Oj records the times
for its last read and write commits, RDj and WRj,
respectively. Ri and Wi are read set and write
set for ti. The transactions are serialized in
the order of TVi. VALIDATION If tk is already in
validation phase, and TVi lt TVk, then ti can not
be validated and must be aborted. If ti has no
overlap with any other transaction, it is
validated. If ti execution phase overlaps with
update phase of tk, then Ri and Wk needs to be
disjoint to validate ti. If ti execution phase
overlaps with validate and update of tk, then Ri
and Wk needs to be disjoint as well as Wi and Wk
need to be disjoint.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
17 Data and File Replication
- Reading operation
- Read-one-primary
- Read-one
- Read-quorum
- Writing operation
- Write-one-primary
- Write-all
- Write-all-available
- Write-quorum
- Write-Gossip
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
18 Quorum Voting
- 2write-quorum gt number of replicas
- Read-quorumwrite_quorum gt number of replicas
- Usually read-quorum is chosen to be smaller than
write-quorum - Voting by witnesses
- Weighted voting schemes.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
19 Gossip Update
Each File service agent (FSA) f maintains a time
stamp of last updation TSfEach Replica manager
(RM) i maintains a time stamp of last updation
TSi Basic Gossip Protocol Read If TSf gt TSi the
replica manager do not have current data. Either
wait or contact a different replica manager.
Otherwise proceed with reading. Write Increase
TSf. If TSf gt TSi update and ask the replica
manager to propagate the update by gossip.
Otherwise, depending on the application, either
overwrite or update and read. In both cases
increase TSf. Gossip message If the gossip
message carries a newer time-stamp, update data,
TSi and depending on application, repeat the
gossip with Tsi.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan