Title: CSS434: Parallel
1CSS434 Distributed File Systems Textbook Ch8, 13
Professor Munehiro Fukuda
2DFS Desirable Features
- Transparency
- Access transparency a single set of operations
- Location transparency uniform file name space
- Mobility transparency file mobility
- Performance transparency Comparable to a
centralized file system - Concurrency and synchronization should complete
concurrent access requests consistently. - Forward/backward validation
- File caching and replication
- Caching at client/server for scalability
- Replication at multiple servers for availability
- Heterogeneity should allow a variety of nodes to
share files in different storage media
and OS - Similarity between Unix and NTFS stream-oriented
files, a tree-structured system - Difference between Unix and NFTS CR char
included in NTFS, file naming - Fault tolerance at-most-once or at-least-once
semantics - Consistency Unix one-copy update semantics,
session semantics, etc. - Security should protect files from network
intruders.
3Consistency Maintenance in Various Storage Systems
4File Service Architecture
(File caching/replication)
(File caching)
Consistency maintenance
5DFS Services
- Flat file service
- File-accessing mechanism
- deciding a place to manage remote files
and unit to transfer data (at server or
client? file, block or byte?) - File-sharing semantics providing similar to Unix
but weaker file update semantics - File-caching mechanism improving
performance/scalability - File-replication mechanism
- improving performance/availability
- Directory service
- Mapping between text file names and reference to
files, (i.e. file IDs)
6Flat File Service Operations
7Directory Service Operations
host3
host2
host1
Name1
Name3
Name2
addName( Dir, Name, file)
file
Ref count3 if ref_count 0, file deleted
Dir
8File-Accessing Models
Demerits
Merits
File access
Communication overhead
A simple implementation
At a server
Remote service model
Cache consistency problem
Reducing network traffic
At a client that cached a file copy
Data caching model
9File-Sharing Semantics
- Define when modifications of the file data made
by a user are observable by other users - Unix semantics
- Session Semantics
- Immutable shared-files semantics
- Transaction-like semantics
10File-Sharing SemanticsUnix Semantics (One-copy
Update Semantics)
Absolute Ordering (seen to all clients as if only
a single copy existed and is updated
immediately)
Client A
Append(e)
read
delayed
a
b
a
b
c
a
c
b
d
c
b
d
a
b
a
c
e
a
b
c
d
e
c
a
b
delayed
Append(d)
read
Client B
Network Delays (Inevitable to have a weaker
semantics)
11File-Sharing SemanticsSession Semantics
Client C
Server
Client A
Client B
Open(file)
Append(c)
Open(file)
Append(d)
Append(x)
Append(e)
Append(y)
Close(file)
Append(z)
Open(file)
Close(file)
Append(m)
m
Close(file)
m
File writes may overwrite previous updates. File
lock is needed to prevent this overwrites.
12File-Sharing SemanticsSession Semantics with
File Lock
Client B
Server
Client A
file
Open(file)
lockt
Append(c)
Open(file)
User need to choose quit, steal, or proceed
lockt
Append(x)
xw
X
Close(file)
xs
Close(file)
User need to choose Quit, save anyway, or type
xw
file2
file
X
file3
13File-Sharing SemanticsTransaction-Like Semantics
(Concurrency Control)
Forward validation
Backward validation
Client A
Client B
Client A
Client B
Client C
Client D
Client C
Client D
Trans_start
Trans_start
Compare write with later reads
Compare reads with former writes
Trans_start
Trans_start
R1 R2 W6 R4 W7
Trans_start
Trans_start
R1 R2 W9 R4 W8
validation
validation
Commitment
Commitment
Trans_start
Trans_start
Trans_end
Trans_end
R1 R2 R6 R8 W8
R1 R2 R6 R8 W8
Trans_abort
Trans_restart
Trans_end
Trans_end
Trans_end
Abort itself or conflicting active transactions
Trans_abort
Trans_restart
Trans_end
Which validation is better?
14File-Sharing SemanticsImmutable Shared-Files
Semantics
Server
Client B
Client A
Version 1.0
Tentative based on 1.0
Tentative based on 1.0
Version 1.1
Version conflict
Abort
Depend on each file system. Abortion is simple
(later, the client A can Decide to overwrite it
with its tentative 1.0 by changing the
corresponding directory)
Version 1.2
Version 1.2
Merge
Ignore conflict
15File-Caching SchemesCache Location
Node boundary
Client
Server
Main memory
Main memory
copy
copy
Disk
Disk
copy
file
16File-Caching SchemesModification Propagation
- Write-through scheme
- Pros Unix-like semantics and high reliability
- Cons Poor write performance
- Delayed-write scheme
- Write on cache displacement
- Periodic write
- Write on close
- Pros
- Write accesses complete quickly
- Some writes may be omitted by the following
writes. - Gathering all writes mitigates network overhead.
- Cons
- Delaying of write propagation results in fuzzier
file-sharing semantics.
Client 1
Client 2
Main memory
Main memory
copy
Disk
file
Client 1
Client 2
Main memory
Main memory
new
copy
copy
W
W
Disk
delayed write
file
17File-Caching SchemesCache Validation Schemes
Client-Initiated Approach
Client 1
Client 2
Main memory
Main memory
- Checking before every access (Unix-like semantics
but too slow) - Checking periodically (better performance but
fuzzy file-sharing semantics) - Checking on file open (simple, suitable for
session-semantics) - Problem High network traffic
copy
copy
Disk
Write through
Check before every access
Delayed write?
file
Client 1
Client 2
Main memory
Main memory
copy
new
W
copy
Disk
W
Check-on-open
Write-on-close
Check-on-close?
file
18File-Caching SchemesCache Validation Schemes
Server-Initiated Approach
Client 1
Client 2
Client 3
Client 4
Main memory
Main memory
Main memory
Main memory
copy
copy
copy
W
Deny for a new open
W
W
Disk
Notify (invalidate)
Write through Or Delayed write?
file
W
- Keeping track of clients having a copy
- Denying a new request, queuing it, and disabling
caching - Notifying all clients of any update on the
original file - Problem
- violating client-server model
- Stateful servers
- Check-on-open still needed for the 2nd file
opening.
19Homework Assignment 4
Client 1
Server
Client 2
invalidate( ) writeback( )
invalidate( ) writeback( )
download( ) upload( )
chmod 600
chmod 400
file1
file1
file1
file2
/tmp
cwd
/tmp
emacs
emacs
- Session semantics
- Client-side/server-side caching
- Server-initiated invalidation
20File Access Improvements
- Data sieving for a single client
- Read a larger contiguous file portion
- Extract actual file portions from it
- Collective I/O for multiple clients
- Read contiguous space, thereafter distribute sub
spaces to each client - Disk-directed I/O
- Server-directed I/O
- Two-phase I/O (Clients-directed)
21Data Sieving
Users request for non-contiguous file portions
Read a larger contiguous block into memory
Copy requested portions into users buffer
(from R. Thakurs Data Sieving and Collective I/O
in ROMIO, 1998)
22Two-Phase I/O
P0
P1
P0
Redistribute
Read contiguous
P1
Redistribute
Read contiguous
P3
P2
P2
Read contiguous
Redistribute
P3
Read contiguous
Redistribute
23File Stripes Transfer in a Hierarchy(from
Fukuda/Miyauchi Journal of Supercomputing)
key value
GUI
read files
commander Id 0
128_inputFile1_1
128_inputFile1_1
contents
528
contents
32_inputFile1_0
32_inputFile1_0
contents
contents
32_inputFile2_0
contents
32_inputFile2_0
contents
528_inputFile2_7
contents
528_inputFile2_7
contents
128_inputFile1_1
contents
528_inputFile1_7
contents
528_inputFile1_7
contents
root sentinel Id 2
32_inputFile1_0
contents
32_inputFile2_0
contents
sentinel Id 8
32
sentinel Id 9
128
528
528_inputFile2_7
contents
528_inputFile1_7
contents
sentinel Id 32
sentinel Id 33
sentinel Id 38
sentinel Id 36
sentinel Id 37
sentinel Id 39
sentinel Id 128
sentinel Id 129
sentinel Id 130
sentinel Id 131
sentinel Id 132
128_inputFile1_1
contents
32_inputFile1_0
contents
sentinel Id528
32_inputFile2_0
contents
24DFS ExampleSun NFS
Server
Client B
Client A
/
/
/
usr
usr
opt
bin
bin
bin
org
shared
shared
export
export
User process
User process
VFS
VFS
VFS
NFS server
NFS client
Local FS
Local FS
NFS client
Local FS
RPC stub
RPC stub
RPC stub
25Sun NFSInstallation
- Server
- Check if NFS is running rpcinfo p
- Start NSF /etc/rc.d/init.d/nfs start
- Edit /etc/exports file /dir/to/export
client1(permissions), client2( - Export dirs in /etc/exports exportfs a
- Check exported directories showmount e
- Client
- Import a servers directory mount o options
server_name/dir /my_dir - bg continue working on importing upon a failure,
- intr a process will be interupted if its I/O
request to the server dir is pending. - soft allowing a client to time out the
connection after a number of retries - rw/ro normal r/w or read only
- Underlying Connections
portmapper
client
NFS mount service port
mountd
permission
portmapper
2049
nfs
rpc
26Sun NFSOverviews
- Communication
- RPC a compound procedure
- Lookup, Open, and Read
- Server status
- Stateless simple implementation in ver 3.
- Statefull allowing clients to cache files in ver
4. - RPC call back from a server to invalidate a
clients cache - Synchronization
- Session semantics
- File Locking in ver 4 lock, lockt, locku, and
renew - Ex. Emacs Tests with lockt when modifying
buffer, locks a file with lockt, and unlock with
locku after writing buffer contents to the file. - Share reservation specify how to share a file
(with ro, wo, or r/w)
27SUN NFSOverviews (Contd)
- Caching
- In clients memory
- Session semantics
- Revalidation of clients cache upon re-opening
the same file - Open delegation
- A server delegates a open decision to a writing
client which can handle an open request from
other clients on the same machine. - A server calls back the client when receiving an
open request from another machine. - Fault Tolerance
- RPC failure use a duplicate-request cache
- File locking failure provide a grace period
during which a client reclaim locks previously
granted and the server builds up its previous
state.
28Sun NFSDuplicate Request Cache
server
server
server
client
client
client
XID 1234
XID 1234
XID 1234
XID 1234
Too soon, ignore
Too soon, ignore
Transaction completed
Transaction completed
Transaction completed
XID 1234
reply
reply
reply
Just replied, ignore
XID 1234
reply
Then, when does the server delete this cached
result?
29DFS ExampleAndrew File System
30AFSFile Name Space
Client
Server
/
/
usr
usr
tmp
tmp
Shared
Local
bin
bin
Symbolic links
Symbolic links
Vice process
Venus process
User process
Unix Kernel (Unix FS)
Unix Kernel (Unix FS)
cache
31AFSSystem Call Interception
32AFSImplementation of file system calls
33DFS ExampleXFS
Metadata Manager
Storage Server
Metadata Manager
Client
Storage Server
Storage Server
Client
LAN
34DFS ExamplePlan 9
Client
/
Union directory
ex
in
N
import
net
x
y
b
a
c
a
d
import
export
import
File server 2
File server 1
Computation server
Network Interface
d3
d1
d2
N
Network access
b
a
c
a
d
x
y
net
Internet
Remote execution
35Paper Review by Students
- Sun NFS
- Andrew File System
- XFS
- Plan 9
- LFS
- Discussions
- What file-sharing semantics is each system based
on? - Which systems use server-side caching?
- Which systems use client-side caching?
- Which systems use the client-initiated
validation? - Which systems use the server-initiated
validation?
36Non-Turn-In Exercises
Q1. In transaction-like semantics a.k.a.
concurrency control, compare the pros and cons of
backward and forward transactions. In particular,
consider the case where each transaction includes
more read than write operations. Backward
transaction Pros Cons Forward
transaction Pros Q2. Answer the following five
questions about file-caching. When you are asked
to show which systems use a given caching scheme,
choose all applicable systems from NFS, AFS, xFS
and Plan9. Q2-1. Why can file-caching contribute
to performance improvement? Answer two
reasons. Reason 1 Reason 2 Q2-2. State one
merit for using server-side caching? Which system
uses server-side-caching? Merit System Plan9
(Answer)
37Non-Turn-In Exercises
Q2-3. Client-side caching allows multiple clients
to cache the same file. There are two schemes to
validate the contents of a locally-cached file
(or invalidate the contents of the same file
cached at remote clients.) Those are
client-initiated and server-initiated
validations. Does the client-initiated validation
require a file server to be stateful? Justify
your answer. Also show which systems use the
client-initiated validation. Stateless or
stateful? Reason Systems NFS, Plan9
(Answer) Q2-4. Does the server-initiated
validation require a file server to be stateful?
Justify your answer. Also show which system uses
the server-initiated validation. Stateless or
stateful? Reason System AFS, xFS (Answer)