Title: Scale and Performance in a Distributed File System
1Scale and Performance in a Distributed File
System
- Howard et al.
- Carnegie Mellon University
Presented by Tsuen Wan Ngan (Johnny) Assisted by
Shu Du
2Andrew File System
- A distributed file system
- homogenous, location-transparent file name space
- Motivated primarily by scalability
- Maximize client/server ratio
- Scales gracefully
3Andrew File System
Whole-file transfer
Caching (file and status)
4Locating a File
Vice 1
Vice 2
/
/
/usr
/bin
/bin
/usr
/usr/local
/usr/share
5Example of Accessing a File
- Interrupts an open call
- Locate the server
- Fetch the file if no valid cache copy
- Serve read/write calls locally
- Update the server upon a close call
Vice 3
Vice 1
Vice 4
Vice 2
Venus
6Example of Accessing a File
- Interrupts an open call
- Locate the server
- Fetch the file if no valid cache copy
- Serve read/write calls locally
- Update the server upon a close call
Server 3
Server 1
Server 4
Server 2
Fetch
Client
7Example of Accessing a File
- Interrupts an open call
- Locate the server
- Fetch the file if no valid cache copy
- Serve read/write calls locally
- Update the server upon a close call
Server 3
Server 1
Server 4
Server 2
Update
Client
8Overview
- Brief discussion on prototype
- Changes for performance
- Effect of changes
- Comparison with NFS
- Conclusion
9Prototype Description
- Address files by full pathnames
- No notion of a low-level name like inode
- Server mirrors directory hierarchy
- Search miss ends up in a stub directory
- All cached copies are suspect
10Server Process Structure
lock-server process
Server i
processes
Client
Client
Client
Client
11Qualitative Observations
- Most programs run without changes
- Some programs much slower
- Due to stat calls
- Acceptable up to 20 users/server
- RPC exceeds network resources in kernel
12Performance Observations
- 9 5 on weekdays for 2 weeks
- File and status cache hit over 80
- Server loads not evenly balanced (51)
- Bottleneck was on server CPU (40 avg.)
- Frequent context switches between server
processes - Full pathnames traversal
13Distribution of Calls
14Changes for Performance
- Cache management
- Name resolution
- Low-level storage representation
- Communication and server process structure
15Cache Management
- Also caches directories and symbolic links
- Uses LRU algorithm to keep in size
- Keeps status cache in virtual memory
- Assumes caches are valid
- Server notifies changes (Callback)
16Name Resolution
- Prevent namei operations in servers
- Clients logically performs namei
- Clients identify files as Fid
- Location information replicated on each server
17Low-level Storage Representation
- Files are accessed by inode
- Fid allows table lookup to locate file
- Same mechanism for clients to access cache entries
18Communication and Server Process Structure
- Use a single server process
- User-level non-preemptive Lightweight Processes
- Bounded to a client for one server operation
- RPC moved outside kernel
19Overall Design
in cache?
open a file
no
yes
has callback?
fetch, est. callback
no
no
yes
valid cache?
use
yes
est. callback
20Notion of File Consistency
- Writes visible to all local processes, invisible
elsewhere - Changes to a closed file visible anywhere except
open instances - All other operations visible everywhere
- Multiple nodes can perform operations concurrently
21Benchmark
- Source code with 70 files, 200k in size
- 5 distinct phases
- MakeDir
- Copy
- ScanDir
- ReadAll
- Make
- Referred as a Load Unit
22Scalability
Revised one is more scalable
23The Sun Network File System
- No distinguished client or server
- No transparent file location
- Implemented within kernel
- Indeterminate consistency semantics
- Remote file involved in each operation
- With read-ahead and write-behind
24Experimental Comparison
- Experimented on Andrew File System
- Cold cache set
- Warm cache set
- NFS failed at high load
- Due to file system errors
- Cause by lost RPC reply packets
25Benchmark Time
Andrew more scalable in performance
26Percent CPU Utilization
Andrew consumes less CPU cycles
27Percent Disk Utilization
Andrew has lower disk utilization
28Latency
NFS has low latency independent of file size
29Conclusion
- Andrew is scalable
- Has room for further improvement
- By moving code into kernel
- Satisfied with its performance
30Thank you!
31Changes for Operability
- Volume partitioning
- Volume movement
- Quotas
- Read-only replication
- Backup