Scale and Performance in a Distributed File System - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Scale and Performance in a Distributed File System

Description:

Clients identify files as Fid. Location information replicated on ... Fid allows table lookup to locate file. Same mechanism for clients to access cache entries ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 32
Provided by: csR7
Category:

less

Transcript and Presenter's Notes

Title: Scale and Performance in a Distributed File System


1
Scale and Performance in a Distributed File
System
  • Howard et al.
  • Carnegie Mellon University

Presented by Tsuen Wan Ngan (Johnny) Assisted by
Shu Du
2
Andrew File System
  • A distributed file system
  • homogenous, location-transparent file name space
  • Motivated primarily by scalability
  • Maximize client/server ratio
  • Scales gracefully

3
Andrew File System
Whole-file transfer
Caching (file and status)
4
Locating a File
Vice 1
Vice 2
/
/
/usr
/bin
/bin
/usr
/usr/local
/usr/share
5
Example of Accessing a File
  • Interrupts an open call
  • Locate the server
  • Fetch the file if no valid cache copy
  • Serve read/write calls locally
  • Update the server upon a close call

Vice 3
Vice 1
Vice 4
Vice 2
Venus
6
Example of Accessing a File
  • Interrupts an open call
  • Locate the server
  • Fetch the file if no valid cache copy
  • Serve read/write calls locally
  • Update the server upon a close call

Server 3
Server 1
Server 4
Server 2
Fetch
Client
7
Example of Accessing a File
  • Interrupts an open call
  • Locate the server
  • Fetch the file if no valid cache copy
  • Serve read/write calls locally
  • Update the server upon a close call

Server 3
Server 1
Server 4
Server 2
Update
Client
8
Overview
  • Brief discussion on prototype
  • Changes for performance
  • Effect of changes
  • Comparison with NFS
  • Conclusion

9
Prototype Description
  • Address files by full pathnames
  • No notion of a low-level name like inode
  • Server mirrors directory hierarchy
  • Search miss ends up in a stub directory
  • All cached copies are suspect

10
Server Process Structure
lock-server process
Server i
processes
Client
Client
Client
Client
11
Qualitative Observations
  • Most programs run without changes
  • Some programs much slower
  • Due to stat calls
  • Acceptable up to 20 users/server
  • RPC exceeds network resources in kernel

12
Performance Observations
  • 9 5 on weekdays for 2 weeks
  • File and status cache hit over 80
  • Server loads not evenly balanced (51)
  • Bottleneck was on server CPU (40 avg.)
  • Frequent context switches between server
    processes
  • Full pathnames traversal

13
Distribution of Calls
14
Changes for Performance
  • Cache management
  • Name resolution
  • Low-level storage representation
  • Communication and server process structure

15
Cache Management
  • Also caches directories and symbolic links
  • Uses LRU algorithm to keep in size
  • Keeps status cache in virtual memory
  • Assumes caches are valid
  • Server notifies changes (Callback)

16
Name Resolution
  • Prevent namei operations in servers
  • Clients logically performs namei
  • Clients identify files as Fid
  • Location information replicated on each server

17
Low-level Storage Representation
  • Files are accessed by inode
  • Fid allows table lookup to locate file
  • Same mechanism for clients to access cache entries

18
Communication and Server Process Structure
  • Use a single server process
  • User-level non-preemptive Lightweight Processes
  • Bounded to a client for one server operation
  • RPC moved outside kernel

19
Overall Design
in cache?
open a file
no
yes
has callback?
fetch, est. callback
no
no
yes
valid cache?
use
yes
est. callback
20
Notion of File Consistency
  • Writes visible to all local processes, invisible
    elsewhere
  • Changes to a closed file visible anywhere except
    open instances
  • All other operations visible everywhere
  • Multiple nodes can perform operations concurrently

21
Benchmark
  • Source code with 70 files, 200k in size
  • 5 distinct phases
  • MakeDir
  • Copy
  • ScanDir
  • ReadAll
  • Make
  • Referred as a Load Unit

22
Scalability
Revised one is more scalable
23
The Sun Network File System
  • No distinguished client or server
  • No transparent file location
  • Implemented within kernel
  • Indeterminate consistency semantics
  • Remote file involved in each operation
  • With read-ahead and write-behind

24
Experimental Comparison
  • Experimented on Andrew File System
  • Cold cache set
  • Warm cache set
  • NFS failed at high load
  • Due to file system errors
  • Cause by lost RPC reply packets

25
Benchmark Time
Andrew more scalable in performance
26
Percent CPU Utilization
Andrew consumes less CPU cycles
27
Percent Disk Utilization
Andrew has lower disk utilization
28
Latency
NFS has low latency independent of file size
29
Conclusion
  • Andrew is scalable
  • Has room for further improvement
  • By moving code into kernel
  • Satisfied with its performance

30
Thank you!
31
Changes for Operability
  • Volume partitioning
  • Volume movement
  • Quotas
  • Read-only replication
  • Backup
Write a Comment
User Comments (0)
About PowerShow.com