Title: Distributed File System Design and Implementation
1Distributed File System Design and Implementation
2Contents
- File and File System concept
- File Mounting
- Stateful/Stateless server concept
- Current work and Future work
3Files File Systems
- Files are named data objects. Files hold
structured data that are used by programs but
that are not part of the programs themselves. - File system is responsible for the naming,
creation, deletion, retrieval, modification, and
protection of a file in the system. - Logical components of a file for users.
File Name
File Attributes
Data units
4Example
- UNIX
- Files are streams of characters for application
programs and sequences of logical fixed size
blocks for file system. - Both sequential and direct access methods are
supported. Other access methods can be built on
top of the flat file structures.
5Major Components in a file system
Directory service Directory service Name resolution, add and deletion of files
Authorization service Authorization service Capability and /or access control list
File service Transaction Concurrency and replication management
File service Basic Read/write files and get/set attributes
System Service System Service Device, cache, and block management
6Directory Service
- Directories are files that contain names and
addresses of other files and subdirectories. -
- Mapping and locating
- Search for a file
- Create a file
- Delete a file
- List a directory
- Rename a file
- Traverse the file system
7Authorization Service
- File access must be regulated to ensure security
- Types of access
- Read
- Write
- Execute
- Append
- Delete
- List
8File Service Basic Operations
- Create
- Allocate space
- Make an entry in the directory
- Write
- Search the directory
- Write is to take place at the location of the
write pointer - Read
- Search the directory
- Read is to take place at the location of the read
pointer - Reposition within file file seek
- Set the current file pointer to a given value
- Delete
- Search the directory
- Release all file space
- Truncate
- Reset the file to length zero
- Open(Fi)
- Search the directory structure
- Move the content of the directory entry to memory
- Close(Fi)
- move the content in memory to directory structure
on disk - Get/set file attributes
9System Service
- System services are a FSs interface to the
hardware and are transparent to users of FS - Mapping of logical to physical block addresses
- Interfacing to services at the device level for
file space allocation/de-allocation - Actual read/write file operations
- Caching for performance enhancement
- Replicating for reliability improvement
10File Mounting Server Registration
11File Mounting
- Attach a remote named file system to the clients
file system hierarchy at the position pointed to
by a path name - A mounting point is usually a leaf of the
directory tree that contains only an empty
subdirectory - Once files are mounted, they are accessed by
using the concatenated logical path names without
referencing either the remote hosts or local
devices - Location transparency
- The linked information (mount table) is kept
until they are unmounted
12File Mounting
- Different clients may perceive a different FS
view - To achieve a global FS view SA enforces
mounting rules - Export a file server restricts/allows the
mounting of all or parts of its file system to a
predefined set of hosts - The information is kept in the servers export
file - File system mounting
- Explicit mounting clients make explicit mounting
system calls whenever one is desired - Boot mounting a set of file servers is
prescribed and all mountings are performed the
clients boot time - Auto-mounting mounting of the servers is
implicitly done on demand when a file is first
opened by a client
13Server Registration
- The mounting protocol is not transparent the
initial mounting requires knowledge of the
location of file servers - Server registration
- File servers register their services, and clients
consult with the registration server before
mounting - Clients broadcast mounting requests, and file
servers respond to clients requests
14Stateful Stateless File Servers
15Stateful Stateless File Servers
- State information
- Opened files and their clients
- File descriptors and file handles
- Current file position pointers
- Mounting information
- Lock status
- Session keys
- Cache or buffer
16Stateful Stateless File Servers
- Sateful a file server maintains internally
some of the state information - Stateless a file server maintains none at all.
- Stateful file Server file servers maintain
state information about clients between requests - Stateless file Server when a client sends a
request to a server, the server carries out the
request, sends the reply, and then remove from
its internal tables all information about the
request - Between requests, no client-specific information
is kept on the server - Each request must be self-contained full file
name and offset
17Comparing
18File Sharing Space Multiplexing
19File Sharing
- Overlapping access multiple copies of the same
file - Space multiplexing of the file
- Cache or replication
- Coherency control managing accesses to the
replicas, to provide a coherent view of the
shared file - Desirable to guarantee the atomicity of updates
(to all copies) - Interleaving access multiple granularities of
data access operations - Time multiplexing of the file
- Simple read/write, Transaction, Session
- Concurrency control how to prevent one execution
sequence from interfering with the others when
they are interleaved and how to avoid
inconsistent or erroneous results
20Space Multiplexing
- Remote access no file data is kept in the client
machine. Each access request is transmitted
directly to the remote file server through the
underlying network. - Cache access a small part of the file data is
maintained in a local cache. A write operation or
cache miss results a remote access and update of
the cache - Download/upload access the entire file is
downloaded for local accesses. A remote access or
upload is performed when updating the remote file
21Current work
- Lakshman, A. and Malik, P., Cassandra a
decentralized structured storage system, ACM
SIGOPS Operating Systems Review, volume 44,
number 2, pages 35-40, 2010-gt Facebook and
Twitter uses Cassandra (distributed - filesytem)
- -gt Used for inbox search for about 800 million
active - users.
- -gt The cluster of computers uses  regular
commodity - hardware prone to failure.
22Current work
- Shvachko, K., Kuang, H., Radia, S. and Chansler,
R., The hadoop distributed file system, Symposium
on Mass Storage Systems and Technologies, pages
1-10, 2010Borthakur, D., The hadoop distributed
file system Architecture and design, Hadoop
Project Website, 2007-gt HDFS is a filesytem for
Hadoop - -gt Designed to run on low cost hardware
- -gt Highly fault-tolerant and suitable for
large data sets -gt Hardware failure a norm
rather than the exception -gt Moving
computation is cheaper than moving data -gt
Emphasis on high throughput of data
23Current work
- Ungureanu, C., Atkin, B., Aranya, A., Gokhale,
S., Rago, S., Cakowski, G., Dubnicki, C. and
Bohra, A., HydraFS a high-throughput file system
for the HYDRAstor content-addressable storage
system, Proceedings of the 8th USENIX conference
on File and storage technologies, 2010 - -gt Content addressable storage
- -gt Stores information that can be retrieved based
- on its content, not its storage location.
- -gt HydraFS isbuilt on top of CAS
24Future work
- DFS at Exascale
- Today (2011) Petascale Computing
- O(10K) nodes and O(100K) cores
- Near future (2018) Exascale Computing
- 1M nodes (100X)
- 1B processor-cores/threads (10000X)
- Ioan Raicu, Pete Beckman, Ian Foster, Making a
Case for Distributed File Systems at Exascale,
ACM Workshop on Large-scale System and
Application Performance (LSAP), 2011
25Thanks!