Serverless Network File Systems - PowerPoint PPT Presentation

About This Presentation

Title:

Serverless Network File Systems

Description:

Centralized file systems fundamentally limit performance and ... that indicates which physical machines mange which groups of index numbers at any given time ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 28

Provided by: mac90

Learn more at: http://fac-staff.seattleu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Serverless Network File Systems

1
Serverless Network File Systems

Overview by Joseph Thompson

2
Problem

Centralized file systems fundamentally limit
performance and availability
All reads and writes go through the centralized
server
Increased server performance is expensive

3
Purpose

Better performance and scalability
High availability via redundant data storage

4
Assumption

SNFS is only appropriate among machines that
communicate over a fast network and that trust
each other to enforce security
SNFS generates a significant amount of network
traffic
Security will be covered later

5
Components of SNFS

Software RAID
Log File System (LFS)
Zebra
Merges RAID and LFS in a distributed network
Dont miss my next presentation on Zebra!
Multiprocessor Cache Consistency
In this model, a each processor is one client

6
Three Problems to Be Solved

Need distributed metadata which both provide
cache consistency management and flexibility to
dynamically reconfigure client responsibilities
Scalable way to subset storage servers for
efficiency
Scalable log cleaning

7
Metadata

Manager Map
IMap
File Directories
Stripe Group Map

8
Mangers

The manager of a file controls two sets of
information about it
Cache consistency state
Disk location metadata

9
Manager Map

Table that indicates which physical machines
mange which groups of index numbers at any given
time
Globally replicates this table to all mangers in
system
Table relatively small (10s of kBytes per
hundreds of clients)
Table rarely changes

10
IMap

A files imap entry contains the log addresses of
the files inode
For scalability, imaps are only distributed to
managers who have been assigned to the file

11
File Directories

Contains mappings from file names to index
numbers
Stored in the file itself
Files created by the client are assigned to the
manager on that machine (if there is one)
Index Numbers
Used to find the manager who is responsible for
the file

12
Stripe Group Map Justification

In a large raid, even large log segments create
small write inefficiencies with large RAIDs
While one client write at is full network
bandwidth to one stripe group, another client can
do the same with another group
Smaller segment size make cleaning more efficient
Stripe groups greatly improve availability
Each group stores its own parity which helps if
there are multiple server failures in different
groups

13
Stripe Group Implementation

Group ID
Group Members
Current or Obsolete
Current and Obsolete field is used to increase
efficiency relying on the cleaner to eventually
move all data to a current group and removing the
obsolete group
Also globally replicated to each client
Small and rarely changes

14
Cleaning

Three main tasks
Utilization status
Uses status to decide which segment to clean
Writes blocks from old segment to new segment

15
Distributed Utilization

Assign the burden of maintaining each segments
utilization status to the client that wrote the
segment
Client stores utilization information in s-files
for each stripe group they write to which are
written like normal files and can be found by a
stripe group leader

16
Distributed Cleaning

A stripe group leader (dynamically appointed)
initiates cleaning when the number of free
segments drops below a threshold value or when
the group is idle
The leader accumulates the s-files for the group
and can dynamically assign cleaners from
different machines to clean subsections of the
stripe group in an efficient manner

17
Procedure to Read a Block

Diagram Demystified!

18
Writing and Cache Consistency

To write, a client must request a lock from the
owning manager which the manger can revoke at any
time
The manger invalidates its cache and updates its
cache consistency information
One implementation uses a client caching lists to
invalidate stale client caches and forward read
requests to clients with valid cached copies

19
Recovery and Reconfiguration

General Recovery Strategy
Data Structure Recovery
Storage Server Recovery
Manager Recovery
Cleaner Recovery
Scalability of Recovery

20
General Recovery Strategy

LFS has an append only log of every file
modification between log segment writes called
the delta
Uses checkpoint recovery and roll-forward
Unless additional parity servers per stripe group
are used, multiple storage servers from a single
stripe are unreachable, there can be no full
recovery

21
Data Structure Recovery

Layered dependence requires the recovery to start
with the storage servers, then managers, then
cleaner

22
Storage Server Recovery

As we have seen with RAID architectures,
recovering a single storage server is easy
Once we do the initial recovery we can use LFS
delta feature to poll clients for their unwritten
changes in the process of rolling forward

23
Manager Recovery

Retrieves last known imaps from its last
checkpoint written to a storage server
The manager gathers a consensus of map manager
tables from clients in the roll-forward process
to set the appropriate changes to data block
locations

24
Cleaner Recovery

Since s-files are stored like normal files, they
will be recovered from the respective storage
server
Then must go through a roll-forward state where
it checks the clients for a summary of their
modifications to those segments that are more
recent
To avoid clients having to search their logs
multiple times they can gather utilization
information during the manager recovery process

25
Scalability of Recovery

The roll-forward process can generate O(N2)
messages per object using the roll-forward step
where N refers to the number of clients, manger,
or storage servers
An optimization is each object only need to
contact N lower layer object, and if there is
randomization used to reduce the number of
concurrent access to a single storage server,
each manager can roll-forward in parallel.

26
Other Information Not Covered Here