Title: The Zebra Striped Network File System
1The Zebra Striped Network File System
- Presentation by
- Joseph Thompson
2Purpose
- Single file server architectures will not be able
to support future throughput needs. - Need a striping technique that will support all
size of files writes in effective and uniform
manner.
3Striping in Zebra
- RAID
- Per-File Striping in a Network File System
- Log-Structured File Systems and Per-Client
Striping
4RAID-Problems
- Small writes in RAID are about four times as
expensive as they would be in a disk array with
parity. - All the disks are attacked to a single machine,
so its memory and I/O system are performance
bottlenecks. - Note no reason there has to be a dedicated
parity disk.
5Per-File Striping in a Network File System
- Note A collection of file data that spans the
servers is called a stripe, and the portion of a
stripe stored on a single server is called a
stripe fragment. - Small files are difficult to handle efficiently
- Inefficient parity management during updates
6Log-Structured File Systems and Per-Client
Striping
- Solution to per-file problems
- Zebra applies techniques of Log-File System and
Per-Client Striping - Creates an append only log for each client who
then can convert many small writes into one large
writes to a single stripe. (client is responsible
for calculating parity) - Requires a File Manager to facilitate client
interaction and keep record of file metadata such
as file attributes, directory structures, etc. - Like all other LFSs, this solution also requires
a stripe cleaner.
7Zebra Components
- Storage Servers
- Clients
- File Mangers
- Stripe Cleaners
8Storage Servers
- Storage server requirements
- Store a fragment
- Append to an existing fragment
- Used for periodic writes of a log
- Retrieve a fragment
- Delete a fragment
- Identify fragments
- Used to identify end of client logs after crashes
9Clients
- On Read
- Client must determine which stripe fragments
store the desired data, retrieve the data from
the storage servers, and return them to the
application. - On Write
- Client appends the new data to its log by
creating new stripes to hold the data, computing
the parity of the stripes, and writing the stripe
to the storage servers.
10File Mangers
- File Manager stores all of the information in the
file system except for file data. - The client requests block pointers for the File
Manager, and accesses the block data itself. - Performance if the File Manager is a concern
because it is a centralized resource. - Solution clients cache naming information from
File Manager so that the client contacts the file
manager less often.
11Stripe Cleaners(first glance)
- The only way to reuse free space in a stripe is
to clean the stripe so that is contains no live
data, then delete it. - Since the cleaner is a client itself, it just
reads live data from stripes with the largest
amounts of free fragments, appends the data to
its own client log to be written to a new stripe,
and then deletes the old stripes.
12System Operations
- Communication Deltas
- Stripe Cleaning (additional details)
- Adding Additional Storage Servers
13Communication Deltas
- Deltas provide a simple and reliable way for
various system components to communicate changes
to files. - A client's log also contains deltas.
- Delta Information
- File ID, File Version(time edited), Block Number,
Old Block pointer, New Block pointer. - Three types of deltas
- Update delta, cleaner delta, reject delta.
14Stripe Cleaning (additional details)
- Evaluating stripe space utilization
- Cleaner must process the number of deltas in
every client log (stripe) to keeping a running
count of free fragments. - The cleaner appends all of the deltas that refer
to a given stripe to a special file for that
stripe, called a Stripe Status File. - Conflicts between cleaning and file access
- Stripe cleaner does not lock any files during
cleaning. Only issues a special cleaner delta. - If a conflict did a occur when a update took
place during a cleaning, the file manager will
notice two different deltas and make sure the
final pointer for the block reflects the update
delta. - The manager generates a reject delta that the
cleaner uses to tell that the new block it
created is unused. - (just to show how adding a stripe cleaner
significantly adds complexity)
15Adding Additional Storage Servers
- When a new storage sever becomes available, all
that must be done is notify the clients, file
manager, and stripe cleaner that each stripe
group has one more fragment.
16Restoring Consistency After Crashes
- Two general issues upon crash
- Consistency
- Availability
- Zebra uses checkpoint and roll-forward method for
restoring consistency. - Three new consistency problems
- Stripes may become internally inconsistent
- Some of the data or parity written but not all of
it - Information written to stripes may become
inconsistent with metadata - Stripe cleaner state becomes inconsistent with
stripes
17Stripes may become internally inconsistent
- Zebra stores simple checksum for each fragment.
- On storage server reboot
- Verifies checksums only around the time of crash
(using deltas) - Discards incomplete checksums
- Queries other stripes to find out what new
stripes were written when server was down.
18Information written to stripes may become
inconsistent with metadata
- If Client crashes file manager must check logs to
make sure the last log written successfully. - If manager crashes it has to run through every
clients log from the managers last check point
and roll forward through the rest of the log to
update info since last checkpoint.
19Stripe cleaner state becomes inconsistent with
stripes
- Strip cleaner also stores periodic checkpoints
and on restart reads (and corrects) status files
then starts collecting more utilization
information from the point of its last checkpoint
(roll-forward).
20Performance overview
21Performance overview cont
22Conclusion
- Zebra provides higher throughput , availability,
and scalability, than previous file systems at
the cost of increased system complexity. - Its only step one. As we saw that xFS included
and improved Zebras core functionality.