Building Distributed, Wide-Area Applications with WheelFS - PowerPoint PPT Presentation

About This Presentation
Title:

Building Distributed, Wide-Area Applications with WheelFS

Description:

Building Distributed, Wide-Area Applications with WheelFS. Jeremy Stribling, Emil ... A decentralized, wide-area FS. Main contributions: ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 31
Provided by: you273
Category:

less

Transcript and Presenter's Notes

Title: Building Distributed, Wide-Area Applications with WheelFS


1
Building Distributed, Wide-Area Applications with
WheelFS
  • Jeremy Stribling, Emil Sit,
  • Frans Kaashoek, Jinyang Li, and Robert Morris
  • MIT CSAIL and NYU

2
Grid Computations Share Data
  • Nodes in a distributed computation share
  • Program binaries
  • Initial input data
  • Processed output from one node as intermediary
    input to another node

3
So Do Users and Distributed Apps
  • Shared home directory for testbeds (e.g.,
    PlanetLab, RON)
  • Distributed apps reinvent the wheel
  • Distributed digital research library
  • Wide-area measurement experiments
  • Cooperative web cache
  • Can we invent a shared data layer once?

4
Our Goal
  • Distributed file system for testbeds/Grids
  • App can share data between nodes
  • Users can easily access data
  • Simple-to-build distributed apps

Testbed/Grid
Node
Node
Node
File foo
File foo
Node
Node
Node
File foo
5
Current Solutions
Testbed/Grid
Node
Node
Node
  • Usual drawbacks
  • All data flows through one node
  • File systems are too transparent
  • Mask failures
  • Incur long delays

Central File Server
Copy foo
File foo
Node
Node
Node
6
Our Proposal WheelFS
  • A decentralized, wide-area FS
  • Main contributions
  • 1) Provide good performance according to Read
    Globally, Write Locally
  • 2) Give apps control with semantic cues

7
Talk Outline
  1. How to decentralize your file system
  2. How to control your files

8
What Does a File System Buy You?
  • A familiar interface
  • Language-independent usage model
  • Hierarchical namespace useful for apps
  • Quick-prototyping for apps

9
File Systems 101
Node
App 1
App 2
API call
Operating System
File System
Local hard disk
  • File system (FS) API
  • Open ltfilenamegt ? ltfile_idgt
  • Close/Read/Write ltfile_idgt
  • Directories translate file names to IDs

10
Distributed File Systems
Testbed/Grid
Node
App 1
App 2
API call
Operating System
File System
Local hard disk
Node
Node
Node
Node
Node
File 135
Dir 500 foo ? 135
11
Basic Design of WheelFS
Node 653
Node 076
135
File 135
File 135 v2
Node 150
File 135 v3
Node 554
076 150 257 402 554 653
Node 257
Node 402
135 v3
135 v3
135 v2
135 v2
135
135
12
Read Globally, Write Locally
  • Perform writes at local disk speeds
  • Efficient bulk data transfer
  • Avoid overloading nodes w/ popular files

13
Write Locally
  1. Choose an ID
  2. Create dir entry
  3. Write local file

Node 653
Node 076
Node 554
Node 150
File 550 (bar)
bar 550
550
Node 402
Node 257
Dir 209 (foo)
14
Read Globally
  1. Contact node
  2. Receive list
  3. Get chunks

Cached 135
Chunk
Node 653
Cached 135
Node 076
Chunk
Node 554
Node 150
Cached 135
File 135
076 653
076 653
076 554 653
File 135
Node 402
Node 257
Chunk
15
Example BLAST
  • DNA alignment tool run on Grids
  • Copy separate DB portions and queries to many
    nodes
  • Run separate computations
  • Later fetch and combine results

16
Example BLAST
  • With WheelFS, however
  • No explicit DB copying necessary
  • Efficient initial DB transfers
  • Automatic caching for reused DBs and queries
  • Could be better since data is never updated

17
Example Cooperative Web Cache
  • Collection of nodes that
  • Serve redirected web requests
  • Fetch web content from original web servers
  • Cache web content and serve it directly
  • Find cached content on other CWC nodes

18
Example Cooperative Web Cache
if -f /wfs/cwc/URL then if notexpired
/wfs/cwc/URL then cat /wfs/cwc/URL
exit fi fi wget URL O - tee
/wfs/cwc/URL
  • Avoid hotspots

19
Example Cooperative Web Cache
Dir 070 (/wfs/cwc)
Node 653
Node 076
Client
No!
135
URL 550
URL?
Node 554
Node 150
File 550
Cached 135
File 135
Chunk
135?
135 v1
402
Chunk
Node 402
Node 257
URL
Cached 135
Chunk
if -f /wfs/cwc/URL then if notexpired
/wfs/cwc/URL then cat /wfs/cwc/URL
exit fi fi wget URL O - tee
/wfs/cwc/URL
20
Talk Outline
  1. How to decentralize your file system
  2. How to control your files

21
Example Cooperative Web Cache
if -f /wfs/cwc/URL then if notexpired
/wfs/cwc/URL then cat /wfs/cwc/URL
exit fi fi wget URL O - tee
/wfs/cwc/URL
  • Would rather fail and refetch than wait
  • Perfect consistency isnt crucial

22
Explicit Semantic Cues
  • Allow direct control over system behavior
  • Meta-data that attach to files, dirs, or refs
  • Apply recursively down dir tree
  • Possible impl intra-path component
  • /wfs/cwc/.cue/foo/bar

23
Semantic Cues Writability
  • Applies to files
  • WriteMany (default)
  • WriteOnce

Node 653
Node 076
Node 554
Node 150
Cached 135 v3
Cached 135
File 135
File 135 v2
File 135 v3
Node 402
Node 257
24
Semantic Cues Freshness
  • Applies to file references
  • LatestVersion (default)
  • AnyVersion
  • BestVersion

Node 653
Node 076
Node 554
Node 150
Cached 135
File 135
Node 402
Node 257
25
Semantic Cues Write Consistency
  • Applies to files or directories
  • Strict (default)
  • Lax

Node 653
Node 076
Node 554
Node 150
File 135
File 135 v2
Node 402
Node 257
135 v2
135
26
Example BLAST
  • WriteOnce for all
  • DB files
  • Query files
  • Result files
  • Improves cachability of these files

27
Example Cooperative Web Cache
  • Reading an older version is ok
  • cat /wfs/cwc/.maxtime250,bestversion/foo
  • Writing conflicting versions is ok
  • wget http//foo gt /wfs/cwc/.lax,writemany/foo

if -f /wfs/cwc/.maxtime250,bestversion/URL
then if notexpired /wfs/cwc/.maxtime250,best
version/URL then cat
/wfs/cwc/.maxtime250,bestversion/URL
exit fi fi wget URL O - tee
/wfs/cwc/.lax,writemany/URL
28
Discussion
  • Must break data up into files small enough to fit
    on one disk
  • Stuff we swept under the rug
  • Security
  • Atomic renames across dirs
  • Unreferenced files

29
Related Work
  • Every FS paper ever written
  • Specifically
  • Cluster FS Farsite, GFS, xFS, Ceph
  • Wide-area FS JetFile, CFS, Shark
  • Grid LegionFS, GridFTP, IBP
  • POSIX I/O High Performance Computing Extensions

30
Conclusion
  • WheelFS distributed storage layer for
    newly-written applications
  • Performance by reading globally and writing
    locally
  • Control through explicit semantic cues
Write a Comment
User Comments (0)
About PowerShow.com