CS176C: Spring 2006 Applications on Structured Overlays - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

CS176C: Spring 2006 Applications on Structured Overlays

Description:

Distributed yellow pages (DOLR) Store your file or your service wherever ... Some Real Applications. Many possibilities. Storage, media delivery, resilient routing... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 26
Provided by: beny7
Category:

less

Transcript and Presenter's Notes

Title: CS176C: Spring 2006 Applications on Structured Overlays


1
CS176C Spring 2006Applications on Structured
Overlays
  • Administrivia
  • Homework due Friday night
  • Chimera memory leak (Thanks Chang Kou his
    group)
  • Today
  • Quick overview of Chimera
  • Peer to peer APIs DHT / DOLR / CAST
  • A storage application OceanStore

2
What is Chimera
  • Several protocol implementations, most
  • Large code-base (integrated w/ other projects)
  • Java implementations, big footprint
  • Research prototypes, often buggy
  • Chimera goal lean, mean routing machine
  • Combine best of previous protocols
  • Higher throughput routing
  • C-implementation, small footprint for
    distribution
  • Thus far, Chimera includes
  • Prefix routing
  • More stability / simplicity (Leafsets from
    Pastry)
  • Optimized routing topology (Join algorithms from
    Tapestry)
  • Ver 1.04 3900 lines of C (versus Tapestry,
    35000 lines of Java)

3
How Can You Use Chimera?
  • Focus on being a high-throughput routing layer
  • Route messages to a key (KBR)
  • Route messages to a node
  • Additional services on top of Chimera
  • What are the APIs for overlay applications?
  • Distributed storage on Chimera (DHT)Treat
    machines like buckets in a hash tablehash (file)
    5, so store it on machine 5
  • Distributed yellow pages (DOLR)Store your file
    or your service whereverJust tell everyone about
    location (file) ME

4
A Distributed Hash Table
  • Just like a traditional hash table
  • Write data put (key, data)
  • Read data get (key)
  • The difference?
  • Spread across many machines
  • Each machine a bucket
  • Who stores data with key K?
  • Root (K)
  • Put (key, data)
  • Sends file via KBR to remote host
  • Get (key)
  • Retrieves data via KBR from root (K)

Node 0/1024
K800
0
put
128
896
bucket
256
768
640
384
512
5
Enhancements to DHT
  • A distributed hash table (DHT) is simple, useful
  • But not enough
  • Root (K) might be far away
  • Root (K) might be fragile
  • Root (K) might be insecure
  • So what do we do?
  • Enhancements to DHT
  • Caching DHT local copy of files
  • Root replication multiple roots
  • Is this enough?
  • Replication is expensive
  • Consider Erasure coding

6
A Caching DHT on KBR
root(K)
File F hash K
7
Distributed Directory Service
  • A distributed yellow pages for data
  • Ks server publish (K)
  • Ks clients routeMsg (K, msg)
  • Sends msg to Ks server
  • How does it work?
  • sprinkle yellow-page entryon path from server
    to root(K)
  • routeMsg routes to K,finds sprinkle, changes
    direction to server

8
DOLR on Routing API
9
Some Real Applications
  • Many possibilities
  • Storage, media delivery, resilient routing
  • Storage
  • Distributed file system (well talk about one of
    them)
  • Automatic backup services
  • Distributed CVS
  • Media delivery
  • Wide-area multicast systems
  • Video-on-demand systems
  • Content distribution networks (CDNs)
  • Resilient routing
  • Route around Internet failures

10
OceanStore A Global Storage Utility
11
The Challenges
  • Maintenance
  • Many components, many administrative domains
  • Constant change
  • Must be self-organizing
  • Must be self-maintaining
  • All resources virtualizedno physical names
  • Security
  • High availability is a hackers target-rich
    environment
  • Must have end-to-end encryption
  • Must not place too much trust in any one host

12
The Big Picture
  • For durability (archival layer)
  • Apply Erasure coding
  • Replicate copies of fragments across network
  • Periodically check for level of redundancy
  • For quick access (dissemination layer)
  • Maintain small of copies replicated in network
  • Use access tracking to move copies closer to
    clients
  • For attack resilience (Byzantine layer,
    inner-ring)
  • Key decisions made per file by inner-ring of
    servers
  • Distributed decisions verified by a threshold
    signature

13
Technologies Tapestry
  • You guys already know this
  • This is Tapestry in action on a wide-area network
  • Tapestry performs
  • Distributed Object Location and Routing
  • From any host, find a nearby
  • replica of a data object
  • Efficient
  • O(log N ) location time, N of hosts in
    system
  • Self-organizing, self-maintaining

14
Technologies Tapestry (cont.)
15
Technologies Erasure Codes
  • More durable than replication for same space
  • The technique

16
Technologies Erasure Codes
  • Properties
  • Code (data) ? N fragments, need m to regenerate
  • Computationally expensive to fragment /
    reconstruct
  • Using erasure codes
  • Whats the right coding factor?
  • Where do you distribute fragments?
  • Concerns?
  • Minimum of r block needed
  • If blocks lt r? Total data loss

17
Technologies Byzantine Agreement
  • Guarantees all non-faulty replicas agree
  • Given N 3f 1 replicas, up to f may be
    faulty/corrupt
  • Expensive
  • Requires O(N 2) communication
  • Combine with primary-copy replication
  • Small number participate in Byzantine agreement
  • Multicast results of decisions to remainder

18
Putting it all together a Write
19
Prototype Implementation
  • All major subsystems operational
  • Self-organizing Tapestry base
  • Primary replicas use Byzantine agreement
  • Secondary replicas self-organize into multicast
    tree
  • Erasure-coding archive
  • Application interfaces NFS, IMAP/SMTP, HTTP
  • Event-driven architecture
  • Built on SEDA
  • 280K lines of Java (J2SE v1.3)
  • JNI libraries for cryptography, erasure coding

20
Deployment on PlanetLab
  • http//www.planet-lab.org
  • 100 hosts, 40 sites (in 2003)
  • Shared .ssh/authorized_keys file
  • Pond up to 1000 virtual nodes
  • Using custom Perl scripts
  • 5 minute startup
  • Gives global scale for free
  • Performance
  • Faster reads than NFS (why?)
  • Slower writes (why?)

21
A Case for Common APIs
  • Lots and lots of peer to peer applications
  • Decentralized file systems, archival backup
  • Group communication / coordination
  • Routing layers for anonymity, attack resilience
  • Scalable content distribution
  • A number of scalable, self-organizing overlays
  • E.g. CAN, Chord, Pastry, Tapestry, Kademlia, etc
  • Semantic differences
  • Store/get data, locate objects, multicast /
    anycast
  • How do these functional layers relate?
  • What is the smallest common denominator?
  • One ring to rule them all? ?
  • Common API would encourage application portability

22
Some Abstractions
  • Distributed Hash Tables (DHT)
  • Simple store and retrieve of values with a key
  • Values can be of any type
  • Decentralized Object Location and Routing (DOLR)
  • Decentralized directory service for
    endpoints/objects
  • Route messages to nearest available endpoint
  • Multicast / Anycast (CAST)
  • Scalable group communication
  • Decentralized membership management

23
Tier 1 Interfaces
24
Structured P2P Overlays
Tier 2
CFS
PAST
SplitStream
i3
OceanStore
Bayeux
Tier 1
CAST
DHT
DOLR
Key-based Routing
Tier 0
25
Next Time
  • Well talk more about applications for data
    streaming
  • Think about Homework 2 (specs will go up this
    week)
  • Deploy a simple native application on Chimera
  • Default remote file transfer (FTP)
  • More interesting ideas are ok, but check with me
    first!
  • Think about projects!!
  • Implement a resilient DHT on Chimera
  • Integration Chimera with network sensors
  • Design / implement a resilient chat application
  • Large projects require group merges (2 groups ? 4
    members)
  • Promising projects can become BS/MS projects
  • More ideas for projects next lecture
Write a Comment
User Comments (0)
About PowerShow.com