Storage management and caching in PAST - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Storage management and caching in PAST

Description:

Storage management and caching in PAST. Antony Rowstron and Peter Druschel. Presented to cs294-4 by Owen Cooper. Outline. PAST goals. PAST api. File storage overview ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 21
Provided by: owenc
Category:

less

Transcript and Presenter's Notes

Title: Storage management and caching in PAST


1
Storage management and caching in PAST
  • Antony Rowstron and Peter Druschel
  • Presented to cs294-4 by Owen Cooper

2
Outline
  • PAST goals
  • PAST api
  • File storage overview
  • File and replica diversion
  • Replica management
  • Caching
  • Performance
  • Discussion

3
PAST (non)goals
  • P2P global storage network
  • Use properties of existing p2p systems (Pastry)
  • Support for strong persistence
  • Via a core set of replicas
  • High availability
  • Via local caching
  • Scalable
  • Obtain high storage utilization via local
    cooperation
  • Secure
  • Design goals do not include
  • Replacing the file system
  • Updatable files
  • Directory or lookup service

4
Security Model
  • Pastry node ids are a hash of a public key
  • Smartcard based security
  • Provides keys
  • Quota management
  • Nodeid and fileid generation controlled
  • Try to stop nodes from getting consecutive ids
  • Or clients from overloading parts of the network
  • But node id and real world identity may not be
    linked
  • Data not encrypted

5
PAST APIs
  • In PAST, files are immutable
  • FileidInsert(filename,credentials, k, file)
  • Insert k copies of the file into the network, or
    fail.
  • Fileid a signed (filename, credentials, salt)
  • Successful if ack with receipts from k nodes
  • Filelookup(fileid)
  • Return a copy of the file if it exists
  • Reclaim(fileid, cradentials)
  • Reclaim accepted if requested by the owner
  • Allows, but does not require, storage reclamation

6
File insertion
  • Insert(name, c, k, file)
  • Computes a storage certificate
  • Contains fileid, hash of content, k, salt
  • Deducts kfilesize from quota
  • Routes file and storage certificate using pastry
    using fileid.
  • Node verifies the integrity of the file, stores
    it, and asks k-1 closest nodes to store the file.
  • K-1 nodes in leaf set (k-1 lt l)
  • Node returns ack with k signed storage receipts,
    or a nak.

7
Lookup and Reclamation
  • Pastry ensures replica is found
  • Since a lookup is routed to the closest nodeid
  • Reclamation
  • Client generates a reclaim certificate
  • Sends it to the fileid via pastry
  • Recipients verify the certificate issue receipt
  • Client reclaims quota

8
Diversion
  • A file or replica can be relocated
  • For a replica, to another close node
  • If one of the K closest is overloaded
  • For a file, to another set of nodes in the
    idspace
  • If the nodes around a fileid are (possibly
    locally) congested
  • Why is this necessary?
  • Differing storage capacity at nodes
  • Differing file size for inserted files

9
Replica Diversion
  • Node responsible for fileid asks k-1 neighbors to
    store the file
  • Neighbor (N) may divert a copy to a node in its
    leaf set
  • Pointer to copy inserted at N
  • N issues storage certificate
  • N also inserts a pointer on the k1th closest
    node
  • No orphan if N fails
  • N remains responsible for pointer maintenance

10
File Diversion
  • Replica diversion is local
  • Allows storage choice between nodes around fileid
  • File Diversion
  • Triggered when an insert with a fileid fails
  • Insert is tried a total of three times
  • New fileid generated by changing the salt

11
Storage Policy
  • How does a node choose to accept or reject a
    replica?
  • Computes sizeof(file)/sizeof(free_space)
  • Compares to Tpri or Tdiv depending nodes role
  • Tpri gt Tdiv
  • How is node chosen for replica diversion
  • Search leaf set for the node that
  • Has maximal free space
  • Doesnt already hold a diverted or primary
    replica
  • File diversion
  • K copies cannot be located (via primary or
    diversion)

12
Replica maintenance
  • Node join/leave causes responsibility shift
  • Pastry node failure detection will cause leaf set
    updates
  • Past detects responsibility shifts this way
  • Newly responsible node must copy files
  • Make a copy immediately, OR
  • pointer to old owner copy lazily
  • Diverted replicas
  • Target of diversion may move out of leaf set
  • Node to store repica can be any one in leaf set
  • Must exchange keepalive messages themselves
  • Should be relocated

13
Replica maintenance (2)
  • Node failure may cause storage shortage
  • No node in leaf set can take over ownership
  • Search space is widened
  • Ask most extreme nodes to locate storage
  • Increases search space to 2l nodes
  • If no storage space found, fail.

14
Caching
  • Pastrys locality based routing will tend to
    direct requests to nearby copies
  • PAST also stores cached copies
  • Along routing path between client and fileid
  • For insert and lookup operations
  • Cache maintained using GD-size algorithm
  • Weight per file 1/size(file)
  • Eviction
  • Pick file with minimum weight
  • Subtract weight of evicted file from all others

15
Experiments without diversion
  • Experiments use
  • Large trace from web server
  • Files from local web server
  • The case for diversion with web trace
  • Without diversion
  • 51.1 of insertions failed
  • 60.8 storage utilization

16
Experiments (2) with diversion
  • With diversion
  • Bigger leaf set size a plus

17
Experiments (3)varying Tpri
  • Effects of varying Tpri
  • files stored v.s. size of file

18
Experiments (4) Varying Tdiv
  • Varying Tdiv
  • Tpri is constant

19
File and Replica Diversion
20
caching
  • 8 traces combined
  • Requests from clients in each trace are mapped to
    close PAST nodes
Write a Comment
User Comments (0)
About PowerShow.com