Storage management and caching in PAST, a largescale, persistent peertopeer storage utility - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Storage management and caching in PAST, a largescale, persistent peertopeer storage utility

Description:

Internet based p2p global storage and content distribution utility ... PAST permits nodes to jointly store and publish content exceeding capacity or BW of any node ... – PowerPoint PPT presentation

Number of Views:104

Avg rating:3.0/5.0

Slides: 29

Provided by: langev

Category:

more less

Transcript and Presenter's Notes

Title: Storage management and caching in PAST, a largescale, persistent peertopeer storage utility

1
Storage management and caching in PAST, a
large-scale, persistent peer-to-peer storage
utility

Presented by
Deniz Hastorun

2
Outline

PAST motivation and goals
PAST API
Security
Storage management
File and replica diversion
Replica management
Caching
Performance
Conclusion

3
PAST

Internet based p2p global storage and content
distribution utility
Based on a self-organizing overlay network of
storage nodes
Cooperative routing, replication of files,
caching of popular files
File Storage rather than Block Storage
Built on top of Pastry

4
Motivation and Goals

Currently p2p application research more directed
towards constructing apps and understanding
related issues
Goals
Strong persistence by providing persistent
storage for replicated read-only files
High availability through replication and caching
Scalability by obtaining high storage utilization
via local cooperation
Secure system
Not intended as a general purpose FS
No searching, directory lookup, or key
distribution operations

5
PAST nodes

Multitude and diversity of nodes in the Internet
are exploited
The collection of PAST nodes form an overlay
network
Minimally, a PAST node is an access point
Optionally, it contributes to storage and
participate in the routing
PAST permits nodes to jointly store and publish
content exceeding capacity or BW of any node

6
PAST API

NodeID a 128-bit node identifier distributed
amongst a circular namespace
SHA-1 of the nodes public key
quasi random assignment of nodeIDs
No correlation btw node ID and nodes geographic
location
FileID 160-bit file identifier
SHA-1 of the file name, owners public key and
salt
Root Node the node which is numerically closest
in hash value to the fileID
Uniform distribution of nodeID and fileIDs over
the keyspace

7
PAST API(2)

FileIDInsert(filename,credentials, k, file)
Insert k copies of the file into the network, or
fail.
Fileid assigned (filename, credentials, salt) as
result
Successful if receipts received from k nodes
Filelookup(fileID)
Return a copy of the file if it exists and if one
of the k nodes is accessible
Reclaim(fileID, credentials)
Reclaim accepted if requested by the owner
Allows, but does not require, storage reclamation
No longer guarentee a return for lookup operation

8
PASTRY

Routing Route a fileID to closest nodeID in less
than O(log2b (N)) hops (b4)
A file stored on k nodes, nodeIDs closest to 128
msbs of the fileID
Routing Table nth row in the table has 2b-1 node
IP addresses that are the same as the current
node
Leaf Set L/2 nodes numerically greater than and
less than current nodeID
Neighborhood Set L nodes closest to current node
by proximity metric (used in recovery updates)
Each hop is sent to a node which has one more
prefix digit in common
Routing table entries are updated lazily

9
PAST Operations

File Insertion
Lookup
Reclaim

10
File Insertion

Computes a fileID and a storage certificate
Certificate fileID SHA-1 hash of content k
salt creation date optional file metadata
Debit storage kfilesize from quota
Routes file and storage certificate using PASTRY
towards fileID
Node verifies the integrity of the file, stores
it, and asks k-1 closest nodes to store the file.
K-1 nodes in leaf set (k-1 lt l)
Upon acceptance nodes return ack with k signed
store receipts, or an appropriate error to the
client.
Files are immutable

11
Lookup and Reclamation

Lookup request- client sends a req msg using the
requested fileID as the destination
Pastry ensures replica is found
Since a lookup is routed to the closest nodeID
and replicas stored on k nodes w/ adjacent
nodeIDs
Reclamation- analagous to Insert
Client generates a reclaim certificate
Certificate routed to the fileID via PASTRY
Storing nodes verify the certificate issue
reclaim receipt
Client reclaims credit for users storage quota

12
Security

Smartcard based security model
For each PAST node and each user of the system a
smartcard
Private/public key pair is associated with each
card
Smartcards generate, verify certificates and
maintain storage quotas
Ensure the integrity of nodeID and fileID
assignments
Store receipts prevent nodes from not making k
replicas
File certificates to verify integrity and
authenticity of stored content
File and reclaim certificates help enforce client
storage quotas
Data not stored as encrypted

13
Storage Management

Goal High global storage utilization and
graceful degradation as max utilization reached
Rely on local coordination among nodes w/ adj
nodeIDs
Fully integrate file insertion w/storage
management
Incur only modest performance overhead

14
Storage Management(2)

Case where k closest nodes cannot store a replica
Storage node imbalance
Reasons
Statistical variation in the assignment of
nodeIDs and fileIDs, some nodes might store more
than the others
High variance in inserted file size distribution
Difference in the storage capacity of PAST nodes
Assumption Difference is no more than 2 orders
of magnitude

15
Storage Management Techniques

Replica Diversion -one of k nodes overloaded
Balance the differences in storage capacity and
utilization of nodes within a leaf set
Node A diverts copy to node B in its leaf set if
B is not among k-closest, does not already have a
diverted replica
A enters the pointer to copy at B in its table
and issues store receipt
A issues storage certificate
A also inserts a pointer on the k1th closest
node C
If B fails a replacement replica created
If C fails, A installs another pointer on the
current k1 th node

16
Storage Management

File diversion
Balance remaining free storage space among
different portions of the nodeID space
In case of a neg ack in file insertion
Generate a new fileID using a different salt
value
Retry the insert operation
Repeat this process 3 times
If still fails, abort and return error to the
operation

17
Storage Policies

How does a node choose to accept or reject a
replica?
Computes sizeof(file)/sizeof(free_space)
Compares to Tpri or Tdiv depending nodes role
Tpri gt Tdiv
A node accepts all but oversized files as long as
its utilization is low
Prevents unnecasary diversion
Discrimation against large files, threshold
decreased

18
Maintaining Replicas

Node join/leave causes leaf set adjustments
In case of a node failure, node removed from leaf
sets of l nodes and the live node w/ next closest
nodeID included
In case of a new node, joining node included, one
node dropped from the leaf sets
Newly responsible node must copy files
Either acquire a replica copy immediately
Or include a reference pointer to old owner-gt
gradual copy migration
Diverted replicas
Target of diversion may move out of leaf set
Node to store replica and the node referencing
may not be from same leaf set
Must exchange keepalive messages themselves
Should be gradually relocated to a node in the
same leaf set as the refererring node

19
Maintaining Replicas (2)

Node failure may cause storage shortage
No node in the leaf set can take over ownership
To maintain storage invariant
Ask the 2 most distant nodes in the leaf set to
locate storage
Increases search space to 2l nodes
If no storage space found, fail.
Number of replicas drops below k until space
becomes available

20
Caching

Minimize client access latencies, maximize query
throughput and balance query load
K replicas maintained for high availability
Pastry routes client lookup request to the
replica closest to client
For popular files and performance improvement
cached copies are stored
Caches inserted to nodes along the route path btw
client and fileID
Insertion policy file size lt c( nodes current
cache size)
Cache replacement policy based on GreedyDual-Size
(GD-S) Policy
maintain Weight per file H cost(file)/size(file
)
Eviction
Pick file with minimum weight
Subtract weight of evicted file from all others
If cost(file) 1, cache hit rate maximized

21
Experimental Results

Web proxy and filesystems workloads used for
evaluation of storage management and caching
First set of experiments- no diversion
Tpri1 , tdiv0 and failure after first insert
failure
51.1 of file insertions failed
Global storage utilization only 60.8
Results obviates the need for storage management
in a system like PAST

22
Experimental Results(2)

With diversion
Tpri0.1, tdiv0.05, l16 or 32
With l16 utilization rises to gt94
w/ l32 to gt98

Larger leaf set increases the scope for load
balancing
No further improvement beyond l32, but node
arrival and departure costs increased

23
Experiments (3)

Varying Tpri and Tdiv

As tpri is increased fewer successful files
insertions but higher storage utilization
24
Experiments (4)

Varying Tdiv

As tdiv is increased fewer successful files
insertions but higher storage utilization
25
Impact of File and Replica Diversion
File diversion negligible if storage utilization
below 83
Number of diverted replicas remain small even at
high utilization 10 at 80 util
Overhead imposed is moderate as long as
utilization remains below 95
26
Impact of File Size

Up to 80 utilization no file less than .5 MBs is
dropped (large files can find adequate resources)

27
Impact of Caching

8 combined NLANR traces used
Requests from clients in each trace are mapped to
close PAST nodes
When caching disabled of routing hops constant
to 70 utilization then begins to rise
At low utilization rates, files cached in the
network close to where requested
At low global cache hit ratio due to high
utilization rate, avg number of routing hops
increases
Caching results still better than no caching even
at 99 utilization

28
Conclusion

Storage management ideas effective, but more work
is required to make it operational.
Support for directory service, key lookup needed
A third party evaluation of the systems needed
More experiments needed to compare how caching
performs with other systems
Experiments for performance of file retrieval
and reclaim performance, number of hops required
for file insertion, overhead due to diversion,
overlay routing overhead, effort to cache,etc
Possible improvements
Avoiding or reducing replica diversion
nodes could simply forward on to the next node or
store statistics in routing table
A directory service for retrieval performance.
Centralized yet distributed storage quota
management mechanisms
"Masters nodes" or distributed proxies possible
solutions