New Template.97 - PowerPoint PPT Presentation

About This Presentation

Title:

New Template.97

Description:

Binary Search Tree (BST) Cache content as. a priority queue (PQ) LRU. 12 ... BST, Active lifetime T. PQ of unpinned files in cache. A vector of pinned files in cache ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 22

Provided by: alic145

Learn more at: https://sdm.lbl.gov

Category:

more less

Transcript and Presenter's Notes

Title: New Template.97

1
Adaptive File Caching in Distributed Systems
Ekow J. Otoo Frank Olken Arie Shoshani
2
Objectives

Goals
Develop a coordinated optimal file caching and
replication of distributed datasets
Develop a software module, called Policy Advisory
Module (PAM) as part of Storage Resource Managers
(SRMs) and other grid storage middleware
Examples of application areas
Particle Physics Data Grid (PPDG)
Earth Science, Grid (ESG)
Grid Physics Network (GriPhyN).

3
Managing File Requests at a Single Site
Multiple Clients Using a Shared Disk for
Accessing Remote MSS
Other Sites
Mass Storage System
Storage Resource Manager
Network
File requests
Queuing And Scheduling
Policy Advisory Module
Shared disk
4
Two Principal Components of Policy Advisory
Module

A disk cache replacement policy
Evaluates which files are to be replaced when
space is needed
Admission policy for file requests
Determines which request is to be processed next
e.g. may prefer to admit requests for files
already in cache
Work done so far concerns
Disk cache replacement policies
Development of SRM-PAM Interface
Some models of file admission policies

5
New Results Since Last Meeting

Implementation of the Greedy Dual Size (GDS),
replacement policy
New experimental runs with new workloads.
6 month log of access trace from Jlab
Synthetic workload with file sizes from 500K to
2.14G Bytes
Implementation of SRM-PAM simulation in OMNeT
Papers
Disk cache replacement algorithm for storage
resource Managers on the grid SC2002.
Disk file caching Algorithms that account for for
delays in space reservation, file transfers and
file processing. To be submitted to Mass Storage
Conference
A discrete event simulation model of a storage
resource manager (To be submitted to
SIGMMETRICS2002) .

6
Some Known Results in Caching (1)

Disk to Memory Caching
Least Recently Used (LRU) keeps the last ref.
time
Least Frequently Used (LFU) keeps reference
counts
LRU-K keeps last reference times up to a max of
K.
Best known result (ONeil et al. 1993)
Small K is sufficient (K2, 3)
Gain 5-10 over LRU depending on reference
pattern
Significance of a 10 saving in time per
reference
Improved response time
In the Grid and Wide Area Networks, this
translates to
Reduced network traffic
Reduce load at the source
Savings in time to access files

7
Some Known Results in Caching (2)

File Caching in Tertiary Storage to Disk
Modeling of Robotic Tapes Johnson 1995, Sarawagi
1995,..
Hazard Rate Optimal Olken 1983
Object LRU Hahn et al. 1999
Web Caching
Self Adjusted LRU Aggarwal and Yu 1997
Greedy Dual Young 1991
Greedy Dual Size (GDS), Cao and Irani 1997

8
Difference Between Environments

Caching in primary memory
Fixed page size
Cost (in time) is assumed constant
Transfer time is negligible
Latency is assumed fixed for disk
Memory reference is instantaneous
Caching in Grid environment
Large files with variable sizes
Cost of retrieval (in time) varies considerably
From one time instant to another even for the
same file
Files may be available from different locations
in a WAN
Transfer time may be comparable to the latency in
a WAN
Duration of file reference is significant and
cannot be ignored
Main goal is to reduce network traffic and file
access times

9
Our Theoretical Result on Caching Policiesin
Grid Environment

Latency delays, transfer delays and file size
impact caching policies in the Grid
Cache replacement algorithms, such as LRU, LRU-K
do not take these into account and therefore are
inappropriate
The replacement policy we advocate is based on a
cost-beneficial function computed at time t0 as

t0 is the current time, ki(t0) is
the count of references for file i up to max of
K Ci(t0) is the cost in time of accessing the
file i, Si is size of file i. fi(t0)
is the total count of references to the file i
over its active time T. t-K is the
time of the kth backward reference.

Eviction candidate is one with minimum gi(t0)

10
Implementations from the Theoretical Results

Two new practical implementations developed
- MIT-K Maximum average Inter-arrival Time, an
improved
LRU-K.
- MIT-K dominates LRU-K
- Does not take into account access costs and
file size
- The main ranking function is
- LCB-K Least Cost Beneficial with K backward
references
- does take into account retrieval delay and
file size

11
Some Details of the Implementation Algorithms

Evaluation of replacement policies with no delay
considerations involves
- a reference stream r1, r2, r3, , rN
- a specified cache size Z, and
- two appropriate data structures
One holds information of referenced files and
A second holds information about the files in
cache but also allows for fast selection of
eviction candidate.

Cache content as a priority queue (PQ)
LRU
Binary Search Tree (BST)
12
Implementation When Delays are Considered
When delays are considered each reference ri in
the reference stream has 5 event times time of
arrival, time when file caching starts, time
when caching ends, time when processing begins
and time when processing ends and file is
released.
BST, Active lifetime T
Varies with different policies
A vector of pinned files in cache
PQ of unpinned files in cache
13
Performance Metrics

Hit Ratio

Byte Hit Ratio

Retrieval Cost Per
Reference

14
Implications of Metric Measures

Hit Ratio
Measure of the relative savings as a count of the
number of files hit
Byte Hit Ratio
Measure of the relative savings as the time
avoided in data transfers
Retrieval Cost Per Reference
Measure of the relative savings as the time
avoided in data transfers and in retrieving data
from their sources

15
Parameters of the Simulations

Real workload from Jefferson Natl. Accelerator
Facility (JLab)
A six month trace log of file accesses to
tertiary storage
Log contains batched requests
Replacement policy used in JLab is LRU
Synthetic workload based on JLab
250,000 files with large sizes uniformly
distributed between 500K to 2.147 GBytes
Inter-arrival time is exponentially distributed
with mean 90 sec
Number of references generated is about 500,000
Locality of reference
partition the references into random size
intervals
follows the 80-20 rule within each interval(80
of references are on 20 of the files)

16
Replacement Policies Compared

RND Random
LFU Least Frequently Used
LRU Least Recently Used
MIT-K Maximum Inter-Arrival Time based on last K
references
LCB-K Least Cost Beneficial based on last K
references
GDS Greedy Dual Size

Active lifetime of a file T is set at 5-days
All results accumulated with variance reduction
technique.

17
Simulations Results for JNAF Workload Comparison
of Hit Ratios

Higher values represent better performance

MIT-K and LRU give the best performance
LCB-K, GDS and RND are comparable
LFU is the worst

18
Simulations Results for JNAF Workload Comparison
of Byte Hit Ratios

Higher values represent better performance

MIT-K and LRU give slightly best performance
All policies except LFU are comparable
LFU is the worst

19
Simulations Results for JNAF Workload Comparison
of Average Retrieval Time Per Reference

Lower values represent better performance

LCB-K and GDS give the best performance
MIT-K, LRU and RND are comparable
LFU shows the worst performance

20
Simulations Results for Synthetic Workload
Comparison of Average Retrieval Time Per
reference

Lower values represent better performance

LCB-K clearly gives the best performance although
not significantly better than GDS
LFU is still the worst
Hit ratio and Byte Hit Ratio are not good
measures of caching policy effectiveness on the
Grid

21
Summary

Developed a good replacement policy, LCB-K, for
caching in storage resources management on the
grid.
Developed a realistic model for evaluating cache
replacement policies taking into account delays
at the data sources, transfers and processing.
Applied the model for extensive simulation of
different policies under synthetic and real
workloads of access to mass storage system in
JNAF
We conclude that two worthwhile replacement
policies for storage resource management on the
GRID are LCB-K and GDS.
The LCB-K gives about 10 savings in retrieval
cost per reference compared to the widely used
LRU.
The cumulative effect can be significant in terms
of reduced network traffic and reduced load at
the source.