A Hybrid Caching Strategy for Streaming Media Files - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

A Hybrid Caching Strategy for Streaming Media Files

Description:

A Hybrid Caching Strategy for Streaming Media Files. Jussara M. Almeida Derek L. Eager Mary K. ... Use bandwidth goodness list to select candidates for eviction ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 30
Provided by: jussaramar
Category:

less

Transcript and Presenter's Notes

Title: A Hybrid Caching Strategy for Streaming Media Files


1
A Hybrid Caching Strategy for Streaming Media
Files
  • Jussara M. Almeida Derek L. Eager
    Mary K. Vernon
  • University of Wisconsin-Madison
  • University of Saskatchewan
  • November 2001

2
Outline
  • Characteristics of Streaming Media (SM) files
  • Delivery of SM files
  • Hypothesis and Assumptions
  • Previous Caching Policies
  • New Policy Performance Comparison
  • New Caching Policies
  • Conclusions and Future Work

3
Characteristics of SM Files
  • Large file size
  • cache on disk
  • Sustained I/O bandwidth
  • inserting and reading new content
  • Clients access partial files
  • initial portion
  • favored segment
  • base variable number of layers of layered
    encoding

4
Delivery of SM Files
  • Unicast streaming
  • server bandwidth is linear in client request rate
  • goal maximize byte hit ratio
  • Multicast streaming
  • save bandwidth
  • cost sharing introduces new tradeoffs

5
Caching for Multicast Streams Tradeoffs
  • example
  • 10 distributed proxy servers each serving a
    local region,
  • 100 requests (on avg) arrive per region
    during a given popular video
  • need 7 streams per region, or 12 streams at the
    remote server

6
Caching for Multicast Streams Tradeoffs
  • caching popular content reduces the load on the
    remote server and network
  • delivering popular content from the remote server
    amortizes the cost of a stream over more clients
  • earlier portions of a popular video require more
    bandwidth and have less cost-sharing than later
    portions

7
New Caching Policies Research
  • Hypothesis popularity-based strategy will
    outperform replacement-based strategy
  • significant fraction of requests to uncached
    files may be for files that are accessed very
    sporadically
  • Assumptions
  • limited disk space implies limited disk bandwidth
  • proxy bandwidth for delivering cached streams is
    equal to min of proxy disk bw and proxy network
    bw
  • (call this proxy disk bandwidth)

8
Current Web Caching Policies
  • Replacement based (cache on each miss)
  • Top replacement candidate is an ad-hoc
    combination of
  • large files
  • least recently access or lower access frequency
  • miss penalty (server latency, bandwidth)
  • Cache whole file or none
  • Unicast
  • Ignore limited disk bandwidth

9
Previous SM Caching Policies
  • Interval Caching DaSi93, KaRT95
  • Resource Based Caching (RBC) TVDS98
  • Least Frequently Used (LFU)
  • Block-based insertion and deletion AcSm00
  • Popularity-based caching for layered encoding
    RYHE00
  • Prefix and Segment Caching for smoothing
    SeRT99,WZDS98

10
Interval Caching
  • Cache smallest intervals
  • Target memory caches (lots of insertions)

File f
11
Resource Based Caching
  • Cache entire files and intervals/runs
  • Goal efficiently utilize the limited resource
  • limited space cache smallest space requirement
  • limited bandwidth cache smallest write overhead
  • Pre-allocate bandwidth to each cached entity
  • Complex algorithm
  • Complex implementation
  • High time complexity

12
RBC Algorithm
13
Least Frequently Used
  • Different implementation options
  • What to do when receive first access to an
    object?
  • How to estimate frequency?
  • Version studied Currently Most Popular (CMP)
  • Insert only most frequently accessed
    (file or segment)
  • On-line popularity estimate future research

14
Previous comparison RBC vs. CMP TVDS98
  • Fixed file access frequencies
  • RBC outperforms CMP for all parameter values
    studied
  • Limited design space
  • e.g. total cache size ? 16GB
  • Inconsistent results

15
New Performance Comparison
  • Re-evaluate byte hit ratio of CMP and RBC
  • Simulation with synthetic workload
  • Broad design space
  • New Pooled RBC
  • New simple hybrid CMP/interval caching (CMP/IC)
    policy

16
System Assumptions
  • Arrivals Poisson(?)
  • extra experiments with Pareto(?,k)
  • File access frequency Zipf(?)
  • Perfect File popularity
  • extra experiments with approximate file
    popularity
  • Uniform file size and delivery rate
  • extra experiments with variable file size and
    delivery rate
  • Load balanced across multiple disks

17
System Parameters
  • n number of files
  • ? Zipf parameter
  • N arrival rate
    (avg. number of requests per
    avg. file duration T)
  • N ? ? T
  • C cache size (fraction of media data accessed)

18
System Parameters
  • B normalized disk bandwidth
  • (fraction of the average number of
    simultaneous streams needed to deliver data that
    is cached by CMP)
  • B depends on N, ?, n, C and disk technology
  • Relative performance of policies depends mainly
    on B
  • B 1.0 CMP system is bandwidth balanced
  • B ? 1.0 CMP system is bandwidth deficient
  • B ? 1.0 CMP system is bandwidth abundant

19
Normalized Disk Bandwidth (B)Example
  • Ultrastar 72ZX disk
  • disk space 116.76 hours of MPEG-1 video (73.4GB)
  • disk bandwidth 108 MPEG-1 streams (22-37 MB/s )
  • Assume 100 requests / hour for cached files
  • If cache contains 2-hour movies
  • Need 200 streams
  • B 108/200 0.54
  • If cache contains 30-minute TV shows
  • Need 50 streams for cache content
  • B 108/50 2.16

20
RBC vs. CMP
N 450, n 100, ?0
  • CMP outperforms RBC if B ? 1.0
  • RBC slightly outperforms CMP if B ? 1.0 and
    small caches

21
Files Cached by RBC
  • Average fraction of each file cached by RBC (N
    450, n 100, C0.25)

B 0.75
B 2.0
B 1.0
22
Space and Bandwidth Utilization
B 0.75
B 2.0
B 1.0
23
Pooled RBC
  • Three improvements over RBC
  • simpler rule to select entity to cache
  • can keep cached intervals when deleting a full
    file
  • pool of pre-allocated bandwidth
  • Similar complexity as RBC

24
Pooled RBC, RBC and LFU
N 450, n 100, ?0
  • Pooled RBC ? CMP
  • BUT, Pooled RBC is much more complex than CMP

25
Hybrid CMP/IC Policies
  • Do interval caching on a separate (small) cache
  • Interval Cache in Main Memory
    CMP/ICmem and Pooled RBC/ICmem
  • Interval Cache on Disk CMP/ICdisk
  • e.g. 5 of disk cache

26
CMP/ICmem vs. Pooled RBC/ICmem
N 450, n 100, ?0
  • Memory cache improves CMP and Pooled RBC
  • B ? 1.0 greater improvement for CMP

27
CMP/ICdisk vs. Pooled RBC
N 450, n 100, ?0
  • CMP/ICdisk ? Pooled RBC ? CMP

28
Conclusions
  • Simple CMP
  • simple to implement
  • performance similar to Pooled RBC, CMP/ICdisk
    (static file popularities)
  • Hybrid CMP/IC policy
  • Performance ? Pooled RBC
  • simple to implement
  • possibly more robust
    (imperfect and dynamic popularity
    measures)

29
Future Work
  • Develop on-line estimate of file popularity
  • Server log analysis
  • client behavior and workloads (NOSSDAV01 paper)
  • More logs!!!!
  • Caching Policies for Multicast Streams
  • popular file has greater cache-sharing if not
    cached
  • determine cache content that minimizes per-client
    cost
  • caching principles / on-line policy
  • (coming up soon)
  • Prototype, experimental ( live ) workloads
Write a Comment
User Comments (0)
About PowerShow.com