Title: Memory Resource Allocation for File System Prefetching From a Supply Chain Management Perspective
1Memory Resource Allocation for File System
Prefetching-- From a Supply Chain Management
Perspective
Zhe Zhang (NCSU), Amit Kulkarni (NCSU) Xiaosong
Ma (NCSU/ORNL), Yuanyuan Zhou (UIUC)
2Aggressive prefetching an idea whose time has
come
- Enlarging processor-I/O gap
- Processing power doubling every 18 to 24 months
- Disparity between growth of disk latency and
throughput - Latency improving 10 per year while throughput
improving 40 per year Hennessy 03 - Large memory cache sizes
- Usually 0.05 0.2 of storage capacity Hsu 04
Papathanasiou 05
3 and whose challenges follow
- Systems facing large number of concurrent requests
1
Facebook
How to manage file systems memory resource for
aggressive prefetching?
- Servers handling large number of clients
10
4All streams are not created equal
MP3
Youtube
Youtube HQ
128 kbps
200 kbps
900 kbps
- Allocating memory resource according to access
rate? - Related work
- Access pattern detection rate not detected Lee
87, Li 04, Soundararajan 08 - Aggressiveness control based on sequentialty
Patterson 95, Kaplan 02, Li 05 - Multi-stream prefetching rate not sufficient
utilized Cao 96, Tomkins 97, Gill 07
5Similar story in grocery stores!
300 Wine
Beer
Milk
200 per day
80 per day
1 per year
- Allocating storage resource according to
consumption rate? - Studied in Supply Chain Management (SCM)
- Demand rate measurement/analysis/prediction
- Dated back to first wars
- Yet active
- Wal-Mart 24M on satellite network for instant
inventory control - Dell aiming at zero inventory
6Our contributions
- A mapping between data prefetching and SCM
problems - Novel rate-aware multi-stream prefetching
techniques based on SCM heuristics - Implementation and performance evaluation
- Modified Linux 2.6.18 kernel
- Extensive experiments with modern server and
multiple workloads
- Coordinated multi-level prefetching
- Based on multi-echelon inventory control
- Extending application access pattern to lower
level - Evaluation with combinations of state-of-the-art
single level algorithms
7Outline
- Motivation
- Background and problem mapping
- Algorithms
- Performance evaluation
- Conclusions
8Background Inventory cycles
- Inventory theory
- Task manage inventory for goods
- Goal satisfy customer demands
order quantity
Inventory level
fast dem -and
cycle inventory
average demand
slow demand
reorder point
safety inventory
Time
lead time
9Background Prefetching basics
Memory cache
Disk
10Background Prefetching cycles
- Prefetching techniques
- Task manage the cache for data blocks
- Goal satisfy application requests
order quantity
prefetch degree
Prefetched blocks
fast dem -and
cycle inventory
average demand
slow demand
reorder point
Tc
trigger distance
safety inventory
Ts
Time
disk access time
lead time
11Challenges in mapping
- Data requests ? Customer demands
- Data blocks are unique
- Linear sequence of blocks in detected streams
GroceryStoregetMilk()
FileSystemgetNextBlock()
FileSystemgetBlock(Position p)
- Prefetched data blocks ? Inventory
- Accessed data blocks remain in the cache
- But as second class citizens Gill 05, Li 05
12Outline
- Motivation
- Background and problem mapping
- Algorithms
- Performance evaluation
- Conclusions
13Performance metrics and objectives
- Prefetching optimization objective improve cache
hit rate - Dynamically adjust
- Trigger distance
- Prefetch degree
- SCM optimization objective improve fill rate
- Fraction of demand satisfied from inventory
- ESC Expected Shortage per Cycle
- Q order quantity
14Rate aware prefetching algorithms
prefetch degree
Prefetched blocks
cycle inventory
average demand
slow demand
fast demand
reorder point
trigger distance
safety inventory
Time
- Task calculating Tc and Ts
- Tc lead time average consumption rate
- Ts based on estimation of uncertainty
15Algorithm1 Equal Time Supplies (ETS)
- Safety inventory for all goods set to the same
time supply (e.g., amount of goods consumed in 5
days) - With standard distribution shapes, uncertainty
is proportional to the mean value - Ts set to be proportional to average data access
rate
average rate of streami
trigger distance of streami
total allowed trigger distance
16Algorithm2 Equal Safety Factors (ESF)
- Safety inventory set to maintain the same safety
factor across all goods - Ts set to be proportional to standard deviation
of access rate
standard deviation
- Implementation challenges
- Measurement and calculation overhead
- Limited floating point calculation in kernel
17Outline
- Motivation
- Background and problem mapping
- Algorithms
- Performance evaluation
- Conclusions
18Comparing with Linux native prefetching
- Linux prefetching algorithm (kernel 2.6.18)
- Trigger distance (T) Prefetch degree (P)
- Doubling T and P for each sequential hit
- Upper bounds
- T P 32 (pages)
- Implementation of SCM-based algorithms
- Principle maintaining same memory consumption as
original algorithm - Default parameters
- Tdefault 24, Pdefault 48
19Experimental setup
- Platform
- Linux server
- 2.33GHz quad-core CPU, 16GB memory
- Comparing 32-32, 24-48, ETS and ESF algorithms
- Workloads
- Synthetic benchmarks
- Linux file transfer applications
- HTTP web server workload
- Server benchmarks
- SPC2-VOD-like (sequential)
- TPC-H (random)
20Two streams with different rates
- Rate of stream 1 fixed at 1000 pages / second
- Rate of stream 2 varying b/w 3000 to 7000 pages /
second
Rate of fast stream (pages/second )
Average response time ETS 1925 improvement
over 32-32
of cache misses per prefetch cycle (ESC) ETS
same of cycles as 24-48 and similar ESC as 32-32
21Two streams with different deviations
- SD of stream 1 fixed at square root of rate
- SD of stream 2 varying b/w 3 to 7 times of the
average rate
SD of unstable stream
SD of unstable stream
Average response time ESF 2035 improvement
over ETS
Response time of individual streams ESF large
improvement for unstable stream, small
degradation for stable stream
22Throughput of server benchmarks
- SPC2-VOD-like (sequential streams)
- TPC-H (random accesses)
Random application throughput ETS never worth
than 32-32 2.5 average improvement
Sequentialrandom apps. throughput ETS 653
improvement over 32-32
Sequentialrandom apps. memory consumption
23Conclusions and future work
- Observations
- File blocks can be managed as apples!
- Simple approaches such as ETS seems to perform
well
- Future work
- Awareness of both access rate and delivery time
- Adjusting the prefetch degree
- Acknowledgements
- Anonymous reviewers
- Our shepherd George Candea
- Our sponsors NSF and DOE Office of Science