Title: File and Storage Systems for MEMSbased Storage
1File and Storage Systems for MEMS-based Storage
- Bo Hong
- Advisors Scott Brandt and Darrell Long
Storage Systems Research Center University of
California, Santa Cruz
2Why New File and Storage Systems for MEMS-based
Storage
- Micro-Electro-Mechanical Systems (MEMS) storage
- A promising alternative secondary storage
technology - RAM/Disk replacements or complements
- Why not use existing file systems and storage
architectures? - Optimized towards disks
- MEMS has radically different performance
characteristics and underlying architecture from
disks - Forcing MEMS to match existing file systems and
disk-based architectures is suboptimal - A better understanding of design options and
trade-offs of file/storage systems based on MEMS
storage will result in better system performance
3MEMS Storage Technology
- Hardware Research IBM, HP, CMU, Nanotech
- Recording Techniques Magnetic, physical
- Non-volatile
- Orthogonal magnetic recording
- Higher recording density
- Thousands read/write tips
- A subset of tips active simultaneously
- Higher throughput and parallelism
- Tip array and media sled move relative to each
other - In the X and Y directions independently
- Two degrees of freedom
- No rotating media
4MEMS Storage Device
5MEMS Storage Device Characteristics
- Physical size 1 2 cm2
- Recording density 250 750 Gb/in2
- Capacity 2 10 GB
- Price 5 50/GB
- Access latency 0.1 1 ms
- Tip bandwidth 400 1000 Kb/sec
- Aggregate bandwidth 100 400 MB/sec
6Performance Comparison
7GB/s
DRAM
6GB/s
0.52 GB 100-200/GB
5GB/s
Throughput
4GB/s
3GB/s
MEMS
2GB/s
210 GB 5-50/GB
100500 GB 1/GB
1GB/s
DISK
1ns
10ns
100ns
1us
10us
100us
1ms
10ms
Latency
7How We Use MEMS-Based Devices
- As super disks?
- Yes. But
- Very expensive
- Limited capacity
- Existing file systems are optimized for disks
- MEMS-based file systems need to consider
- Two-dimensional data layout
- Unique seek behaviors
- Zero rotational delay
- Low access latency
- High throughput and parallelism
8Current StatusSystem-Level MEMS Storage Research
- Device Modeling
- Performance characteristics analysis
- Request scheduling
- Storage subsystem architectures
9Proposed WorkSystem-Level MEMS Storage Research
- Device Modeling (completed)
- Performance characteristics analysis (completed)
- Request scheduling (started)
- Storage subsystem architectures (started)
- Data layout and file allocation
- Caching and prefetching
- Putting it all together (started)
10Definitions MEMS Disk Analogies
Tip Region The portion of the media sled
accessible by a single tip
11MEMS Disk Analogies
Tip Sector The smallest unit of data accessible
by a single tip
Tip Sector 8 data bytes ECC Servo info
12MEMS Disk Analogies
Sector The tip sectors accessed by n
simultaneously active tips. The standard unit of
data access.
Tip Region
Tip Sector
Recall all tips are over the same relative tip
sector at the same time
Sector
13MEMS Disk Analogies
Track The sectors accessible by a set of active
tips with no X movement.
Tip Region
Tip Sector
X movement incurs additional settling time.
Sector
Track
14MEMS Disk Analogies
Cylinder
Tip Region
Cylinder The tracks accessible by all tips with
no X movement.
Tip Sector
Assumption Any Y movement is cheaper than any X
movement
Sector
Track
151. Device ModelingMEMS Positioning
- tposition max(tx, ty)
- tx includes X-dimension settling time
- Oscillations in X lead to off-track interference
- ty includes Y-dimension turnaround times
- The media sled may change its movement directions
during seeks
16A MEMS Positioning Model
- The model takes into account
- Actuator forces (constant but bidirectional)
- Spring forces
- Initial and final access velocities
- X-dimension settling time
- Y-dimension turnaround times
- Originally proposed by CMU Griffin
- Only solved iteratively, assuming
piecewise-constant spring forces - Less accurate and computationally complex
- We provided an analytical solution to positioning
time equations
17Analytical Model
- Seek time in the X dimension
- Seek time in the Y dimension
182. Performance Characteristics Analysis Seek
Time Equivalence Regions
(b)
(a)
- Equivalence regions from the center to (a) even-
and (b) odd-indexed bit columns
- Seek time equivalence regions
- Bounded and predictable seek times within an
equivalence region - Rectangular, average xy ratio dependent on
physical parameters - 110 under a set of default parameters
- Different from equivalence regions of disks
cylinders
19Experimental Methodology
- DiskSim device simulator (CMU)
- CMU and UCSC MEMS device models
- Disk access traces
- Cello
- HP-UX time-sharing system at HP Labs in 1999
- Random access
- Hplajw
- HP-UX personal workstation at HP Labs in 1999
- Sequential access
- Scale request inter-arrival times to increase I/O
workload intensities
203. Request SchedulingRequest Scheduling
Algorithms
- The goals of request scheduling algorithms
- Reduce response times
- Provide fairness, i.e. minimize variation in
response times - Standard request scheduling algorithms
- Designed for disk
- Minimize seek distances
- Minimize rotational delays
- Feasible on MEMS?
- Request scheduling designed for MEMS
- Take advantage of unique seek behaviors of MEMS
21Existing Algorithms
- First Come First Served (FCFS)
- Circular LOOK (C-LOOK)
- Keep fairness by servicing in a fixed order
- Shortest Seek Time First (SSTF) (not shown)
- Only consider tx
- Shortest Positioning Time First (SPTF)
- tposition max(tx,ty)
- Aged Shortest Positioning Time First (ASPTF)
Jacobson - Also considers the time that the request has been
waiting for service - Scheduled by FCFS when teffective lt 0
22Existing Algorithms on MEMS
- The insights from disks also hold for MEMS (also
in Griffin) - SPTF performs best but suffers high response time
variations - ASPTF performs as well as SPTF but suffers the
aging effect under heavy workloads - FCFS performs well only under light workloads
- C-LOOK and SSTF performs well under light and
moderate workloads - FCFS, C-LOOK, and ASPTF have low response time
variations - ASPTF performs best overall
- BUT, reordering the queue is NOT free!
- 5.1 ms per entry (table-driven calculation in a
modern Linux machine) - Reordering a 200-entry queue takes 1 ms
- Comparable to the maximum MEMS positioning time
- Goal Simple algorithm with ASPTF-like performance
23ZONE Scheduling Algorithm
- Zone-based Shortest Positioning Time First (ZONE)
- Divide the device into zones
- Shaped like equivalence regions
- SPTF within zones
- Optimizes seek time within zones
- C-LOOK between zones
- Ensures overall fairness
24ZONE vs. Existing Algorithms Average Response
Time
- ZONE performs as well as SPTF
25ZONE vs. Existing Algorithms Response Time
Variation
- ZONE has response time variations like C-LOOK
26Remaining Work Pyramiding
- ZONE only reduces ASPTF complexity by a constant
factor - May reduce opportunity for optimization at low
and moderate workloads - Pyramiding
- Variable-sized zones
- Better queue lengths under all workloads
- Preliminary results are encouraging
An example of pyramiding
27Request Scheduling Summary
- Standard disk request scheduling algorithms are
suboptimal for MEMS storage devices - ZONE appears to be nearly optimal
- SPTF-like average response times
- C-LOOK-like response time variation
- No severe reordering overhead and no aging effect
- Customizable zone size and order (e.g. for hot
spots and cold spots) - Pyramiding could provide better performance under
all workloads
284 Storage Subsystem ArchitecturesArchitectural
Alternatives
- MEMS instead of DRAM Wang
- Slows down instruction execution by 9 16 times
- MEMS instead of disk
- Improves I/O response times by 6 10 times
- Even better when I/O traffic is more intensive
- Capacity and price issues
- Capacity 2 10 GB
- Price 5 50 GB
- Ultimate goal
- As fast as MEMS
- As large and cheap as disk
29MEMS in the Storage Hierarchy
- I/O Workload characteristics in general-purpose
UNIX systems - Write traffic is the majority (50-80) and bursty
- Metadata traffic is substantial (up to 80) and
bursty - Metadata takes 1-2 of disk storage
- MEMS as complements of disks
- Idea 1 Store all metadata on MEMS device
- Idea 2 Service write requests on MEMS device
30MEMS Metadata Storage
- Store all metadata on MEMS device
- Periodically write metadata back to disk
- Advantages
- Metadata requests are fast
- Disk workload becomes more sequential
- Data and metadata can be serviced in parallel
31MEMS as Disk Write Buffer
- All writes appended to logs on MEMS
- Logs written to disk when disk is idle
- Hit reads are also serviced by MEMS
- Advantages
- Writes are faster and more reliable
- Total number of requests to disk decreases
Incoming Requests
Disk Driver
Missed Reads
Writes Hit-Reads
MEMS Write Buffer
Disk
Write Clean
32MEMS Metadata Storage with MEMS Disk Write Buffer
Disk Driver
Write Hit-Reads
Miss-Reads
Metadata Requests
Write Clean Metadata Write Back
- Combine the aforementioned two techniques
- Feasible because MEMS write buffer only takes 100
500 MB
33Experimental Results Average Response Time
- MEMS metadata marginally improves performance
- Heavily dependent on of metadata requests in
workload - MEMS write buffer significantly improves
performance - As good as MEMS alone
34Remaining WorkMEMS Virtual Disk
Incoming
- VM-like storage system
- Requests serviced by MEMS
- MEMS and disk exchange data in segments
- A Segment is a set of contiguous data blocks
- Segment management
- Segment size
- Segment layout on MEMS
- Segment replacement
- Segment prefetching
- Impacts on data layout
- Metadata layout
- File layout
Disk driver
requests
Segment 1
MEMS Storage
. . .
Segment n
Segment exchange
Segments
Disk
35Remaining Work (maybe)Other Storage Subsystem
Architectures
- MEMS vs. NVRAM
- Much slower but much larger and cheaper
- MEMS in RAID systems
- Metadata server in large distributed file systems
36Storage Subsystem ArchitecturesSummary
- As primary storage
- MEMS performs poorly
- Useful in low power/performance applications
- As secondary storage
- 10 times faster than disk, but
- Expensive
- As a layer in the storage hierarchy
- Performance could match MEMS alone
- Relatively inexpensive
375. Data Layout and File Allocation for MEMS
Storage (all remaining work)
- Interesting problems
- Initial file location
- Extending existing file
- Inter-file layout
- MEMS properties
- Determinism
- Multi-dimensionality
- High parallelism and bandwidth, even higher with
multiple sleds - FFS-like layout (FFS, Ext2, Ext3)
- Cylinder groups based on seek time equivalence
regions - Log-structure-like layout (LFS)
- Extent-like layout (XFS, VxFS, NTFS)
- Other layouts
385. Data Layout and File Allocation for MEMS
Storage (cont.)
- Aggressive file striping
- File grouping Ahmad, Yeh
- Impact on scheduling
396. Caching and Prefetching for MEMS Storage (all
remaining work)
- MEMS properties
- Low access latency
- High bandwidth
- Non-volatility
- Cheap and large compared to NVRAM
- Interesting questions
- Relevance and feasibility of existing caching
replacement and prefetching polices? - How to improve?
- Aggressive prefetching
- Large logical block size
- Small logical block size for space efficiency
Wang - Caching and prefetching in MEMS/disk systems
- File-level prefetching Yeh
- Segment-level prefetching in MEMS virtual disk
407. Putting it All Together
- Performance of all parts together
- Trade-offs when designing file systems for MEMS
- Mobile computing low power consumption
- High performance computing high throughput and
fast access latency - Others
- Identify typical working environment of MEMS
devices and examine the corresponding
configurations
41Overall Summary
- Existing file systems are suboptimal for
MEMS-based storage devices - A better understanding of design options and
trade-offs of file systems based on MEMS storage
is necessary - Data layout and file allocation
- Scheduling
- Storage subsystem architecture
- Caching and prefetching
42Related Work
- MEMS technology development
- IBM, HP, CMU CHI2PS, Nanotech, University of
Colorado Boulder - MEMS systems research by CMU Parallel Data Lab
Griffin, Schlosser, Nagle, Ganger, Carley - Simulation environments
- Modeling
- Performance characteristics
- Performance sensitivity to design parameters
- Disk-analogous data layout
- Simple MEMS data placement schemes
- Feasibility of existing request scheduling
algorithms - Benchmarks on MEMS-based storage system
- Failure management
- Power utilization
43Related Work (cont.)
- MEMS systems research by HP Labs
- Using MEMS in RAID Uysal et al.
- MEMS systems research by UCSC Storage Technology
Advanced Research group - Striping unit size vs. throughput in MEMS RAID 0
Zimet - Workload-based optimization of MEMS design
parameters Zimet, Dramaliev and Madhyastha - MEMS systems research by other UCSC SSRC
researchers - Modeling Yang and Madhyastha
- MEMS-based storage system hierarchies Wang
- Power management Lin et al.
- Disk scheduling
- Data layout
- Caching and prefetching
44Thank You!
- Acknowledgements
- Feng Wang, Karen Glocer, Zachary Peterson, Ying
Lin - Dave Nagle, Greg Ganger, CMU PDL
- The rest of the UCSC SSRC
- More information
- http//ssrc.cse.ucsc.edu
- http//ssrc.cse.ucsc.edu/mems.shtml
- http//www.cse.ucsc.edu/hongbo
- Questions?