DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings

Description:

DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings A. gupta, Y. Kim, B. Urgaonkar, Penn State – PowerPoint PPT presentation

Number of Views:278

Avg rating:3.0/5.0

Slides: 35

Provided by: csCmuEdu83

Category:

more less

Transcript and Presenter's Notes

Title: DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings

1
DFTL A flash translation layer employing
demand-based selective caching of page-level
address mappings

A. gupta, Y. Kim, B. Urgaonkar, Penn StateASPLOS
2009
Shimin Chen, Big Data Reading Group

2
Introduction

Goal improve performance of flash-based devices
for workloads with random writes
New Proposal DFTL (Demand-based FTL)
FTL flash translation layer)
FTL maintains a mapping table virtual ? physical
address

3
Outline

Introduction
Background on FTL
Design of DFTL
Experimental Results
Summary

4
Basics of Flash Memory

OOB (out-of-band) area
ECC
Logical page number
State erased/valid/invalid

5
Flash Translation Layer

Maintain mapping
Virtual address (exposed to upper level)?
physical address (on flash)
Use a small, fast SRAM for storing this mapping
Hide erase operation to the above
Avoiding in-place update
Updating a clean page
Performing garbage collection and erasure
Note
OOB has the physical ? virtual mapping
FTL virtual ? physical mapping can be rebuilt (at
restart)

6
Page-Level FTL

Keep page to page mapping table
Pro can map any logical page to any physical
page
Efficient flash page utilization
Con mapping table is large
E.g., 16GB flash, 2KB flash page, requires 32MB
SRAM
As flash size increases, SRAM size must scale
Too expensive!

7
Block-Level FTL

Keep block to block mapping
Pro small
Mapping table size reduced by a factor of (block
size / page size) 64 times
Con page number offset within a block is fixed
Garbage collection overheads grow

8
Hybrid FTLs (a generic description)
LPN Logical Page Number

Data blocks block-level mapping
Log/update blocks page-level mapping

9
Operations in Hybrid FTLs

Update on data blocks write to log blocks
Log region is small (e.g., 3 of total flash
size)
Garbage collection (gc)
When no free log blocks are available, invoke gc
to merge log blocks with data blocks

10
Full Merge can be Recursive thus Expensive

Often resulted from random writes

11
Outline

Introduction
Background on FTL
Design of DFTL
Experimental Results
Summary

12
DFTL Idea

Avoid expensive full merges totally
Do not use log blocks at all
Idea
Use page-level mapping
Keep the full mapping on flash to reduce SRAM use
Exploit temporal locality in workloads
Dynamically load / unload page-level mappings
into SRAM

13
DFTL Architecture
Global mapping table
14
DFTL Address Translation
Case 1 request_LPN hits in cache mapping
table Done. Retrieve the mapping directly
Global mapping table
15
DFTL Address Translation
Case 2 a miss in cache mapping table (CMT) If
(CMT is not full) then look up GDT
read the translation page fill in
CMT entry goto case 1
Global mapping table
16
DFTL Address Translation
Case 3 a miss in cache mapping table (CMT) If
(CMT is full) then select CMT entry to evict
(LRU) write back dirty entry goto
case 2
Global mapping table
17
Address Translation Cost

Worst case cost (case 3)
2 translation page reads
1 translation page write
Temporal locality
More hits, fewer misses, fewer evictions
CMT contains multiple mappings in a single
translation page
Batch updates

18
Data Read

Address translation LPN ? PPN
Read the data page PPN

19
Writes

Current data block
Updated data page is appended into current data
block
Current translation block
Updated translation page is appended into current
translation block
Until number of free blocks lt GC_threshold

20
Garbage Collection

Select a victim block

15 Kawaguchi et al. 1995
21
Garbage Collection

If selected victim block is a translation block
Copy valid page to a free translation block
Update GTD (global translation directory)
If selected victim block is a data block
Copy valid page to a free data block
Update the page-level translation for each data
block
Possibly update CMT entry (if so, done)
Locate translation page, update it, change GTD
Batch update opportunities if multiple page-level
translations are in the same translation page

22
Benefits

Page-level mapping
No expensive full merge operations
Better random write performance as a result
But random writes are still worse than sequential
more CMT misses, more translation page writes
Data pages in a block are more scattered
GC costs higher less opportunities for batch
updates

23
Outline

Introduction
Background on FTL
Design of DFTL
Experimental Results
Summary

24
FTL Schemes Implemented

FlashSim simulator
The authors enhanced DiskSim
Block-based FTL
A state-of-the-art hybrid FTL (FAST FTL)
DFTL
An idealized page-based FTL

25
Experimental Setup

Model 32GB flash memory, 2KB page, 128KB block
Timing is displayed in Table 1

26
Traces Used in Experiments
27
Block Erases
Baseline idealized page-level FTL
28
Extra Read/Write Operations
63 CMT hits for financial
29
Response Times (from tech report)
30
CDF
31
CDF
address translation overhead shows up
32
CDF
FAST has a long tail
33
Figure 10. Microscopic analysis
34
Summary