DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings

Description:

DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings A. gupta, Y. Kim, B. Urgaonkar, Penn State – PowerPoint PPT presentation

Number of Views:278
Avg rating:3.0/5.0
Slides: 35
Provided by: csCmuEdu83
Category:

less

Transcript and Presenter's Notes

Title: DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings


1
DFTL A flash translation layer employing
demand-based selective caching of page-level
address mappings
  • A. gupta, Y. Kim, B. Urgaonkar, Penn StateASPLOS
    2009
  • Shimin Chen, Big Data Reading Group

2
Introduction
  • Goal improve performance of flash-based devices
    for workloads with random writes
  • New Proposal DFTL (Demand-based FTL)
  • FTL flash translation layer)
  • FTL maintains a mapping table virtual ? physical
    address

3
Outline
  • Introduction
  • Background on FTL
  • Design of DFTL
  • Experimental Results
  • Summary

4
Basics of Flash Memory
  • OOB (out-of-band) area
  • ECC
  • Logical page number
  • State erased/valid/invalid

5
Flash Translation Layer
  • Maintain mapping
  • Virtual address (exposed to upper level)?
    physical address (on flash)
  • Use a small, fast SRAM for storing this mapping
  • Hide erase operation to the above
  • Avoiding in-place update
  • Updating a clean page
  • Performing garbage collection and erasure
  • Note
  • OOB has the physical ? virtual mapping
  • FTL virtual ? physical mapping can be rebuilt (at
    restart)

6
Page-Level FTL
  • Keep page to page mapping table
  • Pro can map any logical page to any physical
    page
  • Efficient flash page utilization
  • Con mapping table is large
  • E.g., 16GB flash, 2KB flash page, requires 32MB
    SRAM
  • As flash size increases, SRAM size must scale
  • Too expensive!

7
Block-Level FTL
  • Keep block to block mapping
  • Pro small
  • Mapping table size reduced by a factor of (block
    size / page size) 64 times
  • Con page number offset within a block is fixed
  • Garbage collection overheads grow

8
Hybrid FTLs (a generic description)
LPN Logical Page Number
  • Data blocks block-level mapping
  • Log/update blocks page-level mapping

9
Operations in Hybrid FTLs
  • Update on data blocks write to log blocks
  • Log region is small (e.g., 3 of total flash
    size)
  • Garbage collection (gc)
  • When no free log blocks are available, invoke gc
    to merge log blocks with data blocks

10
Full Merge can be Recursive thus Expensive
  • Often resulted from random writes

11
Outline
  • Introduction
  • Background on FTL
  • Design of DFTL
  • Experimental Results
  • Summary

12
DFTL Idea
  • Avoid expensive full merges totally
  • Do not use log blocks at all
  • Idea
  • Use page-level mapping
  • Keep the full mapping on flash to reduce SRAM use
  • Exploit temporal locality in workloads
  • Dynamically load / unload page-level mappings
    into SRAM

13
DFTL Architecture
Global mapping table
14
DFTL Address Translation
Case 1 request_LPN hits in cache mapping
table Done. Retrieve the mapping directly
Global mapping table
15
DFTL Address Translation
Case 2 a miss in cache mapping table (CMT) If
(CMT is not full) then look up GDT
read the translation page fill in
CMT entry goto case 1
Global mapping table
16
DFTL Address Translation
Case 3 a miss in cache mapping table (CMT) If
(CMT is full) then select CMT entry to evict
(LRU) write back dirty entry goto
case 2
Global mapping table
17
Address Translation Cost
  • Worst case cost (case 3)
  • 2 translation page reads
  • 1 translation page write
  • Temporal locality
  • More hits, fewer misses, fewer evictions
  • CMT contains multiple mappings in a single
    translation page
  • Batch updates

18
Data Read
  • Address translation LPN ? PPN
  • Read the data page PPN

19
Writes
  • Current data block
  • Updated data page is appended into current data
    block
  • Current translation block
  • Updated translation page is appended into current
    translation block
  • Until number of free blocks lt GC_threshold

20
Garbage Collection
  • Select a victim block

15 Kawaguchi et al. 1995
21
Garbage Collection
  • If selected victim block is a translation block
  • Copy valid page to a free translation block
  • Update GTD (global translation directory)
  • If selected victim block is a data block
  • Copy valid page to a free data block
  • Update the page-level translation for each data
    block
  • Possibly update CMT entry (if so, done)
  • Locate translation page, update it, change GTD
  • Batch update opportunities if multiple page-level
    translations are in the same translation page

22
Benefits
  • Page-level mapping
  • No expensive full merge operations
  • Better random write performance as a result
  • But random writes are still worse than sequential
  • more CMT misses, more translation page writes
  • Data pages in a block are more scattered
  • GC costs higher less opportunities for batch
    updates

23
Outline
  • Introduction
  • Background on FTL
  • Design of DFTL
  • Experimental Results
  • Summary

24
FTL Schemes Implemented
  • FlashSim simulator
  • The authors enhanced DiskSim
  • Block-based FTL
  • A state-of-the-art hybrid FTL (FAST FTL)
  • DFTL
  • An idealized page-based FTL

25
Experimental Setup
  • Model 32GB flash memory, 2KB page, 128KB block
  • Timing is displayed in Table 1

26
Traces Used in Experiments
27
Block Erases
Baseline idealized page-level FTL
28
Extra Read/Write Operations
63 CMT hits for financial
29
Response Times (from tech report)
30
CDF
31
CDF
address translation overhead shows up
32
CDF
FAST has a long tail
33
Figure 10. Microscopic analysis
34
Summary
  • Demand-based page-level FTL
  • Two-level page table
  • (Flash) Translation page LPN to PPN entries
  • (SRAM) Global translation directory translation
    page entries
  • Mapping cache in SRAM
Write a Comment
User Comments (0)
About PowerShow.com