Title: Some Results on Codes for Flash Memory
1Some Results on Codes for Flash Memory
- Michael Mitzenmacher
- Includes work with Hilary Finucane, Zhenming Liu,
Flavio Chierichetti
2Flash Memory
- Now becoming the standard for many products and
devices. - Even flash hard drives becoming a standard.
- But flash memory works differently than
traditional memories. - New, interesting questions.
3Basics of Flash
- Data organized into cells
- Can write at the cell level
- Cells contain electrons
- Can ADD electrons at the cell level
- Typical ranges are 2-4 possible states, but may
increase 256 someday? - Cells organized into blocks
- Can only ERASE at the block level
- Blocks can be thousands/hundreds of thousands of
4The Problem with Erasures
- Erasing a block is expensive
- In terms of time solve by preemptive moves of
data. - In terms of wear.
- Limited life cycles imply minimizing block
erasure an important goal.
5Basics of Flash
- Reading and one-way writing adding electrons
is easy. - Writing general values is hard.
- What should our data representation look like in
such a setting?
0 2 3 1
2 2 3 1
0 2 3 1
0 2 1 1
6Big Underlying Question
- How should flash change our underlying
algorithms, data structures, data representation? - Memory structure, hierarchy has big impact on
performance. - Algorithmists should care!
- Here focusing on basic question of data
7Some History
- Write-once memories (WOMs)
- Introduced by Rivest and Shamir, early 1980s.
- Punch cards, optical disks.
- Can turn 0s to 1s, but not back again.
- Question How many punch card bits do you need
to represent t rewrites of a k-bit value? - Starting point for this kind of analysis.
- Better schemes than the naïve kt bits.
8Floating Codes
- Data representation for flash memory.
- State is an n-ary sequence of q-ary numbers.
- Represents block of n cells each cell holds an
electric charge, q states. - State mapped to variable values.
- Gives k-ary sequence of l-ary numbers.
- State changes by increasing one or more cell
values, or reset entire block. - Resets are expensive!!!!
9Floating Codes The Problem
- As variable values change, need state to track
variables. - How do we choose the mapping function from states
to variables AND the transition function from
variable changes to state changes to maximize the
time between reset operations? - These codes do not correct errors. Just data
representation. - Errors a separate issue.
10Formal Model
- General Codes
- We usually consider limited variation one
variable changes per step.
Track k 4 bits (so l 2) with n 8 cells
having q 4 states
3 2 2 0 3 0 3 1
1 0 1 0
Change bit 3
3 2 2 0 3 1 3 1
1 0 0 0
Change bit 2
3 2 3 0 3 1 3 1
1 1 0 0
Change bit 1
3 3 2 0 3 1 3 1
0 1 0 0
Change bit 1
1 0 1 0 0 0 0 0
1 1 0 0
- Floating codes introduced by Jiang, Bohossian,
Bruck (ISIT 2007) as model for Flash Memory. - Designed to maximize worst-case time between
resets. - New multidimensional flash codes suggested by
Yaakobi, Vardy, Siegel, Wolf in Allerton 2008. - Average case studied by Finucane, Liu,
Mitzenmacher in Allerton 2008.
13Contribution 1 New Worst-Case Codes
- Hilary Finucanes senior thesis.
- Similar codes also found simultaneously by
Yaakobi et al. - Simple construction, best known performance.
- Tracks k bits of data, for even k.
- Performance measured by deficiency.
- Max possible updates is n(q-1).
- Deficiency is smallest t such that n(q-1)-t
updates always possible.
14Mod-Based Codes
- Break block into groups of k cells.
- Each group will represent 1 bit.
- And at most one active group per bit.
- Parity of group determines value of bit.
- Increase a cell by 1 each time the bit changes.
- How do we know which bit for each group?
- Start with jth cell within a group to represent
bit j. - As cells fill go right, moving back to first cell
at end. - Either last empty cell is j - 1, or only non-full
cell is j - 1 either way, can figure out which
bit. - Maximum deficiency k2q. Independent of n!
Track k 8 bits with cells having q 4 states
0 0 0 0 3 0 0 0
Bit 5 is 1
0 0 0 0 3 3 2 0
Bit 5 is 0
3 3 3 3 3 3 2 0
Bit 1 is 0
3 3 1 3 3 3 3 3
Bit 4 is 0
0 0 0 0 0 0 0 0
Empty block, ignore
3 3 3 3 3 3 3 3
Full block, ignore
16Further Improvements
- Can improve basic construction by being more
careful as available cells get small. - Can prove O(kq(log2k)(logqk)) deficiency.
- Use smaller blocks of cells, but explicitly write
which bit it stores, when number of cells gets
17Contribution 2 Average Case
- Argument Worst-case time between resets is not
right design criterion. - Many resets in a lifetime.
- Mass-produced product.
- Potential to model user behavior.
- Statistical performance guarantees more
appropriate. - Expected time between resets.
- Time with high probability.
- Given a model.
18Specific Contributions
- Problem definition / model
- Codes for simple cases
19Formal Model Average Case
- Above when
- Cost is 0 when R moves to cell state above
previous, 1 otherwise. - Assumption variables changes given by Markov
chain. - Example ith bit changes with prob. pi
- Given D, R, gives Markov chain on cell states.
- Let ? be equilibrium on cell states.
- Goal is to minimize average cost
- Same as maximize average time between resets.
- Many possible variations
- Multiple variables change per step
- More general random processes for values
- Rules limiting transitions
- General costs, optimizations
- Hardness results?
- Conjecture some variations NP-hard or worse.
21Building BlockCode n 2, k 2, l 2
- 2 bit values.
- 2 cells.
- Code based on striped Gray code.
- Expected time/time with high probability before
reset 2q - o(q) - Asymptotically optimal for all p, 0 lt p lt 1.
- Worst case optimal approx 3q/2.
D(0,0) 00 D(1,3) 11 R((1,0),2,1) (2,0)
22Proof Sketch
- Even cells down with probability p, right with
probability 1-p. - Odd cells right with probability p, down with
probability 1-p. - Code hugs the diagonal.
- Right/down moves approximately balance for first
2q-o(q) steps.
23A Slightly Better Code
- Changing the final corner improves things.
24Performance Results
25Codes for k l 2
- Break into Gray code blocks larger n.
- Each bit walks along diagonal of its own Gray
code block. - At the last block, behaves like n 2, k 2, l
2 - Expected deficiency O(sqrt(q)).
Bit 1 changes recorded from the left
Meet somewhere in the middle, depending on rates
Bit 2 changes recorded from the right
27Random Codes
- Average-case analysis looks at random data
- Natural also to look at random codes
(Shannon-style arguments) - We consider random codes in the setting of
general transitions. - All k bits can change simultaneously
- Give some insights into what may be possible.
- Results in paper.
- New questions arising from flash memory.
- How to store data to maximize lifetimes.
- How to code to deal with errors.
- How to optimize algorithms and data structures.
- How to optimize memory hierarchies and
variable-type memory systems. - Big question is this a core science
game-changer? - How much should we be re-thinking?