Mobile Memory Improving memory locality in very large reconfigurable fabrics - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Mobile Memory Improving memory locality in very large reconfigurable fabrics

Description:

Configurable Logic Block Embedded Memory. Expect entire applications to be mapped on very large reconfigurable fabric (VLRF) ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 34
Provided by: lul9
Category:

less

Transcript and Presenter's Notes

Title: Mobile Memory Improving memory locality in very large reconfigurable fabrics


1
Mobile MemoryImproving memory locality in very
large reconfigurable fabrics
Rong Yan, Seth C. Goldstein
  • Carnegie Mellon University
  • yanrong,seth_at_cs.cmu.edu
  • 3/22/2002

2
Outline
  • Motivation
  • Mobile Memory vs. Cache-Only Memory Architecture
  • Design Issues
  • Implementation Cost
  • Conclusion

3
Outline
  • Motivation
  • Mobile Memory vs. Cache-Only Memory Architecture
  • Design Issues
  • Implementation Cost
  • Conclusion

4
Increasing FPGA Density
http//www.xilinx.com/support/techxclusives/evolut
ion-techX20.htm
5
Increasing FPGA Density
  • Configurable Logic Block Embedded Memory
  • Expect entire applications to be mapped on very
    large reconfigurable fabric (VLRF)

http//www.xilinx.com/support/techxclusives/evolut
ion-techX20.htm
6
Abstraction for VLRF
Computation Core
Embedded Memory
Very Large Reconfigurable Fabric
7
Problem in VLRF
  • Long idle time for some benchmarks
  • One of the main reasons large memory latency
  • Above are all the benchmarks which can run on
    our simulator from MediaBench and SpecInt95

8
Possible Solutions
  • Possible solutions exploit the reference
    locality
  • introduce cache
  • move memory data at run time
  • Our Choice Mobile memory
  • Move the memory data closer to accessor at run
    time, inspired by cache-only memory
    architecture(COMA)
  • Investigate whether it is enough or we need more
    complex solution, e.g. replication

9
Outline
  • Motivation
  • Mobile Memory vs. Cache-Only Memory Architecture
  • Design Issues
  • Implementation Cost
  • Conclusion

10
Quick Review - COMA
  • Key Points
  • Shared memory multiprocessor, connected by
    network
  • Main Memory acts as cache
  • Automatically replicate/ migrate data to the
    accessing processor at run time

11
Mobile memory vs. COMA
  • Similar idea in different contexts
  • Analogy

VLRF
Multiprocessors
Code
Processor
12
Mobile memory vs. COMA
13
Limit Study
  • Purpose Examine if mobile memory may or may not
    be beneficial in the context of VLRF
  • Definition in our computational Model
  • Unit area of 32-bit memory or 32-bit adder
    (assume equal size)
  • Cluster A number of units grouped together
  • Assumption
  • There are huge, infinite resources available
  • Assume the memory data can move to any position
    at run time, even overlapped with code region
  • Assume no additional cost for memory movement
  • No replacement policy
  • Move only one memory word at a time

14
Outline
  • Motivation
  • Mobile Memory vs. Cache-Only Memory Architecture
  • Design Issues
  • Analytical Model
  • Implementation Cost
  • Conclusion

15
Mobile Memory
  • Goal reduce the memory latency by exploiting
    memory locality
  • Approach Move the memory at run time, without
    replication
  • mobile memory policies

16
Mobile memory policies
  • Three main design axes
  • When to move
  • Where to move (our focus)
  • How much to move
  • Proposed policies
  • Greedy Policy, N-Best Policy, Centroid

17
Greedy Policy
  • Always move memory data to the most recent
    accessor

A
M
Example
18
Greedy Policy
  • Always move memory data to the most recent
    accessor

A
M
Example
19
Bad case for Greedy Policy
  • Example Ping-Pong access, two accessors
    alternate accesses to a memory location

20
N-Best Policy
  • History of last N accessors, shared for whole
    cluster
  • Assume the access pattern is repeating
  • Move to one among these accessors that minimizes
    memory access cycles

Example(N 3)
21
N-Best Policy
  • History of last N accessors, shared for whole
    cluster
  • Assume the access pattern is repeating
  • Move to one among these accessors that minimizes
    memory access cycles

A
A1
M
A2
Example(N 3)
22
Centroid Policy
  • History of last N accessors, shared for whole
    cluster
  • Move to the centroid of the N accessors

Example(N 3)
23
Centroid Policy
  • History of last N accessors, shared for whole
    cluster
  • Move to the centroid of the N accessors

A1
A
M
A2
Example(N 3)
24
Comparison
  • An offline algorithm is used to estimate the
    optimal performance,
  • Please refer to our paper for more details

25
Memory Access Cycles
Memory access cycles for different policies
normalized by baseline cycles
26
Total Cycles
Total cycles for different policies normalized
by baseline cycles
27
Outline
  • Motivation
  • Mobile Memory vs. Cache-Only Memory Architecture
  • Design Issues
  • Implementation Cost
  • Conclusion

28
Implementation Cost
  • Directories
  • Cost to localize the moved memory
  • Assume each clusters coupled with two directories

Cluster
Local DIR
Home DIR
29
Implementation Cost
  • Directories (Cont.)

Local Directory Misses
Cluster
Local DIR
A
M
Home Cluster
Home DIR
30
Sensitivity to directory size
The effect on memory access cycles with different
size local directory
31
Implementation Cost
  • Cost of making room
  • Assure enough room to accommodate new data
  • Reserve portion of memory, and thus expand graph
  • Increase both control transfer cycles and memory
    access cycles

2
1
Dilation Factor 2
Cluster
3
4
32
Effect of implementation cost
Total execution cycles with different dilation
factor for centroid policy, normalized by
baseline cycles
33
Conclusion
  • Mobile memory aims at solving the memory
    bottleneck in VLRF, inspired by COMA
  • Simple heuristics are enough
  • Lower implementation cost is a key issue
  • Mobile memory may not be sufficient
  • Replication is probably required
Write a Comment
User Comments (0)
About PowerShow.com