Memory - PowerPoint PPT Presentation

1 / 31

About This Presentation

Title:

Memory

Description:

Consider a memory with these parameters: 1 cycle to send ... Synchronous DRAM (SDRAM) Clock added to interface. Register to hold number of bytes requested ... – PowerPoint PPT presentation

Number of Views:18

Avg rating:3.0/5.0

Slides: 32

Provided by: sari158

Category:

more less

Transcript and Presenter's Notes

Title: Memory

1
Memory

Main Memory (Sections 5.8 and 5.9)
Simple main memory
Wider memory
Interleaved memory
Memory Technologies
DRAM, SRAM
Advances in DRAM technology
Virtual Memory (Section 5.10 and 5.11)
Motivation
Basics
Address translation
Interaction with caches
Protection

2
Simple Main Memory

Consider a memory with these parameters
1 cycle to send address
6 cycles to access each word
1 cycle to send word back to CPU/Cache
What's the miss penalty for a 4word block?
(1 6 cycles 1 cycle) ? 4 words
32 cycles
How can we speed this up?

3
Wider Main Memory

Make the memory wider
Read out 2 (or more) words in parallel
Memory parameters
1 cycle to send address
6 cycles to access each doubleword
1 cycle to send doubleword back to CPU/Cache
Miss penalty for a 4word block
(1 6 cycles 1 cycle) ? 2 doublewords
16 cycles
Cost
Wider bus
Larger expansion size

4
Interleaved Main Memory

Organize memory in banks
Subsequent words map to different banks
Word A in bank (A mod M)
Within a bank, word A in location (A div M)

Word address
Bank Word in Bank
How many banks to include?
5
Interleaved Main Memory

Organize memory in banks
Subsequent words map to different banks
Word A in bank (A mod M)
Within a bank, word A in location (A div M)

Word address
Bank Word in Bank
How many banks to include? banks gt clock
cycles to access word in a bank
6
Interleaved Main Memory (Cont.)

Simple interleaving for sequential accesses
(e.g., cache blocks)
Complex interleaving for others
(e.g., requests from nonblocking caches)
Alternative independent memory banks
Each bank has separate controller, separate
address lines, and maybe separate data lines

7
Memory Technologies

Dynamic Random Access Memory (DRAM)
Optimized for density, not speed
One transistor cells
Multiplexed address pins
Row Address Strobe (RAS)
Column Address Strobe (CAS)
Cycle time roughly twice access time
Destructive reads
Must refresh every few ms
Access every row
Sold as dual inline memory modules (DIMMs)
4 to 16 DRAMs on a board, 8 bytes wide

8
Memory Technologies, cont.

Static Random Access Memory (SRAM)
Optimized for speed, then density
46 transistors per cell
Separate address pins
Static ? No Refresh
Greater power dissipation than DRAM
Access time cycle time

9
DRAM Advances Page Mode

Normal DRAM
First read entire row
Then select column from row
Stores entire row in a buffer
Page Mode
Row buffer acts like an SRAM
By changing column address, random bits can be
accessed within a row.

10
DRAM Advances Synchronous DRAM

Normal DRAM has asynchronous interface
Each transfer involves handshaking with
controller
Synchronous DRAM (SDRAM)
Clock added to interface
Register to hold number of bytes requested
Send multiple bytes per request
Double Data Rate (DDR)
Send data on rising and falling edge of clock

11
DRAM Advances RAMBUS

RAMBUS uses same core DRAM technology, but new
interface
Each chip is a memory system
Interleaved memory
High speed interface
No RAS/CAS
Packet switched or split-transaction bus
Chip can return variable amount of data, perform
refresh
Uses a clock, transfer on both edges
First generation RDRAM
Second generation Direct RDRAM faster, wider

12
Virtual Memory

User operates in a virtual address space, mapping
between virtual space and main memory is
determined at runtime
Original Motivation
Avoid overlays
Use main memory as a cache for disk
Current motivation
Relocation
Protection
Sharing
Fast startup
Engineered differently than CPU caches
Miss access time O(1,000,000)
Miss access time ?? miss transfer time

13
Virtual Memory, cont.

Blocks, called pages, are 512 to 16K bytes.
Page placement
Fullyassociative -- avoid expensive misses
Page identification
Address translation -- virtual to physical
address
Indirection through one or two page tables
Translation cached in translation buffer
Page replacement
Approx. LRU
Write strategy
Writeback (with page dirty bit)

14
Address Translation
virtual page number
page offset
page-table- base-register

Page Table
protection dirty bit reference bit in-memory?
XXXXXXXXX
page offset
page frame number

Logical Path
Two memory operations
Often two or three levels of page tables
TOO SLOW!

15
Address Translation
virtual page number
page offset
TLB
tag pte
...
...
...
...
...
...
...
Compare Incoming Stored Tags and Select PTE
Hit/Miss
page offset
page frame number

Fast Path
Translation Lookaside Buffer (TLB, TB)
A cache w/ PTEs for data
Number of entries 32 to 1024

16
Address Translation / Cache Interaction

Address Translation
Cache Lookup

virtual page number
page offset
PO
VPN
TLB
PFN
PO
page offset
page frame number
address tag
block offset
index
BO
IDX
TAG
read tags
m?
m?
hit/miss
17
Sequential TLB Access

Address translation before cache lookup

Small Cache
Large Cache
PO
VPN
PO
VPN
TLB
TLB
PO
PFN
PO
PFN
TAG
TAG
BO
IDX
BO
IDX
read tags
read tags
m?
m?
m?
m?
Problems Slow May increase cycle time, CPI,
pipeline depth
18
Parallel TLB Access

Address translation in parallel with cache lookup

Small Cache
PO
VPN
BO
IDX
TLB
read tags
PFN
PO
TAG
m?
m?
19
Parallel TLB Access

Address translation in parallel with cache lookup
Index taken from virtual page number

Large Cache
PO
VPN
BO
IDX
TLB
read tags
PFN
PO
TAG
m?
m?
20
Parallel TLB Access

Address translation in parallel with cache lookup
Index taken from virtual page number
Could cause problems with synonyms

Large Cache
PO
VPN
BO
IDX
TLB
read tags
PFN
PO
TAG
m?
m?
21
Virtual Address Synonyms
Virtual Address Space
Physical Address Space
V0
P0
V1
Tag
Data
Virtual Index
V0
V1
22
Solutions to Synonyms
23
Solutions to Synonyms