Title: Computer Systems Organization & Architecture
1Computer Systems Organization
Architecture Chapter 1 Part 7 Memory
2Memory
- A group of circuits used to store data
- Not strictly combinatorial, but can be used in
combinatorial circuits
3Memory
- Has some number of memory locations
- Number may vary between chips
- But is fixed within a chip
- Each stores a binary value of some fixed length
- Size of chip number of locations times the
number of bits per location - 512 X 8 has 512 memory locations each of 8 bits
4Memory
- Address inputs of a memory chip
- Designates one of its locations
- Chip with 2n memory locations requires n address
inputs - Label inputs An-1An-2A0
- A 512 x 8 memory has address lines A8A7A6A0
since there are 512 29 memory locations
5Memory
- Data pins of a memory chip
- Used to access the data
- Chip with m bits per location requires m data
pins - Label inputs Dm-1Dm-2D0
- A 512 x 8 memory has address lines D7D6D0 since
there are 8 bits per location
6Memory Chips (a) ROM and (b) RAM
7Memory
- Other pins enable or chip select
- Chip enable signal (CE) enables or disables the
entire chip - When disabled, data pins output high impedance Z.
- CE may be active high or active low
- May have more than one memory chip, so might need
to select a specific chip
8Memory
- Other pins output enable
- Chip output enable signal (OE) enables or
disables the output of the chip - Both OE and CE must be active for a ROM chip to
output data (see later slide for definition of
ROM) - R/W 1, RAM inputs data.
- R/W 0, RAM outputs data.
9Memory
- Classes of Memory Chips
- Read Only Memory (ROM)
- Once programmed, cannot be changed
- Used as lookup tables to implement functions
- Used in PCs to store basic input/output systems
(BIOS) - Nonvolatile keeps value when power is removed
10Memory
- Classes of Memory Chips
- Random Access Memory (RAM)
- Initially contains no data
- Digital circuit can retrieve and store data at
various locations on a RAM chip - Data pins are bidirectional (into or out of the
chip) - Volatile loses data when power is removed
11Memory
- Reading a memory
- place address code of the desired cell on the
address pins of the RAM/ROM - internal circuit must read address and enable the
correct cell - places contents of the cell on the output lines
12ROM
- true ROM
- programmed during fabrication
- programmable ROMs (PROMs)
- can be programmed by the user, but only once
- erasable-programmable ROMs (EPROMs)
- can be programmed, erased, and reprogrammed many
times - other types exist
13ROM
- true ROM
- programmed during fabrication
- manufacturer takes a ROM mask
- modifies it to store desired values
- fabricates
- initial cost high, production relatively cheap
- economical when ROM will be used in large
quantities
14ROM
- Programmable ROMs (PROMs)
- example 2764 8kx8 PROM 8-bit cells, 13 address
lines. Initialized with 1s in all cells. - each bit is a fuse, a conductor that can be
melted by larger than normal current. - blowing the fuse changes the stored value to 0
- commercially available PROM programmers can be
used to blow selected fuses - once blown, cannot restore a fuse.
- data lasts forever
15ROM
- Erasable PROMs (ePROMs)
- Can be programmed by user.
- Data is stored as electric charges not fuses.
- Can erase data using ultraviolet light (UVPROMs)
- or electric impulses (EEPROMs)
- used for developing and debugging new circuits
- Can be reprogrammed ltgollumgt thousands of times
before it deteriorates. - Can hold data for 10 years or more.
- relatively expensive
16ROM use
- Can replace a combinational circuit with a ROM.
- Create a truth table.
- inputs become the address in the ROM
- output becomes the value stored in that cell.
17ROM use
- Implementing 2-bit multiplication
- A1A0 x B1B0
- Number of cells 16 different combinations of
possible multiplicands, so need 16 cells - Largest result 112 x 112 10012 so need 4 bits
in each cell. - Require a 16 x 4 ROM.
- Individual cells will store the result of
multiplying the first two bits of their address
with the last two bits. - Fast, but large. Multiplying two 16-bit numbers
would require 232 cells, each of 32 bits. - Modern computers do multiplication in
combinational circuits
18ROM vs. PLA
- ROM is fully decoded
- contains a full output word for every possible
input combination - A PLA is only partially decoded.
- only contains entries for specific inputs
- PLA smaller
- inputs grow exponentially but the number of
product terms grows much more slowly - but to add a new entry, must change size
- ROM easier to change
- normally not all entries are used, but all are
present - to expand, just fill in the entry
19RAM
- static RAM (SRAM).
- based on latches value is stored on a pair of
inverting gates - Take 4-6 transistors
- keep stored data al long as the power is on.
- dynamic RAM (DRAM)
- based on capacitors formed by transistors (1
transistor also used) - the capacitors represent data as stored charge
- capacitors slowly lose charge will lose it in
milliseconds even if power is on. - capacitors must be periodically rewritten or
refreshed
20RAM
- static RAM (SRAM).
- cells take more space cannot be made as dense as
DRAM - result SRAM is more expensive than DRAM
- used where high speed is necessary (e.g., cache)
- dynamic RAM (DRAM)
- DRAM needs to be refreshed after several
milliseconds - results in longer cycle time, the minimum time
between consecutive memory accesses - simpler internally
- normally DRAM is a generation ahead of SRAM in
terms of size
21SRAM
- SRAM is an IC that has a memory array
- Normally a single access port that provides a
read or write - has fixed access time to any data
- write and read access characteristics often
differ - has a specific configuration
22SRAM
- SRAM is an IC that has a memory array
- example 4M x 8
- 4M entries (called the height)
- each entry has 8 bits (called the width)
- 22 address lines (4M 222)
- 8-bit input line
- 8-bit output line
- For technical reasons most SRAM are x 1 and x 4.
23SRAM
Chip Select determines if this chip is used.
21
SRAM 2M x 16
Address
Chip select
16
Dout15-0
Output enable
Write enable
16
Din15-0
24SRAM
- SRAM read access time
- specified as the delay from the time that Output
enable is true and address values are valid until
the time that the data is on the output lines - read access time for SRAM (CMOS, 2004) 2-4ns for
best (and narrowest) - read access time 8-20ns for typical largest
parts (32 million bits of data). - read access time low-power SRAM 5-10 times slower
- but much lower power consumption
25SRAM
- SRAM write access time
- specified as the delay from the time that Write
enable is true, address values are valid, data in
is valid until the time that the value is written
into the cell - there are set-up time and hold-time requirements
for the address and data lines. - the Write enable signal is not a clock edge,
rather a pulse with a minimum width requirement.
26SRAM
- SRAM design
- cannot build the same as a register file
- would need too large a multiplexor 64K-to-1 for
a 64K x 1 SRAM. - Instead use a shared output line called a bit
line - use tri-state buffers to determine which cell
drives the bit line. - Put the tri-state buffers in each cell of the
SRAM. - More efficient that using a large centralized
multiplexor - See next slide
274X2 SRAM Uses D latches Enable controls a
tri-state gate.
282D Memory Organization
- SRAM design
- To further minimize the size of
decoder/multiplexor, we can split the address
into two parts - One part will select the row
- The second part will select the appropriate
column - This is called a 2D Memory Organization
292D Memory organization
The leftmost 12 bits select The column. The
rightmost 10 bits select the column
30SRAM
- SRAM design
- Still need a large decoder and many word lines
(to enable the individual flip-flops) - Decoder requires at least one AND gate for each
output line. - 2n AND gates in an n-bit decoder
- in a 4M x 8 SRAM need a 2-to-4M decoder and 4M
word lines. - Also has a high pin count
- 2D organization requires n w 4 pins to
implement a 2n x w memory - n pins for the address w lines for the address
4 lines for control, voltage, and ground
31SRAM
- SRAM design how many pins to construct a 1024 x
8 memory using a 2D organization? - 22 pins total
- 10 address pins (1024 210),
- 8 data pins,
- a CS pin, a R/W pin,
- a ground pin, and a common voltage pin.
32SRAM
- Relationship between the costs for implementing
memories with 4Kbits (4096 bits). - The more square the memory, the lower the cost of
the decoder but the greater the number of pins.
33SRAM
- Goal
- Minimize the number of decoder gates.
- Minimize the total number of pins required.
- Implementation
- Want a square memory to minimize number of
decoder gates - Want a large height/small width to minimize the
number of data pins (like a 4096 X 1 size in the
chart).
34SRAM
- Solution a 2 1/2 D organization
- Use square memory
- But make the memory appear to be rectangular (not
square) to minimize the number of pins. - See next two slides
352 1/2 D Memory Organization
4096 X 1 memory Uses a square 64 X 64 memory
array Logical word size is 1 bit (i.e., size of
word that is output)
36Memory Organization
- Build larger memories from these simple 64 x 1
chips - Also build 8bit memory cells from combinations of
these chips. - Example build a 128K x 8 memory from 16 64 x 1
chips - Both hold 210 bits
- Would use two banks of memory, each of which
would provide 64K x 8 memory - Each bank would contain 8 64 x 1 chips.
37Memory Organization
- 2 1/2 D organization does not work well for large
memories - To put 4Mb on a chip need 27 pins
- 22 address, 1 data, 2 control, 2 common (voltage
and ground) - Must also multiplex the address lines.
- Use a row address and a column address can cut
number of address pins in half - Will see this later
38Multiplexing Address
- 16 Mbit DRAM configured as a 2M x 8 chip.
- Cells are 4K x 4K array
- The 4096 cells in each row are divided into 512
groups of 8. - Row stores 512 bytes of data.
- 21 bits in address
- 12 address bits to select row
- use high-order 12 bits for this
- 9 bits to specify a group of 8 bits in the
selected row - use low-order 9 bits of address
- these are the column address
- Continued
39Multiplexing Address
- Read or Write operation
- Apply row address on the 12 input pins first
- loaded into the row address latch
- controlled by the RAS signal
- then initiate a read operation
- all cells on the selected row are read and
refreshed. - next apply column address on the address pins
- load into the column address latch
- Controlled by the CAS signal
- selects the appropriate group of 8 Sense/Write
circuits.
40DRAM
- DRAM value is stored as a charge on a capacitor.
- A single transistor is used to access the stored
charge. - Capacitors lose charge after several milliseconds
- To keep a charge, must refresh the capacitor
- Single-chip memory controllers handle the refresh
function independent of the CPU. - SRAM stores value on a pair of inverting gates
- 4-6 transistors.
- DRAM cells are smaller, thus DRAM are denser
41DRAM
- Refreshing
- Use a two-level decoding structure.
- Allows us to refresh an entire row at a time
- Use a read cycle followed immediately by a write
cycle - Refresh typically consumes 1 to 2 of the active
cycles of the DRAM.
42DRAM
43DRAM 2Mx8
The Sense/write logic senses the outputs of the
selected memory cells during the first phase of a
memory access and sets the values of the selected
memory cell in the second phase.
The leftmost 12 bits select The column. The
rightmost 9 bits select the column
4K x 4K memory array 4096 rows, 512 groups of
columns Each group stores 8 bits.
44DRAM 2Mx8
- Can multiplex the address lines on 12 pins
- During a read or write, row address applied
first - the Row Address Strobe is input
- Causes the row address to be loaded into the row
address latch
The leftmost 12 bits select The column. The
rightmost 9 bits select the column
4K x 4K memory array 4096 rows, 512 groups of
columns Each group stores 8 bits.
45DRAM 2Mx8
- next a read operation is initiated.
- all cells on the selected row are read and
refreshed.
The leftmost 12 bits select The column. The
rightmost 9 bits select the column
4K x 4K memory array 4096 rows, 512 groups of
columns Each group stores 8 bits.
46DRAM 2Mx8
- Now put column address on input pins
- Load into the column address latch when CAS is
asserted. - Appropriate group of 8 sense/write circuits are
selected - If the R/W indicates READ, transfer values to
output lines - If WRITE, information on D7-0 transferred to
sense/write circuit and used to overwrite the
appropriate row values
The leftmost 12 bits select The column. The
rightmost 9 bits select the column
4K x 4K memory array 4096 rows, 512 groups of
columns Each group stores 8 bits.
47Memory Banks
- Problem CPU needs to manipulate different types
of data - character might be 8 bits
- integer might be 16 or 32 bits
- float might be 64 bits
- Memory must accommodate this need
- CPU must be allowed to store different sized data
48Memory Banks
- Solution memory banks
- Modern memory provides access to byte (8-bit),
halfword (16-bit) and word (32-bit). - Some provide access to 64 or 128-bit values
- Example memory might allow access to 8-bit and
16-bit values - need size signal
- need two banks of memory
- each bank acts as an independent 2n-1 x 8 memory
- to access a 16-bit value must access an 8-bit
value from each bank
49Memory Banks
50Memory Banks
- Addresses
- global addresses
- range from 0 to 2n - 1
- two sets of relative addresses (one for each
bank) - for each bank, range from 0 to 2n-1 - 1
- Converting a global address to a local address
- least significant bit of the global address
identifies the bank - most significant n - 1 bits specify the relative
address - see next slide
51Memory Banks
Bank 0
Bank 1
52Memory Banks
Bank 0
Bank 1
53Memory Banks
A 16-bit value stored in global address 10 would
have 8 bits in bank 0, address 5 (101) and 8 bits
in bank 1, address 5 (101)
Bank 0
Bank 1
54Memory Banks
A 16-bit value stored in global address 9 would
have 8 bits in bank 1, address 8 (100) and 8 bits
in bank 0, address 5 (101)
Bank 0
Bank 1
55Memory Banks
- Addresses
- this memory is called interleaved
- global memory addresses are interleaved between
two banks - could instead allocate addresses 0 through 2n-1
-1 to bank 0 and 2n-1 to 2n - 1 to bank 1 - this makes it difficult to store/access a 16-bit
word. Would have to store it in two adjacent
memory locations in one bank - must then use two memory cycles to load/store a
word.
56Memory Banks
- Addresses
- Suppose you access an 8-bit value
- would the value be put on lines D0 to D7 or D8 to
D15? - If the memory has a justified port, will always
place an 8-bit value on the same lines (e.g. D0
to D7 ) - Suppose you access a 16-bit value
- if the value begins on an even address, easy
- if value begins on odd address, complicated
- must cross data lines
- must add one to get the second (even) part of the
address - Memory systems often allow 16-bit values to be
stored only on an even address. Called aligned
system.
57Synchronous DRAM (SDRAM)
- SSRAM transfer burst of data from a series of
sequential addresses within an array or row. - Burst is defined by the starting address and a
burst length. - Advantage dont have to specify additional
address bits. - Instead use a clock to transfer the successive
bits in the burst. - Much faster block transfer times.
58Synchronous DRAM (SDRAM)
- Example a 64M x 4 DRAM actually access 8K bits
on every row access - Then throws away all but 4 of those bits during a
column access. - Instead allow the column address to change
without changing the row address - Thus just access the other bits in the column
latches - Can use a clock signal and a column counter to
transfer successive addresses - Need a special mode register to determine
whether/how many consecutive cells to access
59SDRAM
60Double Date Rate Synchronous SRAM (DDR SDRAM)
- Standard SDRAM performs all actions on the rising
edge of the clock signal. - DDR SDRAM transfers data on both edges of the
clock. - Doubles bandwith for long burst transfers.
- Must organize memory into two data banks
- Each bank accessed separately
- Consecutive words of a given block are stored in
different banks (called interleaving).
61Rambus memory