Title: FSM, Cache memory
1FSM, Cache memory
- Prof. Sin-Min Lee
- Department of Computer Science
CS147 Lecture 12
2The Five Classic Components of a Computer
3The Processor Picture
4Processor/Memory Bus
PCI Bus
I/O Busses
5Using T Flip Flop and JK Flip Flop
- log24 2, so 2 flip flops are needed to
implement this FSA
6Step 1 - Translate diagram into StateTable
7Step 2 - Create maps for T and JK
8Step 3 - Determine T, J, and K equations
9Step 4 - Draw resulting diagram
10Implementing FSM with No Inputs Using D, T, and
JK Flip Flops
- Convert the diagram into a chart
11Implementing FSM with No Inputs Using D, T, and
JK Flip Flops Cont.
12Implementing FSM with No Inputs Using D, T, and
JK Flip Flops Cont.
13Implementing FSM with No Inputs Using D, T, and
JK Flip Flops Cont.
14von NeumannArchitecturePrinceton
Memory
Address Pointer
Arithmetic Logic Unit (ALU)
Data/Instructions
Pc Pc 1
Program Counter
Featuring Deterministic Execution
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19Cache Memory
- Physical memory is slow (more than 30 times
slower than processor) - Cache memory uses SRAM chips.
- Much faster
- Much expensive
- Situated closest to the processor
- Can be arranged hierarchically
- L1 cache is incorporated into processor
- L2 cache is outside
20Cache Memory
- This photo shows level 2 cache memory on the
Processor board, beside the CPU
21Cache Memory- Three LevelsArchitecture
Memory Multi- Gigabytes Large and Slow 160 X
Cache Control Logic
2 Gigahertz Clock
8X
2X
16X L3 Cache Memory
L2 Cache Memory
L1 Cache Memory
32 Kilobytes
128 Kilobytes
16 Megabytes
Featuring Really Non-Deterministic Execution
Address Pointer
22(No Transcript)
23Cache (1)
- Is the first level of memory hierarchy
encountered once the address leaves the CPU - Since the principle of locality applies, and
taking advantage of locality to improve
performance is so popular, the term cache is now
applied whenever buffering is employed to reuse
commonly occurring items - We will study caches by trying to answer the four
questions for the first level of the memory
hierarchy
24(No Transcript)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28(No Transcript)
29Cache (2)
- Every address reference goes first to the cache
- if the desired address is not here, then we have
a cache miss - The contents are fetched from main memory into
the indicated CPU register and the content is
also saved into the cache memory - If the desired data is in the cache, then we have
a cache hit - The desired data is brought from the cache, at
very high speed (low access time) - Most software exhibits temporal locality of
access, meaning that it is likely that same
address will be used again soon, and if so, the
address will be found in the cache - Transfers between main memory and cache occur at
granularity of cache lines or cache blocks,
around 32 or 64 bytes (rather than bytes or
processor words). Burst transfers of this kind
receive hardware support and exploit spatial
locality of access to the cache (future access
are often to address near to the previous one)
30Where can a block be placed in Cache? (1)
- Our cache has eight block frames and the main
memory has 32 blocks
31Where can a block be placed in Cache? (2)
- Direct mapped Cache
- Each block has only one place where it can appear
in the cache - (Block Address) MOD (Number of blocks in cache)
- Fully associative Cache
- A block can be placed anywhere in the cache
- Set associative Cache
- A block can be placed in a restricted set of
places into the cache - A set is a group of blocks into the cache
- (Block Address) MOD (Number of sets in the cache)
- If there are n blocks in the cache, the placement
is said to be n-way set associative
32How is a Block Found in the Cache?
- Caches have an address tag on each block frame
that gives the block address. The tag is checked
against the address coming from CPU - All tags are searched in parallel since speed is
critical - Valid bit is appended to every tag to say whether
this entry contains valid addresses or not - Address fields
- Block address
- Tag compared against for a hit
- Index selects the set
- Block offset selects the desired data from the
block - Set associative cache
- Large index means large sets with few blocks per
set - With smaller index, the associativity increases
- Full associative cache index field is not
existing
33(No Transcript)
34Which Block should be Replaced on a Cache Miss?
- When a miss occurs, the cache controller must
select a block to be replaced with the desired
data - Benefit of direct mapping is that the hardware
decision is much simplified - Two primary strategies for full and set
associative caches - Random candidate blocks are randomly selected
- Some systems generate pseudo random block
numbers, to get reproducible behavior useful for
debugging - LRU (Last Recently Used) to reduce the chance
that information that has been recently used will
be needed again, the block replaced is the
least-recently used one. - Accesses to blocks are recorded to be able to
implement LRU
35What Happens on a Write?
- Two basic options when writing to the cache
- Writhe through the information is written to
both, the block in the cache an the block in the
lower-level memory - Write back the information is written only to
the lock in the cache - The modified block of cache is written back into
the lower-level memory only when it is replaced - To reduce the frequency of writing back blocks on
replacement, an implementation feature called
dirty bit is commonly used. - This bit indicates whether a block is dirty (has
been modified since loaded) or clean (not
modified). If clean, no write back is involved
36(No Transcript)
37(No Transcript)
38There are three methods in block placement
Direct mapped if each block has only one place
it can appear in the cache, the cache is said to
be direct mapped. The mapping is usually (Block
address) MOD (Number of blocks in cache) Fully
Associative if a block can be placed anywhere
in the cache, the cache is said to be fully
associative. Set associative if a block can
be placed in a restricted set of places in the
cache, the cache is said to be set associative .
A set is a group of blocks in the cache. A block
is first mapped onto a set, and then the block
can be placed anywhere within that set. The set
is usually chosen by bit selection that is,
(Block address) MOD (Number of sets in cache)
39- Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
                                        Â
- A pictorial example for a cache with only 4
blocks and a memory with only 16 blocks.
40- Direct mapped cache A block from main memory can
go in exactly one place in the cache. This is
called direct mapped because there is direct
mapping from any block address in memory to a
single location in the cache.
cache
Main memory
41- Fully associative cache A block from main
memory can be placed in any location in the
cache. This is called fully associative because a
block in main memory may be associated with any
entry in the cache.
42Memory/Cache Related Terms
- Set associative cache The middle range of
designs between direct mapped cache and fully
associative cache is called set-associative
cache. In a n-way set-associative cache a block
from main memory can go into n (n at least 2)
locations in the cache.
43Locality of Reference
- If location X is access, it is very likely that
location X1 will be accessed. - Benefit of the cache with data blocks
44Current CPUs
45Replacing Data
- Initially all valid bits are set to 0
- As instructions and data are fetched from memory,
the cache is filling and some data need to be
replaced. - Which ones?
- Direct mapping obvious
46Replacement Policies for Associative Cache
- FIFO - fills from top to bottom and goes back to
top. (May store data in physical memory before
replacing it) - LRU replaces the least recently used data.
Requires a counter. - Random
47Replacement in Set-Associative Cache
- Which if n ways within the location to replace?
- FIFO
- Random
- LRU
Accessed locations are D, E, A
48Writing Data
- If the location is in the cache, the cached value
and possibly the value in physical memory must
be updated. - If the location is not in the cache, it maybe
loaded into the cache or not (write-allocate and
write-noallocate) - Two methodologies
- Write-through
- Physical memory always contains the correct value
- Write-back
- The value is written to physical memory only it
is removed from the cache
49Cache Performance
- Cache hits and cache misses.
- Hit ratio is the percentage of memory accesses
that are served from the cache - Average memory access time
- TM h TC (1- h)TP
Tc 10 ns Tp 60 ns
50(No Transcript)
51Page Replacement - FIFO
- FIFO is simple to implement
- When page in, place page id on end of list
- Evict page at head of list
- Might be good? Page to be evicted has been in
memory the longest time - But?
- Maybe it is being used
- We just dont know
- FIFO suffers from Beladys Anomaly fault rate
may increase when there is more physical memory!
52FIFO vs. Optimal
- Reference string ordered list of pages accessed
as process executes - Ex. Reference String is A B C A B D A D B C B
- OPTIMAL
- A B C A B D A D B C B
System has 3 page frames
5 Faults
toss A or D
A B C D A B C
FIFO A B C A B D A D B C B
toss ?
7 faults
53Least Recently Used (LRU)
- Replace the page that has not been used for the
longest time
3 Page Frames Reference String - A B C A B D A
D B C
LRU 5 faults
A B C A B D A D B C
54LRU
- Past experience may indicate future behavior
- Perfect LRU requires some form of timestamp to be
associated with a PTE on every memory reference
!!! - Counter implementation
- Every page entry has a counter every time page
is referenced through this entry, copy the clock
into the counter. - When a page needs to be changed, look at the
counters to determine which are to change - Stack implementation keep a stack of page
numbers in a double link form - Page referenced move it to the top
- No search for replacement