FSM, Cache memory - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

FSM, Cache memory

Description:

Using T Flip Flop and JK Flip Flop. log24 = 2, so 2 flip flops are needed ... tag on each block frame that gives the block address. ... page at head of list ... – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 55
Provided by: Lee144
Category:
Tags: fsm | cache | giving | good | head | memory

less

Transcript and Presenter's Notes

Title: FSM, Cache memory


1
FSM, Cache memory
  • Prof. Sin-Min Lee
  • Department of Computer Science

CS147 Lecture 12
2
The Five Classic Components of a Computer
3
The Processor Picture
4
Processor/Memory Bus
PCI Bus
I/O Busses
5
Using T Flip Flop and JK Flip Flop
  • log24 2, so 2 flip flops are needed to
    implement this FSA

6
Step 1 - Translate diagram into StateTable
  •  

7
Step 2 - Create maps for T and JK
  •  

8
Step 3 - Determine T, J, and K equations
  •  

9
Step 4 - Draw resulting diagram
  •  

10
Implementing FSM with No Inputs Using D, T, and
JK Flip Flops
  • Convert the diagram into a chart

11
Implementing FSM with No Inputs Using D, T, and
JK Flip Flops Cont.
  • For D and T Flip Flops

12
Implementing FSM with No Inputs Using D, T, and
JK Flip Flops Cont.
  • For JK Flip Flop

13
Implementing FSM with No Inputs Using D, T, and
JK Flip Flops Cont.
  • Final Implementation

14
von NeumannArchitecturePrinceton
Memory
Address Pointer
Arithmetic Logic Unit (ALU)
Data/Instructions
Pc Pc 1
Program Counter
Featuring Deterministic Execution
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
Cache Memory
  • Physical memory is slow (more than 30 times
    slower than processor)
  • Cache memory uses SRAM chips.
  • Much faster
  • Much expensive
  • Situated closest to the processor
  • Can be arranged hierarchically
  • L1 cache is incorporated into processor
  • L2 cache is outside

20
Cache Memory
  • This photo shows level 2 cache memory on the
    Processor board, beside the CPU

21
Cache Memory- Three LevelsArchitecture
Memory Multi- Gigabytes Large and Slow 160 X
Cache Control Logic
2 Gigahertz Clock
8X
2X
16X L3 Cache Memory
L2 Cache Memory
L1 Cache Memory
32 Kilobytes
128 Kilobytes
16 Megabytes
Featuring Really Non-Deterministic Execution
Address Pointer
22
(No Transcript)
23
Cache (1)
  • Is the first level of memory hierarchy
    encountered once the address leaves the CPU
  • Since the principle of locality applies, and
    taking advantage of locality to improve
    performance is so popular, the term cache is now
    applied whenever buffering is employed to reuse
    commonly occurring items
  • We will study caches by trying to answer the four
    questions for the first level of the memory
    hierarchy

24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
Cache (2)
  • Every address reference goes first to the cache
  • if the desired address is not here, then we have
    a cache miss
  • The contents are fetched from main memory into
    the indicated CPU register and the content is
    also saved into the cache memory
  • If the desired data is in the cache, then we have
    a cache hit
  • The desired data is brought from the cache, at
    very high speed (low access time)
  • Most software exhibits temporal locality of
    access, meaning that it is likely that same
    address will be used again soon, and if so, the
    address will be found in the cache
  • Transfers between main memory and cache occur at
    granularity of cache lines or cache blocks,
    around 32 or 64 bytes (rather than bytes or
    processor words). Burst transfers of this kind
    receive hardware support and exploit spatial
    locality of access to the cache (future access
    are often to address near to the previous one)

30
Where can a block be placed in Cache? (1)
  • Our cache has eight block frames and the main
    memory has 32 blocks

31
Where can a block be placed in Cache? (2)
  • Direct mapped Cache
  • Each block has only one place where it can appear
    in the cache
  • (Block Address) MOD (Number of blocks in cache)
  • Fully associative Cache
  • A block can be placed anywhere in the cache
  • Set associative Cache
  • A block can be placed in a restricted set of
    places into the cache
  • A set is a group of blocks into the cache
  • (Block Address) MOD (Number of sets in the cache)
  • If there are n blocks in the cache, the placement
    is said to be n-way set associative

32
How is a Block Found in the Cache?
  • Caches have an address tag on each block frame
    that gives the block address. The tag is checked
    against the address coming from CPU
  • All tags are searched in parallel since speed is
    critical
  • Valid bit is appended to every tag to say whether
    this entry contains valid addresses or not
  • Address fields
  • Block address
  • Tag compared against for a hit
  • Index selects the set
  • Block offset selects the desired data from the
    block
  • Set associative cache
  • Large index means large sets with few blocks per
    set
  • With smaller index, the associativity increases
  • Full associative cache index field is not
    existing

33
(No Transcript)
34
Which Block should be Replaced on a Cache Miss?
  • When a miss occurs, the cache controller must
    select a block to be replaced with the desired
    data
  • Benefit of direct mapping is that the hardware
    decision is much simplified
  • Two primary strategies for full and set
    associative caches
  • Random candidate blocks are randomly selected
  • Some systems generate pseudo random block
    numbers, to get reproducible behavior useful for
    debugging
  • LRU (Last Recently Used) to reduce the chance
    that information that has been recently used will
    be needed again, the block replaced is the
    least-recently used one.
  • Accesses to blocks are recorded to be able to
    implement LRU

35
What Happens on a Write?
  • Two basic options when writing to the cache
  • Writhe through the information is written to
    both, the block in the cache an the block in the
    lower-level memory
  • Write back the information is written only to
    the lock in the cache
  • The modified block of cache is written back into
    the lower-level memory only when it is replaced
  • To reduce the frequency of writing back blocks on
    replacement, an implementation feature called
    dirty bit is commonly used.
  • This bit indicates whether a block is dirty (has
    been modified since loaded) or clean (not
    modified). If clean, no write back is involved

36
(No Transcript)
37
(No Transcript)
38
There are three methods in block placement
Direct mapped if each block has only one place
it can appear in the cache, the cache is said to
be direct mapped. The mapping is usually (Block
address) MOD (Number of blocks in cache) Fully
Associative if a block can be placed anywhere
in the cache, the cache is said to be fully
associative. Set associative if a block can
be placed in a restricted set of places in the
cache, the cache is said to be set associative .
A set is a group of blocks in the cache. A block
is first mapped onto a set, and then the block
can be placed anywhere within that set. The set
is usually chosen by bit selection that is,
(Block address) MOD (Number of sets in cache)
39
  •                                                   
                                             
  • A pictorial example for a cache with only 4
    blocks and a memory with only 16 blocks.

40
  • Direct mapped cache A block from main memory can
    go in exactly one place in the cache. This is
    called direct mapped because there is direct
    mapping from any block address in memory to a
    single location in the cache.

cache
Main memory
41
  • Fully associative cache A block from main
    memory can be placed in any location in the
    cache. This is called fully associative because a
    block in main memory may be associated with any
    entry in the cache.

42
Memory/Cache Related Terms
  • Set associative cache The middle range of
    designs between direct mapped cache and fully
    associative cache is called set-associative
    cache. In a n-way set-associative cache a block
    from main memory can go into n (n at least 2)
    locations in the cache.

43
Locality of Reference
  • If location X is access, it is very likely that
    location X1 will be accessed.
  • Benefit of the cache with data blocks

44
Current CPUs
45
Replacing Data
  • Initially all valid bits are set to 0
  • As instructions and data are fetched from memory,
    the cache is filling and some data need to be
    replaced.
  • Which ones?
  • Direct mapping obvious

46
Replacement Policies for Associative Cache
  • FIFO - fills from top to bottom and goes back to
    top. (May store data in physical memory before
    replacing it)
  • LRU replaces the least recently used data.
    Requires a counter.
  • Random

47
Replacement in Set-Associative Cache
  • Which if n ways within the location to replace?
  • FIFO
  • Random
  • LRU

Accessed locations are D, E, A
48
Writing Data
  • If the location is in the cache, the cached value
    and possibly the value in physical memory must
    be updated.
  • If the location is not in the cache, it maybe
    loaded into the cache or not (write-allocate and
    write-noallocate)
  • Two methodologies
  • Write-through
  • Physical memory always contains the correct value
  • Write-back
  • The value is written to physical memory only it
    is removed from the cache

49
Cache Performance
  • Cache hits and cache misses.
  • Hit ratio is the percentage of memory accesses
    that are served from the cache
  • Average memory access time
  • TM h TC (1- h)TP

Tc 10 ns Tp 60 ns
50
(No Transcript)
51
Page Replacement - FIFO
  • FIFO is simple to implement
  • When page in, place page id on end of list
  • Evict page at head of list
  • Might be good? Page to be evicted has been in
    memory the longest time
  • But?
  • Maybe it is being used
  • We just dont know
  • FIFO suffers from Beladys Anomaly fault rate
    may increase when there is more physical memory!

52
FIFO vs. Optimal
  • Reference string ordered list of pages accessed
    as process executes
  • Ex. Reference String is A B C A B D A D B C B
  • OPTIMAL
  • A B C A B D A D B C B

System has 3 page frames
5 Faults
toss A or D
A B C D A B C
FIFO A B C A B D A D B C B
toss ?
7 faults
53
Least Recently Used (LRU)
  • Replace the page that has not been used for the
    longest time

3 Page Frames Reference String - A B C A B D A
D B C
LRU 5 faults
A B C A B D A D B C
54
LRU
  • Past experience may indicate future behavior
  • Perfect LRU requires some form of timestamp to be
    associated with a PTE on every memory reference
    !!!
  • Counter implementation
  • Every page entry has a counter every time page
    is referenced through this entry, copy the clock
    into the counter.
  • When a page needs to be changed, look at the
    counters to determine which are to change
  • Stack implementation keep a stack of page
    numbers in a double link form
  • Page referenced move it to the top
  • No search for replacement
Write a Comment
User Comments (0)
About PowerShow.com