Title: Review For Final
1Review For Final
2Administrative Issues
- Final Exam
- Date Time Wednesday, April 25th, 7.00 PM
- Location Same as regular lecture room
- Basics
- Closed book
- Can bring one sheet of notes (Can write on both
sides) - Calculators are allowed
- Computers are not allowed
3More Details
- Exam structure Similar to mid-examination
- Questions
- True or False (Need explanation, for both TRUE
and FALLS) - Short answer questions
- Exercise questions (resembling homework problems)
- Total 5 to 6 questions (multiple parts)
- Syllabus Non cumulative
- Material Covered
- Chapter 5 Memory Hierarchy Design (5.1-5.11)
- Chapter 7 Storage Systems (7.1-7.5, 7.7, 7.8,
7.10 7.11) - Associated homework problems (HW4, 5)
- Associated lecture notes
4Big Picture The World We Live In
Technology Why? Communication Computing Pleasur
e Comfort Peace Understanding Efficiency/Produ
ctivity
LAN
WAN
What Matters?
5Big Picture Diving In
Registers ALU ILP Pipelining (ISA)
interrupts
Processor
Cache
Where does new technologies stand? Are new
applications more interesting? What else?
Memory - I/O Bus
Main
I/O
I/O
I/O
Memory
Controller
Controller
Controller
Disk
Disk
Graphics
Network
What Matters? Performance as experienced by a
user Benchmarks?
6Memory Hierarchy Design
- Principles of Locality
- Performance Average Access Time (includes the
effect of misses) - Caches
- Associativity, block size, capacity performance
(Pros and Cons with options) - Frame Data, Tag, Statebits
- Direct mapped Address mapping details
- Block placement, identification, replacement
write strategy - Cache Miss CCC
- L2 Cache Design Issues
- Virtual Address Translation Issues
- Optimization techniques (miss rate, miss penalty,
hit time)
7Memory Hierarchy Design
- Main Memory
- Technology DRAM (slow, but dense)
- Latency Bandwidth issues
- Interleaving/banking for high bandwidth
- Simple vs. complex
- Virtual Memory
- Virtual Memory Vs. Physical Memory
- Larger memory, protection, relocation,
multiprogramming - VA vs. PA ? Address translation
- Pages vs. Frames
- Page Tables (inverted multi-level) TLBs
8Storage Systems
- Disks
- Parameters, Performance
- Redundancy and RAID
- Reliability, Availability, and Dependability
- Buses
- I/O System Architecture
- CPU vs. DMAC vs. IOP
- DMA I/O Processors
- Polling Vs. Interrupts
- I/O System Example
9Example True or False questions with explanations
- In a memory system with DRAM banks totaling a
width of N bytes, if all memory accesses are at
least N bytes long and aligned on N byte
boundaries, then there is no advantage to
implementing complex interleaving over simple
interleaving. - The main difference between DRAM (technology for
main memory) and SRAM (technology for caches) is
that DRAM is optimized for access speed while
SRAM is optimized for density. - Replication and multibanking are two ways to
increase available cache write bandwidth.
10Example Short Questions
- Disk access latency, t-disk, is the sum of
several components. Three of these are t-seek,
t-rotation, and t-transfer. What are these?
Which are typically short and which are typically
long? - Between Polling and Interrupts, which one is a
better I/O mechanism to implement in network
computer architectures? - List methods (not including changing the size of
the cache) for reducing capacity misses
11Review Q Caches
- 95 hit ratio Block 2 words whole block is
read on a miss - Processor requests 109 words/second 25
references are writes - Mem Sys can support 109 words/second does it
a word at a time - Stats At any time 30 of the blocks are
modified Write allocate on write miss - How much memory system BW is used for a)
write-through, b) write-back?
Only data transfer traffic is considered
12Continued..
- 95 hit ratio Block 2 words whole block is
read on a miss - Processor requests 109 words/second 25
references are writes - Mem Sys can support 109 words/second does it
a word at a time - Stats At any time 30 of the blocks are
modified Write allocate on write miss - How much memory system BW is used for a)
write-through, b) write-back? - Write through Read-miss-clean2
write-hit-clean1 write-miss-clean 3 - ? ( 0.750.051.02 0.250.951.01
0.250.051.03) 109 - Write back
- Read-miss-clean 2 Read-miss-dirty 4
write-miss-clean2 write-miss-dirty4 - ? 0.750.05 (0.72 0.34) 0.250.05 (0.72
0.34) 109 - ? 0.05(0.72 0.34) 109
In write through of dirty 0 Or of clean
100
13Review Q Caches
- Fully associate 64 byte cache 8-byte lines LRU
replacement - How many sets does the cache have?
- For given series of access (octals), label
compulsory, capacity conflict misses - Double the block size to 16 bytes, repeat first
two parts. - ONE SET
- 8 byte block ? last octal digit represent offset
- Ones set ? No index bit ?Upper three octal
digits are the block tags - Compulsory miss First miss on any block address
- Conflict miss Any miss that would not occur in a
fully associative cache ? no conflict misses - So rest are capacity misses!
- Follow the table
14(No Transcript)
15Q Cache Tag Size
- Note Address Synonyms if number of sets in the
cache times its block size is greater than the
size of a virtual memory page - Q 16KB First level data cache 32B lines, 2-way
set-as. - Virtually indexed Physically tagged
- Virtual address 32 bits Page 8KB
- Total size of tags?
- If 64B lines half the number of sets (keep
associativity same) What is tags size? - Can we halve the associativity and leave the of
sets the same? - Word 32B ?5 offset bits
- Set 2 blocks ? of sets 16KB/64B 256 ?8
index bits - Tag size 32 5 8 19 bits Note Synomyms
?25632 8KB - Do other parts yourself.
16Q Page Table Size
- Page table size (1 level) ? of virtual pages
Size of each entry - Each entry is determined by the number of
physical pages protection bits - of virtual pages are determined by the address
space and size of the page
17One-Level Page Table
- 32-bit machine with 16KB pages and 64 MB
physical mem - PTE contains PFN only and PTEs are integer number
of bytes - How much storage is needed for a single-level
page table? - of entries in the page table of virtual
pages - 232 / 16K 218 256k
- Size of entry in the page table bits required
to represent of physical pages - of physical pages 64MB/16KB 4k ? 12 bits
- Size of entry in the page table 12 bits ? 2
bytes (nearest integer bytes) - Page table size of entries entry size
256K 2 bytes 512 Kbytes - (Assumption 1 byte 1 word)
18Two-Level Page Table
- 32-bit machine with 16KB pages and 64 MB phys
memory - PTE contains PFN only and PTEs are integer bytes
- How much storage is needed for the first-level
table - Only of a two-level virtual page table?
- How much if it is two-level physical page table?
- of physical pages 64MB/16KB 4k
- ? PFN 12 bits ? 2bytes (roundup)
- How many total second level entry pages are
there? - 1 per virtual page ? 232 / 16K 218 256k
- Total size of second level tables ? 256k 2 B
512KB - How large is the first-level table?
- Each second-level table is 1 16KB page in size,
meaning there are 32 of them. - (2nd-level tables are each the size of 1 data
page) - As a result first level table must contain 32
pointers.
For more see page 87-92 on Cache-memory-lecture
notes
19Concept of Physical Virtual tables in 2-level
tables
- The first-level table in a 2-level table
implementation can be implemented either in VA
space or PA space. - Normally system performs translation access
the page table info on process behalf - Inside 1st level table
- If pointers to 2nd level tables are in Physical
Address Space then it is Physical Table - If pointers to 2nd level tables are in Virtual
Address Space then it is Virtual Table
20How to Access A Physical Cache with a Virtual
Address?
- Only index bits matter for cache access.
- Also only part of VA changes during translation
to PA - ?Ensure that index bits are in untranslated part
of the address - index is within page offset or
- Virtual index physical index
- Sometimes called virtually indexed, physically
tagged - Advantages Fast cache access from VA space
- Problems Restricted cache size
- Block size sets lt page size
- It is ok, just use associativity to increase
cache size - Virtually indexed, virtually-tagged if cache
keeps VPN instead of tag
Virtual Address
VPN
Page offset
tag
Index
Offset
Physical Cache
21Synonyms
- What happens if (index offfset ) gt page offset?
- Assume j VPN bits are used in index
- It means same physical block may be in 2J sets
- Impossible to know which given only physical
address - Called a synonym intra-cache coherence problem
- Solutions
- Search all possible synonymous sets in parallel
- Restrict page placement in OS such that index(VA)
index (PA) - Eliminate by OS convention single shared virtual
address space
22How do you separate misses?
- Rules
- Any miss to a block you have not seen before is a
compulsory miss - Any miss to a block you have not seen within the
last N distinct blocks where N is the total
number of blocks in the cache (and the victim
buffer) is a capacity miss. (Basically the last N
distinct blocks will be the N blocks present in a
fully associative cache). - All other misses are conflict misses.
23Example Problem with all types of misses victim
buffer
- Block 8B, Address in Octal, Direct mapped cache,
8 sets - Victim buffer (VB) of 1 line
- Initial cache 000, 010,020,030,040, 050,060,
and 070 - Which are accessed in that order and VB is empty
- For given sequence of references, label the cache
hits and misses - Offset the last octal digit ? Keep track of most
significant 2 digits - Direct mapped with 8 sets ? middle address digit
represents block to the set - Victim buffer On a miss the replaced block goes
to VB - On a VB hit, the block in the VB and the evicted
block are swapped
24Contd.
25I/O System Example Revisited
- Given
- 500 MIPS CPU
- 16B wide, 100 ns memory system
- 10,000 instructions per I/O
- 16KB per I/O
- 200 MB/s I/O bus, with room for 20 SCSI-2
controllers - SCSI-2 strings-20MB/s with 15 disks per bus
- SCSI-2 1ms overhead per I/O
- 7,200 RPM (120 RPS), 8ms avg seek, 6MB/s transfer
disks - 200GB total storage
- Q Choose 2GB or 8GB disks for maximum IOPS?
- How to arrange disks and controllers?
Similar example in the book on page 744
CPU
Memory
BUS
Disks
26I/O System Example (contd)
- Step 1 Calculate CPU, memory, I/O bus peak IOPS
- CPU 500 MIPS / (10,000 instructions/IO) 50,000
IOPS - Memory (16-bytes / 100 ns) / (16 KB/IO) 10,000
IOPS - I/O Bus (200MB/s) / 16 KB 12,500 IOPS
- Memory bus is the bottleneck with 10,000 IOPS!
- Step 2 Calculate disk IOPS
- tdisk 8 ms 0.5 /120 RPS 16KB /(6MB/s) 15
ms - Disk 1/15ms 67 IOPS
- 8GB disks ? need 25 ? 25 67 IOPS 1,675 IOPS
- 2BG disks ? need 100 ? 10067 IOPS 6,700 IOPS
- 100 2GB disks with 6,700 IOPS are new bottleneck!
- Answer. I 100 2 GB disks!
27I/O System Example (contd)
- Step 3 Calculate SCSI-2 controller peak IOPS
- tSCSI-2 1 ms 16KB / (20 MB/s) 1.8ms
- SCSI-2 1/ 1.8ms 556 IOPS
- Step 4 how many disks per controller?
- 556 IOPS / 67 IOPS 8 disks per controller
- Step 5 how many controllers?
- 100 disks / (8 disks / controller) 13
controllers - Answer. II 13 controllers, 8-disks each
28Thank You
- Best of Luck with Your Exam Careers