EECE 374: Microprocessor Architecture and Applications Chapter 1 (Part I) - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

EECE 374: Microprocessor Architecture and Applications Chapter 1 (Part I)

Description:

Title: EE 471 (Spring 2000): Computer Design Author: Sunggu Lee Last modified by: Sungjoo.Yoo Created Date: 2/21/2000 5:24:32 AM Document presentation format – PowerPoint PPT presentation

Number of Views:166
Avg rating:3.0/5.0
Slides: 73
Provided by: sungg
Category:

less

Transcript and Presenter's Notes

Title: EECE 374: Microprocessor Architecture and Applications Chapter 1 (Part I)


1
EECE 374 Microprocessor Architecture and
ApplicationsChapter 1(Part I)
2
Agenda (Chapter 1)
  • Part I (3/13 and 3/15)
  • 3/13
  • A minimum processor
  • 3/15
  • Review on our minimum processor
  • Bus
  • Intel 4004 vs. our minimum processor
  • Memory map
  • Microprocessor performance improvement
  • Part II (3/20)
  • Number system
  • Multiplier and divisor

3
Microprocessor
  • What is it?

(Program Control Unit)
(Data Processing Unit)
4
Microprocessor
  • What is it?
  • What is the function of control, data path,
    memory, and input/output?
  • A minimum set of functions in microprocessor
  • Control (un)conditional branch
  • Data path addition of two numbers
  • Results sum and status (e.g., zero,
    over/underflow)
  • Memory read and write
  • Data is located with an address
  • Input/output

5
A Minimum Microprocessor
Function Text representation (with instruction mnemonics) Binary representation
Conditional branch BRNZ 10 // if flag Z!1, go to 10 00 10 00 00
Addition ADD R3 R1 R2 // R3 R1R2 // if(R30) Z1 01 11 01 10
Read and write LD R1 00 // read a data at address 00 to R1 ST 01 R3 // store data at R3 to address 01 10 01 00 00 11 01 00 11
Binary format
My first program
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 01
00 00000101 01 10100000 10 01110110 11 00010000
R1 is assumed to be initialized to zero

6
A Minimum Microprocessor
Memory
My first program
LD R2 00, i.e., b10100000
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 10
PC
Control decode branch
Data path adder
Registers R1,R2,R3
0
Zero
PC program counter
7
A Minimum Microprocessor
Memory
My first program
Data at 00, i.e., b00000101
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 10
PC
Control decode branch
Data path adder
Registers R1,R2,R3
5
Zero
8
A Minimum Microprocessor
Memory
My first program
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 10
Control decode branch
Data path adder
Registers R1,R2,R3
PC
5
Zero
0
if the result is zero Zero 1 else Zero 0
9
A Minimum Microprocessor
Memory
My first program
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 10
Control decode branch
Data path adder
Registers R1,R2,R3
PC
Zero
PC 10 since Zero!1
10
A Minimum Microprocessor
Memory
My first program
ADD R3 R1 R2, i.e., b00100000
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 10
Control decode branch
Data path adder
Registers R1,R2,R3
PC
Zero
We restart at 10 We do the same thing,
R3R1R205 over and over again!
11
A Minimum Microprocessor
Function Text representation (mnemonics) Binary representation
Conditional branch BRNZ 10 // if flag Z!1, go to 10 00 10 00 00
Addition ADD R3 R1 R2 // R3 R1R2 // if(R30) Z1 01 11 01 10
Read and write LD R1 00 // read a data at address 00 ST 01 R3 // store data at R3 to address 01 10 01 00 00 11 01 00 11
My first program!
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 01
Too small code size! What if 4 bit address
instead of 2 bits?
12
A Minimum Microprocessor
Function Text representation (mnemonics) Binary representation
Conditional branch BRNZ 0010 // if flag Z!1, go to 0010 00 0010 00 00
Addition ADD R3 R1 R2 // R3 R1R2 // if(R30) Z1 01 0011 01 10
Read and write LD R1 0000 // read a data at address 0000 ST 0001 R3 // store data at R3 to address 0001 10 0001 00 00 11 0001 00 11
My first program!
0000 5 0001 LD R2 0000 0010 ADD R3 R1
R2 0011 BRNZ 0001
Lets change my program to do something
meaningful!
13
A Meaningful Example
Assumption R0 is initialized to 0
Assumption 1 integer 1 byte
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
int a 1 // ? R1 int n 5 // ? R2 int d
-1 // ? R3 int r 0 // ? R0 while(n!0)
// BRNZ r r a // R0R0R1 n n d
// R2R2R3
Compiler!
14
A Minimum Microprocessor
Memory
My first program
LD R1 0000, i.e., b10 0001 0000
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
PC
1
0
Zero
15
A Minimum Microprocessor
Memory
My first program
LD R2 0001, i.e., b10 0010 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
5
1
0
PC
Zero
16
A Minimum Microprocessor
Memory
My first program
LD R3 0010, i.e., b10 0011 0010
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
5
1
0
Zero
PC
17
A Minimum Microprocessor
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
5
1
1
Zero
PC
18
A Minimum Microprocessor
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
4
1
1
Zero
0
PC
19
A Minimum Microprocessor
Memory
My first program
BRNZ 0111, i.e., b00 0111 0000
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
4
1
1
Zero
0
PC 0111 since Zero ! 1
PC
20
A Minimum Microprocessor
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
4
1
2
Zero
0
PC
21
A Minimum Microprocessor
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
3
1
2
Zero
0
PC
22
A Minimum Microprocessor
Memory
My first program
BRNZ 0111, i.e., b00 0111 0000
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
0
1
5
Zero
1
PC
After three more iterations
23
A Minimum Microprocessor
Memory
My first program
BRNZ 0111, i.e., b00 0111 0000
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
1
5
0
Zero
1
PC 1010 since Zero1
PC
24
A Minimum Microprocessor
Memory
My first program
ST R0 0011, i.e., b11 0011 0000
0000 1 0001 5 // iterations 0010 -1 0011
5 // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
1
5
0
Zero
1
PC
25
Empirical Summary
0000 1 0001 5 // iterations 0010 -1 0011
5 // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Address space
26
Empirical Summary
0000 1 0001 5 // iterations 0010 -1 0011
5 // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Data area
Who decides data and code area?
Code area
27
Empirical Summary
Your program!
0000 1 0001 5 // iterations 0010 -1 0011
X // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
int a 1 // ? R1 int n 5 // ? R2 int d
-1 // ? R3 int r 0 // ? R0 while(n!0)
// BRNZ r r a // R0R0R1 n n d
// R2R2R3
Data area
Code area
Compiler!
28
Empirical Summary
Function Text representation (mnemonics) Binary representation
Conditional branch BRNZ 0010 // if flag Z!1, go to 0010 00 0010 00 00
Addition ADD R3 R1 R2 // R3 R1R2 // if(R30) Z1 01 0011 01 10
Read and write LD R1 0000 // read a data at address 0000 ST 0001 R3 // store data at R3 to address 0001 10 0001 00 00 11 0001 00 11
Operation code OP code
Instruction code
Operands
Register , address
29
What We Need to Know In This Class
  • Op code
  • Given n instructions ? log2n bits for Opcode.
    Thats all?
  • Operands
  • How to represent 64b address with 32b address
    field? ? Addressing modes!
  • Control flow
  • How to execute function calls?
  • Data
  • Data types how to represent a floating number,
    e.g., 3.14159 ?
  • Operation Addition is enough? Subtraction,
    multiplication, division, AND, OR,
  • Memory allocation, e.g., malloc, free, How do
    they work?
  • (Physical) memory
  • How to read/write data from/to memory?
  • Level 1 and 2 cache, DRAM, What are they for?
    How do they work?
  • How to integrate all to make an instruction work?
  • Internal structure of microprocessor, e.g.,
    address/data/control bus, I/O,
  • Advanced topics
  • Interrupts, pipelining, branch prediction, DMA
    (direct memory access), SIMD and MMX,

30
Agenda (Chapter 1)
  • Part I (3/13 and 3/15)
  • 3/13
  • A minimum processor
  • 3/15
  • Review on our minimum processor
  • Bus
  • Intel 4004 vs. our minimum processor
  • Memory map
  • Microprocessor performance improvement
  • Part II (3/20)
  • Number system
  • Multiplier and divisor

31
A Minimum MicroprocessorHow to Access Memory?
Memory
My first program
LD R2 00, i.e., b10100000
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 10
PC
Control decode branch
Data path adder
Registers R1,R2,R3
0
Zero
Enough? How to tell memory to give me b10100000
at 0001? Contract with memory 1st cycle address,
RD or WR 1 2nd cycle data
8b
Processor
Memory
RD
WR
32
A Minimum MicroprocessorHow to Access Memory?
Memory
My first program
I want to read from 0001
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 10
PC
Control decode branch
Data path adder
Registers R1,R2,R3
0
Zero
8b
Cycle 1
0001
Processor
Memory
1
RD
WR
0
33
A Minimum MicroprocessorHow to Access Memory?
Memory
My first program
LD R2 00, i.e., b10100000
00 5 01 LD R2 00 10 ADD R3 R1 R2 11 BRNZ 10
PC
Control decode branch
Data path adder
Registers R1,R2,R3
0
Zero
8b
Cycle 2
1010000
Processor
Memory
0
RD
WR
0
34
A Minimum MicroprocessorWhat if You Want to See
the Result?
Memory
My first program
ST 0011 R0, i.e., b11 0011 0000
0000 1 0001 5 // iterations 0010 -1 0011
5 // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
1
5
0
Zero
1
PC
35
A Minimum MicroprocessorNew Instruction for
Output!
Memory
My first program
OUT 0001 R0, i.e., b100 0001 0000
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 OUT 0001 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
5
Zero
I/O device (I/O address 0001)
PC
36
Bus System
Buses used in computer system - Address bus -
Data bus - Control bus
MWTC(memory write) MRDC(memory read) IOWC(IO
write) IORC(IO read)
37
The Minimum Microprocessor is
  • Similar to Intel 4004 (1971)

Memory
Control decode branch
Data path adder
Registers R1,R2,R3
Zero
38
Intel 4004 vs. Our Minimum Processor
Intel 4004 Ours Comments
Data bit width in memory 4b 8b ? 10b Why did we change it to 10b?
Instruction bit width 8b (45 instructions) 8b ? 10b (4 instructions, i.e., BRNZ, ADD, LD, ST, since 2b for instructions) How to support 8b instruction with 4b data in 4004?
Address bit width 12b (4K entries of 4b data) 4b (16 entries of 10b data) How to support 12b address with 8b instruction in 4004?
39
Intel 4004
  • 4b data I/O

40
Memory Space Comparison
4b
10b
4b address 16 entries of 10b data
12b address 4K entries of 4b data
4004
Ours
41
Memory Access
4b
Fetching an 8b instruction may take 8 cycles
10100000
1010
000011010001
0000
12b address 4K entries of 4b data
4b
Cycle 1
0000
Processor
Memory
1
RD
WR
0
4004
42
Memory Access
4b
Fetching an 8b instruction may take 8 cycles
10100000
1010
000011010001
0000
12b address 4K entries of 4b data
4b
Cycle 2
1101
Processor
Memory
0
RD
WR
0
4004
43
Memory Access
4b
Fetching an 8b instruction may take 8 cycles
10100000
1010
000011010001
0000
12b address 4K entries of 4b data
4b
Cycle 3
0001
Processor
Memory
0
RD
WR
0
4004
44
Memory Access
4b
Fetching an 8b instruction may take 8 cycles
10100000
1010
000011010001
0000
12b address 4K entries of 4b data
4b
Cycle 4
0000
Processor
Memory
0
RD
WR
0
4004
45
Memory Access
4b
10100000
Fetching an 8b instruction may take 8 cycles
000011010010
1010
0000
12b address 4K entries of 4b data
4b
Cycle 5 8
0000
Processor
Memory
0
RD
WR
0
4004
46
Bus System
Buses used in computer system - Address bus -
Data bus - Control bus
MWTC(memory write) MRDC(memory read) IOWC(IO
write) IORC(IO read)
47
Memory Access
4b
Fetching an 8b instruction may take 8 cycles
10100000
1010
000011010001
0000
12b address 4K entries of 4b data
4b
Cycle 1
0000
Processor
Memory
1
RD
WR
0
4004
48
Memory Access
4b
Fetching an 8b instruction may take 8 cycles
10100000
1010
000011010001
0000
12b address 4K entries of 4b data
4b
Cycle 2
1101
Processor
Memory
0
RD
WR
0
4004
49
Memory Access
4b
Fetching an 8b instruction may take 8 cycles
10100000
1010
000011010001
0000
12b address 4K entries of 4b data
4b
Cycle 3
0001
Processor
Memory
0
RD
WR
0
4004
50
Memory Access
4b
Fetching an 8b instruction may take 8 cycles
10100000
1010
000011010001
0000
12b address 4K entries of 4b data
4b
Cycle 4
0000
Processor
Memory
0
RD
WR
0
4004
51
Memory Access
4b
10100000
Fetching an 8b instruction may take 8 cycles
000011010010
1010
0000
12b address 4K entries of 4b data
4b
Cycle 5 8
0000
Processor
Memory
0
RD
WR
0
4004
52
Bus and Memory Sizes
1 in address corresponds to one byte
53
Data Width and Address WidthI
8b data
16b data
Memory size, how much?
10b address
10b address
How many bits are transferred per cycle?
I/O
I/O
54
Data Width and Address WidthII
8b data
16b data
10b address
10b address
0000000011
6
7
4
5
0000000010
2
3
0000000001
What if we want to access data at address 3?
0
1
0000000000
I/O
I/O
55
Physical Memory System
Pentium
56
Memory Map of Intel PC
TPA transient program area
57
Intel Processor History
  • 4004 50 KIPs (kilo-instruction per second), 4K
    4b memory, 45 instructions
  • 8008 (1971) 8b data, 16KB memory
  • 8080 (1973) 500KIPs, 64KB memory
  • MITS Altair 8800, BASIC language interpreter
    developed by Bill Gates and Paul Allen
  • 8085 769KIPs, 246 instructions
  • 8086 (1978) 2.5MIPs (million instructions per
    second), 1MB memory, 16b data, 4B or 6B
    instruction cache, multiply/divide instruction,
    20,000 variations of instruction ? CISC (complex
    instruction set computer)
  • 80286, 4MIPs, 16MB

58
Intel Processor History (Contd)
  • 80386 (1986), 4GB (32b address), 32b data (good
    for floating point numbers), memory management
    unit in hardware
  • 80486 (1989), 8KB cache, 1 cycle instruction
    (50), 50MHz
  • Pentium (1993), 64b data, 16KB cache (ID),
    100233MHz, superscalar (two instructions/cycle),
    branch (jump) prediction
  • Pentium Pro (1995), floating point unit, 166MHz,
    Level 2 cache (256KB), three instructions/cycle,
    64GB (36b address)
  • Pentium II (1997), 266333MHz, bus speed up
    (66MHz?100MHz)
  • Pentium III, 1GHz, bus 100133MHz
  • Pentium 4 and Core2, 3.2GHz, DDR SDRAM, 256KB4MB
    Level 2 cache, 64b address (in reality, 40b used
    for 1TB)

59
Performance Improvement inIntel Processor History
  • How could they improve performance?
  • How to address the gap between processor and
    memory speed?

µProc 60/year
1000
CPU
Moores Law
Processor-Memory Performance Gap(grows 50 /
year)
100
Performance
10
DRAM 7/year
DRAM
1
1986
1987
1989
1990
1991
1992
1993
1994
1995
1996
1980
1981
1983
1984
1985
1988
1997
1998
1999
2000
1982
Time
60
Microprocessor Performance Improvement
  • Higher clock frequency
  • Larger cache
  • Multi-level cache
  • Superscalar
  • Simultaneous multi-threading (SMT), or
    HyperThreading

61
High Clock Frequency
  • Low latency logic design
  • E.g., logic optimization

a b AND c AND d AND e
b
b
c
c
a
d
d
a
e
e
Latency 3
Latency 2
62
A Minimum Microprocessor
Memory
My first program
LD R3 0010, i.e., b10 0011 0010
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
5
1
0
Zero
PC
63
A Minimum Microprocessor
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
5
1
1
Zero
Instruction fetch
PC
Addition
Time
Clock rising edge
64
Pipelining in A Minimum Microprocessor
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
5
1
1
Zero
Instruction fetch
PC
Addition
0111
R0R0R1
1000
R2R2R3
1001
???
Time
2x frequency!
Clock rising edge
65
Pipelining Requires Pipeline Registers and More
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
5
1
1
Zero
Instruction fetch
PC
Addition
0111
R0R0R1
1000
R2R2R3
1001
???
Time
2x frequency!
Clock rising edge
66
Working with Slow Memory
  • Cache
  • Prefetch in 16b instruction case

16b
Processor
Memory
1ns
2ns
IPC (instruction per cycle) 1/3
Instruction from memory
0000
0010
0100
1 instruction per 3 cycles
Processor
Time
67
Working with Slow Memory
  • Cache
  • Prefetch in 16b instruction case

32b
16b
Processor
Cache
Memory
1ns
2ns
0ns
IPC (instruction per cycle) 2/3
Instruction from memory
0000
0100
1000
2 instructions per 3 cycles
Processor
0000
0100
1000
0010
0110
1010
Time
68
Working with Slow Memory
  • Cache
  • Long memory latency degrades performance

32b
16b
Processor
Cache
Memory
1ns
3ns
0ns
IPC (instruction per cycle) 2/5
Instruction from memory
0000
0100
2 instructions per 5 cycles
Processor
0000
0100
0010
0110
Time
69
Working with Slow Memory
  • If there is access locality (e.g., 00000110 are
    repeatedly accessed), larger cache improves
    performance by reuse

0100
0110
32b
16b
0000
0010
Processor
Memory
Cache
1ns
3ns
0ns
Instruction from memory
0000
0100
IPC 1
Processor
0000
0100
0010
0110
0000
0100
0010
0110
Time
70
A Minimum Microprocessor
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
5
1
1
Zero
PC
71
A Minimum Microprocessor
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1,R2,R3
-1
4
1
1
Zero
0
PC
72
Superscalar Operation inOur Minimum
Microprocessor
Memory
My first program
ADD R0 R0 R1, i.e., b01 0000 0001
0000 1 0001 5 // iterations 0010 -1 0011
XX // result 0100 LD R1 0000 0101 LD R2
0001 0110 LD R3 0010 0111 ADD R0 R0 R1 1000 ADD
R2 R2 R3 1001 BRNZ 0111 1010 ST 0011 R0
Control decode branch
Data path adder
Registers R0,R1
1
1
Zero
0
PC
Data path adder
Registers R2,R3
-1
4
Zero
0
Write a Comment
User Comments (0)
About PowerShow.com