Title: Processor architectures
1Processor architectures
- SYSC5603 (ELG6163) Digital Signal Processing
Microprocessors, Software and Applications - Miodrag Bolic
2Overview
- Introduction
- Basic structure of a processor
- Basic Operations
- Pipelining
- Registers
- Example design on an application-specific
processor - General purpose processors
- Example of FIR on a general purpose processor
- Datapath of a MIPS processor
3What is Computer Architecture?
Application
Operating
System
Compiler
Firmware
Instruction Set Architecture
I/O system
Instr. Set Proc.
Datapath Control
Digital Design
Circuit Design
Layout
- Coordination of many levels of abstraction
- Under a rapidly changing set of forces
4Levels of abstraction
- Delving into the depths reveals more
information - An abstraction omits unneeded detail, helps us
cope with complexity
5Basic Structure of a Computer
References Patterson04
6Basic Structure of a Computer (2)
- Input Unit
- Keyboards, joysticks, trackballs, microphones and
mice - Output Unit
- Printers and graphic displays
- Memory Unit
- Primary (cache, RAM, HDD) and secondary (CD-ROM,
tape drives) - Arithmetic and Logic Unit (ALU)
- Executions completed here and stored in
fast-access registers - Control Unit (CU)
- Provides control to all other units, including
timing signals
References Patterson04
7Basic Operation of a Computer
- The computer accepts information in the form of
programs and data through an input unit and
stores it in memory - The information stored in memory is fetched,
under program control, and processed in an ALU - The processed information leaves the computer
through an output unit - All activities inside the computer are directed
by the control unit
References Patterson04
8Detailed Instruction Cycle
Copied from References Patterson04
9Detailed Instruction Cycle (2)
- Instruction address calculation
- Determines the address of the next instruction to
be executed - Instruction fetch
- Reads the instruction from its memory location
into the processor - Instruction operation decoding
- Analyzes the instruction to determine the type of
operation to be performed and the operand(s) to
be used - Operand address calculation
- Determines the address of the operand (if needed)
- Operand fetch
- Fetches the operand from memory or read it from
I/O - Data operation
- Performs the operation indicated in the
instruction - Operand store
- Write the results into memory or out to I/O
References Patterson04
10Fast, Pipelined Instruction Interpretation
Instruction Address
Instruction Register
Time
Operand Registers
Result Registers
Registers or Mem
Copied from References Culler-Slides
11Visualizing Pipelining
Time (clock cycles)
I n s t r. O r d e r
Copied from References Culler-Slides
12Fast-Access Registers
- These are some of the registers that help us in
the execution of programs - Instruction Register (IR)
- Holds the instruction that is currently being
executed - Program Counter (PC)
- Contains the memory address of the next
instruction to be fetched and executed - n general-purpose registers (GPRs)
- Numbered R0 through Rn-1
- Used to store results of general-purpose
operations - Stack Pointer (SP)
- Holds the address of the top of the stack
- Memory Address Register (MAR)
- Holds the address of the location to be accessed
- Memory Data Register (MDR)
- Holds the data to be written into or read out of
the address location
References Hamacher01
13Connections between the processor and the memory
Copied from References Hamacher01
14Basic ISA Classes
C A B
Copied from Meerbergen-Slides
15Terminology
- Performance - Time
- MIPS
- MFLOPS
- Cycles per Instruction (CPI)
- Architectures
- RISC Reduced Instruction Set Computer
- CISC Complex Instruction Set Computer
- Scalar
- Superscalar
- Very-long instruction word
16Comparison CISC, RISC, VLIW
Copied from Philips
17Sequential application specific processor
- A processor tuned only for a particular
application - Can be used for low-power implementations
- Word lengths can be adjusted to the current
problem. - Example FIR filter
18Direct form FIR filter
Copied from Wanhammer99
19Transposed FIR
Copied from Wanhammer99
20Assignment
- Design an N-tap transposed linear-phase FIR
filter as a sequential application specific
processor. Use only one multiplier and show how
processing time can be decreased twice. - Hint design a transposed FIR filter structure
as in the previous slide but allow for generating
the sums in reversed order PSN-1, PSN-2, , PS1,
y(n).
Copied from Wanhammer99
21General purpose processor architecture
- FIR example
- We will study RISC architectures
- Single-cycle processor
- Implementation of add and load instructions
- Pipelined implementation
- Why do all instructions have the same number of
cycles
22Example Digital Filtering
- The basic FIR Filter equation is
- Where hk is an array of constants
yn0 For (n0 nltNn) For (k
0kltNk) //inner loop yn yn
hkxn-k
Only Multiply and Accumulate (MAC) is needed!
In C language
23MAC using General Purpose Processor (GPP)
R0
R2
44
X
R1
24The MIPS Instruction Formats
- All MIPS instructions are 32 bits long. The
three instruction formats are - R-type
- I-type
- J-type
- The different fields are
- op operation of the instruction
- rs, rt, rd the source and destination register
- shamt shift amount
- funct selects the variant of the operation in
the op field - address / immediate address offset or immediate
value - target address target address of the jump
instruction
Copied from References Shulte-Slides
25Translating MIPS Assembly into Machine Language
- Humans see instructions as words (assembly
language), but the computer sees them as ones and
zeros (machine language). - An assembler translates from assembly language to
machine language. - For example, the MIPS instruction add t0, s1,
s2 is translated as follows - Assembly Comment
- add op 0, shamt 0, funct 32
- t0 rd 8
- s1 rs 17
- s2 rt 18
00000
100000
01000
10010
10001
000000
funct
shamt
rd
rt
rs
op
Copied from References Shulte-Slides
26MIPS Addressing Modes/Instruction Formats
- All MIPS instructions are 32 bits wide - fixed
length
add s1, s2, s3
Register (direct)
op
rs
rt
rd
register
Immediate
addi s1, s2, 200
immed
op
rs
rt
Baseindex
immed
op
rs
rt
Memory
register
lw s1, 200(s2)
PC-relative
immed
op
rs
rt
Memory
PC
beq s1, s2, 200
Copied from References Shulte-Slides
27Architecture of the MIPS core
Copied from Meerbergen-Slides
28Example 1 R - type add instruction
Copied from Meerbergen-Slides
29Critical path R-type operation
Clk
PC
Instruction address
Instruction Memory
Instruction
Rd
Rt
Rs
Imm
5
5
5
16
32
Rw Ra Rb 32 32-bit registers
Data Memory
Data address
32
32
Data out
Data in
Clk
32
Clk
Copied from Meerbergen-Slides
30Critical path R-type operation
Clock
Clock-to-Q
PC
New value
Old value
Instruction memory access time
Rs, rt, rd op, funct
Old value
New value
RFile access time
Bus A,B
Old value
New value
ALU delay
Bus W
Old value
New value
Set up skew
Write into RFile
Copied from Meerbergen-Slides
31Example 2 I-type load word
- lw rs, rt, imm16
- memPC
- addr Rrs extimm16
- Rrt memaddr
- PC PC 4
Copied from Meerbergen-Slides
32Critical path load operation
Clock
Clock-to-Q
PC
Old value
New value
Instruction memory access time
Rs, rt, rd op, funct
Old value
New value
RFile access time
Bus A,B
Old value
New value
ALU delay
Old value
address
New value
Mem access time
Bus W
Old value
New value
set upskew
Copied from Meerbergen-Slides
33Architecture of the MIPS core
- problem long critical path
- defined by the slowest instruction (load)
- solution ?
- pipelining
- break the instruction into smaller steps
- all steps have about the same critical path
Copied from Meerbergen-Slides
34Pipelining lw instructions
HennessyPatterson
cycle 1
cycle 2
cycle 3
cycle 4
cycle 5
cycle 6
cycle 7
Ifetch
RF read
ALU
dmem
RF write
lw
lw
Ifetch
RF read
ALU
dmem
RF write
Ifetch
RF read
ALU
dmem
RF write
lw
- One instructions enters the pipeline every clock
cycle - One instructions leaves the pipeline every clock
cycle - gt CPI 1 (Cycles per Instruction)
Copied from Meerbergen-Slides
35Pipelining lw instructions
I
R
A
M
W
Instructions
Data
Current CPU cycle
Copied from Meerbergen-Slides
364 stages of R-type instruction
cycle 1
cycle 2
cycle 3
cycle 4
Ifetch
RF read
ALU
RF write
E.g. ADD
Copied from Meerbergen-Slides
37Pipelining lw and R-type instructions
HennessyPatterson
cycle 1
cycle 2
cycle 3
cycle 4
cycle 5
cycle 6
cycle 7
Ifetch
RF read
ALU
dmem
RF write
lw
add
Ifetch
RF read
ALU
RF write
Copied from Meerbergen-Slides
38Solution stretch R-type to 5 stages
Ifetch
RF read
ALU
dmem
RF write
Dummy op (noop)
Copied from Meerbergen-Slides
39Ifetch
Reg/dec
exec
mem
wr
RegWr
branch
Next PC
Rfile
4
flags
Rs
BusA
Ra
Rt
Rb
BusB
adr
Prog mem
Data mem
Rw
Di
Dout
ext.
Din
Imm16
Rt
Rd
MemtoReg
MemWr
HennessyPatterson
RegDst
ALUSrc
ExtOp
ALUop
Copied from Meerbergen-Slides
40Data dependencies R-type instructions
HennessyPatterson
R1 ...
R1 ...
R1 ...
R1 ...
R1 ...
Copied from Meerbergen-Slides
41References
- Culler-Slides D. E. Culler, Computer
Architecture, Lecture slide, Computer Science at
Berkeley. - Hamacher01 C. Hamacher, Z. Vranesic, S. Zaky,
Computer Organization, McGraw-Hill
Science/Engineering/Math 5th edition, August 2,
2001. - Patterson04 D. A. Patterson, J. L. Hennessy,
Computer Organization and Design The
Hardware/Software Interface, Morgan Kaufmann 3rd
edition, August 2, 2004. - Shulte-Slides M. Schulte Computer Architecture
ECE 201, Lecture slides. - The other reference can be found at
www.site.uottawa.ca/mbolic/elg6131/References.htm
-