Title: PIPELINING OF PROCESSORS USING VHDL
1PIPELINING OF PROCESSORSUSING VHDL
BY M. SWARNALAKSHMI 1999B5A8518 S. PRIYA
1999B5A3272
2INTRODUCTION
- Two main components of processor- Datapath and
Control - Design of a machine depends on how the logic
implementing the machine will operate and how the
machine is clocked.
3INTRODUCTION
- Functional units in the MIPS implementation
consists of two types of logic elements-
Combinational elements and State elements. - Clocking methodology defines when signals can be
read and when they can be written. Edge triggered
methodolgy is most commonly used.
4BUILDING DATAPATH
- We assume that the implementation uses a single
long clock cycle for every instruction. - The first step is to examine the major components
required to execute each class of instruction.
5For every class of instruction the first step
remains identical- FETCH THE INSTRUCTION FROM
MEMORY.
6Portion of Datapath used for Fetching Instructions
Add
4
PC
Read address
Instruction Intsruction Memory
7DATA PATH FOR R-TYPE INSTRUCTIONS
- A typical instruction
- add t1 t2 t3
- t3 t2t3
- Thus R-format instruction have three operands.
- Register file is a collection of registers. We
need a total of four inputs and two outputs in
this case.
8DATA PATH FOR R-TYPE INSTRUCTIONS
Read Reg1 Read Data1 Read
Reg2 Write Reg REGISTERS Write Data
Read Data2
ADD
Instruction
Reg Write
9BUILDING DATAPATH
- Data path for the load and store instructions
make use of a sign extension unit and a data
memory unit in addition to the register file and
ALU. - Similarly Branch instructions require a shift
unit in addition to the above mentioned.
10Creating a Single Data path
- The simplest data path might attempt to execute
all instructions in one clock cycle. But that
requires duplication of functional units. - The best alternative is using Data selectors
also called Multiplexors.They allow a data path
element to be shared among different class of
instructions.
11SINGLE CYCLE IMPLEMENTATION
12ALU CONTROL
- Depending on the class of instruction, the ALU
performs one of these five functions. - The 6 bit function field and a 2- bit control
field called ALUop are used to generate the 3 bit
ALU control signal. -
13INSTRUCTION CLASSES
- R-type Instruction
- Bit position 31-26 25-21 20-16
15-11 10-6 5-0 - Load or store instruction
- Bit position 31-26
25-21 20-16 15-0
14INSTRUCTION CLASSES
Bit position 31 -26 25-21
20-16 15-0
15DESIGNING-MAIN CONTROL UNIT
- The op field, also called opcode, is always
contained in bits 31-26. this field is referred
as Op 5-0 - The 2 registers to be read are always specified
by the rs and rt fields, at positions 25-21 and
20-16. this is true for the R-type instructions,
branch equal, and for store.
16DESIGNING-MAIN CONTROL UNIT
- The base register for the load and store
instructions is always in bit positions 25-21(rs) - The 16 bit offset for branch equal, load and
store is always in position 15-0 - The destination register is one of the two
places. For a load it is bit position 15-1(rd).
Thus we will need a add a multiplexer to select
which field of instruction is used to indicate
the register no to be written.
17SINGLE CYCLE IMPLEMENTATION
18OPERATION OF DATA PATH
- R-type instructions t1,t2,t3
- An instruction is fetched from the instruction
memory and the PC is incremented - The 2 registers t2 and t3 are read from the
register file. The main control computers the
setting of the control lines during this step also
19OPERATION OF DATA PATH OF AN R-TYPE INSTRUCTION
- The ALU operates on the data read from the data
read form the register file , using the function
code(bits5-0,which is the funct field of the
instruction), to generate the ALU function - The result from the ALU is written into the
register file using bits15-11 of the
instruction to select the destination register
(t1)
20WHY NOT A SINGLE CYCLE INSTRUCTION?
- It is inefficient since the clock cycles have the
same length for every instruction in this design.
Its overall performance is not good since several
of the instruction classes could fit in a shorter
clock cycle. - Its is found that the variable clock
implementation is more faster.
21Continuing..