Optimizing the mMIPS - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Optimizing the mMIPS

Description:

two source registers. the assembler instruction. weight %1 The first source operand register ... Using the data memory in your program /* The program copies ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 34
Provided by: sander7
Category:

less

Transcript and Presenter's Notes

Title: Optimizing the mMIPS


1
Optimizing the mMIPS
  • Sander Stuijk

2
The mMIPS
  • Pipelined core
  • Hazard detection
  • No forwarding
  • mMIPS instruction set
  • 31 instructions in hardware available (add, bnez,
    mul, ...)
  • Other instructions supported via a C compiler
    (div, sra, ...)

3
Outline
  • LCC compiler for the mMIPS
  • Using memories in the mMIPS
  • The mMIPS vs Hennesy and Patterson

4
Toolflow
test
implementation
LCC C Compiler
Application (C source)
LCC C Compiler
sw
mMIPS (C sources that use SystemC libraries)
hw
Borland C Compiler
Xilinx ISE
Synopsys FPGA Compiler II
Synopsys SystemC compiler
5
LCC compiler its a C compiler
  • Consider the following code fragment
  • for (int i 0 i lt 3 i)
  • ai ...
  • It should be
  • int i
  • for (i 0 i lt 3 i)
  • ai ...

6
How does a compiler work?
lcc prog.c o mips_rom.bin
7
Adding special functions
  • Examples
  • Division, multiply, swap, clip, ...
  • Constraints
  • At most 2 input operands and 1 output operand
  • Manifest loop bounds
  • Clock frequency
  • Chip area

8
Securing our skies
  • Measure height each second
  • The airplane may never be for more then 1 second
    below 1000ft
  • If needed, take appropriate action

9
Securing our skies
  • Measure height each second
  • The airplane may never be for more then 1 second
    below 1000ft
  • If needed, take appropriate action

10
missile.c
  • define TRUE 1
  • define FALSE 0
  • int launch(int height1, int height2)
  • int l
  • if (height1 lt 1000 height2 lt 1000)
  • l TRUE
  • else
  • l FALSE
  • return l
  • void main(void)
  • int height1, height2

11
Assembler
int launch(int height1, int height2) int
l if (height1 lt 1000 height2 lt 1000)
l TRUE else l FALSE return
l
80 addiu sp,sp,-8 84 li t8,1000 88
slt s8,a0,t8 8c beqz s8,0xac 90 nop 94
slt s8,a1,t8 98 beqz s8,0xac 9c nop a0
li t8,1 a4 b 0xb0 a8 sw t8,4(sp) ac
sw zero,4(sp) b0 lw v0,4(sp) b4 jr ra b8
addiu sp,sp,8
lcc missile.c o missile disas missile
12
Adding a special function to the mMIPS (overview)
  • New mMIPS instruction launch
  • Select an opcode and function code

opcode ? 0 functioncode ? 0x10 (not yet used)
13
Adding a special function to the mMIPS (hardware)
aluctrl alu
14
Converting a C program to the LCC data
representation
0 int main(void) 1 int a 3 2 if
(a 3) 3 return 1 4 return 0 5

15
What does a rule look like?
A rule for adding two unsigned integer (4
bytes) reg ADDU4 (reg,reg)
"\taddu c,0,1\n" 1
1 The first source operand register 2 The
second source operand register c The
destination register
16
Converting the LCC data-structure to assembler
.set reorder .globl main .text
.text .align 2 .ent
main main .frame sp,8,31 addu
sp,sp,-8 la 24,3 sw
24,-48(sp) lw 24,-48(sp) la
15,3 bne 24,15,L.2 la 2,1
b L.1 L.2 move 2,0 L.1
addu sp,sp,8 j 31 .end main
17
Adding a special function to the mMIPS (software)
  • Launch function must be detected by LCC
  • Use special pattern to indicate use of launch
    function
  • Example ((a) - ((b) (int ) 0x12344321))
  • The following 4 constructs map to custom
    operations in LCC
  • ((a) - ((b) (int ) 0x12344321))
  • ((a) ((b) (int ) 0x12344321))
  • ((a) - ((b) - (int ) 0x12344321))
  • ((a) ((b) - (int ) 0x12344321))
  • More operations (possibly with more operands) can
    be added. Look at the website for more
    information.

18
Custom operation in C and assembler
  • define TRUE 1
  • define FALSE 0
  • define launch(h1, h2) ((h1) - ((h2) (int )
    0x12344321))
  • void main(void)
  • int height1, height2
  • int l
  • while (TRUE)
  • l launch(height1, height2)

80addiu sp,sp,-16 84 sw s5,0(sp) 88
sw s6,4(sp) 8c b 0x98 90 sw s7,8(sp) 94
tgeu s7,s6,0x2a0 98 b 0x94 9c nop a0
lw s5,0(sp) a4 lw s6,4(sp) a8
lw s7,8(sp) ac jr ra b0 addiu sp,sp,16
lcc missile.c o missile disas missile
19
Comparison
original
added custom instruction
80 addiu sp,sp,-8 84 li t8,1000 88
slt s8,a0,t8 8c beqz s8,0xac 90 nop 94
slt s8,a1,t8 98 beqz s8,0xac 9c nop a0
li t8,1 a4 b 0xb0 a8 sw t8,4(sp) ac
sw zero,4(sp) b0 lw v0,4(sp) b4 jr ra b8
addiu sp,sp,8
94 tgeu s7,s6,0x2a0
Reduction of 14 instructions per execution!
20
Outline
  • LCC compiler for the mMIPS
  • Using memories in the mMIPS
  • The mMIPS vs Hennesy and Patterson

21
The mMIPS memory layout
22
Taking the memory from LCC to the mMIPS
mips_rom.bin
mips_ram.bin
23
Using the data memory in your program
  • / The program copies the data from str1 to str2.
    Note that
  • at most 512 characters are copied.
  • /
  • char str1 (char )0x0 // Memory address 0
    in ram
  • char str2 (char )0x200 // Memory address
    0x200 in ram
  • void main (void)
  • int i
  • for (i 0 str1i ! \0 i lt 0x200 i)
  • str2i str1i
  • str2i \0

24
Outline
  • LCC compiler for the mMIPS
  • Using memories in the mMIPS
  • The mMIPS vs Hennesy and Patterson

25
Registerfile and write-back hazards
Input
Output
Write
HP
Data is available on the output of the
registerfile in the current cycle
Write
Output
Input
Data is available on the output of the
registerfile in the next cycle
mMIPS
26
Branch hazards
  • Hennesy and Patterson
  • Branch detection in the decoding phase (after
    registerfile)
  • Two cycles needed to determine branch taken (IF
    and ID)
  • The first instruction after the branch is the
    branch delay slot filled by the assembler.
  • mMIPS
  • Branch detection in the execution phase (using
    the alu)
  • Three cycles needed to determine branch taken
    (IF, ID, EX)
  • Two branch delay slots (one used by the
    assembler, the second delay slot is filled with a
    NOP by the hazard detection unit).

27
  • Questions?

28
(No Transcript)
29
Assignment
30
Assignment
  • Optimize the run-time of an image processing
    algorithm running on the mMIPS.
  • Allowed
  • Add special instructions to the mMIPS
  • Change design of the mMIPS (e.g. forwarding).
  • Not-allowed
  • Modification of the image processing algorithm
    that are not needed to use special instructions
    (e.g. replace multiply with shifts).

31
Testing and implementing the design
  • Test for functional correctness
  • Run the original mMIPS with the algorithm to
    produce a reference output.
  • Compare the results of your mMIPS to the
    reference output.
  • Implement your design on the FPGA
  • You must complete the flow till the FPGA. The
    maximum clock frequency at which your mMIPS can
    be synthesized is part of the performance.

32
ImageProcessing.zip
  • Download the file ImageProcessing.zip at
  • http//www.es.ele.tue.nl/education/Computation/og
    o12/
  • Content
  • ImageProcessing/algorithm
  • ImageProcessing/bendime
  • ImageProcessing/bennoc
  • ImageProcessing/cocentric
  • ImageProcessing/lcc
  • ImageProcessing/mips
  • ImageProcessing/SystemC2.0.1borland
  • bennoc_setup.csh

33
Support and Information
  • Dominic Gawlowski - FPGA
  • Valentin Gheorghita - LCC
  • Sander Stuijk - SystemC
  • Each Tuesday and Friday between 14.00 and 16.00h.
  • Look also at http//www.es.ele.tue.nl/education/Co
    mputation/ogo12/ for information, tips, etc.
Write a Comment
User Comments (0)
About PowerShow.com