Title: HW
1HW 1 Problems Industrial Strength Assembly
2Reading the Question
- If there are no bs in the string, then bfind
should return a pointer to the null character at
the end of the string - lbu t2, 0(t3)
- beq t2, 0, term
-
- term la a0, ndone
- li v0 ,4
- syscall
- Its great that you tell me whats happening.. How
about putting t3 in to v0?
3Reading the Question
- The bfind procedure should locate the first b
character in the string and return its address in
register v0 - bfind lbu t1,0(a0)
- beqz t1, Ret
- beq t1, b, Ret
- addi a0, a0, 1
- j bfind
- Ret sb v0, (t1)
4Using the bfind procedure in bcount
- You must use your bfind procedure in Exercise
3.23 in your implementation of bcount. - What does bfind return?
- So how do we check if were at the end of the
string? - jal bfind
- lb t0,0(v0)
- beqz t0, endOfString
5RTFQ Reading the Question
- Tough parts of Software
- Specifying Requirements
- Understanding the Specified Requirements
- Why it matters at GWU Grading Scripts
- Besides.. It should be really frustrating to
spend hours answering the wrong question
6RTFQ- Reading the Question
- Typical Errors
- Didnt write bcount as a procedure
- This requires saving ra before the call to
bfind, and restoring ra after the call to bfind - main
- jal bcount
- bcount jal bfind
- What happens to ra when jal is called?
7Answering the Question
- add t0, 0, 0
- bfind add t3, t0, a0
- lbu t2, 0 (t3)
- beq t2, 0, done
- addi t2, t2, -98
- beq t2, 0, done
- addi t0, t0, 1
- j bfind
- done addi t0,1
- add v0,t0,0
- jr ra
8Answering the Question
- li t0,0
- bfind add t3,a0,t0
- lbu t2, 0 (t3)
- beq t2, 0, done
- beq t2, b, done
- addi t0, t0, 4
- j bfind
- done move s2,t0
- jr ra
9Why do we program in Assembly?
10Why do we program in Assembly?
- What matters in real world
- SPEED, SPEED SPEED
- Other acceptable answer
- Low Level Interface with hardware (mouse,
ethernet card, serial port, video card, etc.) - Unfortunately we cant simulate these hardware
interfaces with SPIM.
11Naïve Approach to Style
- Good Style lots of comments
- Even poor code well
- documented is still
- poor code.
12Style
- Accomplish task in straight forward manner with
minimum effort - Unfortunately, Good Style comes from experience
and experience comes from Bad Style - We dont expect you to have wonderful style in
the first couple assignments - By the end of semester you should be writing
simple straight forward code.
13Minimum Effort
- la t0, a0
- bfind lbu t1, 0 (t0)
- beq t1, Done
- beq t1,b, Done
- addi t0, 1
- addi a0, 1
- j bfind
- Done
- move v0,a0
14Minimum Effort
- Registers are a Commodity. Do not waste them!
- Why is minimizing the use of registers so
important? - What is the slowest type of instruction on our
processor? - If we have registers available... We can use
them. - This will allow your programs to run faster!
15Minimum Effort
- Registers are a Commodity. Do not waste them!
- Why is minimizing the use of registers so
important? - Modern processors have multiple execution units
- Processors can execute several different portions
of your code at the same time... - Lack of registers prevents multiple execution
- Help the processor out by using minimal number of
registers and having minimal register overlap. - This will allow your programs to run faster!
16Do we need all these registers?
- li s1,0x00
- bfind
- lbu t2,0(a0)
- beq t2,s1,exit
- beq t2,b,exit
- addi a0,a0,1
- j bfind
- exit
17Do we need all this code?
- bfind
- lbu t2,0(a0)
- addi a0,a0,1
- beq t2,zero,exit
- beq t2,b,exit
- j bfind
- exit
- subu a0,a0,1
18Minimum Effort
- Know your architecture (load - store) and what
your pseudo instructions do - Is there a difference difference between
- beq t0, b, Done
- and
- addi at,zero,b
- beq t0,at, Done
19Do I need everything in my loop?
- bfind lbu t0, 0 (a0)
- beqz t0, done
- addi t1,zero,b
- beq t0, t1, done
- add a0, a0, 1
- j bfind
- done mov v0, a0
- jr ra
20Extra Branches Border Case Failure
- bfind li t1, b
- lb t0,0(a0)
- bne t0,t1,loop
- move v0,a0
- quit jr ra
- loop addi a0,a0,1
- lbu t0,0(a0)
- beq t0,zero,quit
- bne t0,t1,loop
- move v0,a0
- jr ra
21Extra Branches and Loops
- Why are extra branches bad?
- -Optimizing Branches
- Why are extra loops bad?
22Public and Private Variables
- In SPIM- When you call a function and the
function returns the only values from that
function your procedure should access are - v0, v1
- Values saved by the function on the stack
- Locations in Memory
- DO NOT access the internal variables (registers)
of another function! This breaks the whole
function paradigm. It isnt a black box if
you go digging around
23Public and Private Variables
GWU
FUNCTION
24Breaking the Function Paradigm
- addi sp, sp, -4
- sw ra,0(sp)
- jal bfind
- lw ra,0(sp)
- addi sp,sp,4
- quit bne t0, zero, adder
- Who set t0? bfind did. But the only return
value was supposed to be v0? - What are the problems with this?
25Breaking the Function Paradigm
- I should be able to plug in ANY bfind in to
bcount and make it work. - Dont fight the framework!
26Real World Disclaimer
- Register assignments arent as structured
- (no t0-t9, s0-s7..just registers 1-31)
- Saving values on stack or in memory is slow.
- More registers can be used to return values.
- Function writers and function callers typically
have specific agreements about what registers to
use for passing and returning values.
27Proper Instructions
- jr 1 is jump to a value given by the register 1
(Where do you typically see jr ra ?) - j is jump to a label
- Use addi to add a constant to a register
- Use add to add registers together
- Watch the ordering of registers in instruction
- sw sp, ra What does this do?
28Good Implementation of Bfind
- addi t1,zero,b
- bfind lbu t0, 0 (a0)
- beqz t0, done
- beq t0, t1, done
- add a0, a0, 1
- j bfind
- done move v0, a0
- jr ra
- 8 lines of assembly. No extra branches, loops,
variables. Common case is fast.
29Lab Assignment Endian Fixer
- TCP/IP message headers are in Big Endian
- Your PC is a Little Endian machine
- Write function that takes word value in a0 and
returns a Endian reversed value in v0 - Goals
- Learn how to use masks
- li t0, 0x000000ff
- andi t1, a0, t0
- ori v0, t0, t1
-
30- endianFixer 1234
- rol a0,a0,8 2341
- lui t1, 0x00ff
- ori t1, t1, 0x00ff
- and v0,t1,a0 0301
- sll t1,t1,8 t1ff00ff00
- rol a0,a0,16 4123
- and a0,t1,a0 4020
- or v0,v0,a0 4321
- jr ra
Why dont I ever zeroize v0? Is li
t1,0x00ff00ff ok?
31Adder Review
- How many delays does a 4 bit ripple carry adder
take to add two 4 bit numbers? Why?
32Adder Review
- What are the equations for a 4 bit carry look
ahead adder? Do you understand this?
33- From when a and bs show up how long does it
take until sum appears on the output lines? - How long for carry generation how long for sum?
348 bit adder
- We need an 8 bit adderhow can we do this?
358 bit adder
- We could have a pure ripple carry
- How long would this take to calculate
- the sums?
368 bit adder
- We could use ripple carry to connect
- two 4 bit carry look ahead groups
- How long would this take to calculate
- the carries? The sums?
372nd layer of Carry Lookahead
- Lets add another layer of Carry Lookahead for our
8 bit adder - What are the equations for big Ps?
- What are the equations for big Gs?
- What are the equations for the big Cs?
382nd layer of Carry Lookahead
- How long would it take to calculate the carries?
- How long would it take to calculate the sums?