CSC324 Machine Organization - PowerPoint PPT Presentation

1 / 70
About This Presentation
Title:

CSC324 Machine Organization

Description:

Full Adder Logic Equations. si is 1' if an odd number of inputs are 1' ... Extend the full adder to one-bit ALU. Support logical, subtraction, set less than ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 71
Provided by: kaiw
Category:

less

Transcript and Presenter's Notes

Title: CSC324 Machine Organization


1
CSC324 Machine Organization
  • Chapter 3
  • Kai Wang
  • Computer Science Department
  • University of South Dakota

http//www.usd.edu/Kai.Wang/csc324/csc324.html
2
Review of MIPS
3
Alternative Architecture IA - 32
  • 1978 The Intel 8086 is announced (16 bit
    architecture)
  • 1980 The 8087 floating point coprocessor is
    added
  • 1982 The 80286 increases address space to 24
    bits, instructions
  • 1985 The 80386 extends to 32 bits, new
    addressing modes
  • 1989-1995 The 80486, Pentium, Pentium Pro add a
    few instructions (mostly designed for higher
    performance)
  • 1997 57 new MMX instructions are added,
    Pentium II
  • 1999 The Pentium III added another 70
    instructions (SSE)
  • 2001 Another 144 instructions (SSE2)
  • 2003 AMD extends the architecture to increase
    address space to 64 bits, widens all registers
    to 64 bits and other changes (AMD64)
  • 2004 Intel capitulates and embraces AMD64
    (calls it EM64T) and adds more media extensions

4
Chapter ThreeArithmetic for Computers
5
The Questions To Design a Computer
  • What you know
  • Decimal number
  • Arithmetic, logical
  • add, sub
  • and, or, xor
  • slti
  • What you have
  • 0 and 1
  • Logical gates
  • and, or, xor
  • Solutions
  • Complement number

6
How to Represent Numbers
  • Number is expressed as binary number in computer
  • Binary numbers (base 2) 0000 0001 0010 0011 0100
    0101 0110 0111 1000 1001...
  • Each bit is 0 or 1
  • Bits are just bits (no inherent meaning)
  • Of course it gets more complicated numbers are
    overflow fractions and real numbers negative
    numbers e.g., no MIPS subi instruction addi can
    add a negative number
  • How do we represent negative numbers? i.e.,
    which bit patterns will represent which numbers?

7
Possible Representations
  • Sign and Magnitude Expression
  • The first bit represent the sign, 0 is positive,
    1 is negative
  • The other bits represent the magnitude
  • 0000 0, 0001 1, 0010 2
  • 1000 - 0, 1001 - 1, 1010 - 2
  • The value is computed by
  • Take the magnitude part out
  • Use (xi-1 2i-1)(xi-2 2i-2) (x2 22)
    (x1 21) (x0 20)
  • Observe and add the sign
  • 00100101 ?
  • 10100110 ?

8
Possible Representations
  • Advantage
  • Easy to understand
  • Straightforward implementation
  • Disadvantage
  • Difficult to do arithmetic ( 1 byte example)
  • 00100101
  • -
  • 00100110
  • There are two zeros

9
Two's Complement Representation
  • Positive numbers are the same as sign and
    magnitude expression
  • Negative numbers are expressed by a process
    called Negating a two's complement number
  • represent its magnitude
  • Invert all bits
  • add 1
  • Use one byte to express 37
  • 00100101
  • Use one byte to express -38
  • 00100110
  • 11011001
  • 11011010
  • The first bit is still sign, 0 is positive, 1 is
    negative
  • Value (-xi 2i)(xi-1 2i-1)(xi-2 2i-2)
    (x2 22) (x1 21) (x0 20)

10
MIPS
  • 32 bit signed numbers0000 0000 0000 0000 0000
    0000 0000 0000two 0ten0000 0000 0000 0000 0000
    0000 0000 0001two 1ten0000 0000 0000 0000
    0000 0000 0000 0010two 2ten...0111 1111
    1111 1111 1111 1111 1111 1110two
    2,147,483,646ten
  • 0111 1111 1111 1111 1111 1111 1111 1111two
    2,147,483,647ten
  • 1000 0000 0000 0000 0000 0000 0000 0000two
    2,147,483,648ten
  • 1000 0000 0000 0000 0000 0000 0000 0001two
    2,147,483,647ten
  • 1000 0000 0000 0000 0000 0000 0000 0010two
    2,147,483,646ten...1111 1111 1111 1111 1111
    1111 1111 1101two 3ten1111 1111 1111 1111
    1111 1111 1111 1110two 2ten1111 1111 1111
    1111 1111 1111 1111 1111two 1ten

11
Sign Extension
  • Converting n bit numbers into numbers with more
    than n bits
  • MIPS 16 bit immediate gets converted to 32 bits
    for arithmetic
  • copy the most significant bit (the sign bit) into
    the other bits 0010 -gt 0000 0010 1010 -gt
    1111 1010
  • "sign extension" (lbu vs. lb)

12
Twos Comp. Addition
  • To add two's complement numbers, add the
    corresponding bits of both numbers with carry
    between bits.
  • For example,
  • 3 0011 -3 1101 -3 1101
    3 0011
  • 2 0010 -2 1110 2 0010
    -2 1110
  • -------- ---------
    --------- ---------
  • Unsigned and twos complement addition are
    performed exactly the same way, but how they
    detect overflow differs.

13
Twos Comp. Subtraction
  • It is actually done by addition
  • To subtract two's complement numbers we first
    negate the second number and then add the
    corresponding bits of both numbers.
  • For example
  • 3 0011 -3 1101 -3 1101 3
    0011
  • - 2 0010 - -2 1110 - 2 0010 - -2
    1110
  • ---------- ---------
    --------- ---------

14
Set-on-less-than
  • Can be done by an addition
  • The set-on-less instruction
  • slt s1, s2, s3
  • sets s1 to 1 if (s2 lt s3) and to 0
    otherwise.
  • This can be accomplished by
  • subtacting s2 from s3
  • setting the least significant bit to the sign bit
    of the result
  • setting all other bits to zero
  • For example,
  • s2 1010 s2 0111
  • -s3 1011 -s3 0100
  • 1111 0011
  • s1 0001 s1 0000

15
Overflow
  • When adding or subtracting numbers, the sum or
    difference can go beyond the range of numbers
    that hardware can represent.
  • This is known as overflow. For example, for two's
    complement numbers,
  • 5 0101 -5 1011 5 0101
    -5 1011
  • 6 0110 -6 1010 - -6 1010 - 6
    0110
  • -------- ---------
    --------- ---------
  • -5 1011 5 0101 -5 1011
    5 0101
  • Overflow creates an incorrect result that should
    be detected.

16
2s Comp - Detecting Overflow
  • When adding two's complement numbers, overflow
    will only occur if
  • the numbers being added have the same sign
  • the sign of the result is different
  • If we perform the addition
  • an-1 an-2 ... a1 a0
  • bn-1bn-2 b1 b0
  • ----------------------------------
  • sn-1sn-2 s1 s0
  • Overflow can be detected as
  • Overflow can also be detected as
  • where cn-1and cn are the carry in and carry out
    of the most significant bit.

17
Detecting Overflow
  • No overflow when adding a positive and a negative
    number
  • No overflow when signs are the same for
    subtraction
  • Overflow occurs when the value affects the sign
  • overflow when adding two positives yields a
    negative
  • or, adding two negatives gives a positive
  • or, subtract a negative from a positive and get a
    negative
  • or, subtract a positive from a negative and get a
    positive
  • Consider the operations A B, and A B
  • Can overflow occur if B is 0 ?
  • Can overflow occur if A is 0 ?

18
MIPS arithmetic instruction format
31
25
20
15
5
0
R-type
op
Rs
Rt
Rd
funct
I-Type
op
Rs
Rt
Immed 16
Type op funct ADDI 10 xx ADDIU 11 xx SLTI 12 xx SL
TIU 13 xx ANDI 14 xx ORI 15 xx XORI 16 xx LUI 17 x
x
Type op funct ADD 00 40 ADDU 00 41 SUB 00 42 SUBU
00 43 AND 00 44 OR 00 45 XOR 00 46 NOR 00 47
Type op funct SLT 00 52 SLTU 00 53
19
Refined Requirements
(1) Functional Specification inputs 2 x 32-bit
operands A, B, 4-bit mode outputs 32-bit result
S, 1-bit carry, 1 bit overflow operations add,
addu, sub, subu, and, or, xor, nor, slt,
sltU (2) Block Diagram (3) How to
design?
32
32
A
B
4
ALU
m
c
ovf
S
32
20
Full Adder
  • A fundamental building block in the ALU is a full
    adder (FA).
  • A FA performs a one bit addition.
  • ai bi ci 2ci1 si

ai
bi
ci
ci1
1 1 1 0
ai
0 1 0 1 0 1 1 1 1 1 0 0
FA
ci
ci1
bi
si
si
21
Full Adder Logic Equations
  • si is 1 if an odd number of inputs are 1.
  • ci1 is 1 if two or more inputs are 1 .

ai
bi
ci
ci1
si
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
22
Full Adder Design
  • One possible implementation of a full adder uses
    nine gates.

7
23
ALU Design
24
One bit ALU
S-select
CarryIn
and
A
or
Result
Mux
add
B
CarryOut
25
Additional operations
  • A - B A ( B)
  • form two complement by invert and add one

S-select
CarryIn
invert
and
A
or
Result
Mux
add
1-bit Full Adder
B
CarryOut
Set-less-than? left as an exercise
26
Overflow Detection Logic
  • Carry into MSB Carry out of MSB
  • For a N-bit ALU Overflow CarryInN - 1 XOR
    CarryOutN - 1

CarryIn0
A0
1-bit ALU
Result0
X
Y
X XOR Y
B0
0
0
0
CarryOut0
0
1
1
1
0
1
1
1
0
CarryIn2
A2
1-bit ALU
Result2
B2
CarryIn3
Overflow
A3
1-bit ALU
Result3
B3
CarryOut3
27
Larger ALUs
  • Three 1-bit ALUs, a 1-bit MSB ALU, and a 4-input
    NOR gate can be concatenated to form a 4-bit ALU.

a0
bo
a1
b1
a2
b2
a3
b3
0
0
0
c1
c0
ALUop2
c2
c3
c4
1-bit ALU
1-bit ALU
1-bit ALU
1-bit MSB ALU
V
set
ALUop20
r0
r1
r2
r3
Z
  • 31 1-bit ALUs, a 1-bit MSB ALU, and a 32-input
    NOR gate can be concatenated to form a 32-bit ALU.

28
But What about Performance?
  • Critical Path of n-bit Rippled-carry adder is nCP

CarryIn0
A0
1-bit ALU
Result0
B0
CarryOut0
CarryIn1
A1
1-bit ALU
Result1
B1
CarryOut1
CarryIn2
A2
1-bit ALU
Result2
B2
CarryOut2
CarryIn3
A3
1-bit ALU
Result3
B3
CarryOut3
29
Carry Look Ahead (Design trick peek)
Cin
A B C-out 0 0 0 kill 0 1 C-in propagate 1 0 C-
in propagate 1 1 1 generate
A0
S
G
B1
P
C1 G0 C0 P0
P A and B G A xor B
A
S
G
B
P
C2 G1 G0 P1 C0 P0 P1
A
S
G
B
P
C3 G2 G1 P2 G0 P1 P2 C0 P0 P1
P2
A
S
G
G
B
P
P
C4 . . .
30
Cascaded Carry Look-ahead (16-bit) Abstraction
C0
G0
P0
C1 G0 C0 P0
C2 G1 G0 P1 C0 P0 P1
C3 G2 G1 P2 G0 P1 P2 C0 P0 P1
P2
G
P
C4 . . .
31
Conclusions
  • An n-bit ALU can be designed by concatenating n
    1-bit ALUs.
  • Carry lookahead logic can be used to improve the
    speed of the computation.
  • A variety of design options exist for
    implementing the ALU.
  • The best design depends on area, delay, and power
    requirements, which vary based on the underlying
    technology.

32
Review of Previous Contents
  • Design ALU (Arithmetic Logic Unit)
  • It is easy to design logical gates (and gate, or
    gate)
  • Key Point
  • Use logical gates to express arithmetic operation
  • the addition operation
  • Subtraction is addition
  • Set less than is subtraction
  • Solution
  • Design a one-bit adder (full adder)
  • Input 2 one-bit data and a one-bit carry in
  • Output one-bit result and a one-bit carry out
  • Extend the full adder to one-bit ALU
  • Support logical, subtraction, set less than
    operations
  • Cascade many one-bit ALU to be an larger ALU
  • The last one-bit ALU should also support overflow
    detection
  • Carry look ahead to improve its performance

33
Review of The Previous Content full adder
ai
bi
ci
ci1
si
0 0 0 0 0
7
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
Only support 1-bit add
34
Review of The Previous Content 1-bit ALU
  • a full adder
  • an xor gate
  • a 4-to-1 mux

ai
bi
lessi
7
1-bit ALU
ci
ci1
ALUop
ri
ALUOp Function 000
AND 001 OR
010 ADD 110
SUBTRACT 111
SET-ON-LESS-THAN
0
1
2
3
35
MIPS arithmetic instructions
  • Instruction Example Meaning Comments
  • multiply mult 2,3 Hi, Lo 2 x 3 64-bit
    signed product
  • multiply unsigned multu2,3 Hi, Lo 2 x 3
    64-bit unsigned product
  • divide div 2,3 Lo 2 3, Lo quotient, Hi
    remainder
  • Hi 2 mod 3
  • divide unsigned divu 2,3 Lo 2
    3, Unsigned quotient remainder
  • Hi 2 mod 3
  • Move from Hi mfhi 1 1 Hi Used to get copy of
    Hi
  • Move from Lo mflo 1 1 Lo Used to get copy of
    Lo
  • shift left logical sll 1,2,10 1 2 ltlt
    10 Shift left by constant
  • shift right logical srl 1,2,10 1 2 gtgt
    10 Shift right by constant
  • shift right arithm. sra 1,2,10 1 2 gtgt
    10 Shift right (sign extend)
  • shift left logical sllv 1,2,3 1 2 ltlt 3
    Shift left by variable
  • shift right logical srlv 1,2, 3 1 2 gtgt
    3 Shift right by variable
  • shift right arithm. srav 1,2, 3 1 2 gtgt 3
    Shift right arith. by variable

36
MULTIPLY (unsigned)
  • Paper and pencil example (unsigned)
  • Multiplicand 1000 Multiplier
    1001 1000 0000 0000 1000
    Product 01001000
  • m bits x n bits mn bit product
  • Binary makes it easy
  • 0 gt place 0 ( 0 x multiplicand)
  • 1 gt place a copy ( 1 x multiplicand)

37
Unsigned Combinational Multiplier
  • Stage i accumulates A 2 i if Bi 1

38
How does it work?
0
0
0
0
0
0
0
B0
B1
B2
B3
P0
P1
P2
P3
P4
P5
P6
P7
  • at each stage shift A left ( x 2)
  • use next bit of B to determine whether to add in
    shifted multiplicand
  • accumulate 2n bit partial product at each stage
  • 4 versions of multiply hardware algorithm
  • successive refinement

39
Unisigned shift-add multiplier (version 1)
  • 64-bit Multiplicand reg, 64-bit ALU, 64-bit
    Product reg, 32-bit multiplier reg

Shift Left
Multiplicand
64 bits
Multiplier
Shift Right
64-bit ALU
32 bits
Write
Product
Control
64 bits
Multiplier datapath control
40
Multiply Algorithm Version 1
Start
  • Multiplier Multiplicand
  • 0011 0010

Multiplier0 1
Multiplier0 0
1a. Add multiplicand to product place
the result in Product register
2. Shift the Multiplicand register left 1 bit.
3. Shift the Multiplier register right 1 bit.
  • Product Multiplier Multiplicand
  • 0000 0000 0011 0000 0010
  • 0000 0010 0001 0000 0100
  • 0000 0110 0000 0000 1000
  • 0000 0110

nth repetition?
No lt n repetitions
Yes n repetitions
Done
41
Observations on Multiply Version 1
  • 1 clock per cycle gt ? 100 clocks per multiply
  • 1/2 bits in multiplicand always 0gt 64-bit adder
    is wasted
  • 0s inserted in left of multiplicand as
    shiftedgt least significant bits of product
    never changed once formed
  • Instead of shifting multiplicand to left, shift
    product to right.

42
MULTIPLY HARDWARE Version 2
  • Half of the shifted multiplicand register
    contains 0. Why not shift the product right?
  • 32-bit Multiplicand reg, 32 -bit ALU, 64-bit
    Product reg, 32-bit Multiplier reg

Multiplicand
32 bits
Multiplier
Shift Right
32-bit ALU
32 bits
Shift Right
Product
Control
Write
64 bits
43
Multiply Algorithm Version 2
Product Multiplier Multiplicand
0000 0000 0011 0010 1 0010
0000 0011 0010 2 0001 0000 0011
0010 3 0001 0000 0001 0010 1
0011 0000 0001 0010 2 0001 1000
0001 0010 3 0001 1000 0000
0010 1 0001 1000 0000 0010 2 0000
1100 0000 0010 3 0000 1100 0000
0010 1 0000 1100 0000 0010 2
0000 0110 0000 0010 3 0000 0110
0000 0010 0000 0110 0000
0010
44
Still more wasted space!
Product Multiplier Multiplicand
0000 0000 0011 0010 1 0010 0000
0011 0010 2 0001 0000 0011
0010 3 0001 0000 0001 0010 1 0011
0000 0001 0010 2 0001 1000 0001
0010 3 0001 1000 0000 0010 1
0001 1000 0000 0010 2 0000 1100
0000 0010 3 0000 1100 0000
0010 1 0000 1100 0000 0010 2 0000
0110 0000 0010 3 0000 0110 0000
0010 0000 0110 0000 0010
45
Observations on Multiply Version 2
  • Product register wastes space that exactly
    matches size of multiplier
  • Both Multiplier register and Product register
    require right shift
  • Combine Multiplier register and Product register

46
MULTIPLY HARDWARE Version 3
  • Product register wastes space that exactly
    matches size of multipliergt combine Multiplier
    register and Product register
  • 32-bit Multiplicand reg, 32 -bit ALU, 64-bit
    Product reg, (0-bit Multiplier reg)

Multiplicand
32 bits
32-bit ALU
Shift Right
Product
(Multiplier)
Control
Write
64 bits
47
Multiply Algorithm Version 3
Start
  • Multiplicand Product0010 0000 0011

Product0 1
Product0 0
2. Shift the Product register right 1 bit.
32nd repetition?
No lt 32 repetitions
Yes 32 repetitions
Done
48
Observations on Multiply Version 3
  • 2 steps per bit because Multiplier Product
    combined
  • MIPS registers Hi and Lo are left and right half
    of Product
  • Gives us MIPS instruction MultU
  • Can be improved further
  • Booths Algorithm

49
Booths Algorithm
middle of run
beginning of run
end of run
0 1 1 1 1 0
  • Current Bit Bit to the Right Explanation Example O
    p
  • 1 0 Begins run of 1s 0001111000 sub
  • 1 1 Middle of run of 1s 0001111000 none
  • 0 1 End of run of 1s 0001111000 add
  • 0 0 Middle of run of 0s 0001111000 none
  • Originally for Speed (when shift was faster than
    add)
  • When we see a string of 1s in multiplier
  • Replace it with an initial subtract when we first
    see a one
  • Then later add for the bit after the last one
  • 011110 -000010 100000

50
Motivation for Booths Algorithm
  • Example 2 x 6 0010 x 0110
  • 0010 x 0110 0000 shift
    (0 in multiplier) 0010 add (1 in
    multiplier) 0010 add (1 in
    multiplier) 0000 shift (0 in
    multiplier) 00001100
  • ALU with add or subtract gets same result in more
    than one way 6 2 8 0110 00010
    01000
  • For example
  • 0010 x 0110 0000
    shift (0 in multiplier) 0010 sub (start
    string of 1s) 0000 shift (mid run of 1s)
    0010 add (end run of 1s)
    00001100

51
Booths Example (2 x 7)
Operation Multiplicand Product next? 0. initial
value 0010 0000 0111 0
subtract 0010
  • 1a. P P - m 1110
    1110 1110 0111 0 shift P (sign ext)
  • 1b. 0010 1111 0011 1 11 -gt nop, shift
  • 2. 0010 1111 1001 1 11 -gt nop, shift
  • 3. 0010 1111 1100 1 01 -gt add
  • 4a. 0010 0010
  • 0001 1100 1 shift
  • 4b. 0010 0000 1110 0 done

52
Booths Example (2 x -3)
Operation Multiplicand Product next? 0. initial
value 0010 0000 1101 0 10 -gt sub
  • 1a. P P - m 1110
    1110 1110 1101 0 shift P (sign ext)
  • 1b. 0010 1111 0110 1 01 -gt add
    0010
  • 2a. 0001 0110 1 shift P
  • 2b. 0010 0000 1011 0 10 -gt sub
    1110
  • 3a. 0010 1110 1011 0 shift
  • 3b. 0010 1111 0101 1 11 -gt nop
  • 4a 1111 0101 1 shift
  • 4b. 0010 1111 1010 1 done

53
Review of the Last Lecture
  • Multiply can be accomplished by addition and
    shift
  • Four versions of multiply algorithm
  • Come out directly from paper and pencil
  • Test the last bit of the multiplier
  • If 1
  • Add the multiplicand to the product
  • Shift the multiplicand to left
  • Shift the multiplier to right
  • Shift the product to right will get the same
    result
  • Add multiplicand to the higher bits of the
    product register
  • No shift of the multiplicand
  • Space saved
  • ALU only need to handle half of the original bits
  • Computation saved
  • Lower bits of the product register as the
    multiplier register
  • No separate multiplier register
  • Space saved
  • One shift instead of two
  • Computation saved

54
Shifters
Two kinds logical-- value shifted in is
always "0" arithmetic-- on right
shifts, sign extend
msb
lsb
"0"
"0"
msb
lsb
"0"
Note these are single bit shifts. A given
instruction might request 0 to 32 bits to be
shifted!
55
Divide Paper Pencil
  • 1001 Quotient
  • Divisor 1000 1001010 Dividend 1000
    10 101 1010 1000 10
    Remainder (or Modulo result)
  • Creating quotient bit on each step
  • Quotient is 1 if Divisor lt Dividend, otherwise 0
  • How does the ALU know if Divisor lt Dividend?
  • Subtract and if remainder is less than 0
  • add back,
  • Shift and try again.
  • Three versions of divide, successive refinement

56
DIVIDE HARDWARE Version 1
  • 64-bit Divisor reg, 64-bit ALU, 64-bit Remainder
    reg, 32-bit Quotient reg

Shift Right
Divisor
64 bits
Quotient
Shift Left
64-bit ALU
32 bits
Write
Remainder
Control
64 bits
57
Divide Algorithm Version 1
  • Takes n1 steps for n-bit Quotient Rem.

Remainder lt 0
Test Remainder
Remainder gt 0
No lt n1 repetitions
Yes n1 repetitions (n 4 here)
58
Observations on Divide Version 1
  • 1/2 bits in divisor always 0 gt 1/2 of 64-bit
    adder is wasted gt 1/2 of divisor is wasted
  • Instead of shifting divisor to right, shift
    remainder to left?
  • 1st step cannot produce a 1 in quotient bit
    (otherwise too big) gt switch order to shift
    first and then subtract, can save 1 iteration

59
Divide Paper Pencil
  • 01010 Quotient
  • Divisor 0001 00001010 Dividend
    00001 0001 00001
    0001 0001 00
    Remainder (or Modulo result)
  • Notice that there is no way to get a 1 in
    leading digit!(this would be an overflow, since
    quotient would haven1 bits)

60
DIVIDE HARDWARE Version 2
  • 32-bit Divisor reg, 32-bit ALU, 64-bit Remainder
    reg, 32-bit Quotient reg

Divisor
32 bits
Quotient
Shift Left
32-bit ALU
32 bits
Shift Left
Remainder
Control
Write
64 bits
61
Divide Algorithm Version 2
Remainder gt 0
Test Remainder
Remainder lt 0
No lt n repetitions
Yes n repetitions (n 4 here)
62
Observations on Divide Version 2
  • Eliminate Quotient register by combining with
    remainder as shifted left
  • Start by shifting the Remainder left as before.
  • Thereafter loop contains only two steps because
    the shifting of the Remainder register shifts
    both the remainder in the left half and the
    quotient in the right half
  • The consequence of combining the two registers
    together and the new order of the operations in
    the loop is that the remainder will shifted left
    one time too many.
  • Thus the final correction step must shift back
    only the remainder in the left half of the
    register

63
DIVIDE HARDWARE Version 3
  • 32-bit Divisor reg, 32 -bit ALU, 64-bit Remainder
    reg, (0-bit Quotient reg)

Divisor
32 bits
32-bit ALU
HI
LO
Shift Left
Remainder
(Quotient)
Control
Write
64 bits
64
Divide Algorithm Version 3
Test Remainder
Remainder lt 0
Remainder gt 0
No lt n repetitions
Yes n repetitions (n 4 here)
65
Observations on Divide Version 3
  • Same Hardware as Multiply just need ALU to add
    or subtract, and 64-bit register to shift left or
    shift right
  • Hi and Lo registers in MIPS combine to act as
    64-bit register for multiply and divide
  • Signed Divides Simplest is to remember signs,
    make positive, and complement quotient and
    remainder if necessary
  • Note Dividend and Remainder must have same sign
  • Note Quotient negated if Divisor sign Dividend
    sign disagreee.g., 7 2 3, remainder 1
  • Possible for quotient to be too large if divide
    64-bit interger by 1, quotient is 64 bits
    (called saturation)

66
Floating Point (a brief look)
  • We need a way to represent
  • numbers with fractions, e.g., 3.1416
  • very small numbers, e.g., .000000001
  • very large numbers, e.g., 3.15576 109
  • Representation
  • sign, exponent, significand (1)sign
    significand 2exponent
  • more bits for significand gives more accuracy
  • more bits for exponent increases range
  • IEEE 754 floating point standard
  • single precision 8 bit exponent, 23 bit
    significand
  • double precision 11 bit exponent, 52 bit
    significand

67
IEEE 754 floating-point standard
  • Leading 1 bit of significand is implicit
  • Exponent is biased to make sorting easier
  • all 0s is smallest exponent all 1s is largest
  • bias of 127 for single precision and 1023 for
    double precision
  • summary (1)sign (1significand)
    2exponent bias
  • Example
  • decimal -.75 - ( ½ ¼ )
  • binary -.11 -1.1 x 2-1
  • floating point exponent 126 01111110
  • IEEE single precision 10111111010000000000000000
    000000

68
Floating point addition

69
Floating Point Complexities
  • Operations are more complicated (see textbok)
  • In addition to overflow we can have underflow
  • Accuracy can be a big problem
  • IEEE 754 keeps two extra bits, guard and round
  • four rounding modes
  • positive divided by zero yields infinity
  • zero divide by zero yields not a number
  • other complexities
  • Implementing the standard can be tricky
  • Not using the standard can be even worse
  • see text for description of 80x86 and Pentium bug!

70
Chapter Three Summary
  • Computer arithmetic is constrained by limited
    precision
  • Bit patterns have no inherent meaning but
    standards do exist
  • twos complement
  • IEEE 754 floating point
  • Computer instructions determine meaning of the
    bit patterns
  • Performance and accuracy are important so there
    are many complexities in real machines
  • Algorithm choice is important and may lead to
    hardware optimizations for both space and time
    (e.g., multiplication)
  • You may want to look back (Section 3.10 is great
    reading!)
Write a Comment
User Comments (0)
About PowerShow.com