ECE 4120 FloPt 1 presentation

About This Presentation

Transcript and Presenter's Notes

Title: ECE 4120 FloPt 1

1
Floating-Point Computer Arithmetic

ECE 4120 Fundamentals of Computer Design
Dr. Roger L. Haggard, Associate Professor
Department of Electrical and Computer Engineering
Tennessee Technological University
Spring 2004

2
Floating Point Numbers Systems - Introduction

Can represent both integers and fractions with
much wider range of values
N bits still represent up to 2N values, but
interpret differently
Scientific notation in base 10 FPNS
- 2.34 1012 - 2.34 E12
- 2.34 10 12
Mantissa Base Exponent
sign - 10 sign
mag 2.34 mag 12
radix 10 radix 10
usually assumed, not written with number
can be assumed
\ Every number must contain
mantissa sign mag / - / 2.34 /
exponent sign mag / / 12 /

3
FPNS Basics (1)

For Computer Binary values most efficient
VFPN Mantissa BaseExponent
usually S/M usually excess code
\ VFPN (-1)SIGN VM rbVE
value of mantissa base of system value of
exponent
in base rb in base re
m digits e digits
VMMIN VM VMMAX VMMIN VE VEMAX

4
FPNS Basics (2)

Mantissa is fixed point assume value of p
Normalization convert number so that the msd ¹
0, giving maximum precision for the number
Usually p m, so mantissa is a pure fraction
\ VMMIN 0.1000 1/ rb
\ VMMax 0.1111 1- rb-m
Number of legal mantissas NLM (rb - 1)
rbm-1
possible 1st digit other digits
Number of legal exponents NLE ree
(code-dependent)
Number of representable values NRV NLM
NLE 2 (signed!)
Min FP value VMIN VMMIN rbVE,MIN
Max FP value VMAX VMMAX rbVE,MAX

5
7-Bit Example 1

FPNS with rb 2, re 2, m 4, e 2, p
m 4
(S/M) (unsigned) normalized, msd 1
so value (sign) 0.mmmm 2ee
VMMIN 0.10002 1/2 NLM 8
VMMAX 0.11112 15/16
NRV 8 4 2 64
VEMIN 002 0 NLE 4
VEMAX 112 3
VMAX .11112 2112 15/16 8 7 1/2
VMIN .10002 2002 1/2 1 1/2 IF Ve
3 Dr
.0001 23 1/2

6
7 Bit Example 2
7
IEEE Floating Point Standard (Std 754)

rb 2, re 2, m 24, e 8, p m
23
(S/M) (unsigned, excess 127)
MSB hidden, not stored

8
IEEE Floating Point Standard (Std 754)
VMMIN 1.00 02 1 NLM 223 VMMAX 1.11
12 2-2-23 NRV 223 254
2 VEMIN 1 - 127 -126 NLE 28 - 2
254 4.2 109 VEMAX 254 - 127
127 VMIN 1.00 0 2-126 _at_ 1.2
10-38 VMAX 1.11 1 2127 _at_ 3.4
1038

223 _at_ 8 106 _at_ 7 significant decimal
digits (Dr 2-23 2VE)
Gradual underflow
V lt 2-126 (denormalized) when SE 0, VM ¹ 0
(Hidden bit 0)
Error reporting
NaN (Not a Number) 0/0, /, /0, (NaN op X)
SE 255, VM ¹ 0
Infinity when V gt 2 2127 SE 255, VM 0

9
IEEE FPNS Conversion Example

Convert IEEE value C050000016 to its decimal
value

S 1 S E 128 (-) VE 1 VM 1.1010 02
1.62510 V (-1)1 1.625 21
-3.25010
10
IEEE FPNS Addition
Floating Point Add (Positive Operands)
\Align smaller value 1.2 102
.12 103 (Shift Right 1) 2.4 103
2.4 103 ? 2.52 103
1.01 23 1.01 23 1.11
22 .111 23 (SR1) ?
10.001 23 Post Normalize SR1
1.0001 24
11
IEEE FPNS Addition Hardware Diagram
Databus (310)
32
23
23
1
1
8
8
MA
MB
EA
EB
24
24
8
8
Align MA
Align MB
SR (n)
24
24
Compare
A B Cout F Cin
D
M24
24
M23-0
Select Increment for Normalize
Adjust E
Normalize M
SR (1)
0
8
23
3S Reg
EY
MY
SY
3S Reg
3S Reg
8
23
Databus (310)
32
12
IEEE FPNS Addition AlgorithmVersion 1 (1)

Add positive normalized operands
Omit gradual underflow, NaN,

13
Version 1 (2)
EA DB(3023) MA 1, DB(220), hidden bit
MSB EB DB(3023) MB 1, DB(220) hidden
bit MSB
1. Load A B
D EA - EB If D lt 0 then --- A smaller E
EB Shift Right MA by D MA Else (D ³ 0) ---
B smaller E EA Shift Right MB by D
MB Endif
2. Align
14
Version 1 (3)
If M24 1 then Shift Right M by 1 E E
1 Endif
4. Normalize
MY M EY E SY 0
5. Store Y
15
IEEE FPNS Addition Algorithm Version 1S (1)

More efficient with special case than Version 1
Special Case Exponents differ by gt 24
All steps except 2 (align) are the same as
Version 1

16
Version 1S (2)
D EA - EB IF D ³ 24 then -- B very
small MB 0 E EA Elseif D ³ 0 then --
B smaller Shift Right MB by D MB E
EA Elseif D -24 then -- A very small MA
0 E EB Else (D lt 0) -- A smaller Shift
Right MA by D MA E EB Endif
2. Align
17
IEEE FPNS Addition/Subtraction
Solve Need 2 extra bits () A
1.10 22 2s comp. 001.10 ( )
B - 1.11 22 110.01 () Y - .01
22 111.11 Negate Mantissa Sign
- 000.01 22 Normalize (SL2) - 1.00 20
S 1
18
IEEE FPNS Addition/Subtraction Hardware
Databus (310)
32
23
23
1
1
8
8
MA
MB
EA
EB
SB
SA
8
8
Align MA
Align MB
SR (n)
24
24
MA
MB
Add/SubOp
Sign Logic
00
00
Compare
A B A/S 26 bit Add/Sub
FAB
D
Msign
26
Adjust E
Absolute Value
SY
25
8
Normalize M
SR (1) orSL (n)
EY
23
8
MY
Databus (310)
32
23
19
IEEE FPNS Addition/Subtraction Algorithm Version
2 (1)
1. Load A B 2. Align 3. Add or Sub Mantissas 4.
Normalize 5. Store Y
FP ADD/SUB (overall)
20
Version 2 (2)
SA DB(31) EA DB(3023) MA 1, DB(220)
hidden bit MSB SB DB(31) EB DB(3023) MB
1, DB(220) hidden bit MSB
1. Load A B
Same as V1S Align
2. Align
21
Version 2 (3)
MY M EY E SY S
5. Store Y
Steps 3 and 4 are discussed on the following pages
22
Version 2 (4)
Possible Add/Sub Combinations
\Basically, A B or A - B or - (AB) or
- (A-B)
23
Version 2 (5)
CASE (Sub SA 0 SB 1) or (Add SA 0
SB 0) M MA MB, S 0 (Sub SA 0
SB 0) or (Add SA 0 SB 1) M MA -
MB, S 0 (Sub SA 1 SB 1) or (Add
SA 1 SB 0) M MA - MB, S 1 (Sub
SA 1 SB 0) or (Add SA 1 SB 1) M
MA MB, S 1 END CASE If MSIGN 1 then
If negative mantissa then M - M Make
positive (abs value) S S Change
sign End If
3. Add/Sub Mantissas
24
Version 2 (6)
Bit 24 23 ----- 0
If M 0 then E 0 Else If M24 1 then
11.xxx Shift Right M by 1 01.1xxx E E
1 Else While M23 0 do 00.01xx Shift left
M by 1 E E -1 01.xx End While End If
Bit 24 23 ----- 0
4. Normalize
25
IEEE Floating Point Multiplication Examples
(simpler than addition!)
Ex. 1 6.0 102 Þ Mult. Mantissas, Add
Exponents x 4.0 103 24.0 105
Ex. 2 1.01 23 No Alignment needed x
1.10 24 0 0 0
1 0 1 1 0 1 0 1. 1 1 1 0
27 No Normalization 1.11
27 (Rounding?)
26
IEEE Floating Point Multiplication Examples
Ex. 3 1.11 21 No Alignment! x 1.11
25 1 1 1
1 1 1 1 1 1 1 1. 0 0 0 1
26 Normalize (SR1) 0 1.1 0 0 0 1
27 1.10 27
27
IEEE Floating Point Multiplication Hardware
EA
EB
SA
SB
MA
MB
8
8
24
24
Add (Excess 127)
Sign Logic (XOR)
Integer Multiplier P
HOW?
8
24
Adjust
Normalize
Increment
SR (1)
24
8
EY
SY
MY
28
IEEE Floating Point Division Examples(similar to
Multiplication)
29
IEEE Floating Point Division Examples
0.1 0 22 Ex. 3 1.00 24
1.11 0 1.0 0 0 0 22 Q 1.11 22
- 0 1 1 1 2 r2
1 1 0 1 (failed) 7 16
1 0 0 0 - 0 1 1 1 0 0.1 0 22
R (ignored) Þ Normalize (SL1)
1.00 21
30
IEEE Floating Point Division Hardware
EA
EB
SA
SB
MA
MB
Subtract (Excess 127)
Sign Logic (XOR)
Integer Divider Q R
HOW?
separate norm. and exp. if R needed
Adjust
Normalize
Decrement
SL (1)
EY
SY
MY
31
Floating Point Extra Bit Errors

Bit Shifting for Align and Normalize can create
wider words
Must be reduced to standard width result
Reduction creates error and bias, depending on
method
Truncation
Rounding
Others

32
Extra Bit Errors - Examples
4-bit Addition Example
.1 1 0 1 20 Align (SR3) .0 0 0 1 1 0
1 0 23 .1 0 0 1 23 .1 0 0 1
.1 0 1 0 1 0 1 0
if 4-bit Add
Reducing width causes a small error
33
Extra Bit Errors - Examples
4-bit Subtraction Example

We must usually consider
Increased ALU / Reg width
Rounding method

34
Floating Point Status

Separate from the fixed point status bits
Extra information available
Overflow exp too large (add, mult)
Underflow exp too small (mult, div) 0
Zero (mult by 0, div by 0, add 0s, sub )
Sign sign of result (Not MSB)
NaN Not legal number (0/0, /) Invalid
Result
Inexact due to rounding

Write a Comment

User Comments (0)

About PowerShow.com

ECE 4120 FloPt 1 PowerPoint PPT Presentation