CSE%20246:%20Computer%20Arithmetic%20Algorithms%20and%20Hardware%20Design - PowerPoint PPT Presentation

About This Presentation
Title:

CSE%20246:%20Computer%20Arithmetic%20Algorithms%20and%20Hardware%20Design

Description:

Maximal information with given bit numbers. Arithmetic with proper precision. ... Difference between two consecutive values of the significand. ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 18
Provided by: haiku
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: CSE%20246:%20Computer%20Arithmetic%20Algorithms%20and%20Hardware%20Design


1
CSE 246 Computer Arithmetic Algorithms and
Hardware Design
Fall 2006 Lecture 9 Floating Point Numbers
  • Instructor
  • Prof. Chung-Kuan Cheng

2
Motivation
  • Maximal information with given bit numbers.
  • Arithmetic with proper precision.
  • Fairness of rounding.
  • Features at the expenses of the complexity of the
    operations.

3
Topics
  • Floating Point Numbers (IEEE P754)
  • Standard
  • Operations
  • Exceptional Situations
  • Rounding Modes
  • Numerical Computing with IEEE Floating Point
    Arithmetic, Michael L. Overton, SIAM

4
Standard
232 ? Typically
  • Goal Dynamic Range
  • largest / smallest
  • If too large, holes between s

5
Standard
  • ulp (unit in the last place)
  • Difference between two consecutive values of the
    significand.
  • 3 Parts? x s besign, significand, exponent

Sign Bit
23-bit Significand
8-bit exponent
6
Standard
  • e1e2e3e4e5e6e7e8s1s2s3s22s23
  • 1.s1s2s3s22s23 normalized number
  • 0.s1s2s3s22s23 denormalized number
  • e1e2e3e4e5e6e7e8
  • 0 0 0 0 0 0 0 0 0 x0.s1s2s3s22s23 2-126
  • 1 0 0 0 0 0 0 0 1 x1.s1s2s3s22s23
    2-126
  • 2 0 0 0 0 0 0 1 0 x1.s1s2s3s22s23
    2-125
  • .
  • 127 0 1 1 1 1 1 1 1 x1.s1s2s3s22s23 20
  • .
  • 253 1 1 1 1 1 1 0 1 x1.s1s2s3s22s23
    2126
  • 1 1 1 1 1 1 1 0 x1.s1s2s3s22s23 2127
  • 1 1 1 1 1 1 1 1 x Inf if (s1 s23) 0,
    NaN otherwise. NaN ? Not a Number

7
Standard
  • 0.01x2-3 0.001x2-2
  • Same number, so normalize to remove redundancy
  • Use a default 1 in front for one more bit
    precision.
  • Smallest Number
  • 0.0001x2-126 1.0x2-23x2-126
  • 1x2-149

8
Standard - Example
  • eeeeeeee sssss sssss sssss sssss sss
  • 0 00000000 00000000000000000000000
    0.0000x2-126
  • 1 00000000 00000000000000000000000
    -0.0000x2-126
  • 0 00000000 00000000000000000000001
    0.0001x2-149
  • 0 00000001 00000000000000000000000
    1.0000x2-126
  • normalized minimum
  • 0 00000001 00000000000000000000001
    1.0001x2-126
  • .
  • .
  • 0 01111111 00000000000000000000000 1.0000x20
  • 0 01111111 00000000000000000000001 1.0001x20
  • 0 10000000 00000000000000000000001 1.0001x21

9
Standard Example Cont.
  • 0 11111110 00000000000000000000000 1.0000x2127
  • 0 11111110 00000000000000000000001 1.0001x2127
  • 0 11111110 11111111111111111111111 1.1111x2127
  • - Normalized Maximum
  • 0 11111111 00000000000000000000000 Inf
  • Nmin 1.0 x 2-126
  • Nmax (2 2-23)2127

10
Double Floating Point
  • e1e2e11 s1s2s52
  • 0 00000 s1s2s52 x0.s1s2s52 2-1022
  • 0 00001 s1s2s52 x1.s1s2s52 2-1022
  • .
  • .
  • 0 01111 s1s2s52 x1.s1s2s52 20
  • 0 10000 s1s2s52 x1.s1s2s52 21
  • .
  • .
  • 0 11110 s1s2s52 x1.s1s2s52 21023
  • 0 11111 s1s2s52 xInf if (s1s52)0

11
Overflow/Underflow
Underflow
Sparser
Denser
Overflow
Nmin
Nmax
12
Addition/Multiplication
  • s1xbe1 (s2xbe2) sxbe
  • s1xbe1 s2/be1-e2 x be1
  • (s1 s2/be1-e2) x be1
  • (s1xbe1) x (s2xbe2) (s1xs2)be1e2

13
Exceptions
  • a/0 Inf if a gt 0
  • a/Inf 0 if a ! 0
  • a0 0
  • aInf Inf if a gt 0
  • a Inf Inf
  • 0Inf invalid operation (NaN)
  • 0/0 invalid operation (NaN)
  • Inf - Inf NaN
  • NaP op a NaN

14
Rounding Mode
  • Adder Output Cout z1z0.z-1z-2z-l GRS
  • Guard Bit
  • Round Bit
  • Sticky Bit, OR of all bits below bit R
  • 1.101 x 23
  • 1.110 x 23
  • 11.011 x 23
  • 1.1011x24 Normalize need to round or

15
Rouding
  • 1.110 23
  • - 1.101 23
  • 0.001 23
  • 1.000 20 normalize
  • 1.101 23
  • - 1.111 22
  • 1.101 23
  • - 0.1101 23
  • 0.1101 23
  • 1.101 22

Guard bit
16
Rounding
  • Round to the nearest even
  • 1.10111
  • toward 0 1.1011
  • Toward Inf 1.1100
  • Toward -Inf 1.1011

17
Conventional Rounding Error
  • Rounding Error
  • 1.10100 ? 1.101 0
  • 1.10101 ? 1.101 -0.25
  • 1.10110 ? 1.110 0.5
  • 1.10111 ? 1.110 0.25
  • Average Error 0.5/4 0.125
Write a Comment
User Comments (0)
About PowerShow.com