How to represent real numbers - PowerPoint PPT Presentation

About This Presentation

Title:

How to represent real numbers

Description:

Number of Views:54

Avg rating:3.0/5.0

Slides: 15

Provided by: cse46

Learn more at: https://courses.cs.washington.edu

Category:

Tags: numbers | real | represent | static

Transcript and Presenter's Notes

Title: How to represent real numbers

1
How to represent real numbers

2
Real numbers representation inside computer

Use a representation akin to scientific notation
sign, exponent, mantissa
Many variations in choice of representation for
Sign and mantissa (could be 2s complement, sign
and magnitude etc.)
exponent (cf. mantissa) and base (could be 2, 8,
16 etc.) to which the exponent is raised
Arithmetic support for real numbers is called
floating-point arithmetic

3
Floating-point representation IEEE Standard

Basic choices
A single precision number must fit into 1 word (4
bytes, 32 bits)
A double precision number must fit into 2 words
(used most often)
Base for the exponent is 2
There should be approximately as many positive
and negative numbers as well as as many positive
and negative exponents
Single representation of 0 compatible with
integer representation
Numbers will be normalized

4
Example MIPS representation

8 bits
23 bits
exponent
mantissa
31 2322
0
5
MIPS representation (cted)

Mantissa in sign and magnitude form
s bit 31 sign bit for mantissa (0 pos, 1 neg)
mantissa 23 bits always a fraction with an
implied binary point at left of bit 22
(normalized, see next slides)
exponent 8 bits (biased exponent, see next
slide)
0 is represented by all zeroes.
Note that having the most significant bit as sign
bit makes it easier to test for 0, positive, and
negative.

6
Biased exponent

7
Normalization

Since numbers must be normalized, there is an
implicit one at the left of the binary point.
No need to put it in (improves precision by 1
bit)
But need to reinstate it when performing
operations.

8
Double precision

Takes 2 words (64 bits)
Exponent 11 bits (instead of 8)
Mantissa 52 bits (instead of 23)
Still biased exponent and normalized numbers
Still 0 is represented by all zeroes
We can still have overflow (the exponent cannot
handle super big numbers) and underflow (the
exponent cannot handle super small numbers)

9
Floating-Point Addition

Quite complex (logically more complex than
multiplication)
Need to know which of the addends is larger
(compare exponents)
Need to shift smaller mantissa
Need to know if mantissas have to be added or
subtracted (since sign and magnitude
representation)
Need to normalize the result
Correct round-off procedures are not simple (not
covered here)

10
F-P add (details for round-off omitted)

1. Compare exponents . If e1 lt e2, swap the 2
operands such that
d e1 - e2 gt 0. Tentatively set exponent
of result to e1.
2. Insert a 1 at left of each mantissa. If the
signs of operands differ, replace 2nd mantissa by
its 2s complement.
3. Shift 2nd mantissa d bits to the right
(inserting 0s if not complemented, 1s if it
were).
4. Add the (shifted) mantissas. (there is one
case where the result could be negative and you
have to take the 2s complement this can happen
only when d 0 and the signs of the operands are
different)
5. Normalize (if there was a carry-out in step 4,
shift right once else shift left until the first
1 appears on msb)
6. Modify exponent to reflect the number of bits
shifted in previous step

11
Using pipelining

12
Floating-point multiplication

13
Implementing fast multiplication

Use Carry-Save adders (3 inputs, 2 outputs) until
the last addition where you need a CLA (cf. CSE
370?)
Use a (Wallace) tree. Can cut-it off in several
stages depending on hardware available.
The O(n2) process has been replaced by an
O(nlogn) one.
Possibility of some pipelining inter-operations.
Possibility of accumulation as in dot products

14
Division

A guessing game
True also for integer divisions
In some implementations replace divide by 2
operations
Find x 1/denominator (with hardware tables to
guess the first few bits recall denominator will
be normalized)
Multiply x and numerator.

Write a Comment

User Comments (0)