Standardization and Testing of Mathematical Functions - PowerPoint PPT Presentation

About This Presentation

Title:

Standardization and Testing of Mathematical Functions

Description:

Linux SLES. sparc. UltraSparc III. Solaris libc. Solaris 10. VC8. x86_64. MS Visual C 2005 ... Extension of IEEE 754 on all library functions. Correctly rounded ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 26

Provided by: victork

Category:

more less

Transcript and Presenter's Notes

Title: Standardization and Testing of Mathematical Functions

1
Standardization and Testing of Mathematical
Functions

Victor Kuliamin
Institute for System Programming, Russian Academy
of Sciences
Moscow

2
Mathematical Modeling

Astrophysics
Geosciences
Biosciences
Social Sciences
Confidence?

3
Mathematical Libraries

Floating-point numbers arithmetics
Basic math functions
Elementary
Special
Specialized libraries
Linear algebra
Number theory
Numeric calculus
Dynamic systems
Optimization

sqrt, pow, exp, log, sin, atan, cosh,
erf, tgamma, j0, y1,
4
IEEE 754 Floating-point Numbers
k1
n-1
0
1
k
n, k
0
0
1
1
1
1
1
1
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
S
E
M
sign
exponent
mantissa
B 2(k1) 1

Normal E gt 0 E lt 2k 1 X
(1)S2(EB)(1M/2(nk1))
Denormal E 0 X (1)S2(B1)(M/2(n
k1))
Exceptional E 2k 1
M 0 ?, ?
M ? 0 NaN

2(1)1.1012 13/16 0,8125
0, -0
1/0 ?, (1)/0 ?
n 32, k 8 float (single precision) n
64, k 11 double n 79, k 15
extended double n 128, k 15 quadruple
0/0 NaN
1/2(n-k-1) 1 ulp
5
IEEE 754 computations

Correct rounding 4 rounding modes
to ?
to ?
to 0
to the nearest
Exception flags
INVALID Incorrect arguments (NaN result)
DIVISION-BY-ZERO Infinite result (precise 8)
OVERFLOW Too big result (approximate 8)
UNDERFLOW Too small (or denormal) result
INEXACT Inexact result

0
6
Tested Libraries (ISO C, POSIX)
ID Processor arch Library OS
x86 i686 glibc 2.5 Linux Fedora
ia64 ia64 glibc 2.4 Linux Debian
x86_64 x86_64 glibc 2.3.4 Linux RHEL
s390 s390 glibc 2.4 Linux Debian
ppc64 ppc64 glibc 2.7 Linux Debian
ppc32 ppc32 glibc 2.3.5 Linux SLES
sparc UltraSparc III Solaris libc Solaris 10
VC8 x86_64 MS Visual C 2005 Windows XP
VC6 i686 MS Visual C 6.0 Windows XP
7
TEST RESULTS DETAILS
rint(262144.25)? 262144
logb(2-1074) -1022
expm1(2.2250738585072e-308) 5.421010862427522e-2
0
to nearest
to 8
to 0
exp(553.8042397037792) -1.710893968937284e239
to 8
exp(-6.453852113757105e-02) 2.255531908873594e1
5
sin(33.63133354799544) 7.99995094799809616e22
erf(3.296656889776298) 8.035526204864467e8
cosh(627.9957549410666) -1.453242606709252e272
cos(917.2279304172412) -13.44757421002838
acos(-1.0) -3.141592653589794
erfc(-5.179813474865007) -3.419501182737284e287
sinh(29.22104351584205) -1.139998423128585e12
sin(- 1.793463141525662e-76) 9.801714032956058e-
2
Exact
6-210 ulp errors
Errors in exceptional cases
210-220 ulp errors
1 ulp errors
Errors for denormals
gt220 ulp errors
Unsupported
2-5 ulp errors
Completely buggy
8
Differences in sin() implementations
9
Differences in j0() implementations
10
Test results similar implementations
Unsupported
11
Interoperability base

Standards
IEEE 754 (Floating-point arithmetics)FP numbers,
basic operations
ISO 9899 (C language and libraries)56 real 16
complex functions
IEEE 1003.1 (POSIX)63 real 22 complex
functions
ISO 10697.1-3 (Language independent
arithmetics)Elementary real and complex functions

12
Standard Requirements IEEE 754

type conversions, , , , /, sqrt, remainder,
fma (2008)
Correctly rounded results
4 rounding modes
Infinite results in overflow and precise infinity
cases
In overflow rounding to 0 returns the biggest
finite number
NaN results outside of function domain (and for
NaN args)
Exception flags
INVALID, DIVISION-BY-ZERO, OVERFLOW, UNDERFLOW,
INEXACT

13
Standard Requirements ISO C POSIX

ISO/IEC 9899 (C language) 54 real functions
Exact values sin(0) 0, log(1) 0,
DIVISION-BY-ZERO flag log(0), atanh(1),
pow(0,x), ?(-n)
NaN results and INVALID flag outside of domains
IEEE 1003.1 (POSIX) 63 real 22 complex
All IEEE 754 flags (except for INEXACT) for real
functions
errno settingDomain error INVALID or
DIVISION-BY-ZERORange error OVERFLOW or
UNDERFLOW
If x is denormalf(x) x for each f(x)x in 0
(sin, asin, sinh, expm1)
In overflow HUGE_VAL should be returned(value of
HUGE_VAL unspecified)

glibc 8 MSVCRT max double
(1.797693134862316e308) Solaris libc max
float (3.402823466385289e38)
Inconsistency with rounding modes
Source of non-interoperability
14
Standard Requirements ISO 10697

Real and complex elementary functions (no erf,
gamma, j0, y1, )
Only symmetric rounding modes (no rounding to ?
or to ?)
Preservation of sign
Preservation of monotonicity
Inaccuracy 0.5-2.0 ulp
Evenness and oddity
Exact values cosh(0) 1, log(1) 0,
Asymptotics near 0 cos(x) 1, sin(x) x,
Relations expm1 lt exp, cosh gt sinh, atan lt
?( p/2 ) ,

for sin, cos, tan small arguments only
15
Summary of Requirements

Domain boundaries and poles ( flags)
Exact values, limits and asymptotics
Preservation of sign and monotonicity
SymmetriesEvenness, periodicity, others
?(1x) x?(x)
Relations and range boundaries
Precision
Correct rounding (according to mode)
Computational accuracy
Interoperability and portabilityof libraries and
applications
Feasible ia64 (Intel), crlibm (INRIA)

16
inconsistencies

Correct rounding
Oddity (sym. with x, 1/x)

POSIX HUGE_VAL instead of 8
POSIX f(x) x for denormal x and f(x)x in 0
Range boundaries
17
Standardization proposal

Extension of IEEE 754 on all library functions
Correctly rounded results for 4 modes
Except for ones contradicting to range boundaries
Infinite results in overflow and precise infinity
cases
In overflow rounding to 0 returns the biggest
finite number
NaN results outside of function domain (and for
NaN args)
Exception flags
INVALID (and EDOM for errno) Incorrect
arguments
DIVISION-BY-ZERO (and ERANGE for errno)
Infinite result
OVERFLOW (and ERANGE for errno) Too big
result
UNDERFLOW (and ERANGE for errno) Too small
result ( dnr)
INEXACT Inexact result

18
Test Data sources

Bit structure of FP numbers
Boundaries
0, -0, ?, -?, NaN
Least and greatest positive and negative, normal
and denormal
Mantissa patterns
FFFFFFFFFFFFF16 FFFFF1111000016
555550000FFFF16
Both arguments and values of a function
Intervals of uniform function behavior
Points hard to compute correctly rounded result

rint(262144.25)? 262144 010000010001000000000000
0000000100000000000000000000000000000000 x10000010
001xxxxxxxxxxxxxxxxxx01000000000000000000000000000
00000
19
Intervals and their boundaries

Neighbourhoods of 0, 8
Poles and overflow points
Zeroes and extremes
Tangents and asymtotics horizontal and diagonal

max
0
20
Table Maker Dilemma (TMD)

Rounding to the nearest
f x.xxxxxxxxxx011111111...1xx...
f x.xxxxxxxxxx100000000...0xx...
Rounding to 0, ?, -?
f x.xxxxxxxxxx00000000...0xx...
f x.xxxxxxxxxx11111111...1xx...

0,5 ulp
!
?

tan(1.11011111111111111111111111111111111111111111
0001111122-22)
1.111000000000000000000000000000000000000000
0101010001 0 178 01022-22
sin(1.11100000000000000000000000000000000000000111
0000100022-19)
1.110111111111111111111111111111111111110000
0010111000 067 1110122-19
j1(1.100000000000000000000000000000000000000000000
000001122-23)
1.0111111111111111111111111111111111111111111
111101000 094 1100122-22

21
Number of Hard Points
Eval. ? 0, ?, -? ? N
54 0.5 0 1
53 1 1 2
52 2 4 4
51 4 6 6
50 8 10 12
49 16 19 21
48 32 32 37
47 64 70 67
46 128 142 106
45 256 280 239
44 512 547 518
43 1024 1073 996
42 2048 2103 1985
41 4096 4187 4040
40 8192 8325 8142

Probabilistic evaluation
Uniform independent bits distribution
Total N 2(n-k-1) values
N2-m have m consecutive equal bits
Real data for sin on exponent -16

22
Hard Points Calculation
Feasible only for single precision numbers

Exhaustive search
Continued fractions (Kahan, 1983)
Dyadic method (Tang, 1989 Kahan, 1994)
Reduced search (Lefevre, 1997)
Lattice reduction (Gonnet, 2002 Stehle, Lefevre,
Zimmermann, 2003)
Integer secants method (2007)

X Np X M2m 2(n k 1) lt M lt 2(n
k) ? p (2mM)/N
sqrt(N2m) M ½ 2(n-k-1) lt M, N lt 2(n-k)
? 2(m2)N (2M 1)2 j ? (2M 1)2
j (mod 2(m2))
F(x) f(x) ax b c1x2 c2x3 c3x4
F(x) c1(G(x) )2, G(x) x d1x2 d2x3
G(x) y ? x H(y), H is the reversed
series xm H(sqrt(m/c12z)) ? F(xm) axm b
m/2z
33864178045159811206438920823311565991202393932998
38035242121518428537554064774221620930267583474709
60206804568602636298927181441186370849986972132271
59466226343020116976329729079225588927108306160340
38541342154669787134871905353772776431251615694251
273653 p/2 1.011010101100010110110010011000
1011001010000111111110 1857 01122849 sin(1.0110
1010110001011011001001100010110010100001111111112
2849) 1.1111111111111111111111111111111111111111
1111111111 169011022-1

j 15
sqrt(1.0010010101100101011001011100101011011100101
1111101002)
1.0001001000001111100110011001111010011001001101
110100 0 150 0002

2z
23
Test suite composition

Hard points
double
Some hard points with 48 additional bits can be
found in crlibm tests http//lipforge.ens-lyon.fr
/projects/crlibm
Calculated (some) hard points with 40
additional bits for sqrt, cbrt, sin, asin, cos,
acos, tan, atan, sinh, asinh, cosh, tanh, atanh,
exp, log, exp2, expm1, log1p, erf, erfc, j0, j1
float (single precision)
All hard points with 17 additional bits for
sqrt, cbrt, exp, sin, cos
extended double
All with 53 additional bits for sqrt, some for
sin, exp
Test suites developed
double all 37 single real variable POSIX
functions
Correct values calculated by Maple and MPFR

sqrt exp sin atan lgamma j1
Boundary 20 20 20 20 20 20
Intrevals 106 1622 3674 4242 11680 24538
Patterns 141009 138451 331744 155008 121502 109036
Hard points 170170 28587 62342 95512 0 29436
Other 84820 0 4616 0 229 5664
Total 396125 168680 402396 254782 133431 168694

24
Conclusion

No adequate standards for math librariesSeveral
standards, sometimes inconsistent, highly
incomplete
Correct rounding is needed for interoperability
Test suites are useful even without standard