Tr - PowerPoint PPT Presentation

About This Presentation

Title:

Tr

Description:

Utilisation des statistiques pour caract riser le comportement ... R flexion. Objectifs : D finir des algorithmes ' robustes ' en l'absence de standard flottant ... – PowerPoint PPT presentation

Number of Views:102

Avg rating:3.0/5.0

Slides: 21

Provided by: Anv60

Category:

Tags: reflexion

more less

Transcript and Presenter's Notes

Title: Tr

1

Très courte présentation
du groupe de Perpignan

2
Personnes impliquées

Permaments
Philippe Langlois (30 )
Marc Daumas (20 )
David Defour (10 )
Non-Permanents
Nicolas Louvet
Sylvain Collange

3
État davancement

Preuves automatiques
M. Daumas
Utilisation des statistiques pour caractériser le
comportement de code flottant
Spécification
M. Daumas, D. Defour
Arithmétiques exotiques (GPU)
Algorithmes pour lévaluation dexpression
flottante
N. Louvet, P. Langlois, M. Daumas, D. Defour
Algorithmes compensés
Approximation bivariée à base de table

Statistiques et erreurs en arithmétique flottante

5
Systems are now running fast enough and long
enough for their errors to impact on their
functionality

Worst case analysis is meaningless for
applications that run for a long time
For example
A process adds numbers in 1 to single precision
Each addition produces a round-off error of
2-25
This process adds 225 items
The accumulated error is 1
Note that
10 hours of flight time
At operating frequency of 1 kHz
Is approximately 225 operations
Provided round-off errors are not correlated, the
actual accumulated error will be much smaller

6
FAA regulations for aircraft require

Probability of an error be below 10-9 for a 10
hour flight
Provides a bound on the number of numeric
operations (fixed or floating point) that can
safely be performed before accuracy is lost
Important implications for control systems with
safety-critical software
Worst-case analysis would blindly advise the
replacement of existing systems that have been
successfully running for years
Set of formal theorems validated by the PVS proof
assistant
Allow code analyzing tools to produce formal
certificates

7
Some easy ways to obtain worst case behavior

Systematic ad-hoc errors may lead to the slow
accumulation of small quantities of the same sign
Biased measures
Synchronized time shift

8
Developing probabilities on floating point
arithmetic

Formal proof assistants such as ACL2, HOL, Coq
and PVS are used in areas where
Errors can cause loss of life or significant
financial damage
Common misunderstandings can falsify key
assumptions
Developments in probability share many features
with developments in floating point arithmetic
Each result usually relies on a long list of
hypotheses and slight variations induce a large
number of results that look almost identical
Most people want a trustworthy result but they
are not proficient enough to either select the
best scheme or detect minor faults that can
quickly lead to huge problems
Validation of a safety-critical numeric software
using probability should be done using an
automatic proof checker

9
The Central Limit Theorem in action (n 1, 2 or
5)
10
Limitations of the Central Limit Theorem to
target probability 10-9 (n 5, 40, 100 or 200)
11

Arithmétiques exotiques

12
Problématique

Notre expertise larithmétique IEEE-754
Cadre très précis
Précision, arrondi, gestion des exceptions
Portabilité
Nouvelles architectures (GPU)
Ne respecte pas la norme
Gestion des arrondis et des exceptions
Problématiques
Comment vont ce comporter les algorithmes sur ces
architectures
Est-il possible de définir des algorithmes
robustes ?

13
Caractéristiques de larithmétique des GPU

Dépendant de la génération et des constructeurs
Plusieurs unités de calcul
3 MAD
A.x B
1 unité pour le calcul des fonctions spéciales
(exp, log, cos, sin, 1/x, 1/?x)
1 interpolateurbilinéaire, trilinéaire,
anisotropique
Exemple a.x b.y , a0.(a1.x1 b1.y1)
b0.(a2.x2 b2.y2)
1 unité de mélange
Exemple r a.r b.y
Chaque unité se situe le long dun pipeline
Contrainte sur leurs utilisations

14
Bloc diagramme dun GPU
Command data fetch
Vertex Shader
Cull/Clip/Setup
Rasterization
Z-Cull
SharedL2 textureCache
Pixel Shader
Fragment pixel crossbar
Z-compare Blend
Memorypartition
Memorypartition
Memorypartition
Memorypartition
GDDR 3128 Mo
GDDR 3128 Mo
GDDR 3128 Mo
GDDR 3128 Mo
15
Vertex Shader programmable
Vertex data
VLIW
MIMD 4 voies- MAD
VertexTexture Fetch
FP32ScalarUnit
L1 cache
1 voiesin,cos,log,exprcp, rsq
Shared L2 texture Cache
Branch Unit

Vertex engine
Multithread
Branchement sanspénalité
2 inst. / cycle
9 FLOPS

Primitive Assembly
Viewport Processing
Mémoire de Texture
Triangle setup
16
Pixel Shader
Texture data
Pixel data
MADD SIMD 4 voies calcul adresse de
texture Mini ALU Normalisation FP16
Mip-mapping Filtrage
FPTexture Processor
CacheL1
Mini-ALU
MADD SIMD 4 voies Mini ALU
Shared L2 texture Cache
Mini-ALU

Pixel engine
Multithread
SIMD

Branch Unit
Fog Unit
Mémoirede Texture
Fragment pixel Crossbar
17
Notre travail

Caractérisation des MAD
A.x B avec arrondi au milieu (? FMA)
Mode darrondi troncature
Nombre de bit supplémentaire entre 0 et 2
Multiplication sans le calcul de tous les
produits partiels
Ajout éventuel dune constante de biais
Pas de gestion des dénormalisés (? 0)
Pas de qNaN
Précision
Définition dalgorithmes float-float fonctionnels

18
Réflexion

Objectifs
Définir des algorithmes robustes en labsence
de standard flottant
Quantifier le surcoût induit
Exemple
Addition / multiplication float-float avec
arithmétique faithfull

D. M. Priest, On properties of floating point
arithmetic's Numerical stability and the cost of
accurate computations. Phd Thesis, 1992
19
Opérateurs Float-Float
D. M. Priest, Algorithms for arbitrary precision
floating point arithmetic, Proceedings of the
10th IEEE Symposium on Computer Arithmetic
(Arith-10), 1991
20
Arithmétique flottante sur GPU
Reference Number of bits Number of bits Number of bits Number of bits Special values
Reference Total Sign Exponant Mantissa Special values

Nvidia 16 1 5 10 NaN, Inf
Nvidia 32 1 8 23 ( 1) NaN, Inf
ATI 16 1 5 10 No
ATI 24 1 7 16 No
ATI 32 1 8 23 ( 1) No documentation

IEEE-754 ANSI-ISO 32 1 8 23 ( 1) NaN, Inf
IEEE-754 ANSI-ISO 64 1 11 52 ( 1) NaN, Inf

Write a Comment

User Comments (0)