Title: Checking Computation of Numerical Functions by the Use of Functional Equations
1Checking Computation of Numerical Functions by
the Use of Functional Equations
- REC 2006
- NSF Workshop on Reliable Engineering Computing
- F. Vainstein and C. Jones
2Presentation Summary
- Background
- Fault tolerance
- Computing
- Numerical Functions
- Theory
- Finding checking polynomials
- The general method
- A program developed by this research
- Some examples
- Considerations for deployment
- Future directions
3Fault Tolerance
- Grace in response to the unexpected
- Withstands failures
- Exhibits desirable behavior
- Does not endanger life (military, transportation,
medical) - Preserves scientific investment (space,
supercomputing) - Meets consumer expectations
4Fault Tolerance Can Be Critical
Military Global Hawk
Science Gravity Probe B
Exploration Mars Opportunity Rover
Civilian Airbus A380
5Methods for Fault Tolerance
- Modular redundancy
- Back up systems in the event primary unit fails
- Replication with voting
- Duplicate function blocks and compare for
majority wins - Error-correcting codes
- Reed-Solomon, parity checks,
- Algorithm-based fault tolerance (ABFT)
- - Encodes data and augments algorithm to
detect errors
6A Complex System The Space Shuttle
Total number of parts gt 600,000 Total Weight
4,500,000 pounds Cost to move one pound of cargo
20,000 Budget 3.3 billion / year
7Modular Redundancy Space Shuttle
Space shuttle avionics from Redundancy Management
Techniques for Space Shuttle Computers, Sklaroff,
IBM Research Development, 1976.
8Replication With Voting Space Shuttle
9Complex System The Microprocessor
Intel Pentium 4 Prescott Core Number of
transistors gt 125 million Transistor size
90nm Pipeline 31 stages Development Budget
4.2 billion/year
Never in the history of mankind has it been
possible to produce so many wrong answers so
quickly. Carl-Erik Froeberg Â
10What Does a Microprocessor Do?
- ALU Arithmetic logic unit performs
- math and logic functions.
- Math coprocessors were big business
- for Intel and others in the 1980s.
- Today, most processors incorporate
- a math coprocessor or emulator for
- numerical calculations.
- Move data from one memory location
- to another
- Make decisions and jump to new
- set of instructions
11IBM FPU Core
Scientific codes typically spend much of their
time in common numerical subroutines - about 70
of a phase retrieval application, for example,
is spent in the Fast Fourier Transform alone. M.
Turmon, Annual Report for FY 2001 Final Report
Algorithm-Based Fault Tolerance, Nasa-JPL, Remote
Exploration and Experimentation Project.
Image LegendDark Blue Interface, Decode and
IssuePink Pipe Management and Data
ForwardingYellow Arithmetic PipeAqua
Load/Store Pipe
12Numerical Functions
Numbers from numbers
Degrees to radians Cosine Hyperbolic
Sine ArcSine SINC function Next positive power
of 2 Linear interpolation Root finding
Gaussian Mod Greatest Common Divisor
Absolute value Minimum Maximum Round to next
integer Return the fractional part of a value
Clip in a saturation fashion Wrapping for
integers Log Fast Fourier Transform
(FFT) Numerical Differentiation Kalman Filtering
13Numerical Functions in Action 1
- IMAGE PROCESSING
- The FIDO Mars Exploration Rover (MER)
- relies on detailed panoramic views in its
- operation for near real-time tasks
- Determination of exact location
- Navigation
- Science target identification
- Mapping
- WEATHER MODELING
- Roe, K., et al., High Resolution Weather
Monitoring - for Improved Fire Management, 2001, Maui HPCC
- Real-time analysis of environmental information
- for prediction of fire behavior
14Numerical Functions in Action 2
NON-LINEAR CONTROL SYSTEMS Brennan, S.,
Integrated Chassis Control for Vehicles, 2000
SCIENTIFIC SUPERCOMPUTING U. Landman, et al.,
Large-scale classical molecular dynamics, 2001,
Georgia Tech
15Background Summary
- Computing is at the heart of most modern systems
- Fault tolerance is a concern especially for
mission and safety critical systems - The computation of numerical functions is a
critical area of computing
16Notable Work in Numerical Result Checking
- M. Blum, R. Rubinfeld
- - Self-Testing/Correcting with Application to
Numerical - Problems, 1990
- M. Blum, H. Wasserman
- - Reflections on the Pentium Division Bug, 1995,
- - Software Reliability Via Runtime Result
Checking, 1997 - Promoted numerical checking
- A motivation for result checking
Used functional equations but no general method
existed.
17An Algebraic Method for Fault Tolerance
1991 Feodor Vainstein, Georgia Tech Error
Detection and Correction in Numerical Computation
s by Algebraic Methods
Developed a general theory for generating
functional equations. Showed that many numerical
functions have functional equations and that
computations of such numerical functions could be
verified by checking polynomials a novel
technique based upon algebraic concepts such as
the transcendental degree of field extensions.
18Contribution of This WorkA Method for Practical
Numerical Checking
- Developed software method for finding checking
- polynomials.
- Treated the case of functions that are not
polynomially - checkable.
- User-friendly program for hardware/software
engineering - Design considerations
19Polynomial Numerical Checking Example 1
20Polynomial Numerical Checking Example 2
21Polynomial Numerical Checking Example 3
22Polynomial Numerical Checking Example 4
23Algebra Fields
S. Lang, Algebra, Addison-Wesley, 1965
24Algebra Algebraically Dependent
25Algebra Transcendental Degree of Field Extension
26Algebra Algebraically Closed and Algebraic
Closure
27Algebra Linear Independence
28Theory Polynomially Checkable
29Theorem
30Theory Example and Generality
31Theory Linearly Checkable
32Theory Other Cases
- We also considered
- Functions over various fields
- PC and LC functions of several variables
- Partially polynomially checkable functions
- The focus of the present work is on finding a
practical method for determining approximate
checking polynomials for PC and non-PC functions
for real-valued functions of a single variable.
33Least Squares Estimation
The least squares estimation technique is used to
compute estimations of parameters and to fit data.
Since some functions are not PC we can generalize
to approximate for non-PC functions.
- There are other methods but this was chosen to
- Add robustness
- Develop a practical process
- Treat all polynomially checkable functions
34Application of Least Squares Estimation 1
The problem of finding a checking polynomial can
be reduced to the following optimization problem.
Let
35Application of Least Squares Estimation 2
36Application of Least Squares Estimation 3
37Application of Least Squares Estimation 4
38Software Implementation of Least Squares
Estimation 1
Solve the matrix equation
39Software Implementation of Least Squares
Estimation 2
The coefficients of the checking polynomial are
then in vector X
Those values can be used to find the value of the
delta function
Deviation shows how good is our approximation
40The Matlab Function
- Solves the least squares estimation problem
- Finds the delta function value for a range of k
- Returns the checking polynomial coefficients
- for the best (smallest error) delta function
- Plots the error over the function domain for
- the best delta
- Plots deviation for a range of k
- Simulink, DSP Builder generates VHDL and
- deploys to Altera FPGA (Xilinx similar)
41Function Input
42Function Output
43Example SINE Function Output
44Example SINE Function Plots
The sine function is linearly checkable (LC)
45The Logarithm Function Output
46The Logarithm Function Plots
47The Logarithm Function k 140
48Checking Polynomials Simple Functions
49Checking Polynomials Compound Functions
50Why Matlab
Matlab (MATrix LABoratory)
- Matrix-oriented programming environment
- Code can compile to C/C
- Built-in routines for data analysis and
visualization - GUI/Web publishing support
- A popular environment for technical computing
http//www.gtrep.gatech.edu/undergradlabs/labman/C
heckingPolynomial
51Deployment Considerations
- Hardware or software
- Pipeline or parallel
- If non-LC function returns high-order checking
polynomial - Break up function domain
- Generate separate checking polynomial for each
- sub-interval
52Simulink Design
k,delta,alphas,betao,stepsize,A,BLSEFUNRUN('exp
(x).sin(x)',10-4,0,3.1415,(12))
k
delta
alphas
beta
- We show a Simulink example
- Extension of Matlab
- Modeling, simulating
- GUI environment
- Toolboxes for DSP, etc
- Toolboxes for targeting
- FPGA devices
53Simulink Implementation of Checking Algorithm
54Space Complexity
For a ROM implementation that stores
b-bit numbers and has m address lines.
55Error Coverage
56Error Coverage Example
This is the percentage of all errors covered.
57Design Flow
58Target Markets
- Numerically intense, safety, or mission critical
- Supercomputing
- Moletronics and nanosystems
- Space or remote systems
- Control systems using COTS components
59Example NASA Seeks COTS Remote Supercomputing
Space Radiation
S. Kayali, Space Radiation Effects on
Microelectronics, Radiation Effects Group, JPL,
Section 514.
60Traditional Fault Tolerant Devices are Costly in
Terms of Design Space, Time, and Money
- Perry COTS initiative
- Buy more commercial products
- Use industrial specifications
- Reduce costs
- William J. Perry, Specifications and Standards
A New Way of Doing - Business, Memorandum, 1994
Radiation-Hardened Half-Micron CMOS 16K SRAM,
Sandia National Laboratories
61Moletronics, CMOL, and Nanodevices Will Require
Minimizing Fault Tolerant Strategies
Low Yield and structural defects will be
considerable (in moletronic devices). Hence, the
target architecture has to be inherently fault-
tolerant/configurable. If you want to compensate
for the errors then you have to use
error-correcting codes and fault-tolerant
circuits. V. Roychowdhury, A Quest for
Information, Frontiers in Nanocomputing Seminar,
2004
Single molecular implementation of
single-electron transistor, K Likharev,
Electronics Below 10nm, 2003
62Demands for Numerical Fault Tolerant Computing
63Numerical Checking Only Part of the
SolutionComplex Systems Require Multiple Fault
Tolerant Strategies
64Conclusions and Future Directions
- Remaining Tasks
- Tame functional discontinuities
- Deploy to hardware/software testbed
- Investigate impact of single and multiple
checking polynomial strategies - Investigate best interface strategies
- Develop Numerical Checking Toolbox
- Functions of several variables
- Partially polynomially checkable functions
65 Thank You!