Title: 332:437 Lecture 19 Time Redundancy
1332437 Lecture 19Time Redundancy
- Time redundancy
- Alternating logic
- Boolean self-dual functions
- Recomputing with shifted operands
- Recomputing with swapped operands
- Recomputing with duplication with comparison
- Software redundancy
- Summary
2Material from Design and Analysis of Fault
Tolerant Digital Systems, by Barry Johnson,
Addison Wesley.
3Time Redundancy
- Reduce extra fault-tolerant hardware at expense
of time - Detects transient faults, but not permanent ones
- Method Repeat computation, store results,
compare results - Example Have error detection correction, but
want to know whether fault is permanent or
transient - Must keep data around to perform comparison
4Time Redundancy Method
5Extension to Detect Permanent Faults
6Alternating Logic Method
- Time t0 transmit original data
- Time t0 d transmit complemented data
- Detects stuck-at faults on bus
7Example Alternating Logic Bus Transmission
8Self-Dual Boolean Functions
- Dual of Boolean function f is
- fd f (x1, x2, , xn)
- Self-dual Boolean function
- fsd (x) f (x)
- By definition
- fsd xn1 f xn1 fd
- Complementary inputs give complementary outputs
9Alternating Logic Fault Detection Condition
- Detects faults if, for every fault, at least one
input combination produces non-alternating
outputs - Example full adder
- Problem May take some computation time before
you hit the input combination that reveals the
fault - Error latency
10Full-Adder Is Self-Dual
11Recomputing with Shifted Operands (RESO)
- Developed for ALUs
- Encoding function left shift
- Decoding function right shift
- For bit-sliced, ripple-carry adder
- Two-bit arithmetic shift needed to guarantee
error detection - Extra hardware
- 3 Shifters
- Storage register
- Comparator
- 2 Extra bits of ALU
12RESO Faults
13RESO Example on ALU
14RESO Problems
- Extra Hardware
- No fault coverage for shifters
- Comparator must be fully self-checking
- Can instead do a circular shift
- Avoids creating 2 extra ALU bit slices
- Complicates carry handling circuit
- Perhaps not worthwhile
15Recomputing with Swapped Operands (RESWO)
- Swap upper lower halves of operands for 2nd
operation
16RESWO Error Analysis
- Normal mode
- Error in bit slice i affects sum carry out
- Changes result by 0,
2i, 2i1, 2i 2i1 - Computing with swapped operands
- Error in bit slice i throws off result by 0,
2ir1, 2ir2, 2ir1 2ir2
17RESWO Example on ALU
18Recomputing with Duplication with Comparison
(REDWC)
- First calculation
- Each 16-bit adder adds lower halves
- Second calculation
- Each 16-bit adder adds upper halves
- Using output carry of 1st comparator as input
carry - Both times 2 results are compared and 1 result
is stored
19First Calculation
20Second Calculation
- Same hardware used 2 times to compare results
21REDWC Summary
- Same fault detection ability as Duplication with
Comparison - Can use with 4-bit carry lookahead adders
- Carry bit must ripple through lookahead units
22Example on 32-bit Adder
23Example with Carry Lookahead Adder (CLA)
24Hardware Overhead for Time Redundancy Methods
25Time Redundancy for Error Correction
26Error Correction Using Time Redundancy
- Works for bit-wise AND operation
- Does not work for arithmetic, since adjacent bits
are not independent
27Three Methods of Software Redundancy
- Consistency Checks use a priori knowledge of
information characteristics - Check that some digitized reading does not exceed
a known maximum for the sensor - Amount of cash requested from ATM should never
exceed maximum withdrawal amount - Computer should never receive illegal OPCODE from
program instruction (otherwise, you are executing
data) - Compare measured performance of control system
with predicted performance
28Software Redundancy (continued)
- Packet data transfer systems check for word
count overflow - Capability checks
- Periodically write patterns to memory read them
to see if you get the same data back - Run ALU test cases compare to correct answers
stored in ROM - Check that all processors can communicate each
sets a bit in shared memory, other CPUs check
that bit got set
29Software Redundancy (continued)
- N-Version Programming
- Software faults result of incorrect
design/coding - Each of N modules designed coded by separate
programmer group - Get N results from N software copies compare
the N results
30N-Version Programming
- Ferociously expensive
- Software designers/coders often make similar
mistakes - Does not catch specification errors
- All N groups work from same spec.
- Better method
- Use rigid design methods rules to prevent
software faults from occurring. - Perform massive numbers of acceptance tests
before using the software
31Time Redundancy Summary
- Repeat hardware calculation at some Dt later,
with a different encoding compare - Alternating logic
- Recomputing with Shifted Operands
- Recomputing with Swapped Operands
- Recomputing with Duplication with Comparison
- Use software redundancy
- Faults are more likely to be in software than in
hardware in PC, phone exchanges