Title: active redundancy
1active redundancy
- purpose (of the system)
- redundancy types, examples
- activeness of redundancy
- case study - aviation safety
- in the large
- in the medium
- in the small - hardware
- redundancy for what? (performance? reliability?)
- redundancy use for reliability and fault
tolerance - equations and processes
- are we good? comparison with classic
- where else we can use this?
17 dec 08, FoC, LMU
Igor Schagaev
2 active redundancy why? we can use it
computer science (theoretical, HW, SW)
aerospace ways and systems of operation
avionic (RTD, safety critical systems)
econometrics (large scale project life cycle)
curriculum design, teaching...
17 dec 08, FoC, LMU
Igor Schagaev
3 ar - purpose of classification
Why we need classification? What is the role of
any classification for research and development?
What is the benefit from all these theories and
classifications?. Great thinkers did some efforts
to clarify what we should expect from research
and science Only method is able to control the
thought, lead it to and keep within the
subject. Hegel, (Foreword to Encyclopaedia of
Philosophical Sciences) Berlin, 25.05.1827 "I
must invent my own system, or be enslaved by
another man's. I will not reason compare my
business is to create." --William Blake,
Jerusalem (1804) And, finally those who
distinguish themselves by their independence of
judgment and not just their quick wittedness
happily began discussions about the goals and
method of science A.Einstein, Phys Zeit. Schr
17, 101,(1916)
17 dec 08, FoC, LMU
Igor Schagaev
4 ar purpose of the system
- we assume we know the purpose of the system
- power generation
- control of the process
- transportation
- reduction of cost
- construction
- manufacturing, etc
what if we need from the system something extra?
efficiency, zero-latency, safety? we need to
introduce it in the system -gt need theory how...
17 dec 08, FoC, LMU
Igor Schagaev
5 ar classification
This is how we think
This is how we implement
Open question S x I x T const?
17 dec 08, FoC, LMU
Igor Schagaev
6 ar - types
Hardware based redundancy types - HW(2S)
structural redundancy of hardware such as
duplicated systems - HW(S1,S2) hardware system
with different (non identical) units for the same
function - HW(I1) extra information in hardware,
to check errors - HW(nT) special hardware
implemented n-time delay to repeat functions -
HW(dT) special hardware implemented small delay
to avoid malfunction. Software based redundancy
types - SW(2T) double repetition of the same
procedure to check the results - SW(I)
informational redundancy of the program back-up
files, recovery points - SW(S1,S2) two different
versions of the program for the same function -
S(dT) time delays realized in software for
waiting a guaranteed result.
how we can use it? patience please, see next
slide...
17 dec 08, FoC, LMU
Igor Schagaev
7 ar reliability vs. fault tolerance an
algorithm
a. prove that fault does not exist else b.
determine type of fault c. if fault is permanent
then d. locate faulty element e. reconfigure
HW else f. prove the absence of fault influence
of SW else g. locate faulty states of SW h.
reconfigure SW i. continue
17 dec 08, FoC, LMU
Igor Schagaev
8 ar - how to use it
1. any feature might be achieved as an
algorithm (a-h steps) 2. redundancy types
might be applicable at every step 3. ALL
possible systems are described by this table 4. X
might be cost, reliability (Lambda), power
consumption, whatever...
NB we will be back to this picture...
17 dec 08, FoC, LMU
Igor Schagaev
9 ar is it computer systems only?
17 dec 08, FoC, LMU
Igor Schagaev
10 ar - how to use it for active aircraft safety...
17 dec 08, FoC, LMU
Igor Schagaev
11 ar concrete example - processor design
reliability of system with checking of state
reliability of malfunction-free system
reliability of naked system
P e(1dr)(Lpf1)t
d- redundancy for detection, r- redundancy for
recovery Lpf1 - permanent fault ratio
17 dec 08, FoC, LMU
Igor Schagaev
12 ar - reliability by fault tolerance
P1 checking and recovery e
(1dr)(1ak)(Lpf1)t MTTF 1/(1dr)(1
ak)Lpf1t
Denote a success function of the malfunction
reductions as SF. For duplicated system (that
covers all possible faults by comparison) SF ?1,
i.e.100 redundancy guarantees full success of
system operation (including fault detection and
complete recovery). When no redundancy is used
(x0), the recovery probability from malfunction
is zero, SF?0. (Compare if you dont pay for
people healthcare they die from any disease). A
function SFxe(1-x) satisfies both initial
conditions. Denote coefficient a as malfunctions
reduction, then a 1- c(xe(1-x))
c - coverage of faults - strength of
diagnostics
17 dec 08, FoC, LMU
Igor Schagaev
13ar design efficiency - classics were wrong?
to detect diseases and separate hick-ups from
fatal ones is possible to spend less than 100
redundancy in comparison with naked system
efficiency of proposed AR is almost linear to
ratio of malfunctions vs. permanent faults it
is achievable with less then 100 redundancy -gt
no need duplication or triplication.
Design of the systems for safety critical systems
should be done using malfunction tolerant
elements... can we do it? sure!
17 dec 08, FoC, LMU
Igor Schagaev
14 ar design of reliable component an example
malfunction tolerant processor
redundancy of processor - aro 12.6 complexity
- lt10.000 gates performance -1.67 of ARM or
Intel (at the same frequency...)
17 dec 08, FoC, LMU
Igor Schagaev
15active redundancy - questions?
QUESTIONS?
17 dec 08, FoC, LMU
Igor Schagaev