Title: No. 1
1Availability Analysis of Xilinx FPGA on Orbit
- Nozomu Nishinaga
- National Institute of Information and
Communications Technology - Masayoshi Yoneda
- NEC TOSHIBA Space Systems, Ltd.
2Outline
- Motivation
- Heavy Ion test results of Virtex II pro
- Availability analysis
- Conclusion
3Motivation
- Very high availability or low non-availability is
required for the consumer communications
equipment. - typical non-availability value for terrestrial
network equipment is 10E-6 - If the SEU can be defined as an accidental
failure and the failure can be fixed without any
loss of the original device function. - the rebooting process also can be defined as a
repairing - Does equimpment with S-RAM type FPGAs meet the
non-availability criteria?
4Radiation test of Virtex II Pro
- Virtex II pro (XC2VP7-5FG456 and XC2VP4)
- Test carried out in November 2003 and February
2004 at TIARA in Takasaki, Japan - Heavy Ions (N, Ne, and Kr)
- Result compared with that of Virtex II. (Gary
Swift, Candice Yui, and Carl Carmichael,
Single-Event Upset Susceptibility Testing of the
Xilinx Virtex II FPGA, MAPLD2002, paper P29)
5Devices Under Testing
XC2VP4 XC2VP7 XC2VP100
Configuration Memory 3.01 Mbit 4.49 Mbit 34.29 Mbit
DCM (Digital Clock Manager) 4 unit 4 unit 12 unit
Block RAM 28 unit 44 unit 7992 kbit
F/F 6016 unit 9856 unit 9856 unit
Multiplier 28 unit 44 unit 44 unit
Rocket I/O 4 Block 8 Block 8 Block
6Radiation test result (1)
0.000001
0.0000001
1E-08
Cross Section cm2/bit
1E-09
1E-10
1E-11
0
10
20
30
40
50
60
70
LETMev cm2/mg
7Radiation test result (2)
- Configuration Memory region
8SEU frequency analysis (CREAM 96)
XC2VP4
Solar MAX Flare Peak (1 week)
Conf. Memory 0.33 times/day 163.4 times/day
DCM 0.00 times/day 0.11 times/day
Block RAM 0.04 times/day 21.87 times/day
Multiplier 0.00 times/day 0.46 times/day
XC2VP7
Solar MAX Flare Peak (1 week)
Conf. Memory 0.49 times/day 243.8 times/day
DCM 0.00 times/day 0.11 times/day
Block RAM 0.07 times/day 34.4 times/day
Multiplier 0.00 times/day 0.72 times/day
9Mean Time Before Failure Analysis
XC2VP4 XC2VP4 XC2VP7 XC2VP7 XC2VP100 (Simulated) XC2VP100 (Simulated)
Solar MAX (Sec.) Flare Peak (1 week) (Sec.) Solar MAX (Sec.) Flare Peak (1 week) (Sec.) Solar MAX (Sec.) Flare Peak (1 week) (Sec.)
Conf. Memory 2.64E05 5.29E02 1.77E05 3.55E02 2.32E04 4.64E01
DCM 4.14E08 8.09E05 4.14E08 8.09E05 1.38E08 2. 70E05
Block RAM 2.02E06 3.95E03 1.28E06 2.51E03 1.27E05 2.49E02
Multipliers 7.89E07 1.89E05 5.02E07 1.21E05 4.98E06 1.19E04
SYSTEM 2.3267E05 4.6495E02 1.5501E05 3.0972E02 1.95E04 3.90E01
- If the SEU can be considered as A Failure, the
MTTR is roughly proportional to the size. - System MTBF -gt Harmonic Mean of all functional
blocks - Assumption 1 All the SEUs can be detected.
- Assumption 2 All the gates are used.
- Assumption 3 All the SEUs must be repaired as
soon as quickly
10 Mean Time To Repair (MTTR)
XC2VP4 XC2VP7 XC2VP100
Configuration data (bit) 3,006,560 4,485,472 34,292,832
MTTR (s) (10Mbyte/s) 0.037582 0.056068 0.42866
MTTR (s) (50Mbyte/s) 0.007516 0.011214 0.085732
- REBOOT Repair
- The effects of SEU are volatile.
- By loading the correct configuration data, the
operation mode will go to the normal mode. - Rebooting time -gt Repair time
- The maximum data rate for loading is fixed 50M
byte/Sec. for XC2VP series. - The larger gate size or configuration size, the
longer MTTR becomes necessary.
11Triple Module Redundancy
- Case 1 One out of Three system failure is
acceptable. - Loose regulation
- Acceptable when the MTBF is quite large compared
with MTTR - Case 2 NO failure is acceptable
- Tight configuration
- The output is always guaranteed.
12Non-Availability Alalysis
Case 1 XC2VP4 XC2VP4 XC2VP7 XC2VP7 XC2VP100 (Simulated) XC2VP100 (Simulated)
Case 1 Solar MAX Flare Peak Solar MAX Flare Peak Solar MAX Flare Peak
10Mbyte/s 7.83E-14 1.96E-08 3.93E-13 9.83E-08 1.45E-09 3.53E-04
50Mbyte/s 3.13E-15 7.84E-10 1.57E-14 3.93E-09 5.79E-11 1.44E-05
Case 2 XC2VP4 XC2VP4 XC2VP7 XC2VP7 XC2VP100 (Simulated) XC2VP100 (Simulated)
Case 2 Solar MAX Flare Peak Solar MAX Flare Peak Solar MAX Flare Peak
10Mbyte/s 4.85E-07 2.42E-04 1.09E-06 5.43E-04 6.59E-05 3.23E-02
50Mbyte/s 9.69E-08 4.85E-05 2.17E-07 1.09E-04 1.32E-05 6.57E-03
- MTBF is proportional to the area of the die and
MTTR is also proportional. -gt Large FPGA has
disadvantage. - Large size FPGA does not meet the criteria 10e-6
- How to mitigate? divide small FPGAs
- Much larger down load rate will be needed (50 M
Byte/S is too slow)
13Dividing
- The Non-Availability depends on the size
- A Large size FPGA is split up to several (D)
small FPGAs - Sc-gt Configuration data size bits
- R -gt Configuration rate bps
14Interstage VOTER
- The availability is varying With or Without the
interstage Voter. - The performance with interstage voters is
superior to tat without the voters.
15Non-Availability Analysis with dividing
- Area or gate loss due to the division is not
taking into account in this figure. -gt next issue
16Conclusion
- Non availability analysis for Vertex II pro
- Large scaled FPGA do not meet a non availability
criteria for communication equipment (10e-6). - Need much faster or wider Interface for
configuration to enhance its availability.