Soft Errors: Tools and Interactions with Power Optimizations - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Soft Errors: Tools and Interactions with Power Optimizations

Description:

Cypress Chip (2KX8 SRAM) SEAT:Soft Error Analysis Toolset. Device Level. Analysis: SEAT ... Cypress Chip (2KX8 SRAM) Toshiba chip - 4Mbits. Lower Power. 1.00E ... – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 20
Provided by: jani179
Category:

less

Transcript and Presenter's Notes

Title: Soft Errors: Tools and Interactions with Power Optimizations


1
Soft Errors Tools andInteractions with Power
Optimizations
  • Vijaykrishnan Narayanan
  • Embedded and Mobile Computing Design Center
  • The Pennsylvania State University
  • Acknowledgment
  • Vijay Degalahal, Rajaraman Ramanarayanan, Profs.
    Kenan Unlu and Yuan Xie
  • This work was supported in part by grants from
    National Science Foundation and Department of
    Energy. All opinions expressed are those of the
    author.

2
Soft Error in Action
G
D
S
n

- - -
p substrate
B
3
Problems caused by SEU
  • Single event upsets can cause problems in
    different ways
  • Change the data value in the caches and memory

11100 28 minute talk
  • Corrupt the execution of instruction due the flip
    of data in the pipeline registers.
  • Change the configuration of a SRAM-Based FPGA
    circuit. (Firm Error)
  • Cause glitches in combinational logic that can
    propagate to state elements

4
Logic SER The new menace
REG I STERS
REG I STERS
Particle strike
I1
A
1
I2
D
0
O1
I3
1
1
B
X
I4
1
E
0
I5
O2
I6
1
C
0
I7
Effect of electrical masking
Effect of logical masking
5
Latch Window Masking
Faster clocks(Shallow pipelines) and reduced
capacitance and voltage make logic errors a
critical problem
6
Why Care about Soft Errors ?
  • SUN FIRE 15K Crash Mysteriously
  • Its ridiculous. Ive got a 300,000 server that
    doesnt work. The thing should be bullet-proof.
    --- Forbes magazine, 2000
  • All future designs that require highest
    availability must counter unavoidable SEUs.
  • Cisco 12000 line cards may reset after single
    event upset (SEU) failures. This field notice
    highlights some of those failures, why they
    occur, and what work arounds are available.
  • Highest failure rate of all other reliability
    mechanisms combined. - TI, Baumann, IRPS 2002

7
Error Impact on System Operation
  • Soft errors Not a new problem !
  • J. Wallmark, S. Marcus, Minimum size and
  • maximum packaging density of non-redundant
  • semiconductor devices, In Proc. IRE, 50, 1962.
  • Existing solutions employed for
    space/military
  • applications consume more power,
  • reduce manufacturability and severely
    influence
  • performance

Challenge How to provision for error handling
within given performance, power and cost
constraints ?
8
Accelerated Soft Error Testing
9
Accelerated Testing Results
Toshiba chip - 4Mbits
Cypress Chip (2KX8 SRAM)
  • Change in reactor power varies acceleration rate
  • 107 neutrons/cm2-sec at 1 MW
  • 106 neutrons/cm2-sec at 100 KW
  • 360 particles/m2-sec natural radiation at
    ground level

10
SEATSoft Error Analysis Toolset
Device Level
Circuit Level
Logic Level
Block Error Prob.
Arch. Level
Application Error Prob.
11
SEAT-DA Modeling Charge Collection
MCNP codes, using PTRAC card
n-Si Interaction
Reaction products, energies
Charge Deposition
TRIM/SRIM Codes
Electron-hole generation rate, and ions stopping
range
Charge Collection
Synopsys Davinci Device Simulator
Charge Collected, Current and Voltage transient
12
Charge Collection at Different Supply Voltages
13
SEAT - LA
14
SEAT - LA
1000x speedup over circuit simulation Within 5
error margin of SER estimates
15
Lowering Data Retention Voltage for Low Power
16
Increasing Threshold Voltage for Leakage Reduction
HIGHER Qcritical is better
17
Voltage Assignment for Low Leakage
1.6E-20
1.4E-20
1.2E-20
1E-20
Qcritical (C)
8E-21
6E-21
4E-21
2E-21
0
Low Vth Low Vth FF
Slow path ( 6 inverters)
Fast Path (3 inverters)
18
Balancing Power, Performance, Reliability
Tradeoffs
  • Not all functional units are active in all cycles
  • Exploit idleness
  • Switch off to save power
  • Execute replicated computation to increase
    reliability
  • EDF Energy-Delay-Fallibility product

19
Conclusion
  • Tools for quick and accurate estimation of soft
    error rates are necessary
  • Soft error optimizations interact with
  • Power
  • Performance
  • Area
  • Proper choice of control knobs is critical for
    multi-criteria optimizations
  • Combinational logic and on-chip networks will
    require soft-error tolerant provisioning from
    MPSOC designers
  • Soft errors under other stress conditions such as
    thermal hotspots, supply noise fluctuations
    requires further understanding
Write a Comment
User Comments (0)
About PowerShow.com