CostEffective Register File Soft Error reduction - PowerPoint PPT Presentation

About This Presentation
Title:

CostEffective Register File Soft Error reduction

Description:

Study of register file vulnerability to SDC(Silent Data Corruption) ... read instruction reading reg P in ROB and all succeeding instructions is flushed. ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 19
Provided by: Rei45
Category:

less

Transcript and Presenter's Notes

Title: CostEffective Register File Soft Error reduction


1
Cost-Effective Register File Soft Error reduction
  • Pablo Montesinos, Wei Liu and Josep Torellas,
  • University of Illinois at Urbana-Champaign

2
Overview
  • Study of register file vulnerability to
    SDC(Silent Data Corruption)
  • Shield cost effective protection to register
    files
  • Highighting policies and techniques used in
    shield
  • Experiment - Results

3
Register File AVF
  • RF-AVF is the probability that a fault that
    occurs will lead to error.
  • Register lifetime is divided into PreWrite,
    Useful, and PostLastRead parts.
  • Based on AVF calculation we can divide lifetime
    of bit into ACE (Architecturally Correct
    Execution) and un-ACE cycles.

4
Register File AVF
  • During PreWrite Period un-ACE
  • If used atleast once after write the reg switches
    to ACE state.
  • After last read on reg, switches back to un-ACE
    during PostLastRead

5
Highlighting Insights (1)
  • The combined -USEFUL time of all registers is
    small

6
Highlighting Insights (1)
  • The average number of useful (live) registers is
    less than 20 (SPECint) and 17(SPECfp).
  • It is thus possible to redue the vulnerability of
    the register file by only protecting a subset of
    carefully chosen registers at a time.

7
Highlighting Insights (2)
  • Only a few long-lived registers contribute to
    overall Total useful time
  • On average less than 10 of register versions are
    long-lived.

8
Highlighting Insights (2)
  • On average 40 of useful time comes from the few
    long-lived versions.
  • In SPECfp, 5 of long-lived versions account for
    46 of the useful time.

9
Motivation
  • Register files have a very high access rate.
  • High temperature thus leading to lesser Qcrit for
    the devices.
  • An error in an RF can propagate with hght failure
    probability
  • If we isolate a few register versions, predicting
    their life-time, and protect these register
    versions alone, high reliability can be achieved
    with limited overhead.

10
Shield - Architecture
Life-Time Prediction
Register Error Check
Shielding Decision
Error Recovery
11
Reg-Version Lifetime Prediction
P12 gt Used(1) , Renamed(1) P7 gt
Used(0) , Renamed(1)
12
Shielding Decision
  • These prediction bits are stored as status in the
    ECC table.
  • The decision to shield an incoming register
    version written is by
  • Availability of free ECC-Table entry
  • Same register present in the ECC table will be
    replaced with new entry.
  • Existing reg-version with lesser lifetime than
    incoming reg-version will be replaced.
  • Replacement policy

13
Register Error Check Recovery
  • On a read request the register data is sent to
    the original datapath and shield.
  • If the Reg matches with a tag entry, then the
    reg-data is checked for errors at the
    ECC-Checker.
  • If Error is detected
  • Processor stalls the instruction I reading reg P
  • Reg-data is corrected and written into RF
  • Oldest read instruction reading reg P in ROB and
    all succeeding instructions is flushed.
  • Processor resumes from flushed instruction.

14
Experiments- Results
  • AVF computation for RF with shield

15
Experiments-Results
  • AVF of intREG reduced by different replacement
    policies
  • LRU 31
  • Effective 63
  • OptEffective 84 ( pinning of global pointers
    to particular ECC entries Effective
    )
  • AVF for fpREG can be reduced maximum by 100,
    because fewer fp-registers are in useful state.

16
Power and Area Impact
  • Shield only uses 3ECC generators and 3 ECC
    checkers.
  • Shield has 45 power overhead over a plain
    register file. (Full ECC has 2X)
  • Shield introduces an overall 10 area overhead.

17
Conclusion
  • A cost-effective architectural technique has been
    proposed to reduce the vulnerability of RF by 84
  • The area and power overhead indicated is a
    marginal tradeoff for reliability achieved.

18
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com