Diagnosing Faults in CLB Array - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Diagnosing Faults in CLB Array

Description:

... then designers will introduce highly complex FPGAs, necessitating the ... Dynamically reconfigurable FPGA (DRFPGA) emerging market ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 31
Provided by: K2124
Category:
Tags: clb | array | diagnosing | faults

less

Transcript and Presenter's Notes

Title: Diagnosing Faults in CLB Array


1
Diagnosing Faults in CLB Array
  • The target diagnosis here is performed by means
    of locating the faulty CLB. Some believe, that
    more effort targeting the diagnosis of the
    precise faulty point in CLB ( MUX, LUT,
    connection, etc) is not required, in this step of
    the research.
  • Since it does not matter which part of the CLB is
    faulty if the entire faulty CLB will not be used.
    Until now, almost all fault tolerance methods
    proposed to dispose the entire faulty CLB.

2
Diagnosing Faulty CLB Using the Programmability
  • Almost all strategies which were proposed for
    detecting faults in CLB resources are improved
    later for diagnosing faults.
  • BIST Approach improved for diagnosis faults
  • The main concept behind this improvement is the
    use of the regularity of the FPGA chip.
  • As shown in this diagram, the FPGA is diagnosed
    in four sessions ( NS, SN, WE,EW).
  • In each session, one CLB row is programmed as the
    test pattern generator denoted in the figure (
    TPG).

3
Diagnosing Faulty CLB Using the Programmability
  • Some CLB rows are under testing and are denoted
    in the figure (Bus), while other CLB rows are
    programmed as output response analyzers and are
    denoted in the figure (as ORA). Note that after
    the two sessions NS and SN, all CLB rows are
    covered by the test. Therefore, after these two
    sessions, we can determine which row is faulty
    but we cannot determine the exact position of the
    faulty CLB. Some scientist suggest turning the
    chip by 90 degrees and applying the same strategy
    to the columns ( session EW and session WE).
    Therefore, the faulty CLB column will appear
    after the completion of the two sessions (EW and
    WE)
  • If we realize the four sessions, we can determine
    the faulty CLB row and the faulty CLB column.
    Therefore, the position of the faulty CLB will be
    deduced.

4
  • Universal Testing Approach
  • This approach is achieved with low complexity.
    Disadvantage the test time is long in some
    cases
  • Array Based Approach
  • Can locate the fault by applying the test
    strategy twice once for the chip in normal
    position and the second time for the chip rotated
    by 90 degrees ( device is symmetric)
  • Disadvantage the diagnosis time is twice the
    testing time.

5
  • I Approach
  • Can directly detect and diagnose faults since
    each CLB under testing is observed directly from
    off the chip.
  • No additional time is required
  • The same method proposed for detecting faults,
    diagnoses faults as well.
  • However this method is inherently slow compared
    to the other methods.

6
Diagnosing the Faulty CLB Using Design for
Testability
  • Primary concern How to improve the actual FPGA
    design, in order to make the fault diagnosis
    easy.
  • Not much headway has been made in this research
    area due to the fact that in order to improve the
    FPGA design, knowledge of the actual structure of
    the FPGA is required.
  • Two main proposals in this area
  • A modified scan procedure to sequentially test
    every module in FPGA can be used for diagnosing
    faults if the rows and columns of CLBs are used
  • Diagnosing faults in CLBs by shifting of the
    configuration data
  • The idea is to develop an algorithm for shifting
    the configuration data with the aim of diagnosis.
    The algorithm consists of shifting the data row
    by row and column by column. The row by row
    shifting diagnoses the faulty row while shifting
    column by column diagnoses the faulty column.
    Thus the diagnosis of the faulty CLB is achieved.

7
Diagnosing Faults in Interconnect Resources
  • Fault diagnosis in interconnects may require a
    long time since interconnect resources are very
    complex. Diagnosing faults is always more
    difficult that detecting faults
  • Fault Diagnosis in Interconnect Resources Using
    the Programmability
  • Two ways to diagnose faults
  • BIST
  • non-BIST
  • Both of these methods were proposed for
    detecting faults in interconnect resources and
    here they are improved for diagnosing faults.
  • Main difference between detecting faults and
    diagnosing faults no of configurations.
    Diagnosing faults require large number of
    configurations

8
Diagnosing Faults in Interconnect Resources
  • Many scientist have proposed different methods to
    minimize the number of configurations required at
    the expense of fault coverage.
  • We must maintain a balance between the number of
    configurations and the fault coverage ( or the
    FPGA model generalization)

9
Faults Diagnosis in Interconnect Resources Using
the Design for Testability
  • Some of the research proposed in this area
    require regular distribution of the interconnect
    resources. However the actual design of FPGA chip
    on the market are not symmetric
  • Altera FPGA interconnect resources are
    concentrated on the center since more
    functionality is in the middle of the chip
  • Xilinx FPGA more interconnect resources are at
    the border since more functionality exists at the
    border
  • This area holds potential future research topics,
    Since today the concept of embedded design is
    emerging, and then designers will introduce
    highly complex FPGAs, necessitating the
    integration of the testing on the chip

10
DEFECT AND FAULT TOLERANT FPGA
  • Defect Tolerant FPGA
  • Defect Tolerant means the problem of tolerating
    defects occurring during the fabrication of the
    chip
  • presents problems from the manufacture side
  • Fault Tolerant FPGA
  • Means the problem of tolerating faults happening
    within the usage of the chip
  • presents problems from the user side
  • Defect-Tolerant FPGA
  • Manufactures are still searching for more
    reliable chips of low cost( area, hardware,
    complexity, delay, etc) and high-yield
    improvement.
  • General goal after detecting and locating a
    defect in the chip, instead of throwing out the
    entire chip, only the defective CLB, or wire
    should be isolated and avoided without
    compromising the original performance
  • Fault detection and diagnosis
  • CLB defects
  • Interconnect defects

11
Tolerating Defective CLB
  • Since the FPGA is constructed of 2-D arrays of
    identical CLBs, defective CLB can be avoided by
    remapping the users application data around it
    using spare or other unused resources. This
    solution may fit will for FPGA having flexible
    interconnect resources.
  • Disadvantage likely to generate significant
    delay after remapping the user application data,
    especially in the case of FPGA with limited
    interconnect resources
  • One deviation from this method benefits from the
    regular array of FPGA by using a spare column of
    CLBs or one column and one row of CLBs.
  • Though this method improves delay, a defect
    within a CLB will result in a whole row or column
    of CLB being unused. This obviously will affect
    the yield enhancement. To remedy this
    short-coming, scientist proposed a fast
    reconfiguration as a key mechanism of obtaining a
    significant increase in yield
  • Utilization of laser technique this is another
    approach to defect tolerance. This idea is based
    on the addition of one grid of defect avoidance
    buses to the original FPGA interconnect
    resources. When a defect is detected and located,
    the defective cell is avoided by using additional
    buses. Disadvantage additional hardware
    overhead and delay caused by additional switches.

12
  • Node-Covering Technique achieves defect
    tolerance by giving the possibility to each node
    ( CLB) to cover its neighbor in the row. The
    defective CLB is avoided by reconfiguration
    around it using the laser-burned fuses. The
    defect is transparent to the user. It means that
    the user configuration data which is loaded into
    the FPGA remains the same, independent of whether
    the chip is defect-free or not. The SRAM
    corresponding to CLB and that of interconnect
    resources are assumed to be separate. The figure
    shows an example of the SRAM structure
    corresponding to CLB in two rows of simplified
    FPGA model. An additional multiplexer is added to
    the SRAM corresponding to each CLB. When a fault
    occurs, the data is shifted by one CLB to the
    right and the multiplexer corresponding to the
    defective CLB is activated so that this CLB is
    avoided.

13
(No Transcript)
14
  • The data corresponding to the interconnect
    resources prevent originally the defect by using
    the same principle.
  • Every segment covers its neighbor and the last
    segment in the channel is covered by the
    reservation of one supplementary channel segment.
  • Figure (a) shows an example of the implementation
    of defect-tolerant routing of nets using a cover
    segment, and Figure b shows the reconfiguration
    around the defect.
  • Advantage high yield improvement with moderate
    cost
  • Disadvantage
  • susceptible to failures in the case of some
    FPGAs, which contain various types of links in
    their interconnect network.
  • Hardware overhead is high

15
(No Transcript)
16
Shifting Approach
  • This approach achieves defect tolerance by
    shifting the configuration data on the chip.
  • The following two figures show an example of the
    design and its ability to shift data on a chip.
    In this example the user data is shifted by one
    row or column (top, down and right) of CLB
  • Two distribution of spare CLB are proposed (chess
    game)
  • King shifting
  • Horse allocation
  • When a defect occurs, the data is shifted in the
    corresponding direction so that the defect is
    avoided. The king distribution requires
    eight-shifting directions and the horse
    distribution requires four directions.
  • Main problem in this method is the possibility
    it may fail if if a defect occurs in one memory
    cell however it can be improved by tolerating
    defects and faults of the SRAM part separately

17
Shifting Approach
18
(No Transcript)
19
Shifting the data in 8 directions with King
Shifting distribution
20
Tolerating Interconnect Defects
  • When a defect occurs, it can be easily avoided
    using computer-aided design (CAD) tools with less
    delay than that when a defect occurs in CLB.
  • Some scientists suggest node covering method for
    interconnect resources. Yet they are not very
    attractive solution for interconnect resources.

21
Fault Tolerant FPGA
  • Today FPGA devices are not fault tolerant since
  • Manufactures do not have any particular cost
    benefits
  • Fault tolerant can be solved partially on board
    or system levels
  • Solution is based on
  • Chip Level
  • node covering method, CAD tools method, laser
    techniques
  • Unfortunately, these methods present several
    problems
  • When a fault occurs, a user must contact the
    manufacture customer perpetually dependent on
    the manufactures.
  • Board or System level
  • Recommended for fault tolerance since the methods
    based on chip level are complex and expensive

22
Board or System level Fault Tolerance
  • Some scientists proposed a solving the problem at
    the board level using low overhead approach. This
    approach achieves fault tolerance by partitioning
    FPGA in several tiles, within each tile, some
    CLBs are used as spares. Consequently when a
    fault is detected in one tile, only the concerned
    tile is reconfigured using a partial
    reconfiguration so that the fault is avoided.
  • For instance, consider a Boolean function Y
    (AB) (CD), implemented a tile containing 4 CLBs
    and this configuration has one spare CLB. Upon
    detecting a fault, an alternate tile
    configuration is activated. This concept is shown
    in the following figure.

23
(No Transcript)
24
Fault Detection, Diagnosis, and Defect/Fault
Tolerance in NEW FGPA Generations
  • Two types of FPGA
  • Static FPGA
  • Dynamically reconfigurable FPGA (DRFPGA)
    emerging market
  • The structure that we studied earlier is a static
    FPGA
  • Todays market is geared towards DRFPGA. Thus can
    we apply the fault detection, diagnosis, defect
    and fault tolerant approaches that we studied
    earlier for static FPGA to DRFPGA?
  • To answer this question we must first study the
    DRFPGA structure.

25
Structure of the new FPGA Generations
  • Two types of DRFPGA
  • Partially reconfigurable FPGA
  • This type permits reconfiguration of some logic
    blocks and wire segments, while some other
    programmable hardware is busy in the functional
    mode.
  • Switch context FPGA
  • This concept as a whole is still in the research
    stage. This type of FPGA can change from one
    context to another in only one clock period. This
    means that the users can make several designs in
    the multiple configurations store, each one in
    one configuration memory. Then the user can shift
    from one design to another in one clock period

26
Structure of dynamically re-configurable FPGA
27
Detecting and Diagnosing Faults in the New FPGA
  • The new FPGA generations do not introduce any new
    components, requiring new fault model. Thus the
    fault model is same as the one adopted in static
    FPGA
  • As explained earlier, there exist methods that is
    based on programmability of the FPGA and others
    based on the modification of the FPGA structure
    with the goal of testing or diagnosing.
  • Though, we can use the methods based on the FPGA
    programmability for DRFPGA, we cannot directly
    use them most of them would require changes. The
    following table reflects the difficulties of
    these changes

28
Tolerating Faults and Tolerating Defects in the
New FPGA
  • Approaches designed for static FPGA are generally
    difficult to be adopted for the new FPGAs. Since
    these approaches are based on the exact knowledge
    of the FPGA structure details.
  • Scientists believe, that each approach requires
    some changes. The following table reflects the
    difficulties of these approaches

29
Conclusion and Future Research Directions
  • The SRAM-Based FPGA presents several advantages
  • Re-configurability, which provides the
    flexibility to implement several designs in a
    single FPAG without changing the hardware
  • Approaches for testing and fault tolerance were
    introduced based on this re-configurability
    feature
  • Fault Detection most of them are based on the
    re-configurability feature
  • Diagnosis most of them are an improved version
    of the fault detection methods
  • Defect and Fault Tolerance holds less interest
    than testing. More studies are available for
    defect tolerance than fault tolerance
  • Other FPGA structures in addition to the 2-D
    array of CLBs, other structures of SRAM-Based
    FPGAs such as hierarchical and dynamically
    reconfigurable are available

30
  • Fault Model The academic research must be
    performed in conjunction with the manufactures
    and researchers must collect more information
    about the structures of the FPGA
  • Fault Detection and Diagnosis until now there
    has been no research targeting the fault
    detection or the diagnosis of the entire FPGA
    chip area. Research studies until now have
    treated fault detection and diagnosis separately.
  • On line Testing This is a very difficult issue
    within the framework of fault tolerance. It will
    be useful if we know how to achieve on-line
    testing for a standard 2-D FPGA
Write a Comment
User Comments (0)
About PowerShow.com