Test Technology Trends - PowerPoint PPT Presentation

About This Presentation
Title:

Test Technology Trends

Description:

Title: No Slide Title Author: kaat Last modified by: LT Wang Created Date: 4/13/1997 2:24:48 PM Document presentation format: On-screen Show Company – PowerPoint PPT presentation

Number of Views:752
Avg rating:3.0/5.0
Slides: 180
Provided by: kaat95
Category:

less

Transcript and Presenter's Notes

Title: Test Technology Trends


1
Chapter 12
Test Technology Trends in Nanometer Age
2
What is this Chapter About?
  • Introduce the test technology roadmap
  • Focus on a number of difficult challenges and
    test solutions
  • Delay testing, Physical failures and Soft errors,
    FPGA testing, MEMS testing, High-speed I/O
    testing, and RF testing
  • Concluding remarks

3
Section 12.1
Test Technology Roadmap
4
Moores Law and Test Challenges
  • Moores law the number of transistors integrated
    per square inch will double approximately every
    18 months.
  • To keep track of Moores law die size , feature
    size , gate delay , interconnect delay
  • To reduce interconnect delay, interconnects are
    made taller and taller, and this causes crosstalk
    noises between adjacent lines due to capacitive
    and inductive coupling (called signal integrity
    problem). This is very difficult to test.

5
Moores Law and Test Challenges
  • Power integrity clock frequency , supply
    voltage , power supply voltage can drop by
    L(di/dt). This is very difficult to test.
  • Process variation precise control of silicon
    process is becoming more and more difficult. For
    example, it is hard to control effective channel
    length of a transistor. This makes power and
    delay exhibit large variability. This is hard to
    detect.
  • Low-power design faults low-power design
    circuits might result in fault models that are
    difficult to test. For example, drowsy cache
    design by reducing power supply will cause drowsy
    faults.

6
Fabrication Capital versus Test Capital
7
International Technology Roadmap for
Semiconductors (ITRS)
  • ITRS identifies technological challenges and
    needs facing the semiconductor industry over the
    next 15 years
  • ITRS test and test equipment near-term challenges
    SIA 2004
  • High-speed device interfaces
  • Highly integrated designs
  • Reliability screens
  • Manufacturing test costs
  • Modeling and simulation

8
International Technology Roadmap for
Semiconductors (ITRS)
  • ITRS test and test equipment long-term challenges
    SIA 2004
  • DUT (device under test) and ATE (automatic test
    equipment) interfaces
  • Test methodologies
  • Defect analysis
  • Failure analysis
  • Disruptive device technologies

9
International Technology Roadmap for
Semiconductors (ITRS)
  • ITRS design test near-term challenges SIA 2004
  • Effective speed test with increasing core
    frequencies and widespread proliferation of
    multi-GHz serial I/O protocols
  • Capacity gap between DFT/test generation/fault
    grading tools/design complexity
  • Quality and yield impact due to test process
    diagnostic limitations
  • Signal integrity testability and new fault models
  • SOC and SIP test
  • ITRS design test long-term challenges SIA 2004
  • Integrated self-testing for heterogeneous SOCs
    and SIPs
  • Diagnosis, reliability screens and yield
    improvement
  • Fault tolerance and on-line testing

10
Sections 12.2
Delay Testing
11
Why Delay Testing?
  • Three sources of yield loss
  • Random defects causing both logical and timing
    failures
  • Systematic failures causing both logical and
    timing failures
  • Parametric variations more likely causing
    timing failures

Yrandom
Ysystematic
Yparamtric
12
Fault Models Path-Delay Gate-Delay Faults
  • Path-delay fault
  • Propagation delay of path exceeds clock interval
  • of paths grows exponentially with number of
    gates
  • Only consider long paths or a subset of paths
  • Tests can detect small distributed failures
  • Tests for longest paths also useful for
    speed-sorting
  • Gate-delay fault
  • A logic model for a defect that delays a rising
    or a falling transition
  • Small distributed timing failures could be missed
  • of modeled faults is much smaller and manageable

13
Transition Faults Small Gate-Delay Faults
  • Transition fault (Gross gate-delay fault)
  • The extra delay caused by the fault is assumed to
    be large enough to prevent the transition from
    reaching any PO at the time of observation
  • Can be tested along any path from fault site to
    any PO
  • The test is a vector pair that creates a
    transition at the fault site and the second
    vector is a test for the stuck-at fault at the
    fault site
  • Small gate-delay fault
  • Is tested along the longest propagation delay path

14
Path Delay Faults - Type of Tests
  • Single-path-sensitization test
  • Guarantee that DUT will fail if and only if the
    path under test has excessive delay
  • Fully characterize the timing of the path and
    is ideal for delay fault diagnosis
  • All side inputs of gates along the given path
    must be stable

Single-path-sensitization test conditions (for
AND gate)
Must be stable 1
Must be stable 1
V1 V2
V1 V2
Target path
Target path
15
Path Delay Faults - Type of Tests (Contd)
  • Non-robust test
  • Test may be invalidated in presence of other
    path delay faults

Non-robust test conditions (for AND gate)
Could be either 1 or 0
Could be either 1 or 0
V1 V2
V1 V2
Target path
Target path
16
Path Delay Faults - Type of Tests (Contd)
  • Robust test
  • Guarantees DUT will fail if the path under test
    has excessive delay

Robust test conditions (for AND gate)
Could be either 1 or 0
Must be stable 1
V1 V2
V1 V2
Target path
Target path
17
Application of Delay Tests
  • Require application of a vector pair to the
    combinational logic portion and the circuit being
    clocked at speed

input latches
output latches
Combinational Circuit
Input clock
output clock
Rated clock period
Input clock
Output clock
V1 applied
V2 applied
Output latched
  • An arbitrary vector pair may not be applied to a
    sequential circuit under full-scan, partial-scan
    or non-scan methodology

18
At-Speed Test
  • At-speed test means application of test vectors
    at the rated-clock speed.
  • Methods of at-speed test
  • External, functional test
  • Functional vectors applied by high-speed testers
  • At-speed scan test
  • Built-in self-test (BIST)
  • Software-based self-test

At-speed test does not necessarily guarantee
high-quality delay testing unless tests are
designed to detect delay faults!
19
Applying At-Speed Scan Tests
V2 states generated, (A) by one-bit scan shift of
V1, or (B) by
V1 applied in functional mode.
Scan chain length L
Test application scheme (B)
Test application scheme (A)
20
Classifications of Paths
There are untestable paths even if full-scan is
used to deliver tests! But do we really need to
test them, if defects/variations on them do NOT
degrade circuit performance in functional mode?
21
A Cost-Effective Test Strategy
  • Use functional vectors
  • Functional vectors can be applied at-speed and
    should catch some delay defects.
  • Functional vectors should be evaluated for
    transition fault coverage.
  • Derive and apply tests for undetected transition
    faults
  • Derive and apply tests for long path-delay faults

22
Delay Test/Speed Binning Challenges for Nanometer
Devices
  • Delay variability increases due to process,
    circuit, temperature, power, and noise factors.
  • No. of critical paths increases due to speed and
    power saving techniques.
  • Clock is increasingly susceptible to
    faults/variations creating test inaccuracy and
    escapes.
  • Conventional transition and path delay models and
    test methodologies are severely challenged!

23
Variability of Path Delay
  • Noise-induced variability
  • Coupling cap -- pattern (excitation/propagation)/t
    iming specific
  • Power grid fluctuation -- pattern specific
  • Circuit induced -- leakage, charge sharing
    (pattern specific)
  • Process-induced variability
  • Spatial temporal parametric variability lot to
    lot, wafer to wafer, die to die
  • Limitations in lithography
  • CMP induced variability
  • Thermal-induced variability
  • Power-induced variability

Source TM Mak, Intel
24
More Critical Paths Slowing Down Non-Critical
Paths
  • Severe power constraint drove power optimization
    everywhere.
  • Slowed-down paths sped-up paths all crowded
    around required period.
  • More critical paths make it easier for
    crosstalk-slowdown to propagate.
  • Bus coupling effect over local wires may be more
    likely frequent.

25
Potential Solution Go Statistical!
  • Circuit delays can be modeled as correlated
    random variables to take various local global
    factors into account
  • Noise, process variations, pattern dependency,
    temp. variations, etc.
  • Global effects can be modeled by correlations
    factors between delay random variables.

Mean/variance of pin-to-pin delay or interconnect
delay
a
26
Notion of Critical Path
The most critical path can be different
based upon which delay model you have in mind!
27
Critical Path Varies from Chip Instance to
Instance
P1 a, e, g P2 b, e, g P3 c, f, g P4
d, f, g
Suppose 10000 chip instances are produced
P1 P2 P3 P4
43.6 19.1 23.6 13.7
28
Statistical Delay Test Diagnosis Framework
  • Need to consist of five major components
  • Statistical timing analysis
  • Statistical critical path selection
  • Selecting statistical long and true paths whose
    tests maximize the detection of DSM delay defects
  • Path coverage metric
  • Estimating the quality of a path set
  • Generation of high quality tests for target paths
  • Identifying tests that activate longest delay
    along the target path
  • Path delay is highly pattern dependent
  • Delay fault diagnosis based on statistical timing
    model

29
Statistical Timing Analyzer
  • Gate/Cell level
  • Correlated delay vs.
  • Cell delay library
  • Interconnect model
  • Monte Carlo Based
  • Automatically determine convergence condition
  • Static and dynamic
  • Vector-less or vector-dependent

Arrival times
(V1V2.)
Estimate signal arrival time as random variable
30
Statistical Critical Paths
Arrival time of O
O
I
  • A critical path can be defined as the one with
    greater than P probability of exceeding a cut-off
    period T
  • Adjusting P and T to limit the size of critical
    path set

31
Considering Path Correlation for Path Selection
Output arrival times
25/3
A
overlap
24/3
B
C
22/2
After selecting path A, should path B or C be
selected?
32
Considering Path Independence for Path Selection
Paths selected for test generation
  • Defects on selected paths can be captured.
  • However, a (small) defect falls beyond the
    selected paths may not be captured.
  • Even with transition fault tests, path
    independence can still be an important factor for
    path selection.

33
Statistical Critical Path Selection
  • A new method achieving four objectives
  • Select statistical long paths
  • Consider path correlation
  • Achieve path independence
  • Eliminate statistical false paths
  • Results indicating that selecting statistical
    long paths considering correlation and
    independence simultaneously
  • Achieves higher test quality with the same number
    of selected paths
  • Selects fewer paths to achieve same level of test
    quality

J.-J. Liou, et al., "Experience in Critical Path
Selection For Deep Sub-Micron Delay Test and
Timing Validation," ASPDAC 2003. J.-J. Liou,
et al., "False-Path Aware Statistical Timing
Analysis and Efficient Path Selection for Delay
Testing Timing Validation," DAC 2002.
34
Section 12.3.1
Signal Integrity and Power Supply Noise
35
Coping with Signal Integrity
36
Signal Integrity
  • Motivation
  • Modeling
  • Test Methodologies
  • Enhanced BIST
  • Enhanced Scan

37
Motivation
  • Test cost will be dominant in this decade
    ITRS01

38
Result of Technology Scaling
Source ITRS01 Roadmap
Factors Technology Technology Technology Technology Technology
Technology nm 0.35 0.25 0.18 0.13 0.10
Coupling C pF 41.59 49.73 56.93 64.17 70.54
Ground C pF 12.89 10.06 9.65 7.30 6.42
Mutual L nH 0.80 0.84 0.88 0.93 0.97
Self L nH 1.17 1.17 1.18 1.21 1.23
39
Testing for Signal Integrity
Physical Defects
Process Variations
40
Fatal Problems on First Spin
  • Overall 61 of new ICs require at least one
    re-spin

www.deepchip.com
41
The Bottom Line
  • Signal integrity loss occurs due to process
    variations, manufacturing defects, the parasitic
    and coupling C/L. Integrity loss leads to
    failure.
  • Signal integrity problem is both design test
    issue. A systematic approach for testing is
    needed.

42
Interconnect Model
  • Signal integrity problems originate from
    interconnects.
  • Distributed RLC model is too complicated.

43
Integrity Loss Model
  • Excessive delay degrades performance and causes
    functional error.
  • Ringing causes functional error.
  • Overshoot contributes to noise, delay, hot
    carrier, time-dependent dielectric breakdown, and
    electromigration.

44
Prior Works
  • Fault model and test pattern generation
  • W. Chen, S. Gupta and M. Breuer ITC98
  • M. Cuviello, S. Dey, X. Bai and Y. Zhao ICCAD99
  • A.Attarha, M.Nourani VTS02
  • Self-test methods for testing interconnects
  • X. Bai, S. Dey and J. Rajski DAC00
  • M. Nourani and A. Attarha DAC01 JETTA02
  • I. Rayane, J. Medina and M. Nicolaidis VTS99
  • Modified boundary scan
  • J. Shin, H. Kim and S. Kang DATE99
  • K. Lofstorm ITC96
  • C. Chiang and S. Gupta VTS97
  • S. Yang, C. Papachristou and M. Tabib-Azar
    DAC01
  • M. Tehranipoor, N. Ahmed, M. Nourani TCAD04

45
Method 1 Enhanced BIST
  • The adverse effects of integrity loss will appear
    only at the working frequency.
  • The effects of integrity loss are usually
    transient and intermittent.
  • At-speed testing requires high-performance ATEs.
  • External test of signal integrity is limited due
    to speed, access and probing difficulties.

Test Pattern Generator
Output Response Analyzer
Interconnect Under Test (IUT)
Test Controller
46
On-Chip Noise Detection
  • The internal Noise Detector (ND) and Skew
    Detector (SD) cells sample signals and record
    skew and delay violations.
  • Our BIST-based methodology can be integrated
    within conventional BIST environments with 20 to
    50 more overhead.

T P G R
Interconnect Under Test (IUT)
ND Cell
M I S R
SD Cell
BIST Controller
47
Noise Detector (ND) Cell
  • The ND cell detects voltage violations, e.g.
    overshoot and ringing.

signal
Core i
Core j
IUT
Signal noise
T6
T3
T4
To read-out circuit
c
y
T7
x
T1
T2
Test_mode
T5
48
Behavior of the ND Cell
  • The noise detector (ND) cell shows a hysteresis
    property and can detect two threshold voltages.

49
Skew Detector (SD) Cell
XNOR
Sensor
Level restorer

c
To flip-flop
Interconnect signal (Signal Delay)
a
PDN
Inverter 2
TCK
TCK
b
b
b
Inverter 1
ADR
50
Behavior of the SD Cell
ADR
Violation
51
Readout Architecture
SI SO
ND Cell Vb Vc
0
1
Test Controller
52
Method 2 Enhanced JTAG
  • Boundary scan provides easy access to the
    interconnects
  • The boundary scan cells and TAP controller need
    modification to
  • generate and apply the test patterns (PGBSC)
  • capture and read out the integrity violations
    (OBSC)

53
Maximum Aggressor (MA) Model

54
Pattern Analysis in MA Model

55
Pattern Generation BSC (PGBSC)
TDO/next cell

Input pin/ Core output
Output pin/ Core input
0
1
0
1
D2
Q2
D1
0
Q1
1
Mode
Q2
FF1
FF2
TDI/ previous cell
Q1
ClockDR
0
1
ShiftDR
Q3
T
FF3
SI
UpdateDR
56
Operational Modes of PGBSC
PGBSC Mode Q1 SI
Victim 1 1
Aggressor 0 1
Normal X 0

Victim mode
Aggressor mode
UpdateDR
CLK-FF2
Q2
57
Encoded Data for Victim Line

Victim-Select Victim Line
10000 1
01000 2
00100 3
00010 4
00001 5
58
Observation BSC (OBSC)
TDO/next cell


ND/SD
Output pin/ Core input
ND FF
1
0
Input pin/ Core output
0
SD FF
0
1
D2
Q2
D1
Q1
1
0
sel
1
Q2
Mode
FF2
FF1
UpdateDR
ClockDR
ShiftDR
TDI/ previous cell
SI
59
Operational Modes of OBSC

Modes ND/SD SI
NDFF 1 1
SDFF 0 1
Normal X 0
  • Values of signal sel

SI ShiftDR sel
1 0 0
1 1 1
0 X 1
60
Operation of OBSC
TCK

Controller State
Shift-DR
Capture-DR
ClockDR
ShiftDR
SI1
Select ND/SD cell
(form the scan chain)
selSIShiftDR
61
Test Architecture

BSC
OBSC
PGBSC
1
1
2
2
Standard IEEE1149.1 Interface
k
m
62
New Test Instructions
  • Two new instructions are added to the IEEE1149.1
    instruction set
  • G-SITEST Instruction
  • Facilitates test pattern generation based on the
    MA fault model
  • PGBSCs are enabled with signal SI1
  • ND/SD cells become active (CE1) to capture the
    signal integrity information
  • O-SITEST Instruction
  • Is used to capture and scan out the ND/SD FFs
    data
  • Is loaded after the G-SITEST instruction

63
Concluding Remarks - SI
  • Signal integrity failures are intermittent
    therefore, new test pattern generation, detection
    and readout strategies are required.
  • Enhanced BIST Methodology
  • Is capable of at-speed testing.
  • Is relatively expensive unless limited to long
    buses.
  • Provides data for test/reliability analysis and
    diagnosis.
  • Enhanced JTAG Architecture
  • Requires noise/skew detector cells on
    interconnects.
  • Needs modified boundary scan cells.
  • Provides a cost effective solution to test
    interconnects for integrity loss.

64
Coping with Power Supply Noise
65
Power Supply Noise (PSN)
  • Noise in high speed design
  • Sharp rise and fall times (small dt)
  • Changes in current drawn from Vdd (large di)
  • VPSN L di(t)/dt R i(t)
  • Large voltage fluctuation
  • Intermittent malfunctions
  • Functional failure
  • Reliability problem

66
Motivation
  • Power supply noise analysis captures noise more
    globally in high speed circuits.
  • PSN analysis can be useful for
  • Pre synthesis noise/performance estimation
  • Sensitivity analysis
  • Power supply network design
  • Accurate estimation of PSN in a core based SoC
    without exhaustive simulation.

67
Prior Work
  • Genetic algorithms
  • Y. Jiang, K. Cheng and A. Krstic CICC97
  • G. Bai, S. Bobba and I. Haji ICECS01
  • S. Zhao, K. Roy and C. Koh ICCD00
  • Estimation based on modeling
  • Y. Chang, S. Gupta and M. Breuer VTS97
  • L. Zheng, B. Li, and H. Tenhunen ISCAS00
  • M. Nourani, M. Tehranipoor, N. Ahmed VTS05
  • Application power distribution and floor
    planning
  • N. Pham, M. Cases, D. Araujo and E. Matoglu
    VTS04
  • S. Zhao, K. Roy and C. Koh ASP-DAC02

68
Power Distribution Wire and Pin Model
  • Vnoise(t) (Vdd,pin - Vdd,block(t)) (Vss,pin -
    Vss,block(t))

69
PSN Analysis Metrics
  • Level A level of a node (distance from primary
    input) implies how difficult or restrictive it is
    to switch that node.
  • Fan-Out Switching time (tP) is inversely
    proportional to fan-out. Switching low fan-out
    gates will reduce dt ? increase di/dt.
  • Fan-In Large fan-in gates are resized (more
    wide) to allow proper pull-up and pull-down.
    Large width allows drawing more current.

70
Effect of Level
71
Effect of Fan-Out
72
Effect of Fan-In
73
Different Methods
Method M1 M2 M3 M4 M5 M6
Order of Metrics ILO IOL OLI OIL LOI LIO
  • Ordering of metrics gives priority to switching
    one gate over the other.
  • Level is given the highest priority due to ease
    of control.
  • M5 (LOI) and M6 (LIO) methods are used.

74
Test Pattern Generation
  • Exhaustive
  • 2n(2n-1) 22n possible transitions
  • Random
  • Cannot guarantee peak power in short time.
  • Instead of exhaustive or random search,
    heuristic-based approach should be used.

75
Basic Approach
  • Use conventional s-a-f pattern generation method
    and tools to stimulate 0?1 or 1?0 transition.
  • Determine test pattern pairs by estimating
    maximum PSN according to three PSN metrics.

76
Concluding Remarks (PSN)
  • Power supply noise captures the effect of noise
    globally.
  • One goal is often to generate a pattern pair to
    stimulate maximum PSN in non-embedded cores.
  • Combining individual PSN curves strategically
    provides a way to estimate PSN of SoC without
    full simulation.

77
Section 12.3.2
Parametric Defects, Process Variation, and Yield
78
Defects and Physical Defects
  • Physical defects occur during manufacturing, and
    can cause static or timing physical failures
  • Defects can be random or systematic, and can be
    functional or parametric
  • Traditional work is more on functional random
    spots
  • Other three types of defects need to be
    researched
  • Defects can also be caused by process variations
    and random imperfections

79
Defect-Based Test
  • Defect-based test can be done by enumerating
    likely defect sites from layout
  • At-speed tests (path-delay tests and transition
    tests) must be used

80
Supplements of Conventional Stuck-at Tests
  • Bridging tests enumerate likely bridging fault
    sites (interconnects) by layout simulation
  • N-defect stuck-at tests detect every stuck-at
    fault N times by targeting different sensitive
    paths
  • TARO (transition fault propagation to all
    reachable outputs) generate transition tests one
    for each reachable output, for each given
    transition fault.
  • IDDQ tests test by measuring current flow
  • Functional Testing must be added to supplement
    structural tests
  • Key issue how to generate these defect-based
    tests in a timely manner to meet test goals.

81
Sections 12.3.3 -12.3.4
Soft Errors and Fault Tolerance
82
Cosmic Ray/Radiation Mechanics
high energy neutrons
lighter Particles
-


-

-

-
-

-




-
-
-


-
-
-


-
-

-
-

-

-

some particles also may pass through all material
without colliding with any atoms
83
Mechanism of Neutron SER
84
Neutron Environment
Normand et al.
1,000,000 feet 330 km
Shuttle
? Primary Cosmic Rays ?
Neutrons ?????Secondary Cosmic Rays
150,000 feet 50,000 m
Top of Atmosphere
Peak Neutron Flux
60,000 feet 20,000 m
?
35,000 feet 10,000 m
N,O
Aircraft
?
???
Ground 1/500 of Peak Flux
85
electrons gammas
varies with altitude and geography
n0-Si interaction can result in a short range,
intense burst of charge
86
Mechanism of Alpha Particle SER
Host material (package, Si, oxide, etc.)
  • Impurities disintegrate in alpha decay
  • Observed alpha flux 0.01/h/cm2

Impurity in ppb amounts (238U, 232Th, etc.)
87
Sources of alpha
  • alpha decay from radioactive isotope Pb 210
  • travel short distance, a few centimeters
  • but they are located at the surface of the dies
    with C4 mounting
  • Traditional solutions, e.g., epoxy die coat, are
    not effective

88
Shrinking Process Decrease Charge per Node
Soft error is a function of stored charge at
sensitive nodes QCV i.e., Cnode and Vcc
How low will it go?
89
RAM cell sensitive area
SER Critical area is drain/source junction rather
than metal interconnect
Not all diffusion area are sensitive Different
junction have different sensitivity
characteristics
90
Caches on CPU chip
Itanium2
Neutron Scaling Trends Neutron Scaling Trends Neutron Scaling Trends Neutron Scaling Trends Neutron Scaling Trends
Tech. Scaling Factor Scaling Factor Scaling Factor Scaling Factor
Tech. ADIFF CGATE Vcc SER / bit
l 1 1 1 1
0.7x l 0.54 0.75 0.7 1.08-1.45
0.7x l 0.54 0.75 0.8 0.92-1.18
Itanium
This may lead to false security
Doubling of devices every generation will double
soft error rate as well !!
91
Logic Circuit Subjected to SEU
Feedback loop exists
  • Logic is not immune to SER
  • All feedback nodes are susceptible
  • Typical hardening techniques will pose
    performance penalty
  • Latch is susceptible when in the holding phase
    (50 of cycle)
  • F/F Master/Slave
  • Each is sensitive during its inactive phase

typical D-latch
Disturbance change logic state permanently
Focuses on meta-stable window results in low
probability event
92
SER Cannot be Screened During Manufacturing
  • SER is recoverable error (if recomputed)
  • Most memory elements (with feedback) are
    susceptible to SER (maybe to different degree)
  • Errors in compute elements may become SDC (silent
    data corruption) or at best system crash (equally
    undesirable)
  • Low end system may be OK with a reboot not
    acceptable for a server
  • SDC is the most vulnerable wont know unless
    computation is repeated at another time

93
Redundancy - Spatial
Identical Compute elements
  • Spatial
  • Duplication compare
  • Triplicate vote
  • Checkpointing and rollback is a common recovery
    technique

checker
Miscompare !
Initiate rollback
Continue if correct Checkpointing regularly
94
Redundancy - Temporal
Compute element
  • Temporal
  • Assume error is not persistent
  • Recompute everything (twice)
  • First result compare to second
  • Checkpointing and rollback is common recovery
    technique

Compute twice
Temporary storage
1st result stored temporarily
2nd result forward for comparison
checker
Miscompare !
Initiate rollback
Continue if correct Checkpointing regularly
95
Redundancy - Information
parity
  • Error detection by extra information
  • E.g. parity
  • Parity is preserved through computation
  • Error correction by extra information
  • E.g. Hamming code

data
Parity should match
operation
result
ECC check bits
One bit in a row defective
data
ECC logic
data
Defective bit, correct by ECC
96
Hierarchy of Protection
  • Start with most unreliable (and highest quantity)
    elements
  • DRAM (highest number of bits)
  • Parity, ECC
  • Hard disk (mechanical)
  • RAID (Redundant Arrays of Identical Disks)
  • IO Channels
  • CRC, ECC
  • Cache
  • Parity, ECC
  • Register files
  • Parity
  • Execution units
  • DMR

97
Section 12.3.5
Defect and Error Tolerance
98
Defect Tolerance
  • Assumptions
  • Defect rate is low
  • Most defects cause single cell/row/column
    failures
  • Repair defect elements using redundant resources

99
Redundancy Repair An Example
  • Design in spare rows or columns or blocks
  • Test to identify where bad cell or row/column is
  • Algorithm to figure out how best to use spare
    elements to repair all defects
  • Fuse change decoders to address spare elements
    instead of faulty elements

spares
100
Error Tolerance
  • Increasing new computing applications are with
    multimedia data
  • Compression is generally used for these data
    types
  • E.g. MP3, JPEG, MPEG
  • Lossy in nature
  • Human senses are not keen enough to spot the
    difference
  • Errors may be tolerated for these kinds of
    applications
  • Accepting more faulty chips will increase
    effective yield while lowering cost

re Error Tolerance, Mel Breuer, 2005
101
Section 12.4
FPGA Testing
102
Field Programmable Gate Arrays
  • 2-dimensional array
  • Programmable logic blocks (PLBs)
  • Programmable routing network
  • Programmable I/O cells
  • Recent FPGAs incorporate specialized cores
  • Memory cores RAMs, FIFOs, etc.
  • Digital signal processors (DSPs)
  • Embedded processors

103
Basic PLB Architecture
  • Look-up Table (LUT)
  • implements combinational logic truth table
  • Memory elements
  • Flip-flop/latch
  • Some LUTs also implement small RAMs
  • Carry and control logic

104
Programmable Interconnect Network
  • Wire segments of varying length
  • xN N PLBs in length
  • N 1, 2, 4, 6, 8 are most common
  • xH half the array in length
  • xL length of full array
  • Programmable Interconnect Points (PIPs)
  • Transmission gate connects 2 wire segments
  • Controlled by configuration memory bit
  • Four basic types of PIPs

105
Programmable Interconnect Points
  • Break-point PIP
  • Connect or isolate 2 wire segments
  • Cross-point PIP
  • 2 nets straight through
  • 1 net turns corner and/or fans out
  • Compound cross-point PIP
  • Collection of 6 break-point PIPs
  • Can route 2 isolated signal nets
  • Multiplexer PIP
  • Directional and buffered
  • Main routing resource in recent FPGAs
  • Select 1-of-N inputs for output

106
Ranges of Programmable Resources
FPGA Resource FPGA Resource Small FPGA Large FPGA
Logic PLBs per FPGA 100 22,000
Logic LUTs and flip-flops per PLB 1 4
Routing Wire segments per PLB 50 400
Routing PIPs per PLB 80 1,000
Specialized Cores Bits per memory core 128 18,432
Specialized Cores Memory cores per FPGA 16 500
Specialized Cores DSP cores 0 500
Other Input/output cells 62 1,200
Other Configuration memory bits 32,000 50,000,000
Large FPGAs easily exceed 500 million transistors
107
FPGA Testing Problem
  • Must test all modes of operation
  • Many test configurations must be downloaded
  • Long test time
  • Large and complex devices
  • Large FPGAs exceed 500 million transistors
  • Many different types of functions to test
  • PLBs
  • Routing resources
  • Specialized cores (RAMs, FIFOs, DSPs, etc.)
  • Frequently changing architectures

108
FPGA Testing Approaches
  • With respect to system application
  • Application independent testing
  • Test all resources in FPGA regardless of system
    function to be implemented
  • Application dependent testing
  • Test only those resources that will be used by a
    given system function
  • Testing techniques
  • External testing
  • Test patterns applied and output responses
    monitored through I/O pins with external
    equipment
  • Built-In Self-Test (BIST)

109
BIST for FPGAs
  • Basic idea reprogram FPGA to test itself
  • No area overhead or performance penalties for
    system applications
  • Applicable to all levels of testing
  • From device-level through system-level testing
  • Cost
  • Memory to store BIST configurations
  • Goal minimize number of configurations
  • Download time to execute BIST configurations
  • Goal minimize downloads and/or download time

110
BIST for PLBs
  • Program PLBs as
  • Test Pattern Generators (TPGs)
  • Multiple TPGs prevent faulty PLBs under test from
    escaping detection when there is a fault in a TPG
    PLB
  • Identically configured logic blocks under test
    (BUTs)
  • Output Response Analyzers (ORAs)
  • Comparison-based
  • Row or column orientation
  • Two test sessions required
  • TPGs/ORAs and BUTs reverse rolls

Test session 1
Test session 2
ORA
111
BIST for Routing Resources
  • Program groups of wire segments and PIPs as wires
    under test
  • Program some PLBs as TPGs and ORAs
  • Similar to BIST for PLBs
  • Two BIST approaches
  • Comparison-based
  • Similar to BIST for PLBs
  • Parity-based
  • TPG produces test patterns with parity
  • ORA performs parity check

112
Concluding Remarks
  • FPGAs are more SoC-like with specialized cores
  • RAMs, DSPs, etc. can be tested with approach
    similar to BIST for PLBs
  • SoCs are incorporating FPGA cores
  • These cores can be tested with BIST for FPGA
    techniques
  • Complex Programmable Logic Devices (CPLDs) are
    similar to FPGAs
  • Can be tested with approaches similar to those
    used for FPGAs

113
Section 12.5
MEMS Testing
114
Introduction to MEMS
  • What is MEMS?
  • MEMS Micro Electro Mechanical System
  • Extremely small (in range of um) devices
    utilizing both electrical and mechanical
    properties.

MEMS gear chain and a mite for size comparision
(Sandia MEMS)
115
Video Clips from Sandia MEMS
  • A micro-resonator is used as an actuator to drive
    MEMS gears with mites crawling on top.

116
Commercially Available MEMS Devices
Digital Micromirror Device (DMD) (Texas
Instruments, Inc.)
MEMS ink jet print head (nozzle) (HP, Inc.)
117
Commercially Available MEMS Devices
ADXL50 accelerometer (Analog Devices, Inc.)
LambdaRouter optical switch (Lucent, Inc.)
118
Why MEMS?
  • Why MEMS?
  • Lower cost due to batch fabrication
  • Light weight
  • Smaller size
  • Lower energy consumption
  • Higher performance
  • MEMS applications
  • Automobile industry
  • Health care
  • Aerospace
  • Consumer Products
  • RF telecommunications
  • Other areas

119
World MEMS Market Prediction
  • Worldwide MEMS market is increasing rapidly.
  • Worldwide MEMS sales gt 5 billion in 2004,
  • gt 8 billion in 2007.

120
Basic Concepts for Capacitive MEMS Devices
  • In static mode where
  • dielectric constant of air
  • S overlap area between M and F1/F2
  • d0 static capacitance gap between M and F1/ F2

121
Basic Concepts for Capacitive MEMS Devices
  • Assume Vertical stimulus force causes movable
    mass M to move upward with displacement x, where
    x ltlt d0

122
Basic Concepts for Capacitive MEMS Devices
  • To sense displacement x, modulation voltages Vmp
    and Vmn are applied to F1 and F2 respectively

Where V0 is modulation voltage amplitude, w is
freq of modulation, t is time for operation, sqr
is square waveform
123
Basic Concepts for Capacitive MEMS Devices
  • According to charge conservation law,

where VM is voltage sensed by movable plate M
  • Solve the equation
  • By sensing Vm, we find displacement x.

124
MEMS BIST Research
  • Statement of problem MEMS testing is becoming an
    urgent need.
  • MEMS is finding applications in safety-critical
    areas, such as automobile, aerospace, medical
    instruments, etc.
  • MEMS will be integrated into SoC, so it needs to
    be thoroughly tested to ensure reliability.
  • The commercialization of MEMS technology needs a
    thorough and efficient testing solution in order
    to ensure reliability and reduce the test cost.

125
MEMS Testing A Challenging Topic
  • MEMS testing is chanllenging
  • multi-field coupling
  • diversity in device structure working principle
  • analog signals involved
  • vulnerable to various defect souces (stiction,
    etch variances, particle contamination, etc.)
  • Just like VLSI testing, built-in self-test (BIST)
    is believed to be a promising solution for MEMS
    testing.
  • Research goals develop a robust and efficient
    BIST solution for capacitive MEMS devices.

126
MEMS Sensitivity BIST Scheme
  • How to generate test stimulus for MEMS?
  • Apply voltage Vd to F1 and nominal voltage Vnom
    to M
  • Electrostatic force Fd will be experienced where

127
MEMS Sensitivity BIST Scheme
  • The force will result in a displacement x.
  • The sensitivity BIST scheme requires another
    similar structure to sense the displacement x.
  • Sensed VM is compared with a known value for
    comparison.

128
Symmetry BIST for Capacitive MEMS
129
Symmetry BIST for Capacitive MEMS (cont.)
  • For a fault-free Device
  • C1C2, then VM0
  • If C1 ?C2(faulty) VM?0
  • Fixed capacitance plates are partitioned into two
    equal portions (S1, S2 in top and S3, S4 in the
    bottom).
  • During symmetric BIST, always check VM0?
  • Local Defect causing left-right asymmetry is
    detected.
  • Partitioning by movable plate see N. Deb and
    R. D. (Shawn) Blanton,
  • Built-in Self Test of CMOS-MEMS
    Accelerometers, ITC02, p.1075.

130
Capacitance Partition for Dual-mode BIST
  • In Fig (b)
  • S1, S2 top sensing plates,
  • S3, S4 bottom sensing plates,
  • D1,D2 driving plates
  • In Fig (b) cont.
  • M movable plate
  • C1, C2 top sensing cap.
  • C3, C4 bottom sensing cap.

131
Capacitance Partition for Dual-mode BIST
  • Fixed plate at each side of the movable plate is
    divided into three portions 1 for electrostatic
    driving, other 2 equal portions for capacitance
    sensing.
  • The movable capacitance plate is not partitioned.
  • Sensitivity and symmetry BIST can be implemented.

132
Voltage Biasing Scheme of Dual-Mode BIST
  • D1 D2 also participate the normal operation.
  • Analog switches are used for mode-switching.
  • Any defect causing sensitivity change or
    left-right asymmetry will be detected.

133
MEMS Comb Accelerometer
  • Device prototype ADXL50 by Analog Devices Inc.
  • The serrated comb finger groups are extremely
  • vulnerable to defects, thus BISR is highly
    desirable.

134
Working Principles of Accelerometer
  • Uses differential capacitance sensing technique
  • Vmp (Vmn) Complimentary
  • modulation voltages

135
Working Principles of Accelerometer (cont.)
  • VM is proportional to displacement x, which is in
    turn directly proportional to acceleration a.
  • By sensing VM, the value of a can be measured.

136
BIST for Comb Accelerometer
  • Device prototype
  • ADXL50
  • Since fixed plates are
  • separated comb
  • fingers, the partition
  • can be easily realized.
  • M1-M8 movable plates
  • D1-D8 fixed driving
  • plates
  • S1-S8 fixed sensing
  • plates

Voltage Biasing Normal Operation SensitivityBIST Symmetry BIST
Vd - D1,D3,D5,D7 D1,D3,D5,D7
Vnom - D2,D4,D6,D8 M1,M4,M5,M8 D2,D4,D6,D8 M1,M4,M5,M8 S2,S4,S6,S8
Vmp S1,S3,S5,S7 D1,D3,D5,D7 S1,S3,S5,S7 S1,S5
Vmn S2,S4,S6,S8 D2,D4,D6,D8 S2,S4,S6,S8 S3,S7
137
ANSYS Fault Simulation of Comb Accelerometer
Stiction defect simulation result
  • Stiction (on right central movable finger)
    sensitivity BIST is more efficient.
  • Finger height is matched (only in right portion)
    symmetry BIST is more efficient.

Defect location Frequency (kHz) SensitivityBIST(mV) Symmetry BIST(mV)
Defect-free 11.85 967.6 0
0 28.80 107.8 2.5
10 32.75 82.2 2.1
20 39.23 56.4 1.6
30 46.80 39.2 1.2
Finger height mismatch simulation
Height mismatch ?H (um) Frequency (kHz) SensitivityBIST(mV) Symmetry BIST(mV)
Defect-free 11.85 967.6 0
0.1 11.85 951.8 42.9
0.2 11.85 937.1 83.9
0.3 11.85 921.8 128.2
0.4 11.85 904.4 181.3
For definitions of simulated defects, see
paper at ITC02, p.1075.
138
Conclusions of Dual-Mode BIST
  • By partitioning fixed instead of movable
    capacitance plate, the BIST technique can be
    extended to bulk-micromachining and other MEMS
    technologies.
  • Each sensitivity and symmetry BIST has its own
    fault coverage. A combination of both ensures
    better coverage.
  • Sensitivity BIST is necessary during in-field
    usage even after calibration.
  • Some unstable defects may change status.
  • New defects may be developed in in-field usage.
  • Stiction is also possible during in-field usage.

139
Section 12.6
High-Speed I/O Testing
140
I/O Test Requirements and Architectures
  • I/O test requirements are largely driven by the
    interoperability, system performance, and
    functional performance goals.
  • I/O test requirements are closely related to the
    link architectures.
  • For data rates lt 1 Gbps, global common clock (CC)
    and source-synchronized (SS) are popular.
  • Above 1 Gbps, serial architectures are dominant
  • Timing, voltage, and bit error rate (BER) are
    common testing parameters.

141
I/O Architecture (I) Common Clock (CC)
  • Synchronized global (common) clock
  • Common clocks for Tx data driving and Rx data
    sampling
  • Clock skew on board limits its use to lt a few 100
    Mbps data rate
  • Needs to test
  • Data to clock delay at Tx
  • Setup/hold time at Rx

142
I/O Architecture (I) Common Clock (CC)
  • Synchronized global (common) clock
  • Common clocks for Tx data driving and Rx data
    sampling
  • Clock skew on board limits its use to lt a few 100
    Mbps data rate
  • Needs to test
  • Data to clock delay at Tx
  • Setup/hold time at Rx

143
I/O Architecture (II) Source Synchronous (SS)
  • Tx sends data along with strobe (another clock)
  • Rx uses sent strobe to sample the data
  • No clock or strobe skew issue

144
I/O Architecture (II) Source Synchronous (Contd)
  • Some designs use strobe/strobe to improve timing
    accuracy.
  • Needs to test
  • Data valid before and after strobe at Tx end
  • Setup and hold times at Rx end

145
I/O Architecture (II) Source Synchronous (Contd)
  • Limited by data to data skew due to uneven
    channels
  • Board layout
  • E-M issues e.g., coupling, noises
  • Variation in drive among channels
  • Achieve up to 1000 Mbps data rates for wide bus
  • Can improve data rate with splitting into many
    narrower bus

146
AC IO loopback self-test
Similar circuit as the receiving end! Testing
hardware already exists! -- test for both
drive/receive -- low overhead
Loop time Tco (or Tvb) Tsetup OR TvaThold
147
Defect-based IO test
  • A wider spread of data valid time indicate faults

148
I/O Architecture (III) Serialier/Deserializer
(SERDES)
  • Bit clock is embedded in the serial data and gets
    recovered at Rx via clock recovery circuit.
  • Link layer is composed of encoder and decoder.
  • Physical layer (PHY) is composed of Tx, channel,
    and Rx.

149
I/O Architecture (III) A SERDES PHY Implementation
  • Bit clock is recovered via a phase interpolator
    (PI) clock recovery.
  • PI clock recovery tracks low frequency jitter
    from reference clock, Tx, and channel.
  • Differential PLLs reduce the jitter from
    reference clock.
  • Used widely in PCI Express (2.5 Gen I, 5.0 Gbps
    Gen II) and FB DIMM (3.2, 4.0, and 4.8 Gbps Gen
    I).
  • Jitter is the major limiting factor/performance
    metric.

150
Jitter Components and Terminology
151
Eye-Diagram, Jitter, and Noise Testing
  • For output testing, jitter and noise probability
    density functions (PDFs), and eye-opening should
    be upper bounded.
  • For input tolerance testing, jitter and noise
    PDFs, and eye-opening should be lower bounded.

152
Summary
  • Link architecture determines the relevant test
    parameters and methods.
  • For synchronized CC and SS architectures,
    critical test parameters include
  • Data valid to clock/strobe
  • Setup/hold time
  • For a SERDES architecture, critical test
    parameters include
  • Jitter, includes deterministic jitter (DJ),
    random jitter (RJ), and total jitter (TJ) or
    timing eye-opening
  • Noise, and voltage eye-opening
  • Bit error rate (BER)

153
Chapter 12.7
  • RF Testing

154
Outline of the Section
  • Introduce the basic concepts related to RF
  • Discuss the various challenges associated with RF
    test
  • Describe different core RF building blocks
  • Elaborate various test specifications for RF
    devices
  • System-level testing and associated
    specifications
  • Conclude with present and future trends

155
Introduction to RF
  • RF stands for Radio Frequency
  • Usually very high frequencies where signals can
    be transmitted wirelessly
  • Range of frequencies ? 300MHz 3GHz
  • RF is used synonymously with wireless
  • Significant growth during the last decade in the
    consumer segment
  • Increased consumer applications

156
Applications of RF
  • Earlier, consumer applications of wireless
    technology were limited
  • Military, space communications, air traffic
    control
  • Currently, consumer applications are on the rise
  • Cell phone, laptop, PDA, satellite radio
  • Radiofrequency identification (RFID)

157
Challenges with RF testing
  • Tests are performed in two steps
  • Characterization test
  • Production test
  • Various challenges make production test hard and
    expensive
  • RF devices need extra attention during testing
    (challenge 1)
  • Impedance matching _at_ input and output ports to
    ensure optimal power transfer
  • Shielding from external wireless signals during
    testing

158
Characterization test
  • Characterization validates the first set of
    silicon
  • Uses highly accurate instruments to
  • Verify the functionality of the design
  • Ascertain that the specifications are met
  • Ensure high repeatability of the measurement
    system
  • Production test needs to perform all of the above
    with
  • Cheaper instrumentation ? a least-cost commercial
    tester (challenge 2)
  • In much smaller duration of the time used for
    characterization (challenge 3)

159
Repeatability and accuracy issues
  • Test procedure should classify
  • Good devices as good and
  • Bad devices as bad
  • This is constrained by measurement noise
  • Reduces the accuracy of measurement during
    production testing (challenge 4)
  • Introduces large variability in the same
    measurement repeated many times (challenge 5)
  • These can be mitigated by using
  • Accurate test application procedure
  • High resolution measurements

160
Summary of challenges
  • Challenge 1 is very specific to RF devices ?
    needs careful measurement setup
  • Challenge 2 and 3 are also specific to RF
  • RF testers are very expensive (gt 1M) compared to
    the analog and digital counterparts
  • RF tests are usually longer compared to analog
    tests
  • Challenge 4 and 5 are general for all electronic
    devices
  • However, these are more prominent in RF due to
    the large amount of noise involved

161
Typical RF system
162
RF specifications
  • Linearity specifications
  • Gain, conversion gain, output power
  • Non-linearity specifications
  • Third-order intercept (TOI), adjacent channel
    power ratio (ACPR)
  • Noise specifications
  • Noise figure (NF), signal-to-noise ratio (SNR),
    sensitivity, dynamic range
  • System specifications
  • Error-vector magnitude (EVM), bit error rate (BER)

163
A note on decibel
  • Decibel is a very commonly used unit in wireless
    domain
  • Notation for decibel ? dB
  • Any number N can be converted to decibel by
  • NdB 20 log10(N)
  • A similar unit is mili-decibel (notation ? dBm)
  • Used to denote power with reference to 1 mW
  • P watts of power is converted to dBm by
  • PdBm 10 log10(P/1 mW)
  • Thus 1W 1000mW 30 dBm 10?W 0.01mW -20 dBm

164
Gain
  • Measures the small-signal gain of the
    device/system
  • Input is a single-tone stimulus within the
    operating frequency
  • Amplitude is within linear range of operation
  • Gain (Output amplitude / Input amplitude),
    usually expressed in dB

A2
A1
f1
f1
Gain A2/A1 20 log10(A2/A1), in
dB
165
Conversion Gain
  • Measures the small-signal gain of mixers
  • Mixers translate the input frequency at a
    different output frequency
  • Input is a single-tone stimulus within the input
    operating frequency, amplitude within linear
    range of operation
  • Gain (Output amplitude _at_ f2 / Input amplitude
    _at_f1), usually expressed in dB

A2
Different frequencies
A1
f2
f1
Gain A2/A1 20 log10(A2/A1), in
dB
166
TOI
  • Measure of nonlinearity for a device/system
  • Two-tone input, within operating range
  • Frequencies are closely spaced, difference is
    usually lt1 of the device bandwidth
  • Amplitude is larger than linear range of
    operation
  • TOI Pout (Pout PIMD)/2

Fundamental tones
Intermodulation tones
Pout
DUT
PIMD
f1
f2
f1
f2
2f1 - f2
2f2 - f1
167
A note on TOI
Third-order nonlinear term
  • Usually, RF devices exhibit third-order
    nonlinearity
  • y(t) A0 A1x(t) A3x(t)3
  • TOI is denoted in dBm
  • It indicates the output power level where the
    fundamental tones and intermodulation tones
    attain same power

168
Basics of SNR
  • SNR is an important factor for any signal
  • SNR for a known signal can be easily computed
  • SNR denotes the level of purity
  • For this sinusoid, SNR 50 dB
  • (-23) (-73) 50 dB
  • This notion can be extended to any known signal

-23 dBm
Amplitude
-73 dBm
Frequency
169
Noise figure
  • Noise figure measures the degradation of SNR of a
    signal when it passes through the DUT
  • Noise figure SNRin/SNRout
  • This indicates the amount of noise added by the
    DUT
  • NF is usually measured in dB (its a ratio !!)
  • NF is measured using NF meter

170
System-level
Write a Comment
User Comments (0)
About PowerShow.com