Evolvable Hardware Techniques for Autonomous Repair of FPGAs - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Evolvable Hardware Techniques for Autonomous Repair of FPGAs

Description:

Pratt & Whitney: turbine engine design --- engineer: 8 weeks; GA: 2 days w/3x improvement ... Design with Potentially Faulty Components [Thompson 1997] ... – PowerPoint PPT presentation

Number of Views:246
Avg rating:3.0/5.0
Slides: 33
Provided by: cal98
Category:

less

Transcript and Presenter's Notes

Title: Evolvable Hardware Techniques for Autonomous Repair of FPGAs


1
Evolvable Hardware Techniques for Autonomous
Repair of FPGAs
5 October 2003
Ronald F. DeMaraDepartment of Electrical and
Computer EngineeringUniversity of Central
Florida Jason D. Lohn, Gregory A. Larchev
Computational Sciences DivisionNASA Ames
Research Center
2
What is Evolvable Hardware???
Intelligent Search
Hardware Design
Combining two fields to enable complex
dynamic electronics applications
Bayesian
Amplifiers
Simulated Annealing
Filters
Genetic Algorithms
FPGAs
Nearest Neighbor
Antennas
Evolvable Hardware
Applications
  • Automated Construction develop Electronic
    Circuits by Intelligent Search
  • Applications support
    Design, Optimization, or Failure Recovery
    phases
  • Research Focus configuration of
    Field Programmable Gate Arrays (FPGAs)
    using Genetic Algorithms (GAs) with
    applications to
    Autonomous Repair of permanent faults

3
Evolvable Hardware (EHW)

Biological Models of Genetic Representations
and Evolutionary
Principles
Conceptual Inspiration
Science
CACM
  • Powerful technique for multi-objective
    optimization problems
  • power consumption, weight, size, cost, speed, or
    reliability
  • Faster design cycle can use to optimize or
    repair human-generated designs
  • Excellent for difficult-to-design systems
  • adaptive systems and dynamic devices in
    unpredictable environments

4
EHW in the Big Picture
Intelligent Search
Machine Intelligence Techniques
other sub-disciplines
Adaptive/Soft Computing
Evolutionary Computation
Fuzzy Systems
Neural Networks
Genetic Algorithms
Cellular Automata
Simulated Annealing
Application Domains
Numerical Optimization
Mechanical Design
Evolvable FPGAs
5
Operational Flow of EHW Techniques
  • 1. Objective for EHW procedure is specified
  • realize a 8-bit adder circuit or program a
    digital chip to perform a function
    such as tone discrimination
  • Relative ranking called Fitness Function is
    defined
  • 2. Population of alternative designs is created
  • completely at random or seeded with hand designed
  • 3. Genetic Algorithm invoked to evolve each
    alternative
  • Fitness evaluated for alternatives using FPGA
  • FPGA contains programmable logic and interconnect
    resources to realize arbitrary number of circuits
  • Genetic Operators used to increase fitness
  • 4. Fitness Exit Criteria checked
  • If max(fitness)ltthreshold then repeat Step 3
  • 5. Best design represents desired hardware
    configuration

FPGA Configuration
CIRCUIT OUTPUT
AND OR XOR NOR
Buses Muxes Pass Transistors
CIRCUIT INPUT
  • FPGA final configuration implements the
    circuit

PC
config
Example GA running on PC platform
configures a reprogrammable
Static RAM based FPGA
FPGA
results
6
Genetic Algorithms (GAs)
  • Mechanism coarsely modeled after neo-Darwinism
    (natural selection genetics)

start
replacement
offspring
population of candidate solutions
evaluate fitness of individuals
Fitness function
mutation
crossover
selection of parents
parents
Goal reached
7
Genetic Mechanisms
  • Guided trial-and-error search techniques using
    principles of Darwinian evolution
  • iterative selection, survival of the fittest
  • genetic operators -- mutation, crossover,
  • implementor must define fitness function
  • GAs frequently use strings of 1s and 0s to
    represent candidate solutions
  • if 100101 is better than 010001 it will have more
    chance to breed and influence future population
  • GAs cast a net over entire solution space to
    find regions of high fitness
  • Can invoke Elitism Operator (E1, E2 )
  • guarantees monotonically increasing fitness of
    best individual over all generations

8
GA Success Stories
  • Commercial Applications
  • Nextel frequency allocation for cellular phone
    networks -- 15M predicted savings in
    NY market
  • Pratt Whitney turbine engine design ---
    engineer 8 weeks
    GA 2 days w/3x improvement
  • International Truck production scheduling
    improved by 90 in 5 plants
  • NASA superior Jupiter trajectory optimization,
    antennas, FPGAs
  • Koza 25 instances showing human-competitive
    performance such as analog circuit design,
    amplifiers, filters

9
Representing Candidate Solutions
  • Representation of an individual can be using
    discrete values (binary, integer, or any other
    system with a discrete set of values)
  • Example of Binary DNA Encoding

Individual (Chromosome)
GENE
10
Genetic Operators
t
t 1
selection
reproduction
11
Crossover Operator
Population
offspring
12
Mutation Operator
Boolean
Biology
Representation
1 1 1 1 1 1 1
before
z
mutated gene
13
Visualizing GA Operation
Roadmap to animation on the next slide
14
Visualizing GA Operation
current population
new population
2 parent individuals potentially undergo crossover
Individual is potentially mutated
15
EHW Environments
  • Evolvable Hardware (EHW) Environments enable
    experimental methods to research soft
    computing intelligent search techniques
  • EHW operates by repetitive reprogramming of
    real-world physical devices using an iterative
    refinement process

Extrinsic Evolution
Intrinsic Evolution
Application
Two modes of Evolvable Hardware
or
Genetic Algorithm
Genetic Algorithm
Stardust Satellite gt100 FPGAs onboard
hostile environment radiation, thermal
stress How to achieve reliability to avoid
mission failure???
Simulation in the loop
Hardware in the loop
Done? Build it
software model
new approach to Autonomous Repair of failed
devices
device design-time refinement
device run-time refinement
16
Our Goal Autonomous FPGA Repair
An alternative to redundancy for increased
reliability without carrying spare hardware
  • Redundancy
  • increases with amount
  • of spare capacity
  • restricted at design-time
  • based on time required to select spare
    resource
  • determined by adequacy of spares available (?)
  • yes

Repair independent of number
of viable spares variable at
recovery-time based on time required to find
suitable repair affected by multiple
characteristics ( or -) yes
everyday example
automobile spare tire
can of fix-a-flat
?
Overhead from Unutilized Spares weight, size,
power Granularity of Fault Coverage
resolution where fault handled
Fault-Resolution Latency availability or
downtime required to handle fault Quality
of Repair likelihood and completeness
Autonomous Operation fix without outside
intervention
?
?
?
?
?
17
Autonomous Repair
new approach to Autonomous Repair of failed
reprogrammable devices
  • UCF has developed an evolutionary fault-recovery
    system for FPGAs
  • Employs a genetic representation that can
    accommodate both logic and interconnect
    failures
  • Experiments were run using Xilinx Virtex FPGA
  • Demonstrate that a complete repair of some
    combinational and sequential circuits is
    realizable
  • Contribution of new evolutionary procedures for
    repair and novel insights to fault occlusion,
    resource recycling, and parameter optimization

18
Related Work
  • Evolutionary Design Techniques for FPGA
    Fault-Tolerance
  • Evolve redundancy into design before the
    anticipated failure occurs
  • Messy Gate Approach Miller 2001
  • logic functions contain redundant terms as
    functional boundaries change and overlap
  • Fault-tolerant Oscillator Design Canham and
    Tyrrell 2002
  • designs evolved under a range of faults during
    fitness assessment
  • population-based approach with fitness function
    corresponding to operation without faults
  • additional pass evaluates tolerance to a range of
    faults
  • Design with Potentially Faulty Components
    Thompson 1997
  • evolution of designs with redundant capabilities
  • range of fault cases introduced
  • individuals able to exploit whatever component
    behaviors exist, even faulty ones
  • Evolutionary Fault Recovery for FPGA Fault
    Handling
  • Evolve recovery from a specific
    failure after (and if) it actually occurs
  • Evolutionary Repair of 4x4 Multiplier Vigander
    2001
  • attempts to restore functionality after random
    faults injected into FPGA CLBs
  • completely correct repair not achieved although
    excellent partial repairs
  • voting mechanism proposed using alternative
    partially repaired circuits

?
19
Fault-Handling Techniques for SRAM-based FPGAs
Device Failure
Characteristics
Duration
Transient SEU
Permanent SEL, Oxide Breakdown, Electron
Migration
Device Configuration
Processing Datapath
Device Configuration
Processing Datapath
Target
BIST
Evolutionary
Repetitive Readback
Approach
TMR
STARS
CED
Vigander
UCF
Methods
Supplementary Testbench
Duplex Output Comparison
Duplex Output Comparison
Detection
(not addressed)
Cartesian Intersection
Isolation
(not addressed)
Bitwise Comparison
Majority Vote
unnecessary
Fast Run-time Location
Worst-case Clock Period Dilation
Diagnosis
unnecessary
unnecessary
Population-based GA using Extrinsic
Fitness Evaluation
Evolutionary Algorithm using Intrinsic
Fitness Evaluation
Recovery
Replicate in Spare Resource
Select Spare Resource
Invert Bit Value
Ignore Discrepancy
20
Quadrature Decoder
  • Applications requiring determination of angular
    translation (or speed)
  • Example DC-motor to drive system for a mobile
    robot we may wish to move forward (or reverse) by
    a fixed distance
  • Decoder determines rotation direction

21
Quadrature Decoder
  • Finite state machine
  • Input and State traces
  • State transition table

22
Genetic Representation
  • Representation how we represent FPGA
    configurations in the GA
  • Goals
  • Allow all possible LUT configurations
  • Allow all possible CLB interconnections given
    constraints of routing support
  • Disallow illegal FPGA configurations
  • Make it easy for crossover to combine good
    configurations
  • Minimize non-coding introns (junk DNA)
  • Bitstring representation is natural choice,
    though may not scale well (investigating
    generative reps)
  • Representation is specific to Xilinx Virtex FPGA

23
Genetic Representation
  • Logic bits in the LUTs
  • Routing bits specify how to connect LUT outputs
    to LUT inputs

LUT 0
LUT 2
? ? ?
LUT 1
LUT 3
CLB 0
CLB 1
CLB n
24
Experimental Setup
  • Software and Hardware Testbeds
  • ECJ
  • Xilinx JBits
  • Xilinx Virtex DS simulator
  • JBuilder Java SDK
  • Evaluation
  • Input stream of 100 bit pairs
  • Output stream of 110 bits sampled across 4 CLBs
  • Stuck-at-zero fault on CLB2 F1 slice 0
  • Fitness percentage of correct output bits,
    taking the max
  • across 100-bit sliding windows
  • across CLBs

25
FPGA with Fault Injected
26
GA Parameters
  • Generational GA
  • Popsize 40
  • Crossover 80
  • Mutation up to 0.2 per bit
  • Elitism 2 individuals
  • Gen 0 Seeding 20 individuals seeding with
    hand-designed Quad Decoder

27
Temperature map of FPGAlogic cells during
evolution
HW Xilinx Virtex XCV1000 FPGA Ckt Quadrature
Decoder Exp 3
28
Evolving a Complete Repair
elitist
average
Fitness
generation
29
Results
  • Genetic algorithm is able to consistently find
    quad decoders operating at 100 accuracy with a
    single injected stuck-at fault
  • Out of sample test yields 97 accuracy (expected
    to rise as fitness test case length increases)
  • The stuck-at fault is used in the solutions found
    (GA is exploiting the fault)
  • Most runs converge after 1500-2000 circuit
    evaluations
  • Average population fitness increases until
    convergence (useful search)

30
Recent Publications
  • Evolvable Hardware Technical Papers
  • J D. Lohn, G. Larchev, and R. F. DeMara,
    Evolutionary Fault Recovery in a Virtex FPGA
    Using a Representation That Incorporates
    Routing, In Proceedings of the 10th
    Reconfigurable Architectures Workshop (RAW 2003),
    Nice, France, April 22, 2003.
  • J. D. Lohn, G. Larchev, and R. F. DeMara, A
    Genetic Representation for Evolutionary Fault
    Recovery in Virtex FPGAs, In Proceedings of the
    5th International Conference on Evolvable Systems
    (ICES), Trondheim, Norway, March 17 - 20, 2003.
  • J. D. Lohn and R. F. DeMara, A Co-evolutionary
    Genetic Algorithm for Autonomous Fault-Handling
    in FPGAs, accepted to International Conference
    on Military and Aerospace Programmable Logic
    Devices, Laurel, MD, September 10 - 12, 2002.
  • Machine Learning (EHW subcomponent) Curriculum
    and Educational
  •   M. Georgiopoulos, J. Castro, A. Wu, R. DeMara,
    E. Gelenbe, A. Gonzalez, M. Kysilka, M.
    Mollaghasemi, CRCD in Machine Learning at the
    University of Central Florida Preliminary
    Experiences, In Proceedings of 8th Annual
    Conference on Innovation and Technology in
    Computer Science Education, University of
    Macedonia, Thessaloniki, Greece, June 30 - July
    2, 2003.
  •   M. Georgiopoulos, I. Russell, J. Castro, A. Wu,
    M. Kysilka, R. DeMara, A.Gonzalez, E. Gelenbe, M.
    Mollaghasemi, A CRCD Experience Integrating
    Machine Learning Concepts into Introductory
    Engineering and Science Programming Courses, In
    Proceedings of 2003 American Society for
    Engineering Education (ASEE) Annual Conference
    and Exposition, Nashville, Tennessee, June 22 -
    25, 2003.

31
GA Advantages
  • Widely applicable
  • Low development costs (engineering ready)
  • Creativity - surprising solutions
  • Can be run interactively, accommodate user
    proposed solutions
  • Provide many alternative solutions design time
    fault tolerance
  • Abundant intrinsic parallelism
  • Scales with Moores Law -)10x in 5

32
Conclusion
  • One of the first studies to look at evolving
    interconnect for fault-recovery in FPGAs
  • Output results encouraging
  • Current work
  • Reducing execution time for autonomous recovery
  • Scaling to complex problems
  • Robustness of evolved solutions
  • On-line experiments that can safeguard the FPGA
  • Integrating Machine Learning EHW into UCF
    curriculum
  • EHW in EEL4851, EEL4972, EEL6763
  • Subpart of multi-year NSF CRCD Award
    (Georgiopoulos, DeMara, Gelenbe, Gonzalez,
    Kysilka, Wu)
Write a Comment
User Comments (0)
About PowerShow.com