FPGA Partial Reconfiguration - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

FPGA Partial Reconfiguration

Description:

Clock frequency (MHz) Bitstream size (kB) Aspect ratio. Aspect ratio ... Independently clocked using BUFRs. PR modules (PRMs) can span multiple PRRs ... – PowerPoint PPT presentation

Number of Views:550
Avg rating:3.0/5.0
Slides: 25
Provided by: dral60
Category:

less

Transcript and Presenter's Notes

Title: FPGA Partial Reconfiguration


1
FPGA Partial Reconfiguration
  • Presented by Abelardo Jara-Berrocal
  • HCS Research Laboratory
  • College of Engineering
  • University of Florida

April 10th, 2009
2
Outline
  • Introduction
  • Partial Reconfiguration (PR) Overview
  • Proposed Design Methodologies
  • Framework analysis
  • F4 Virtual Architecture for Partial
    Reconfiguration and Design Automation for PR
    Design

3
Introduction Fully reconfigurable systems
Battery
FPGA
Config 1
Configuration lines
disabled
disabled
enabled
System controller
General purpose I/O
Config 2
enabled
disabled
Bitstreams storage
disabled
Required design
Shared memory
External I/O
Config 3
Config 1 Request
Config 2 Request
1. Device too small for complex designs
2. Big full bitstreams (long reconfiguration time)
3. Complete system operation is halted prior to
reconfiguration
Design station
4
Introduction Modular Reconfiguration
  • Types of Modular Dynamic Reconfiguration
  • Static Partial Reconfiguration Reconfiguring a
    portion of the device (changing the
    functionality) when the device is inactive
    without affecting other areas of the device
  • Dynamic Partial Reconfiguration (PDR)
    Reconfiguring a portion of the device while the
    remaining design is still active and operating
    without affecting the remaining portion of the
    device.
  • Virtex 4 and Virtex 5 devices support DPR

)
Reconfigurable region 1
Reconfigurable region 2
5
Partial Reconfiguration
  • Partial Reconfiguration is useful for systems
    with multiple functions that can time-share the
    same FPGA resources.
  • TERMINOLOGY
  • Reconfigurable Region (PRR)
  • Reconfigurable Module (PRM)
  • Static Logic
  • Bus Macro
  • Partial Bitstream
  • Merged Bitstream

6
Introduction A sample PR architecture
Battery
FPGA
disabled
enabled
JTAG
Base system configuration
Bitstreams storage
enabled
External I/O
Reconfigurable area
Static area
Module A request
1. System controller does not need to be placed
in an external device
2. Access to fast Internal Configuration Access
Port (ICAP 32 bits, 100 MHz)
3. Smaller partial bitstreams
4. No need to halt complete system when
reconfiguring a module
5. Time multiplexing of FPGA resources, load and
unload HW modules on demand
7
Medium for Partial Reconfiguration
  • External JTAG, UART (RS232)
  • Internal ICAP
  • ICAP (Internal Configuration Access Port)
  • Self-Reconfiguration controlled by soft-processor
  • Internal read and write access to configuration
    logic
  • Faster
  • HWICAP (provided by Xilinx)
  • Wraps the ICAP with additional logic to read and
    write frames to BRAM
  • Slave to PLB (Processor Peripheral Bus)
  • 100MHz, 32 bits

8
Additional considerations
  • General benefits from PDR
  • Saves space on the FPGA
  • Less time to change only a part of design
  • Reduction of power dissipation by storing
    functionality to external memory
  • Smaller FPGAs can be used to run an application
  • Architecture adaptation
  • Architecture adaptability
  • Main advantage, system can modify its internal
    modules based two schemes
  • Data-Driven Characteristics of input data
    changes at the runtime
  • Artificial intelligence, Evolutionary
    architectures, Adaptive Signal Processing
  • Situation-Driven System load/unload modules to
    adapt to environment conditions
  • Adaptive Fault tolerance, intelligent management
    of system resources

9
Bus Macros
  • Bus Macros Means of communication between PRMs
    and static design
  • All connections between PRMs and static design
    must pass through a bus macro with the exception
    of a clock signal
  • Type of Bus Macros
  • Tri-state buffer (TBUF) based bus macros
  • Slice-based (or LUT-based) bus macros
  • Advantage of slice-based bus macros
  • No signals lines should cross the border in
    partial reconfiguration
  • TBUFs will ignore the boundaries
  • Slice-based signals not crossing boundaries

10
LUT-based Slice Macros
11
Introduction Current PR Design Flow
  • Steps
  • Partition the system into modules
  • Define static modules and reconfigurable modules
  • Decide the number of PR regions (PRRs)
  • Decide PRR sizes, shapes and locations
  • Map modules to PRRs
  • Define PRR interfaces, instantiate slice macros
    for PRR interfaces
  • Many manual steps
  • Design partitioning
  • Number of PRRs
  • PRR sizes, shapes and locations
  • Mapping PRMs to PRRs
  • Type and placement of PRR interfaces

Design partitioning
Design floorplanning and budgeting
Static modules
Reconfigurable Modules (PRMs)
FPGA
Static region
2
of PRRs?
1
12
Introduction Early Access PR Design Flow
  • Introduced by Xilinx in FPL06
  • Major improvements
  • Automatic implementation scripts
  • Rectangular regions (not full column
    reconfiguration)
  • Static nets can cross reconfigurable regions
  • Slice macros replace bus macros
  • Partitioning and floorplanning steps are manually
    executed
  • Design guidelines for these steps are not
    provided

Placement and PRRs constraints
Reconfigurable design specifications
PRM Bitstreams
Xilinx PR Implementation Flow
Design floorplanning and budgeting
Design partitioning
(manual)
Full Initial Bistream
(automatic)
Potential for development of automatic CAD tools
13
Introduction Current PR design tools limitations
  • PR design is a very specialized task
  • Only a physical level of support is provided
  • Architectural knowledge of the target device is a
    must
  • Not very flexible, many design constraints
  • Partitioning and floorplanning steps are manually
    executed
  • No performance sensitive design guidelines are
    provided
  • No automatic heuristics based design flow is
    available too
  • Lack of abstraction from low level details

14
PR Overview Taxonomy of PR systems design flows
PR Designs
Multipurpose
Special purpose
  • Highly specialized systems design
  • All PRMs that will exist on the system are known
    at design time
  • Each PRR is independently optimized (size, shape,
    location, interface) based on the PRMs that will
    be mapped to it
  • Output is
  • Floorplan defining a static region and a set of
    optimized PRRs
  • The set of PRMs that can be placed in each PRR
    (PRMs to PRRs mapping)
  • Not optimized for a specific application
  • PRMs required by the application are not known
    when designing the base system
  • Goal is to design a flexible and reusable base
    design that can be used for several different PR
    systems
  • Base system designer defines a set of PRRs with
    fixed shapes, sizes, locations and interfaces
  • Generated floorplan is used as input template for
    the PRMs implementation

15
PRR Geometries
  • PR system design flows require
  • Proper metrics for PRR performance analysis
  • Design guidelines for efficient PRR floorplanning
  • Study of the effects of varying PRR shape over
  • Maximum Clock Frequency
  • Partial Bitstream Size
  • Five separate test cores
  • Beamforming (DSP/slice)
  • CFAR (slice/memory)
  • AES (register)
  • Performed on V4SX55 thus far

Aspect ratio PRR Height / PRR Width
16
Framework analysis Beamforming (125 MHz, 40)
  • 5022 slices
  • 16 DSP48s
  • 17 RAMB16s
  • Baseline, non-PR performance 1614 kB, 127.845
    MHz

Clock frequency (MHz)
Bitstream size (kB)
Aspect ratio
Aspect ratio
17
Framework analysis CFAR (100 MHz, 16)
  • 2610 slices
  • 2 DSP48s
  • 34 RAMB16s
  • Baseline, non-PR performance 1001 kB, 103.616
    MHz

Clock frequency (MHz)
Bitstream size (kB)
Aspect ratio
Aspect ratio
18
Framework analysis AES (80 MHz, 13.75)
  • 3634 slices
  • 3943 registers
  • 4 RAMB16s
  • Baseline, non-PR performance 1393 kB, 80.483
    MHz

Bitstream size (kB)
Clock frequency (MHz)
Aspect ratio
Aspect ratio
19
F4 Virtual Architecture and Design Automation
for Partial Reconfiguration
  • Dr. Ann Gordon-Ross
  • Dr. Alan D. George
  • UF ECE Faculty

Abelardo Jara Shaon Yousuft Rohit Kumar Terence
Frederick CHREC Students
20
Approach
  • Task 3 Bitstream Relocation
  • Port Bit Reloc to Microblaze
  • Context save and restore for PRMs

PR for Application Designers
  • Task 2 PR Design Flow Automation
  • Framework to model and design PR systems
  • Identification of points in Xilinx PR Design Flow
    amenable for automation
  • Software tools (C/C programs/scripts) for
    automatable steps
  • Task 1 VA for PR Adaptive Embedded Systems
  • SCORES Inter-module Communication Architecture
  • VAPRES Multipurpose Base Embedded Platform
  • Initial Research on fast algorithms for online
    PRMs placement and scheduling

20
21
Background VA for Adaptive PR Embedded Systems
  • Multi-purpose base system platform to build
    runtime-adaptive HW processing embedded systems
  • Architectural support for on-demand HW module
    loading/unloading
  • HW modules can offer better performance than SW
    modules
  • Exploit increased parallelism
  • Main bottleneck
  • Inter-module communication flows through
    centralized controller
  • Can be alleviated by adding custom inter-module
    communication architecture
  • VA benefits
  • Adaptive base system platform
  • Response to environmental changes
  • HW/SW partitioned applications
  • Time-shared virtual resources enable larger
    available area for system operations
  • Improved system resource utilization
  • Case study application PR for Mobile Agents

Target A
Target B
Adaptive embedded system at each processing node
Type A target
Type B target
External memory
Type A module
Type A module
Free slot
Controller and peripherals
SCORES
Type B module
VAPRES
21
22
VAPRES
- (Virtual Architecture for Partially
Reconfigurable Adaptive Embedded Systems)
Microblaze
USB
Shared memory
Network (other VAPRES nodes)
Fast Simplex Link (FSL)
UART
PLB Bus
Flash controller
PRR1
PRR2
PRR3
PRR4
PRM A
BUFR
ICAP
Network
Interface
Interface
Interface
Interface
Switch
Network-on-chip (SCORES)
  • VAPRES Motivations/Benefits
  • Embedded base architecture for multi-purpose PR
    systems
  • Facilitates dynamic HW modules placement and
    scheduling
  • Provides dynamic module frequency scaling
  • Computing power can be distributed among
    VAPRES-based nodes
  • VAPRES Architectural Components
  • Partially Reconfigurable Regions (PRRs)
  • Independently clocked using BUFRs
  • PR modules (PRMs) can span multiple PRRs
  • Controlling agent (Microblaze)
  • Dynamic module placement and scheduling
  • Module control and context save/restore
  • Partial reconfiguration through ICAP
  • Communication with other VAPRES nodes

22
22
23
Background Current Application PR Design Flow
  • PR is a very powerful feature of Xilinx FPGAs,
    but requires specialized skills
  • Manual steps
  • Partition the application into modules
  • Define static modules and partially
    reconfigurable modules (PRMs)
  • Determine the number of PR regions (PRRs)
  • Determine PRR sizes, shapes, and locations
    (resource allocation)
  • Map PRMs to PRRs
  • Define PRR interfaces and instantiate slice
    macros for PRR interfaces
  • Automatiable points and optimization problems
    (design-time)
  • Design partitioning
  • Number of PRRs
  • PRR sizes, shapes, and locations
  • Mapping PRMs to PRRs
  • Type and placement of PRR interfaces
  • Reconfiguration schedule

Design partitioning
Design floorplanning and budgeting
Static modules
Reconfigurable Modules (PRMs)
of PRRs?
2
1
FPGA
Static region
Potential for automation through C/C programs
or scripts
23
24
Questions
Write a Comment
User Comments (0)
About PowerShow.com