4th HiPEAC Industrial Workshop on Compilers and Architectures PowerPoint PPT Presentation

presentation player overlay
1 / 24
About This Presentation
Transcript and Presenter's Notes

Title: 4th HiPEAC Industrial Workshop on Compilers and Architectures


1
Arnaud Grasset, PhDarnaud.grasset_at_thalesgroup.com
THALES Research Technology,Embedded System
Lab - France
Mapping High Performance and Mission-critical
Applications to Embedded Architectures
  • Philippe Bonnot, Sami Yehia, Arnaud Grasset, Eric
    Lenormand, Gilbert Edelin

2
Aerospace, defence and security requirements
  • Embedded systems specific requirements
  • Performance, power consumption, etc
  • Aerospace, defence and security requirements in
    THALES
  • Long life cycle, certification procedures,
    relatively low volumes
  • Designing and programming appropriate
    architectures is a challenge
  • Emergence of on-chip reconfigurable and parallel
    architectures
  • SPEAR A framework for application mapping
  • Supporting new reconfigurable architectures

3
Outline
  • Embedded Systems for low volumes products
  • MORPHEUS A multi-purpose dynamically
    reconfigurable platform
  • SPEAR A framework for application mapping on
    MORPHEUS platform
  • Using MORPHEUS for a motion detection application
  • Conclusion and perspectives

4
Embedded Systems for low volumes products
  • Investigates solutions to satisfy and anticipate
    critical Embedded Systems needs
  • NRE costs prominent in low volumes product, long
    life cycle - flexibility

Diversity of THALES applications distributed in
Business Units
Application Space
SPEAR tool
Architectural Space
MORPHEUS
Ter_at_ops
Ter_at_pix
Cell
Diversity of architectural solutions
  • Application Development Framework is needed to
  • Master growing complexity of application and
    heterogeneous architectures
  • Redeploy applications on several architectures
  • Structure software development methodologies

5
Computation intensive and flexible architectures
  • Data and/or task level parallelism
  • Ter_at_pix SIMD for image processing at DATE08
  • Xilinx Virtex4 SX55 FPGA
  • 128 processing elements
  • 20 GOPS peak performance
  • SIMD frequency 150 MHz
  • Power consumption 14 W

processors memory
C code
Application dependent software code IP
sequencing
processor(s) (embedded in FPGA)
µ-code memory
Stable API
micro code
Library of optronics operators
Sequencer
Stable API
PE
PE
PE
PE
DMA
DMA
VHDL code
Optronics standard hardware platform
PE
PE
PE
PE
display1
IR
PE
PE
PE
PE
display2
PE
PE
PE
PE
VIS
display3
Dataflow
Image Memory
  • Ter_at_ops MPSoC for stream processing dominant
    applications
  • Multi-domain
  • High Level of SW productivity
  • High GOPS/W (respect to GPP)
  • Project started in 2007

TYPE RECONF
TYPE SIMD
TYPE GPP
SMEMn
SMEM1
bridge
T1
Tn
T1
Tn
Tn
T1
Network-on-Chip
DDR1
DDRn
I/O
6
Outline
  • Embedded Systems for low volumes products
  • MORPHEUS A multi-purpose dynamically
    reconfigurable platform
  • SPEAR A framework for application mapping on
    MORPHEUS platform
  • Using MORPHEUS for a motion detection application
  • Conclusion and perspectives

7
MORPHEUS project
  • MORPHEUS Multi-purpose dynamically
    Reconfigurable Platform for intensive
    Heterogeneous processing
  • EU 6th Framework Program IST 027342
  • 3 years project (2006 2008)
  • coordinated by THALES RT
  • 11 industrial and 7 academic partners
  • Focus on reconfigurable computing
  • Rising complexity of embedded systems
  • Design productivity gap

GPP - low comp. density - power inefficient - low
speed
Heterogeneous optimized infrastructure
SoC - high NRC - low reconfigurability -
time-to-market
Programming efficiency
MORPHEUS
Flexible domain focused platforms
Computation Intensive Flexible Embedded Systems
Hardware flexibility
FPGA - design inefficient - power inefficient -
area overhead
8
MORPHEUS architecture
  • MORPHEUS is a reconfigurable platform
  • Several Reconfigurable Units
  • Several granularity levels
  • MORPHEUS is a modular and scalable platform
  • Network-on-Chip
  • Homogeneous interfaces with RU
  • Target implementation characteristics
  • ARM9 processor and infrastructure
  • area 90 mm2 / 90 nm ST technology
  • frequency 200 MHz
  • power consumption 3W
  • M2000 Embedded FPGA
  • Fine grain

9
MORPHEUS toolset
  • proposes a software design approach for
    productivity, flexibility, performance

description of accelerated function
Annotated C code of global application
-------- f(.) --------
Compilation-time scheduling of accelerated
functions setting and execution
Run-time scheduling of the application
ARM sub-system
Communication mechanisms
Configuration Manager
Reconfigurable Units Configuration memory
  • SystemC model in Coware ModelDesigner
  • Silicon prototype

10
Outline
  • Embedded Systems for low volumes products
  • MORPHEUS A multi-purpose dynamically
    reconfigurable platform
  • SPEAR A framework for application mapping on
    MORPHEUS platform
  • Using MORPHEUS for a motion detection application
  • Conclusion and perspectives

11
SPEAR Application Development Framework
Model-based Architecture Capture
Model-based Application Capture
  • Captures both application and architecture
  • Application modelled as acyclic graphs of tasks
  • Heterogeneous architecture accepted
  • Supports application partitioning/mapping,
    communication insertion, scheduling
  • Mapping is human-driven
  • Enables iterative design exploration

Mapping commands GUI
  • Code generation and performance simulation

12
A formalism for accelerated function modelling
  • Accelerated function modelled as an acyclic graph
    of tasks
  • Represents deterministic and data streaming
    applications
  • ARRAY-OL formalism based on multidimensional
    arrays
  • Tasks represent nested-loops executing an
    Elementary Function
  • C-based description of Elementary Functions
  • Linear access patterns from/to input/output array
    of data produced and consumed
  • Expression of data and task level concurrency

F1( ..)
for(i0i
F3( ..)
F2 ( FlowA, ...)
13
SPEAR tasks mapping on MORPHEUS architecture
  • Function model captured with a Graphical User
    Interface (GUI)
  • Manual iterations space scheduling
  • Splitting iterations space and loop permutations

Reconfigurable Unit
Reconfigurable Unit
F
G
C
Data Exchange Buffers
Network-on-Chip
On-chip Memory
  • Fusion lowers pipeline granularity ? reduces
    memory needs
  • SPEAR manages tight memory resources
  • Generation of DMA transfers parameters and memory
    access patterns

14
Memory access patterns generation
  • Array streaming according to iterations
    scheduling
  • Generation of addressing patterns specific to
    data stream characteristics

Producing task
Consuming task
  • 2 memory banks with ping-pong policy
  • Design of a pipeline not explicit in the
    application model

15
Outline
  • Embedded Systems for low volumes products
  • MORPHEUS A multi-purpose dynamically
    reconfigurable platform
  • SPEAR A framework for application mapping on
    MORPHEUS platform
  • Using MORPHEUS for a motion detection application
  • Conclusion and perspectives

16
MORPHEUS target applications
  • INTRACOM Broadband wireless access systems
  • Switch between IEEE 802.11a and IEEE 802.16
  • Alcatel-Lucent Network routing systems
  • Update automatically distributed within the
    network
  • Deutsche Thomson Professional video
  • Dynamic selection of algorithm to apply
  • High processing workload up to 4096x3112 _at_ 24
    frames per second
  • Film grain noise reduction 2k operations / pixel
  • THALES Intelligent cameras for Homeland security
  • Need to change operating modes and very high
    computational complexity

On PACT XPP
On PiCoGA
17
A motion detection image processing application
  • Detects movements with respect to a static
    background

18
Motion detection application captured in SPEAR
  • Image resolution 768x576 pixels _at_ 25 frames per
    second
  • greyscale, 16 bits per pixel
  • SPEAR model well adapted to the application
    specification

Binarization
Opening
Edge detection
Merging
19
Application mapping
  • Image partitioning in tiles
  • Opening and edge detection 33 pixel matrices
  • Merging and binarization data organization
    aligned on other operators
  • Manual implementation on PiCoGA reconfigurable
    unit C. Mucci,SoC 2007
  • 342? performance speed-up with respect to a SW
    approach
  • Project status
  • Architecture defined with a SystemC model
    available
  • Tape-out planned for mid-2008
  • MORPHEUS toolset modules developed and
    integration performed
  • CDFG synthesis in progress
  • Verification on-going

Reconfigurable Unit
local memory
local memory
20
Outline
  • Embedded Systems for low volumes products
  • MORPHEUS A multi-purpose dynamically
    reconfigurable platform
  • SPEAR A framework for application mapping on
    MORPHEUS platform
  • Using MORPHEUS for a motion detection application
  • Conclusion and perspectives

21
Conclusion
  • MORPHEUS seeks for reducing time-to-market and
    improving flexibility
  • A reconfigurable heterogeneous architecture
    modular and scalable
  • A comprehensive toolset to fully exploit MORPHEUS
    architecture
  • A framework for application mapping on embedded
    systems architecture
  • An array-based programming model expressing both
    task and data parallelism
  • Validation in progress with a motion detection
    application
  • Application captured in SPEAR and first
    implementation results on PiCoGA
  • Toolset modules enhancements in second phase of
    the project
  • High-level synthesis in progress, optimization
    planned
  • Mapping on several Reconfigurable Units
  • Reconfigurable unit to reconfigurable unit
    communications
  • Parallel control of accelerated tasks
  • Dynamic allocation of accelerated tasks at RTOS
    level

22
Arnaud Grasset, arnaud.grasset_at_thalesgroup.com
THALES Research Technology, Embedded System
Lab - France
Further information on MORPHEUS project
http//www.morpheus-ist.org
Mapping High Performance and Mission-critical
Applications to Embedded Architectures
23
MORPHEUS project objectives
  • Develop a Reconfigurable System-On-Chip in
    silicon
  • 3 different reconfigurable engines ?
    multi-granular
  • Network-On-Chip ? scalable hardware architecture
  • Predictive configuration manager
  • High-bandwidth memory system
  • Provide a software oriented toolset
  • C-Code for control part of applications
  • Spatial design for kernel operations
    implementation on the reconfigurable units
  • Dynamic control services as interface between
    both
  • Formal methods are proposed
  • Demonstration with four applications from
    different domains

24
Accelerated function synthesis
  • Data movements management
  • DMA parameters generation
  • Synthesis on the reconfigurable units
  • Supports generation of target specific code
  • Separation of behavior and communication concerns
  • Memory oriented task mapping
  • SPEAR THALES RT
  • CDFG extraction from C code of tasks
  • Cascade Critical Blue
  • Architectural and physical Synthesis from CDFG
  • Madeo university of Bretagne Occidentale
Write a Comment
User Comments (0)
About PowerShow.com