Title: 4th HiPEAC Industrial Workshop on Compilers and Architectures
1Arnaud Grasset, PhDarnaud.grasset_at_thalesgroup.com
THALES Research Technology,Embedded System
Lab - France
Mapping High Performance and Mission-critical
Applications to Embedded Architectures
- Philippe Bonnot, Sami Yehia, Arnaud Grasset, Eric
Lenormand, Gilbert Edelin
2Aerospace, defence and security requirements
- Embedded systems specific requirements
- Performance, power consumption, etc
- Aerospace, defence and security requirements in
THALES - Long life cycle, certification procedures,
relatively low volumes - Designing and programming appropriate
architectures is a challenge - Emergence of on-chip reconfigurable and parallel
architectures - SPEAR A framework for application mapping
- Supporting new reconfigurable architectures
3Outline
- Embedded Systems for low volumes products
- MORPHEUS A multi-purpose dynamically
reconfigurable platform - SPEAR A framework for application mapping on
MORPHEUS platform - Using MORPHEUS for a motion detection application
- Conclusion and perspectives
4Embedded Systems for low volumes products
- Investigates solutions to satisfy and anticipate
critical Embedded Systems needs - NRE costs prominent in low volumes product, long
life cycle - flexibility
Diversity of THALES applications distributed in
Business Units
Application Space
SPEAR tool
Architectural Space
MORPHEUS
Ter_at_ops
Ter_at_pix
Cell
Diversity of architectural solutions
- Application Development Framework is needed to
- Master growing complexity of application and
heterogeneous architectures - Redeploy applications on several architectures
- Structure software development methodologies
5Computation intensive and flexible architectures
- Data and/or task level parallelism
- Ter_at_pix SIMD for image processing at DATE08
- Xilinx Virtex4 SX55 FPGA
- 128 processing elements
- 20 GOPS peak performance
- SIMD frequency 150 MHz
- Power consumption 14 W
processors memory
C code
Application dependent software code IP
sequencing
processor(s) (embedded in FPGA)
µ-code memory
Stable API
micro code
Library of optronics operators
Sequencer
Stable API
PE
PE
PE
PE
DMA
DMA
VHDL code
Optronics standard hardware platform
PE
PE
PE
PE
display1
IR
PE
PE
PE
PE
display2
PE
PE
PE
PE
VIS
display3
Dataflow
Image Memory
- Ter_at_ops MPSoC for stream processing dominant
applications
- Multi-domain
- High Level of SW productivity
- High GOPS/W (respect to GPP)
- Project started in 2007
TYPE RECONF
TYPE SIMD
TYPE GPP
SMEMn
SMEM1
bridge
T1
Tn
T1
Tn
Tn
T1
Network-on-Chip
DDR1
DDRn
I/O
6Outline
- Embedded Systems for low volumes products
- MORPHEUS A multi-purpose dynamically
reconfigurable platform - SPEAR A framework for application mapping on
MORPHEUS platform - Using MORPHEUS for a motion detection application
- Conclusion and perspectives
7MORPHEUS project
- MORPHEUS Multi-purpose dynamically
Reconfigurable Platform for intensive
Heterogeneous processing - EU 6th Framework Program IST 027342
- 3 years project (2006 2008)
- coordinated by THALES RT
- 11 industrial and 7 academic partners
- Focus on reconfigurable computing
- Rising complexity of embedded systems
- Design productivity gap
GPP - low comp. density - power inefficient - low
speed
Heterogeneous optimized infrastructure
SoC - high NRC - low reconfigurability -
time-to-market
Programming efficiency
MORPHEUS
Flexible domain focused platforms
Computation Intensive Flexible Embedded Systems
Hardware flexibility
FPGA - design inefficient - power inefficient -
area overhead
8MORPHEUS architecture
- MORPHEUS is a reconfigurable platform
- Several Reconfigurable Units
- Several granularity levels
- MORPHEUS is a modular and scalable platform
- Network-on-Chip
- Homogeneous interfaces with RU
- Target implementation characteristics
- ARM9 processor and infrastructure
- area 90 mm2 / 90 nm ST technology
- frequency 200 MHz
- power consumption 3W
- M2000 Embedded FPGA
- Fine grain
9MORPHEUS toolset
- proposes a software design approach for
productivity, flexibility, performance
description of accelerated function
Annotated C code of global application
-------- f(.) --------
Compilation-time scheduling of accelerated
functions setting and execution
Run-time scheduling of the application
ARM sub-system
Communication mechanisms
Configuration Manager
Reconfigurable Units Configuration memory
- SystemC model in Coware ModelDesigner
- Silicon prototype
10Outline
- Embedded Systems for low volumes products
- MORPHEUS A multi-purpose dynamically
reconfigurable platform - SPEAR A framework for application mapping on
MORPHEUS platform - Using MORPHEUS for a motion detection application
- Conclusion and perspectives
11SPEAR Application Development Framework
Model-based Architecture Capture
Model-based Application Capture
- Captures both application and architecture
- Application modelled as acyclic graphs of tasks
- Heterogeneous architecture accepted
- Supports application partitioning/mapping,
communication insertion, scheduling - Mapping is human-driven
- Enables iterative design exploration
Mapping commands GUI
- Code generation and performance simulation
12A formalism for accelerated function modelling
- Accelerated function modelled as an acyclic graph
of tasks - Represents deterministic and data streaming
applications
- ARRAY-OL formalism based on multidimensional
arrays - Tasks represent nested-loops executing an
Elementary Function - C-based description of Elementary Functions
- Linear access patterns from/to input/output array
of data produced and consumed
- Expression of data and task level concurrency
F1( ..)
for(i0i
F3( ..)
F2 ( FlowA, ...)
13SPEAR tasks mapping on MORPHEUS architecture
- Function model captured with a Graphical User
Interface (GUI) - Manual iterations space scheduling
- Splitting iterations space and loop permutations
Reconfigurable Unit
Reconfigurable Unit
F
G
C
Data Exchange Buffers
Network-on-Chip
On-chip Memory
- Fusion lowers pipeline granularity ? reduces
memory needs
- SPEAR manages tight memory resources
- Generation of DMA transfers parameters and memory
access patterns
14Memory access patterns generation
- Array streaming according to iterations
scheduling - Generation of addressing patterns specific to
data stream characteristics
Producing task
Consuming task
- 2 memory banks with ping-pong policy
- Design of a pipeline not explicit in the
application model
15Outline
- Embedded Systems for low volumes products
- MORPHEUS A multi-purpose dynamically
reconfigurable platform - SPEAR A framework for application mapping on
MORPHEUS platform - Using MORPHEUS for a motion detection application
- Conclusion and perspectives
16MORPHEUS target applications
- INTRACOM Broadband wireless access systems
- Switch between IEEE 802.11a and IEEE 802.16
- Alcatel-Lucent Network routing systems
- Update automatically distributed within the
network - Deutsche Thomson Professional video
- Dynamic selection of algorithm to apply
- High processing workload up to 4096x3112 _at_ 24
frames per second - Film grain noise reduction 2k operations / pixel
- THALES Intelligent cameras for Homeland security
- Need to change operating modes and very high
computational complexity
On PACT XPP
On PiCoGA
17A motion detection image processing application
- Detects movements with respect to a static
background
18Motion detection application captured in SPEAR
- Image resolution 768x576 pixels _at_ 25 frames per
second - greyscale, 16 bits per pixel
- SPEAR model well adapted to the application
specification
Binarization
Opening
Edge detection
Merging
19Application mapping
- Image partitioning in tiles
- Opening and edge detection 33 pixel matrices
- Merging and binarization data organization
aligned on other operators - Manual implementation on PiCoGA reconfigurable
unit C. Mucci,SoC 2007 - 342? performance speed-up with respect to a SW
approach - Project status
- Architecture defined with a SystemC model
available - Tape-out planned for mid-2008
- MORPHEUS toolset modules developed and
integration performed - CDFG synthesis in progress
- Verification on-going
Reconfigurable Unit
local memory
local memory
20Outline
- Embedded Systems for low volumes products
- MORPHEUS A multi-purpose dynamically
reconfigurable platform - SPEAR A framework for application mapping on
MORPHEUS platform - Using MORPHEUS for a motion detection application
- Conclusion and perspectives
21Conclusion
- MORPHEUS seeks for reducing time-to-market and
improving flexibility - A reconfigurable heterogeneous architecture
modular and scalable - A comprehensive toolset to fully exploit MORPHEUS
architecture - A framework for application mapping on embedded
systems architecture - An array-based programming model expressing both
task and data parallelism - Validation in progress with a motion detection
application - Application captured in SPEAR and first
implementation results on PiCoGA - Toolset modules enhancements in second phase of
the project - High-level synthesis in progress, optimization
planned - Mapping on several Reconfigurable Units
- Reconfigurable unit to reconfigurable unit
communications - Parallel control of accelerated tasks
- Dynamic allocation of accelerated tasks at RTOS
level
22Arnaud Grasset, arnaud.grasset_at_thalesgroup.com
THALES Research Technology, Embedded System
Lab - France
Further information on MORPHEUS project
http//www.morpheus-ist.org
Mapping High Performance and Mission-critical
Applications to Embedded Architectures
23MORPHEUS project objectives
- Develop a Reconfigurable System-On-Chip in
silicon - 3 different reconfigurable engines ?
multi-granular - Network-On-Chip ? scalable hardware architecture
- Predictive configuration manager
- High-bandwidth memory system
- Provide a software oriented toolset
- C-Code for control part of applications
- Spatial design for kernel operations
implementation on the reconfigurable units - Dynamic control services as interface between
both - Formal methods are proposed
- Demonstration with four applications from
different domains
24Accelerated function synthesis
- Data movements management
- DMA parameters generation
- Synthesis on the reconfigurable units
- Supports generation of target specific code
- Separation of behavior and communication concerns
- Memory oriented task mapping
- SPEAR THALES RT
- CDFG extraction from C code of tasks
- Cascade Critical Blue
- Architectural and physical Synthesis from CDFG
- Madeo university of Bretagne Occidentale