Fabrizio Ferrandi - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Fabrizio Ferrandi

Description:

... from advanced audio and video systems that support the next ... Selected a meaningful subset of the GNU/GCC Torture Testsuite composed by 834 benchmarks. ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 17
Provided by: cast57
Category:

less

Transcript and Presenter's Notes

Title: Fabrizio Ferrandi


1
hArtes SW Partitioning overview
  • Fabrizio Ferrandi
  • Politecnico di Milano
  • CASTNESS'08 15-18 January 2008 - ROMA - Italy

2
Summary
  • Objectives
  • Task partitioning
  • C based descriptions
  • The PandA framework
  • Preliminary results
  • Conclusions

3
hArtes partitioning objectives
  • An important step in the hArtes design space
    exploration process is the identification of a
    good partitioning of the given application
  • identifying fragments of parallel code on which
    transformations and metric analysis can be
    performed
  • starting from C based application coming from
    advanced audio and video systems that support the
    next-generation of communication and
    entertainment facilities
  • targeting a combination of embedded processors,
    digital signal processing and reconfigurable
    hardware.

4
hArtes concurrency model
  • hArtes applications characteristics
  • Data intensive
  • Control intensive
  • Parallelism granularity
  • Fine grain (basic blocks)
  • Coarse grain (tasks)
  • Specifications
  • Sequential programs
  • Cuncurrency extracted
  • Explicit Concurrency with split/joint barriers

5
Detection of clusters/tasks
  • The detection of clusters of operations connected
    by data exchanges
  • Uses dataflow analysis based on Control and data
    Dependence Graphs (PDG)
  • Two strategies
  • Top-down
  • starting from high-level clusters of operations
    specified by the system designers (the set of
    functions) decompose each function in sub-tasks
    that are connected by minimal sets of data
    exchanges.
  • Bottom-up
  • starting from minimum size clusters of operations
    (i.e., the individual instructions as specified
    by the C intermediate representation), clusterise
    instructions along the heavier communications.

6
Task granularity
  • The actual size of the tasks can be controlled
  • exploiting replication of the operations to
    obtain more parallelism
  • exploiting loop based transformations
    fusion/fission, loop unrolling, .

7
Task mapping goal
  • Goal identify the best trade-off in terms of
    hardware and software tasks to satisfy designers
    constraints
  • Parameters to be considered
  • Platform architecture
  • Performance required by the application
  • Reconfigurable area available
  • Profiling information starting from performance
    costs evaluated on each specific platform
    component

8
Task mapping how
  • The evaluation of the allocation of the modules
    onto the different components of the platform is
    performed as a step-by-step process.
  • Here the focus is either
  • on traditional reinforcement learning algorithms
    based on dynamic programming, such as Q-Learning,
    TD (lambda),
  • or on more advanced techniques that exploit both
    reinforcement learning and evolutionary
    computation, such as learning classifier systems,
    which provide more sophisticated generalization
    capabilities

9
Cost estimations and metrics
  • Metric evaluation for
  • partitioning
  • mapping
  • Input
  • C based annotated description
  • Application constraints Maximum size of task,
    bandwidth, performance
  • Task decomposition (expressed as annotations or
    external file)
  • Profiling information (expressed as annotations
    or external file)
  • Target architecture constraints and description
  • Tool extracts
  • Inter-process synchronization and communication
  • Inter-procedural control and data flow dependency
  • Output
  • C based annotated description
  • Control and data characterization of task
    exchanges
  • Behavior similarity
  • Closeness between tasks
  • Structural relationship between application and
    target architecture

10
Task partitioning interactions
11
C based descriptions
  • C does not have any parallelism based instruction
  • Task described by C function
  • A notation to express parallelism is required
  • Many different notations are possible
  • Pragma based
  • Comment based
  • XML based
  • The notation must be powerful enough to express
    that
  • two or more tasks can run in parallel (fork
    operations)
  • the execution flow must wait for the termination
    of one or more tasks (join operation)

12
Motivations for OpenMP
  • OpenMP is a collection of pre-processor
    directives (pragmas) used to express parallelism
    in C programs
  • Purposely created for a fork/join model
  • It is an open and widely adopted standard
  • supported in next release of GNU/gcc compiler
    (gcc 4.2)
  • platform independent
  • mapping independent
  • easy functional verification of partitioned
    programs on host machines

13
Supported Constructs in Source Code
  • We already support a large number of C constructs
    (e.g. nested struct, union, pointer arithmetic,
    )
  • Selected a meaningful subset of the GNU/GCC
    Torture Testsuite composed by 834 benchmarks.
    Currently, we cover 828/834 benchmarks (99.3).
  • Covering means
  • parsing
  • building internal representation
  • dumping back the C code
  • compiling the produced code
  • executing it without errors (most of them are
    software fault tolerant to detect execution
    errors)
  • Ongoing works to support var_args, computed goto,
    not reducible loops.

14
The PandA framework
Integrated with GCC infrastructure The output of
the task partitioning is an OpenMP compliant C
code Group operations into tasks in an efficient
way, in order to meet performance
requirements Initial mapping on the platform
based on metrics under development
15
Preliminary results
ADPCM
JPEG
Results on ADPCM and JPEG algorithms (1.7-1.4 max
speedup) Correctly analyzed all the hArtes
applications
16
Questions?
  • ? Hopefully on Friday directly to Fabrizio
Write a Comment
User Comments (0)
About PowerShow.com