Presentation on Memory Optimizations In High Level Synthesis' - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Presentation on Memory Optimizations In High Level Synthesis'

Description:

Explore the possibilities of burst & pipelined memory accesses. ... Pointer arithmetic requires a full fledged memory interface. ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 34
Provided by: fpga3
Category:

less

Transcript and Presenter's Notes

Title: Presentation on Memory Optimizations In High Level Synthesis'


1
Presentation onMemory Optimizations In High
Level Synthesis.
  • By
  • Nitin K. Agarwal.
  • Under the Supervision of
  • Dr. Preeti Ranjan Panda
  • Department Of Computer Science, I.I.T., Delhi

2
Overview
  • Motivation Objectives.
  • Literature Survey.
  • Existing C to VHDL Converter.
  • Possible Memory Optimizations.
  • Case Study Framework.
  • Acknowledgement References.

3
(No Transcript)
4
Application Specification in C
Partitioner
Functions mapped On Hardware
My Project
C to VHDL
Functions mapped On Software
VHDL Code For Functions
Compiler
ASIC
ASIC
Processor
Memory
5
Input/output of CtoVHDL Converter.
C to VHDL Converter
Function Specification In C
Behavioral VHDL Code
Behavioral Compiler
Structural RTL Code in VHDL
Component Library
6
Objectives
  • To complete the existing CtoVHDL converter.
  • Implement the bitwidth analysis.
  • Partition the local variable efficiently.
  • Explore the possibilities of overlapping the
    memory communication with computation like
    prefetching.
  • Explore the possibilities of burst pipelined
    memory accesses.
  • Implement the above ideas into CtoVHDL converter.
  • Perform case studies to observe the gain.

7
Complete Flow Of C to VHDL
SUIF IR
SUIF Front Hand
PORKY Optimizations
C Code
BitWidth Analyzer
Behavioral VHDL Code
Structural VHDL Code
Partitioner
VHDL Generator
Behavioral Compiler
8
Motivation Behind the C As A HDL
  • A successful hardware compiler for a high-level
    language allows for more flexible
    hardware-software co-design and simulation.
  • Design rule checking and gate-level optimizations
    are becoming impossible for large designs without
    computer assistance.
  • Writing the RTL code for a large design is a
    cumbersome and error prone task.
  • A better design space exploration is possible
    because of automation.

9
Limitation Of C As A HDL
  • Lack of a timing grammar to define input/output
    behavior.
  • Restriction of the single return value of the
    functions is not suitable for hardware.
  • C is defined with a general memory and
    computation model that does not hold for
    hardware.
  • Pointer arithmetic requires a full fledged memory
    interface.
  • C's inflexible type system of fixed-width data
    types.

10
Literature Survey
  • Silicon C A hardware Backend For SUIF
  • Brass
  • Spark

11
SiliconC - A Hardware Back End
SUIF IR
SUIF Front Hand
EXIT1
C Code
PORKY
Structural VHDL Code
SSA
BITSIZE
VGEN
12
SiliconC Different Stages.
  • SUIF
  • To generate a structured intermediate-format
    file.
  • EXIT 1
  • Its job is to create single-entry, single-exit
    functions.
  • PORKY (Optimization pass)
  • To perform some classical optimization and breaks
    down high-level construct.

13
SiliconC Different Stages (Contd..)
  • SSA
  • It translates the intermediate representation
    into Static Single Assignment (SSA) and generate
    CFG of basic blocks.
  • BITSIZE
  • It tries to narrow the bit-widths of variables
  • VGEN
  • It generates structural VHDL code.

14
Limitation Of Existing C to VHDL
  • It does not support global variables in the
    relevant function.
  • Nested function calls are not supported.
  • It generates the behavioral VHDL code.
  • BitWidth of the variables are not handled
    efficiently.
  • All the local variables are mapped to the local
    registers.

15
Overcoming the Limitations
  • Global Variable Access.
  • Provide pointer to global variable as the
    function parameter.
  • and then treat it as the pointer.
  • BitWidth Analysis
  • Partitioning
  • Partition the local variables between the
    registers memory.

16
Interface of Core Co-proc
CPU/Memory
Data Bus
Start
CORE ASIC Pure Behavioral Code No Clock
A
B
Address Bus
Busy
Valid add.
Add. Accepted
R/W
17
Complete Picture
CPU
Memory
System Bus
WRAPPER
ASIC
18
Different Memory Optimizations
  • Prefetching To reduce the waiting time for
    memory operation
  • Utilizing Burst Mode.
  • Arranging memory requests in pipelined fashion.
  • Bit Width Analysis
  • Partitioning Assign each local variable either
    to memory or local registers.

19
Prefetching
  • There is no notion of clock in the computing
    part.
  • Execution is sequential.
  • Synthesizer can schedule memory operation to
    achieve the effect of prefetching.
  • -------. Computing part independent of
    next.
  • -------. Memory read operation.
  • Wait until (------) --waiting for memory read
    to complete.

20
Burst Mode Pipelined Accesses
  • SRAMS are generally having read/write latency of
    1 clock cycle.
  • Utilizing burst mode means saving some transition
    on address bus.
  • There is no saving in terms of time.
  • One can pipeline the request to achieve the
    concurrency.

21
ZBT SRAM Read Timing Diagram
CLK
OE
ADV/LD
DONT CARE
R/W
DONT CARE
Address
Data
22
ZBT SRAM Write Timing Diagram
CLK
DONT CARE
OE
ADV/LD
DONT CARE
R/W
DONT CARE
Address
Data
23
Bit Width Analysis
  • To overcome the C's inflexible type system of
    fixed-width data types.
  • Associates the bit width for each variable used
    in the description of function.
  • Utilize this information during VHDL code
    generation to reduce the storage area in the
    final layout.
  • Overloaded all the primitive VHDL operator to
    handle the variable as the bit vector.
  • It will be implemented as pre processing pass on
    SUIF IR.

24
Partitioning
  • Assign each variable to either memory or local
    register.
  • Area time tradeoff.
  • Possibilities of scratchpad memory can be
    explored.

25
Partitioning
Memory
Local Variables Of the Function
Registers Of The ASIC
26
Another Possibility
Processor
Memory
ASIC1
ASIC2
Scratch Pad Memory
27
Case Study.
  • Download the functions which are mapped on
    hardware onto the FPGA board.
  • Perform Co-Simulation with software part.
  • Software part can be mapped on -
  • Host computer.
  • Leon Processor which will also be on FPGA.
  • Observe the results.

28
S/W on PC, H/W on XCV800, Communication Over PCI.
29
S/W H/W Both on XCV800Communication Over AHB
30
Schedule
31
Acknowledgements
  • Prof. M. Balakrishnan.
  • Prof. Anshul Kumar.
  • Anup Gangwr( PhD Student).
  • Basant Dewadi( PhD Student).

32
References
  • SiliconC - Hardware backend for Suif
  • http//www.flex-compiler.ics.mit.edu/SiliconC
  • Embedded system group, I.I.T., Delhi
  • http//www.iitd.ernet.in/esproject
  • Stanford Compiler Group. 
  • http//www.suif.stanford.edu
  •  

33
  • Thank You
Write a Comment
User Comments (0)
About PowerShow.com