Ph.D. Research Plan Presentation - PowerPoint PPT Presentation

About This Presentation
Title:

Ph.D. Research Plan Presentation

Description:

Validation framework (supporting tools required) Work plan ... Various I/O timeshapes, rigid or flexible. Possible to introduce pipelined functional units ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 32
Provided by: phil253
Category:

less

Transcript and Presenter's Notes

Title: Ph.D. Research Plan Presentation


1
Ph.D. Research Plan Presentation
  • Anup Gangwar

2
Presentation Outline
  • Introduction and motivation
  • Specialization opportunities in VLIW processors
  • Methodology
  • Validation framework (supporting tools required)
  • Work plan
  • Status of work
  • References

3
Introduction
  • Why customize architectures?
  • General purpose computing domain Vs embedded
  • Customization leads to cheaper design solutions
  • Architectural choices for exploiting ILP
  • Superscalar processors
  • Try to extract ILP at run time, so, complex
    hardware
  • Limited clock speeds and high power dissipation
  • Not suited for embedded type of applications
  • VLIW processors
  • Compiler has lot of knowledge about hardware
  • Compiler extracts ILP statically, so, simplified
    hardware
  • Possible to attain higher clock speeds

4
Introduction - Problems with VLIW Processors
  • Complex compiler required for extracting ILP
  • Adequate hardware support needed for compiler
    controlled execution
  • Code size expansion due to explicit NOPs if,
  • The application does not contain enough
    parallelism
  • The compiler is not able to extract parallelism
    from the application
  • Need for good instruction encoding and NOP
    compression schemes

5
Presentation Outline
  • Introduction and motivation
  • Specialization opportunities in VLIW processors
  • Methodology
  • Validation framework (supporting tools required)
  • Work plan
  • Status of work
  • References

6
Specialization Opportunities -gt FUs
7
Specialization Opportunities -gt FUs (contd...)
  • Functional Unit Types
  • MISO or Multiple Input Single Output
  • MIMO or Multiple Input Multiple Output
  • MIMO with LD/ST or MIMOs with memory interaction
  • Rigid or flexible I/O timeshapes

8
Specialization Opportunities -gt Reg. File
  • Single register file organization doesnt scale
    well
  • Area grows as N3
  • Delay grows as N3/2
  • Power grows as N3
  • where N is the no. of Functional Units connected
    to the register file
  • Clustered VLIW architectures are the solution
  • Each FU can read from/write to only a subset of
    registers
  • Data copying may increase execution latency
  • Powerful application analysis required to
    overcome above mentioned problems

9
Specialization Opportunities -gt Reg. File
(contd...)
A Clustered VLIW Architecture
10
Specialization Opportunities -gt Interconnect
  • Clustering FUs together requires deciding ICN
  • between different clusters
  • between clusters and memory
  • Analysis of data access patterns required for
    evaluating cost-performance tradeoffs
  • Current ASIP vendors do not offer customizable
    interconnects

11
Specialization Opportunities -gt Encoding
  • Instruction encoding/decoding scheme affects
  • Code size
  • Object code compatibility
  • Branch miss prediction penalty
  • Hardware cost
  • Address specification in code size
  • Each UniOp is equivalent to a RISC/CISC
    instruction

UniOp
UniOp
UniOp
UniOp
MultiOp
12
Specialization Opportunities -gt Encoding
(contd...)
IALU.0
IALU.1
FALU.0
BU.0
ADD
NOP
FMUL
NOP
NOPs in a MultiOp
VLIW Processor Pipeline with Instruction
Decompressor
13
Specialization Opportunities -gt Summary
14
Presentation Outline
  • Introduction and motivation
  • Specialization opportunities in VLIW processors
  • Methodology
  • Validation framework (supporting tools required)
  • Work plan
  • Status of work
  • References

15
Existing Methodologies -gt Simulation Driven
16
Task Set and Constraints
Architecture Description
Application Parameter Extraction
Architecture Design Space Exploration
Retargetable Compiler
Instruction Encoding Specialization
Validation (Simulation with encoded instructions)
Architecture Description (Output to synthesizer)
VLIW ASIP Synthesis Methodology
17
Presentation Outline
  • Introduction and motivation
  • Specialization opportunities in VLIW processors
  • Methodology
  • Validation framework (supporting tools required)
  • Work plan
  • Status of work
  • References

18
Validation Framework -gt Trimaran
C Program
Bridge Code
IMPACT
  • ANSI C Parsing
  • Code profiling
  • Classical machine independent optimizations
  • Block formation

ELCOR
  • Machine dependent code optimizations
  • Code scheduling
  • Register allocation

ELCOR IR
SIMULATOR Generator
Generated Simulator (Statistics)
  • ELCOR IR to low level C files
  • HPL-PD virtual machine
  • Cache simulation
  • Performance statistics
  • Compute and stall cycles
  • Cache stats
  • Spill code info

HMDES Machine Description
19
Validation Framework -gt Trimaran (contd...)
REBEL
Low level C files
C libraries
Emulation Library
Code Processor
HMDES
Native Compiler
Executable for the host platform
20
Validation Framework -gt Retargetable Assembler
Instruction Encoding Description
Toolkit Generator
Generated Assembler
Assembly Instructions
Object Code
To Simulator (for simulation with encoded
instructions)
21
Presentation Outline
  • Introduction and motivation
  • Specialization opportunities in VLIW processors
  • Methodology
  • Validation framework (supporting tools required)
  • Work plan
  • Status of work
  • References

22
Work Plan -gt Interconnect/RF/FU Specialization
  • Initially model the interconnect problem as ILP
    and later on move to other solutions
  • Code selection problem in compilers is similar to
    identifying compute intensive parts for AFUs
  • No. and type of FUs has not been properly
    explored
  • RF clustering problem has not been dealt with
    elsewhere
  • Jacome et. al.
  • Deal with Interconnect/RF/FU specialization
    simultaneously
  • Operation chaining is not considered

23
Work Plan -gt Encoding/Decoding Specialization
  • Goal is to be able to generate encoding schemes
    automatically
  • Work of Shail Aditya et. al.
  • Basically a parameterized encoding scheme
  • Techniques especially for HPL-PD architecture
  • Do not talk of dynamic code size minimization
  • Encoding template is fixed exploration limited
    only to within the template design space
  • Various encoding templates need to be explored,
    also the template itself may be derived from
    application
  • Dynamic code size minimization needs to be
    considered

24
Presentation Outline
  • Introduction and motivation
  • Specialization opportunities in VLIW processors
  • Methodology
  • Validation framework (supporting tools required)
  • Work plan
  • Status of work
  • References

25
Work Status -gt Specialized FUs in Trimaran
  • Modeling MISOs
  • Model as external function calls
  • Replace in Trimaran bridge code and replace with
    AFU op
  • Model new AFU in MDES with the required ops
  • Introduce the semantics in simulator op
    definitions file
  • Modeling MIMOs
  • Model as external function calls returning voids
  • Replace in Trimaran bridge code and replace with
    AFU op
  • Explicitly reserve registers in C-code for
    returning values
  • Introduce operation semantics in simulator op
    definition file

26
Work Status -gt Specialized FUs in Trimaran
(contd...)
  • Modeling MIMOs with LD/ST
  • Model as regular MIMOs
  • Memory interaction with block LD/ST at beginning
    and end of execute cycles
  • Additionally
  • Possible to impose register file constraints
  • Various I/O timeshapes, rigid or flexible
  • Possible to introduce pipelined functional units

27
Work Status -gt Instruction Enc. in Trimaran
28
Work Status -gt Instruction Enc. in Trimaran
(contd...)
  • New Jersey Machine Code Toolkit (NJMC)
  • Deals with bits at symbolic level
  • Can be used to write assemblers/disassemblers
  • Specification in SLED (Specification Language for
    Encoding/Decoding)
  • Model instruction decompressor in HMDES
  • Instrument ELCOR to generate assembly code
  • Encoding is done using procedures generated by
    NJMC
  • Problems with NJMC
  • VLIW instruction need to be broken up into 32 bit
    tokens
  • Encoded instructions must end on 8 bit boundary

29
Work Status -gt Code Gen. for Clustered ASIPs
  • ELCOR
  • Disadvantages
  • ELCOR is heavily oriented towards HPL-PD
    architecture
  • Does not support clustered VLIW architecture
  • Advantages
  • Strong optimizing compiler
  • Rich library to deal with the IR
  • IMPACT compiler system offers another choice for
    building a backend
  • Feasibility study being carried out to fix a
    particular direction of work

30
Presentation Outline
  • Introduction and motivation
  • Specialization opportunities in VLIW processors
  • Methodology
  • Validation framework (supporting tools required)
  • Work plan
  • Status of work
  • References

31
References
  • Bhuvan Middha, Varun Raj, Anup Gangwar, M.
    Balakrishnan, Anshul Kumar and Paolo Ienne, A
    Trimaran based framework for exploring design
    space of VLIW ASIPs with coarse grain FUs,
    ISSS-2002.
  • Anup Gangwar, M. Balakrishnan and Anshul Kumar,
    A framework for studying the effect of VLIW
    processor instruction encoding and decoding
    schemes, Mini Project, Dept. of CSE.
  • M. Jacome and G. de. Veciana, Design challenges
    for new application specific processors, IEEE
    Design and Test of Computers-2000.
  • B. Ramakrishna Rau and Michael S. Schlansker,
    Embedded computer architecture and automation,
    IEEE Computer-2001
  • Michael S. Schlansker and B. Ramakrishna Rau,
    EPIC An architecture for instruction-level
    parallel processors, HPCA-2000.
  • N. G. Busa, A. van der Werf and M. Bekooij,
    Scheduling coarse grain operations for VLIW
    processors, ASPDAC-1998.
  • Shail Aditya, Scott A. Mahlke and B. Ramakrishna
    Rau, Code size minimization and retargetable
    assembly for custom EPIC and VLIW processors,
    ISSS-1999.
Write a Comment
User Comments (0)
About PowerShow.com