Syllabus Summary - PowerPoint PPT Presentation

1 / 54

About This Presentation

Title:

Syllabus Summary

Description:

Overview of Field Programmable Gate Arrays (FPGAs) design development within the ... Fixed-plus-Variable, that is core processor with FPGA (Quicksilver, Stretch) ... – PowerPoint PPT presentation

Number of Views:76

Avg rating:3.0/5.0

Slides: 55

Provided by: spaane

Category:

more less

Transcript and Presenter's Notes

Title: Syllabus Summary

1
Syllabus Summary
Microelectronics for the Global World
Collaborative Engineering ECE992/777
2
Lecture 1. Introduction

Overview of Field Programmable Gate Arrays
(FPGAs) design development within the appropriate
software/firmware components development
environment.
In the global design world, we will have to deal
with Intellectual Property (IP), IP testing,
trust and design efficiency.
Differences in technology status, design
environments and proficiency will lead to the
need for tools for design-space excursions and
optimizations.
As a result of the taught design elements,
globally distributed engineering can be
accomplished.

The course is divided into three parts,
collaborative design elements,
the collaborative development process and
the subsequent approach for integration and
optimization.

4
Part I.
Collaborative Design Elements
5
Lecture 2. Outsourcing Economy

Outsourcing has been approached fearfully but can
also be approached as an opportunity to
innovation
We will review traditional versus outsourcing
driven design methodology
Security/Trust issues related to foreign killer
chips will also be discussed.

6
Global Collaboration in Outsourcing
7
Disruptive Technologies
Performance trajectory of present technology
driven by sustaining technological improvements
Performance that customers can absorb or utilize
Performance
New performance trajectory
Disruptive Technology
Time
Clayton M. Christensen, The Innovators Dilemma
When New Technologies Cause Great Firms to
Fail, HarperBusiness, 2000 (Revised Edition)
8
Lecture 3. Reconfigurable Computing

Positioned in computing densities between
Application-Specific Integrated Circuits (ASICs)
and Digital Signal Processors (DSPs), FPGAs
provide increased flexibility in computational
details such as degrees of parallelism and
pipelining, as well as real-estate and power
consumption over DSPs and General-Purpose (GP)
microprocessors.

9
Estrins Fixed Plus Variable Structure Computer
Organization of Computer Systems - The Fixed Plus
Variable Structure Computer, Gerald Estrin, Proc.
WJCC, 1960
10
FPGA Architectural Developments

Traditional Sea-of-CLBs (Xilinx, Altera)
Extreme-DSP, FPGA with embedded 192 18x18
multipliers (Xilinx, Altera), with embedded
PowerPC cores, RapidIO cells (Xilinx)
Fixed-plus-Variable, that is core processor with
FPGA (Quicksilver, Stretch)
Macro-Pipeline Processor (PipeRench)
Sea-of-ALUs, chunky arrays (MorphoSys,
MathStar)
Dynamic Reconfiguration (IPFlex)
DARPA-sponsored Polymorphous Computing
Architectures (PCA) developments

11
Xilinx Virtex-4 FPGA Family
12
MathStar FPOA

Chunky, gross-grain array
Five Silicon Object types
Arithmetic Logic Unit (ALU)
Content Addressable Memory (CAM)
Cyclic Redundancy Check (CRC)
Multiply Accumulator (MAC)
Register File (RF).
In addition RAM memory resources are distributed
in the array.
The function and ratio of these different Silicon
Objects are chosen based on detailed study of
applications space for the product offerings.

13
Processing Spectrum Continuum

ASIC

FPGA

DSP

GPU

Sea-of-CLBs

lt
-------
gt

Sea-of-ALUs

lt
----------------
gt

Fixed-plus-Variable

lt
---------------------------------------
gt

Macro-Pipeline

lt
-----------------------------------------------
gt

Dynamic Reconfiguration

lt
-------
gt

VHDL/Verilog
C/C

lt
-----
-------
gtlt
----------
-------
gt

lt
---------------------------
----------------------
gt

SystemC

14
Efficiency versus Application Space
PCA
ASIC
FPGA
GP
SWEPT Efficiency
Vectors/ Streaming
Structured Bit-operations
Symbolic Operations
Application Types
Optimized Performance Over Broad Application Space
15
Native Stream Mode
16
Native Threaded Mode
17
Application Flow
Control
1
2
3
StreamProcessing
MC-SM
MC-SM
MC-SM
Inter-chip I/O(crossbar)
Inter-chipMemoryTransfer
ThreadedProcessing
MC-TM
MC-TM
5
4
ParcelInterface
18
Lecture 4. Levels of Abstraction

It is a misconception to expect to be able to use
FPGA personalization bit-level code, in order to
update/upgrade.
Too many technology-specific design decisions
have been made to get to that particular
synthesized code pattern.
Only optimization at higher levels of abstraction
will payoff in the long run.
Liev01 P. Lieverse, P. van der Wolf, E.
Deprettere, K. Vissers, A Methodology for
Architecture Exploration of Heterogeneous Signal
Processing Systems, Journal of VLSI Signal
Processing, 29, 197206 (2001), Kluwer Academic
Publishers, Boston

19
The Design Pyramid LIEV01
20
Effect of Abstraction Level
Relative Efficiency
Compiler Performance
Tradeoff Curve Optimization Potential
VHDL
SystemC
UML
Abstraction Level
21
Lecture 5. Design Flow

Design elements from UML down to VHDL, including
SystemC, MathWorks Simulink and Xilinx
SystemBuilder will be reviewed, as well as
general design/test flows.

22
(No Transcript)
23
SystemC-based Hardware/Software Co-Design
System Behavior
System Architecture
Mapping
Performance Simulation
Refine
Implementation
Software
Hardware
Keutzer, K., Malik, S., Newton, R., Rabacy, J.,
Sangiovanni-Vincentelli, A., System Level
Design Orthogonalization of Concerns and
Platform Based Design, IEEE Transactions on
Computer-Aided Design of Circuits and Systems,
2000, 19(12)
24
Lecture 6. Tools for Design

The state-of-the-art of design elements needed
for collaborative design development, including
verification, trade-off and optimization tools
will be described and evaluated.

25
MILAN

MILAN is a model-based, extensible simulation
framework that provides a unified environment
capable of
modeling a large class of embedded systems and
applications
seamlessly integrating different widely-used
simulators into a single framework
enabling rapid evaluation of performance metrics
such as power, latency, and throughput
facilitating simulation at various levels of
granularity
rapid evaluation of a large design space

MILAN, Institute for Software Integrated Systems,
Vanderbilt University, Nashville
26
The MILAN Architecture
GME 2000
Design Space Exploration Tools
Functional Simulators
High-level Power Estimators
Cycle-Accurate Power Simulators
System Generation and Synthesis Tools
Target System
Model interpreter feeding-back results
Model interpreter driving simulators/tools
i
i
MILAN, Institute for Software Integrated Systems,
Vanderbilt University, Nashville
27
Part II.
Collaborative Development
28
Lecture 7. Intellectual Property (IP)

The IP business model and some of its limitations
will be reviewed, several other business
propositions such open model and fabless design
companies will be analyzed.
Business Proposition, Cost Model
Re-use Potential, Patentable
Hardcore or Softcore IP
Hardware versus Software Components

29
Lecture 8. Open Standards VIA, OCP, VSIA

Interface standards defined and developed for, in
particular, System-on-Chip design will be
reviewed and analyzed for compatibility to IP
component development.
Open Core Protocol (OCP)-IP www.ocpip.org
Virtual Interface Architecture (VIA)
Virtual Sockets Interface Alliance (VSIA)
www.vsia.org

30
Lecture 9. Component/System Testing

Testability aspects of firmware components,
including generation of test-vectors, assessment
of coverage, JTAG testing and test monitor
concept will be illustrated.
Intellitech (Durham, NH) TEST-IP
Plug and Play Scan Components
Boundary Scan
Self-Test
Observability

31
Lecture 10. Trusted Circuits

The use of more globally developed ICs has
increased the need for tools to support the
trustable development of complex and
performance sensitive applications.
..develop enabling trusted assembly,
integration, and test technologies that verify
the correctness, reliability, and functionality
of designed Integrated Circuits (ICs), i.e.,
approaches that enable IC users to fully trust
the ICs they employ. DARPA SBIR 2005.2

32
Part III.
Collaborative Integration and Optimization
33
Lecture 11. Component Tradeoffs

In heterogeneous computing environments, the
constituting functions and subsystems can be
implemented at various points along their
respective design space tradeoff curves.

34
Performance/Cost Tradeoffs
The Analysis of Processor-Time Trade-Off
Opportunities in a Reconfigurable Multi-Processor
System, H.A.E. Spaanenburg, Syracuse University,
1979
35
Lecture 12. Design Excursions (SPADE)

In the University of Leiden STEF02 approach
particular computational instances have been
transformed by small perturbations in the
design space. These techniques support a system
designer in exploring alternative instances of an
application mapped onto an architecture template.
STEF02 T. Stefanov, B. Kienhuis, E. Deprettere,
Algorithmic Transformation Techniques for
Efficient Exploration of Alternative Application
Instances, Proceedings 10th International
Symposium on Hardware/Software Codesign
(CODES02), Estes Park, Colorado, May 6-8, 2002

36
The Y-chart extended with the Application
TransformationLayer STEF02.
37
Alternative instances of the application have to
begenerated, mapped onto the architecture
template and exploredin order to evaluate the
performance of the Application-Architecture pair
STEF02.
38
Simple example illustrating the unfolding and
skewingtransformations STEF02.
39
Lecture 13. Optimization (SPIRAL)

A Carnegie Mellon University developed SPIRAL
PUCH05 program technique automatically
generates high performance code that is tuned to
the given platform. SPIRAL generates code for a
broad set of DSP transforms including the
discrete Fourier transform, other trigonometric
transforms, filter transforms, and discrete
wavelet transforms.
PUCH05 M. Püschel, J. Moura, J. Johnson, D.
Padua, M. Veloso, B. Singer, J. Xiong, F.
Franchetti, A. Gacic, Y. Voronenko, K. Chen, R.
W. Johnson, and N. Rizzolo, SPIRAL Code
Generation for DSP Transforms, Proceedings of
the IEEE Special Issue on Program Generation,
Optimization, and Adaptation, Vol. 93, No. 2,
2005, pp. 232-275

40
SPIRAL
Automates the
A library generator for highly optimized,
platform-adapted signal processing transforms
J. Moura et al, Generating Platform-Adapted DSP
Libraries Using SPIRAL, HPEC 2001
41
SPIRAL Methodology
given
DSP Transform (DFT, DCT, Wavelets etc.)
given
Computer Architecture
J. Moura et al, Generating Platform-Adapted DSP
Libraries Using SPIRAL, HPEC 2001
42
SPIRAL vs. FFTW (lower better)
Pentium III/Linux/gcc
Athlon/Linux/gcc
comparable performance
J. Moura et al, Generating Platform-Adapted DSP
Libraries Using SPIRAL, HPEC 2001
Pentium III/Win2000/Intel compiler
43
Lecture 14. System Optimization

The total system solution can be evaluated for
the right combination of design space points for
their constituting elements.
This procedure within the total system constraint
allows for an efficient process for increasing
benefits for the least incremental cost.
These procedures especially facilitate the
introduction of technology updates, since it
allows for the reestablishment of the proper
computational operating point for the combination
of the old and new technology.

44
Processor-Time System Tradeoffs
The Analysis of Processor-Time Trade-Off
Opportunities in a Reconfigurable Multi-Processor
System, H.A.E. Spaanenburg, Syracuse University,
1979
45
Order-of-Magnitude Improvements
Insertion of a next-level processor into an
embedded heterogeneous environment needs to
present an order-of-magnitude improvement
potential
MOPS Kg.Watt
ASIC
FPGA
X
DSP
1000
RISC
100
10
gt3.3x18 months 5 years
time
46
Lecture 15. Heterogeneous Systems

Heterogeneous processing systems currently
contain a continuum of processing alternatives
from general-purpose processors (GPP), to digital
signal processors (DSP), to Field-Programmable
Gate Arrays (FPGA) and Application-Specific
Integrated Circuits (ASIC).
Especially the FPGA domain has recently produced
its own range of architectural alternatives along
that processing continuum spectrum.

47
Not One Machine Does Everything
Since no single architecture can satisfy the
needs of all users, it has been desirable to have
compute system whose architecture can be defined
and varied dynamically S.S. Reddi and E.A.
Feustal, A Conceptual Framework for Computer
Architecture, Computing Surveys, Vol. 8, No. 2,
June 1976
Top of Empire State Building in New York
Top of Foshay Tower in Minneapolis
Airport
Airport
48
Performance-Flexibility Trades
1000
Dedicated ASICs
100
Energy Efficiency MOPS/mW (or MIPS/mW)
10
1
0.1
Flexibility (Coverage)
Pleiades Ultra-Low Power Hybrid and
Reconfigurable Computing, Jan Rabaey, UC
Berkeley, 1999
49
Lecture 16. Upgrade/Updates, Technology
Transparency

System developers must continue to reevaluate
which combination of implementation alternatives
will best meet their overall system requirements.
This question is not only important for the
initial design, but also for subsequent
technology updates and upgrades, especially when
they have to be implemented in the same
constrained real estate.

50
Upgrade/Update Approach
UML
UML-to-SystemC Front-end
SystemC
SystemC-to-VHDL Compiler
VHDL
VHDL
VHDL-to-FPGA Synthesizer
Design in Technology 1, e.g. Xilinx Virtex-4
Design in Technology 2, e.g. MathStar FPOA
51
Lecture 17. Virtualization

A virtual middleware architecture can be
carefully mapped onto an FPGA architecture.
This approach results in effective performance of
the virtual architecture, with maximum
parallelism and throughput.
To the system programmer the virtual
(middleware) machine will become its programming
environment.
Programming and code generation of the actual
virtual machine will make use of conventional
software tools, such as compilers and assemblers.

52
Virtual Middleware Concept
53
Virtual PSP Middleware Concept
54
Conclusion

In a recent interview with Electronics Weekly (9
May 2005), Wim Roelandts, president and CEO of
Xilinx made the following observation
The next step is really to make FPGAs disappear.
Today our customers are hardware engineers. But
FPGAs are programmable devices. If we can create
a level of abstraction that appeals to software
engineers, we can increase our customer base by
at least 10x. That's really where our future is.
As long as you have a set of interfaces that you
can programme to, you don't have to know what the
hardware looks like.