Title: F5: FPLD Device Architectures and Tradeoffs
1F5 FPLD Device Architectures and Tradeoffs
- Dr. Alan D. George
- Professor of ECE, University of Florida
- Ryan DeVille
- Ph.D. Student, University of Florida
2Outline
- Project Goals, Motivations, Challenges
- Background and Related Research
- Project Team Members (faculty students)
- Y1 Tasks
- Overview of Y1 Tasks
- Task 1 - Device architecture evaluation
down-select - Task 2 - Testbed comparisons of strategic options
- Task 3 - Analytical performance power modeling
- Task 4 - Preliminary exploration of new device
options - Y1 Milestones, Deliverables, Budget
- Conclusions, Member Benefits
3Project Goals, Motivations, Challenges
- Project goal develop fundamental research
foundation for comparative analysis and insight
on RC processing technologies for HPEC - How do embedded processing technologies compare?
- What analytical/sim models are needed to
comprehensively study differences? - What is ideal FPLD to serve needs of broad range
of HPEC apps? - Project motivations comprehensive tradeoff
analysis to determine what future roadmap should
be for FPLDs in HPEC - Historical advances with FPLDs have tracked
communications networking market - What changes are necessary to target broader
scientific applications? - HPEC systems usually require a strong awareness
of perf/power - How do FPLDs compare with other technologies over
a suite of apps? - Can architectural changes in FPGA improve this
metric? - Exciting new processing technologies are entering
the market - Can these technologies provide lessons to improve
FPLD design? - Project challenges
- Analytical modeling of resource, performance,
power characteristics - Simulative analysis of new, emerging, and
futuristic device architectures - Testbed experimentation to calibrate analytical
simulative models - Small-scale prototyping of promising innovations
or hybrids
4Background Related Research
- HPEC showcases three primary types of processing
technology - Traditional FPLD devices (FPGAs)
- Provides fine-grain parallelism options
- Proven role in communication and DSP apps
- Bleeding edge of process technology (65 nm)
- Emerging non-traditional FPLD devices
- Attack processing needs from different angles
- FPOAs FPMCs coarser grain logic elements for
faster clock frequency - SW-configurable RC processors Custom RC
instructions exploit additional ILP - May have potential to one day achieve lower
design complexity than traditional FPLDs
- Classical CPUs
- Exploit varying levels of instruction-level
parallelism - Vector processing engines such as AltiVec provide
acceleration capabilities - DSPs have additional architectural elements for
signal processing applications - Novel emerging architectures (Cell, CoolThreads,
multi-core PPC, etc.) target new methods for
using inherent application parallelism
5Background Related Research
- Comprehensive comparisons between devices for
tradeoff analysis - Suitable benchmarks key for contrasting
technologies - Highlight technology suitability for
aerospace/defense applications - HPEC Challenge benchmark suite (http//www.ll.mit.
edu/HPECchallenge/) - Addresses important operations across DOD and
image processing apps - Contains 9 kernel and 1 multiprocessor
application benchmarks - Early work at UF successful in finding key
micro-benchmark operations (Taylor series, FFT,
I-FFT, etc.)
Speedup increases at a higher rate with
increasing parallelism rather than increasing
frequency (Taylor series PM)
- Need to explore all degrees of freedom
- Extended HPEC metrics
- Performance/power, fault tolerance capabilities,
learning curve, etc. - Clearly establish boundaries for device
strengths/weaknesses - Establish a relationship between device
characteristics and their performance and power
relative standings
6Project Team Members
- Faculty members
- Dr. Alan D. George
- Professor of ECE
- Dr. Herman Lam
- Associate Professor of ECE
- Students
- Ryan DeVille student project leader
- 3rd-year doctoral student in ECE
- Alumni Fellow
- Intern at PNNL
- Jason Ling
- MS student in ECE
- BS in ECE, University of Florida, 05/2006
- Intern at Honeywell
- TBD
- Optional third graduate student
- Undergraduate student (volunteer) TBD
7Overview of Y1 Tasks
- Project focus Architecture tradeoffs between
current FPLD and strategic embedded systems - Determine relative performance strengths and
weaknesses in FPLD and other processing
technologies - Extrapolate trends for COTS FPGAs and processors
- With particular awareness of suitability for HPEC
- Determine new components for novel FPLD
architectures - Y1 Tasks
- Task 1 - Device architecture evaluation
down-select - Literature review, detailed architectural studies
of select processing technologies - Task 2 - Testbed comparisons of strategic options
- Broad range of experiments to establish device
and system characteristics - Task 3 - Analytical performance power modeling
- Use architectural awareness to stress the device
- Framework of design and tradeoff rules to
categorize strengths and weaknesses - Task 4 - Preliminary exploration of new device
options - Set stage for follow-up annual project via
simulation and/or prototyping
8Task 1 Architecture Evaluation Selection
- Comprehensive literature review of embedded
processing technologies - Academic studies and future architectures
- Where available, industry comparisons and
analysis
- Detailed architectural study
- Determine level of detail needed to accurately
model each processing technology - Determine key pieces of each technology
- What makes it unique?
- How does that uniqueness translate into
performance gains? - What tradeoffs did the designers face?
- Use existing architectural modeling software to
aid in study - Rough models in needed to aid in determining
tradeoffs - Help in determining detail level required for
analysis
9Task 2 Testbed Comparisons
- Determine suite of benchmarks for analysis of
strategic options - Sponsor input on benchmark choices and
devices/systems of prime interest - Use of HPEC Challenge benchmark suite has already
begun - Early work has focused on developing key
benchmarks on FPGA, FPOA, and AltiVec processing
engines - Pattern-matching kernel benchmark
- Searches for a signal signature across a database
of spectra - High computation/communication ratio
- Time- and frequency-domain FIR kernel benchmarks
- Signal filtering controlled by tap coefficients
- High-data-rate application implies low
computation/communication ratio - Conduct a broad range of testbed experiments with
benchmark suite - Delve deeper into device and system
characteristics - Use knowledge discovered in Task 1 to help locate
strengths weaknesses - Generate data to ascertain inherent traits
- Aggressively use micro-benchmarks to further
refine relationships
10Tasks 34 Modeling, Analysis, Exploration
- Task 3 - Develop analytical performance/power
models of devices - Develop system and device models for further
study - Combine analysis from architectural evaluation
and testbed experiments - Use information gathered from early rough models
to determine level of detail - Validate models using benchmark experiments
- Verify that models accurately depict performance
of processing technology - Verify that other important metrics are within
reasonable error bounds (e.g. power, area, etc) - Stress processors to clearly demonstrate
strengths and weaknesses - Depict best and worst case benchmark studies
- Determine issues at scale
- Does parallelism breaks down due to internal
costs? - Task 4 - Preliminary exploration of new device
options - Using lessons learned about tradeoffs in HPEC
processing technologies - Propose novel FPLD architecture to mitigate
weaknesses, enhance strengths - Based upon previous analytical models, estimate
performance gains
11Y1 Milestones, Deliverables, Budget
- Milestones
- Finalize literature review architectural
down-select Feb.07 - Complete testbed comparisons of strategic options
Aug.07 - Finalize validate architectural models
Dec.07 - Deliverables
- Midterm and final reports documenting research
methods, progress, results, and analysis - Performance and power models for determining
device tradeoffs code cores from benchmarks - One or two scholarly conference and/or journal
publications - Budget
- 2-3 CHREC memberships
- 2 memberships allows completion of first three
tasks - 3 memberships allows completion of fourth task as
well
12Conclusions Member Benefits
- Conclusions
- Project seeks to clearly identify tradeoffs
between HPEC processing technologies - Tradeoff analysis needed to determine suitability
based upon classes of apps - Broad range of experiments needed to encompass
all degrees of freedom - Project defined by tasks that
- Comprehensively evaluate technologies through
architectural studies - Clearly contrast current processing architectures
- Exploit strengths and minimize weaknesses
- Perform a detailed benchmark analysis for
real-world results - Model current architectures for validation and
larger system scale - Exploit key architectural strengths for paper
study of future FPLD designs - Member Benefits
- Early access to detailed analysis of various
processing domains for HPEC - Direct input into recommendations for
down-selects and future generations of FPLD
technology - Insight on architectural tradeoffs for future
project design considerations