Title: FFT VLSI Implementation
1FFT VLSI Implementation
- ???????
- ???
- VLSI Signal Processing
- Shousheng He and Mats Torkelson, A new approach
to pipeline FFT processor. IEEE Proc. Of IPPS,
P766-770, 1996. - E. Bidet, D. Castelain, C. Joanblanq, and P.
Senn, A fast single-chip implementation of 8192
complex point FFT. IEEE J. Solid-State Circuits,
P300-305, March 1995
Updated on 4/2/2001
2FFT Review
3Implementation --- Two Extreme Method
- Slow ? ----------------- Speed -----------------
?Fast
Small ? ------------------Area-------------------
?Large
Complicated ? ------------ Control
--------------- ?Simple
4Design Consideration
- System Requirement
- e.g., speed, area,power
- Trade-off in these two cases, we need
- More Processing Elements (PEs)
- Better Processing Element Utilization Rate
- Better Control Scheme
5FFT Processor --- Block Diagram
6Some Current Themes
Radix-2 Single-path Delay Feedback. ( N 16 )
Radix-2 Multi-path Delay Commutator. ( N 16 )
7Some Current Themes (cont.)
8Comparison
Radix / Speed Low ? ------------------------------
----- ?High
- Control Theme
- Simple ? -----------------------------------
?Complex
Processing Ability / Unit Low ?
----------------------------------- ?High
Combine the advantages ? Further decompose high
radix PE
9Decompose Method (1)
- Simply reuse the repeated micro unit
A radix-4 PE
10Decompose Method (2)
11Graphical Explanation (N16)
12Graphical Explanation (cont.)
- The Eqs are equivalent to the operations below
13Circuit of BF2I and BF2II
14Radix-22 Single-path Delay Feedback
FFT architecture using the above technique, for
N256
Compare with original architecture
15Conclusions
- FFT Applications Radar Signal Processing, Fast
convolution, Spectrum Estimation, OFDM-based
Modulation/demodulations - Efficient VLSI architectures (parallel
processing) are required for real-time
processing. - However, most systems still employ DSP processors
(e.g., TI C3x/C5x) for computations (fast
algorithms like DIT and DIF FFT). - VLIW (Very Long-length Instruction Word)-based
processors (TI C6x) need new programming skills
to utilize the two parallel MAC units.