Title: Parallel FIR implementation
1Project IEEE P802.15 Working Group for Wireless
Personal Area Networks (WPANs) Submission Title
Fast FIR Filter structure Date Submitted 7
November 2003 Source Michael Mc Laughlin, Roger
Maher, Damien Nolan Company ParthusCeva
Inc. Address 32-34 Harcourt Street, Dublin 2,
Ireland. Re Fast FIR Filter structure Abstract
A Fast Finite Impulse response filter structure
is analysed and gate counts are given for example
implementations Purpose Backup material for
ParthusCeva/XSI/CRL TG3a PHY Proposal Notice
This document has been prepared to assist the
IEEE P802.15. It is offered as a basis for
discussion and is not binding on the contributing
individual(s) or organization(s). The material in
this document is subject to change in form and
content after further study. The contributor(s)
reserve(s) the right to add, amend or withdraw
material contained herein. Release The
contributor acknowledges and accepts that these
viewgraphs becomes the property of IEEE and may
be made publicly available by P802.15.
2FIR Gate count forexample real and complex FIR
implementations
3Example Matched Filter Configuration
Cn
Di
CnN
Di-N
4
1
Cn1
Di-1
Di-N-1
4
1
CnN1
..
..
4x
4x
4x
4x
4
4
4
4
..
4 bit adder
4x
4x
5 bit adder
..
..
4Serial real FIR implementation
Input rate Cm
C chip rate m over-sampling factor
FIR1
Filter rate C 1368MHz Too fast
Decimated Output rate C
Real FIR allows up to 224Mbps
5Parallel real FIR implementation
Input rate Cm
FIR0
FIR1
FIR2
FIRn
Filter rate C/n
Output rate C
6RTL Synthesis summary
- Circuit synthesised in RTL using 130nm standard
cell library, - Worst case conditions Voltage, Temperature,
Process. - Average adder gate count per real adder 34
- Can be clocked at up to 193MHz
- worst case conditions
- full result available in a single cycle
- Used standard Ripple Carry adders
- Further speed optimisation possible e.g.
- Introduce pipeline half way down adder tree
- double speed, half gate adder count, same
power - fast adders
- find your own ones!
7Filter rate
- C1368, m4
- n 8 gt Filter rate 172MHz
- (Synthesis shows 193MHz possible in 130nm)
- Taps per filter 300 (Spread of 55ns)
- Total number of taps n x 300 2400
- No. 1st stage pseudo-adders (or gates) 1200
- No. second stage adders (4 bit) 600
- No. of rest of adders (second up to last stage)
600
8Gate count
- Total no. adders 1200
- Average gates/adder 34
- 20 gates for 4 bit adder
- Bits per adder grows down the tree
- Synthesis tools up drive strength, increasing
gate count - Total Adder Gates 40,800
- Other gates 8,600
- Total gates 49,400 No pipeline, 171MHz
clock - Total gates 37,600 1 pipeline stage, 274
MHz clock
9Serial complex FIR implementation
Input rate Cm/2 real
Input rate Cm/2 imag
FIR real
FIR imag
FIR imag
FIR real
Filter rate C
C 1368MHz - too fast
Decimated Output rate C (complex)
C chip rate m over-sampling factor
10Parallel complex FIR implementation
Input rate Cm/2 complex
FIR 0 complex
FIR 1 complex
FIR 2 complex
FIRn complex
Filter rate C/n
Output rate C
11Filter rate
- C1368, m4
- n 8 gt Filter rate 172MHz
- (Synthesis shows 193MHz possible in 130nm)
- Taps per filter 150 (Spread of 55ns)
- Total number of taps 4 x n x 150 4800
- No. 1st stage pseudo-adders (or gates) 2400
- No. second stage adders (4 bit) 1200
- No. of rest of adders (second up to last stage)
1200
12Gate count
- Total no. adders 2400
- Average gates/adder 34 (from synthesis results)
- 20 for 4 bit adder
- Bits per adder grows down the tree
- Synthesis tools up drive strength, increasing
gate count - Total Adder Gates 81,600
- Other gates 8,600
- Total gates 90,200 No pipeline, 171 MHz
clock - Total gates 64,600 1 pipeline stage,
274MHz clock
13Example bit rates