A Multipurpose SignalProcessing Architecture - PowerPoint PPT Presentation

1 / 1

About This Presentation

Title:

A Multipurpose SignalProcessing Architecture

Description:

Each BEE2 processor board can compute at 500 Gops/sec and has 180 Gbits/sec of I ... five FPGA's in a star network, 40 GBytes RAM and eighteen 10 Gbit ethernet ports. ... – PowerPoint PPT presentation

Number of Views:46

Avg rating:3.0/5.0

Slides: 2

Provided by: setiathom

Category:

more less

Transcript and Presenter's Notes

Title: A Multipurpose SignalProcessing Architecture

1
A Multipurpose Signal-Processing Architecture
Parsons, A., Backer, D., Werthimer, D., Wright, M.
Applications
Platform-Independent, Parameterized Libraries
Overview
Our group seeks to revolutionize the development
of radio astronomy signal processing
instrumentation by designing and demonstrating a
scalable, upgradeable, FPGA-based computing
platform and software design methodology that
targets a range of real-time radio telescope
signal processing applications. This project
relies on the development of a small number of
modular, connectible, upgradeable hardware
components and platform-independent signal
processing algorithms and libraries which can be
reused and scaled as hardware capabilities
expand. We have developed such a hardware
platform and many of the necessary signal
processing libraries for applications in antenna
array correlation, wide-band spectroscopy, and
pulsar surveys.
128 Million Channel SETI Spectrometer
Biplex-Pipelined FFTs
We have developed a SETI spectrometer for use by
NASAs Jet Propulsion Laboratory. This
application required the analysis of a selectable
200 MHz band at under 2 Hz resolution for
anomalous narrow-band emission. This 128 Million
channel spectrometer was implemented on an IBOB
module connected to one BEE2 module. Although
the requirements of this application did not tax
the computational resources of these two boards,
it was nonetheless an important demonstrator of
the design-flow and hardware connectivity which
are instrumental to the rapid development of
applications on our modular hardware. In the
development of this spectrometer, every hardware
component and interface on every board. The
reusability of the FFT and PFB libraries, which
were originally developed for our SERENDIP V
architecture, was also demonstrated. This
instrument is currently in operation at NASAs
Goldstone Deep Space Communications Complex.
Modular, FPGA-based System Hardware
We have developed a dual pipelined architecture
for an FPGA-based FFT. By buffering only as many
incoming samples as is necessary and reusing
resources in the FFT butterflies, our design
performs two complex FFTs simultaneously, handles
a constant data-stream, and uses approximately
1/6 of the resources of the standard FFT design
in an FPGA. Our design is parameterizable for
length and for real/complex FFTs.
Polyphase Filter Banks
32-Station FX Correlator
We are in the process of developing a 32 station,
full Stokes, FX correlator for a new dipole
antenna array being developed by UC Berkeleys
Don Backer and NRAOs
The Polyphase-Filter Bank (PFB) is an efficient
algorithm based on the FFT which provides a
significant improvement in signal-to-noise
and out-of-band rejection at a modest cost in
buffering and computational resources. The
steepness, passband ripple, and filter width are
all selectable parameters in our library.
Rich Bradley for probing the period of early
structure formation in our universe--the Epoch of
Reionization (EoR). Our prototype imager
Each BEE2 processor board can compute at 500
Gops/sec and has 180 Gbits/sec of I/O. Each board
has five FPGA's in a star network, 40 GBytes RAM
and eighteen 10 Gbit ethernet ports. The boards
can be connected together like a Beowolf cluster
40 boards fit into a rack, and yield about 20
Teraops/sec.
demonstrates innovative and crucial technologies
needed for the next generation of radio
telescopes such as the ATA, CARMA, the SKA, and
the next generation EoR array.
The PFB front-end interfaces as an add-on to our
FFT architecture, essentially extending the
window of data which is input to the FFT in order
to improve filter shapes.
The first phase of development was to implement
an 8 antenna correlator on 4 IBOBs and 1 BEE2
which bypassed the need for inter-IBOB spectrum
synchronization by using IBOB compute resources
only for digital band extraction (mixing,
filtering, and decimation). PFB spectral
decomposition for the antennas was moved into
four FPGAs on the BEE2, and correlation was
performed in the fifth. Our next step will be to
provide for the synchronization of spectra
between IBOBs, so that the PFB and following
matrix transpose can be moved into the IBOBs
FPGA and ZBT SRAM, respectively. Once this has
been achieved, we will be able to attain up to
8096 spectral channels, and the recovered BEE2
compute resources will allow us to correlate a
maximum of 32 antennas before we require another
BEE2 module to meet the bandwidth requirements
between the IBOBs PFB and the BEE2s X Engines.
After 32 antennas, our next step will be to
develop a fully packetized, switched correlator
to provide for the interconnect between the IBOBs
and multiple BEE2s.
Correlator X-Engines
With the quantity of new arrays under
development, the need for current/future
correlators is tremendous. The trend in these
arrays is toward large numbers of antennas,
requiring careful attention to the scalability of
parameterized designs. In collaboration with Lynn
Urry of UC Berkeleys RAL, we have developed and
implemented a parameterized module for computing
and accumulating baselines in an FX correlation
architecture Our implementation of an X Engine
correlator delays antenna streams to match the
baselines which are to be cross-multiplied.
Multipliers are multiplexed between different
baselines to avoid invalid computations and thus
attain 100 multiplier efficiency. This design
has the feature that it may be divided across
chips or boards between any stage. Applying our X
Engine architecture to a general FX correlator,
we see how an arbitrarily large correlation may
be mapped into multiple hardware modules.
The IBOB contains a single FPGA and ZBT SRAM for
data preprocessing and packetization into the 10
Gbit ethernet format.
Towards a Beowulf-like, Packetized Processing
Engine
The ADC Board mates directly with the IBOB and
can digitize a single stream at up to 2
Gsamples/sec, or two streams at 1 Gsample/sec.
This architecture uses three easily replaceable,
upgradeable hardware modules connected with as
many identical modules as necessary to meet the
computational requirements of an application
(computing by the yard)
Using packetized data routed through commercially
available switches, our proposed architecture
will look like a Beowulf cluster with
reconfigurable, modular hardware in the place of
CPUs. This architecture uses replaceable,
upgradeable hardware modules connected with as
many identical modules as necessary to meet the
computational requirements of an application.
This architecture can use multicast switches to
allow many commensal experiments to run
simultaneously on the same FPGA compute cluster

Write a Comment

User Comments (0)