Title: LowFrequency Pulsar Surveys and Supercomputing
1Low-Frequency Pulsar Surveys and Supercomputing
2Outline
- Baseband Instrumentation
- MultiBOB
- MWA survey vs PKSMB survey
- Data rates
- CPU times
- Low-Frequency Pulsar Monitoring
- The Future Supercomputers
3Pulsar Dedispersion
4Coherent Dedispersion
- Unresolved on us timescales
- From young or millisecond pulsars
- Power-law distribution of energies
PSR J02184232
510221001 Pulsar Timing (Kramer et al.)
6CPSR2 Timing (Hotan, Bailes Ord)
7Swinburne Baseband Recorders etc
- 1998 Canadian S2 to computer (16 MHz x 2)
- 100K system video tapes
- 2000 CPSR
- 20 MHz x 2 DLT7000 drives x 4
- 2002 CPSR2
- 128 MHz x 2 real-time supercomputer (60 cores)
- 2006 DiFX (Deller, Tingay, Bailes West)
- Software Correlator (ATNF adopted)
- 2007 APSR
- 1024 MHz x 2 real-time supercomputer (160
cores) - 2008 MultiBOB
- 13 x 1024 ch x 64us fibre 1600-core
supercomputer
8dspsr software
- Mature
- Delivers lt 100 ns timing on selected pulsars
- Total power estimation every 8us with RFI
excision - Write a loader
- Can do
- Giant pulse work
- Pulsar searching (coherent filterbanks)
- Pulsar timing/polarimetry
- Interferometry with pulsar gating
9PSRDADA (van Straten)
- psrdada.sourceforge.net
- Generic UDP data capture system (APSR/MultiBOB)
- Ring Buffer(s)
- Can attach threads to fold/dedisperse etc
- Hierachical buffers
- Shares available CPU resources/disk
- Web-based control/monitoring
- Free! hooks to dspsr psrchive.
10APSR
- Takes 8 Gb/s voltages
- Forms
- 16 x 128 channels (with coherent dedispersion)
- 4 Stokes, umpteen pulsars
- Real-time fold to DM250 pc/cc.
- O(100) Ops/sample
- Sustaining gtgt100 Gflops
- 100K computers.
- June 2008
- 192 MHz working _at_ 4bits
- 768 MHz working _at_ 2bits
11Coherent Dedispersion BW/time
1024
x
(100K)
BW
128
(300K)
x
16 20
x
x
1998 2000 2002 2004 2006 2008
year
12Coherent Dedispersion
- Now trivial
- FFT ease B-2/?3
13MultiBOB
- High Resolution Universe Survey (PALFA of the
South) - Werthimers iBOB boards
- 1024 channels, down to 10us sampling
- Two pols
- FPGA coding hard
- Use software gain equalizer/summer
- 5 MB/s beam
- 1 Gb/s Fibre to Swinburne (gt1000 km fibre)
- Real time searching!
14New PKS MB Survey
- Bailes
- 13 beams
- 9 minutes/pointing
- 1024 channels
- 300 MHz BW
- 64 us sampling
- /- 15 deg
- Kramer
- 13 beams
- 70 minutes/pointing
- 1024 channels
- 300 MHz BW
- 64 us sampling
- /- 3.5 deg
- Johnston
- 13 beams
- 4.5 minutes/pointing
- 1024 channels
- 300 MHz BW
- 32 us sampling
- The rest
15MWA
- Samples
- Takes (24x1.3MHz32 MHz) x 2 x 512
- Just 32 GB/s (64 Gsamples/s)
- FFTs it
- (5 N log2 ops/pt 2.2 Tflops)
- XMultiplies adds
- (512)256B4 16 TMACs
16Sensitivity
(folded factor)
17PKS vs MWA
- G 3-5 x better
- Tsys 14 x worse ?
- B1/2 3 x worse
- Flux 25 x better (1400 vs 200 MHz)
- t1/2 32 x better
Single Pulse work Comparable Coherent search
32x improvement!
But There is a limit to the time you can observe
a pulsar! 4m vs 144m -gt 5x deeper.
18Scattering b0
19Scattering b5d
20b30
21Search instrumentation?
Volts
Spectra
Visibilities
FBanks
uv
32 MHz
Dedisp
F
X
Grid
2D FFT-1
36 GB/s
x 512
x 512 x 256
x 1922
x 512
36 GB/s
1024 GB/s 32 bits
600 GB/s
30 GB/s 5 bits
200 GB/s 32 bits
Fold
FFT
Spectra
Pulsars lt1 bit/s
22Search Timings
- 36,000 coherent beams (768m/4m192)2
- 36 gigapixels/s
- Dedisperse/CPU core
- Gigapixel/120s
- 36 x 120 4320 cores 500 machines 250 kW
- NFFT 36,000 1024 (DMs)/8192 4608 FFTs/sec
- Seek (3s / 8192 x 1024 pt FFT)
- 14,000 cores 1800 machines MW. (M/yr)
23Supercomputing _at_ Swinburne
The Green Machine
- installed May/June 2007
- 185 Dell PowerEdge1950 nodes
- 2 quad-core processors
- (Clovertown Intel Xeon 64-bit 2.33 GHz)
- 16GB RAM
- 1TB disk -gt 300 TB total
- 1640 cores/14 Tflops
- dual channel gigabit ethernet
- CentOS Linux OS
- job queue submission
- 20 Gb infiniband (Q1 2008)
- 83 kW .vs. 130 kW cooling
Machines 1.2M Fuel 100K/yr
24Search Times
- Depend only upon
- Npixels x Nchans x Tsamp-1
- Requires
- No acceleration trials
- PSR J0437-4715
- In 8192s, small width from acceleration
25Search Timings (32x32 tiles)
- 36000-gt1024 coherent beams
- 36-gt1 gigapixels/s
- Dedisperse/core
- Gigapixel/120s
- 120 120 cores 15 machines 7 kW
- NFFT 1024 1024 (DMs)/8192(s/FFT) 128
FFTs/sec - Seek (3s / (8192 x 1024) pt FFT)
- 378 cores 50 machines 25 kW.
26RRATs
- Log N - Log S (helps with long pointings)
- 1000 x integration time.
- Maybe good RRAT finder.
27Monitoring
Monitoring?
28Monitoring
29Build Your Own Telescope?
- May be cheaper to build dedicated PSR telescope
than attempt to process everything from existing
telescopes! - 32x32 tile (2D FFT - 1D FFT - dedisperse - FFT)
- 2M telescopes
- 2M beamformer/receivers
- 1M correlator
- 1M Supercomputer
- 1M construction
- 7-8M
30Next-Gen Supercomputers (IO or Tflops?)
- Infiniband 20 Gb (40Gb)
- 288 port switch
- 10 Tb/s IO Capacity (1-2K/node)
- Teraflop CPU capacities/node (140 Gflops now)
- Teraflop Server or Tflop GPU?
- 10 GB/s vs 76 GB/s
- Power (0.1W/)
- 2M 200 kW
31Architecture (2011??)
288 Ports 40 Gb/s
288 Ports 40 Gb/s
144 Tflops
144 Tflops
FX
300K
1M
300K
1M
32Summary
- Strong motivation for multiple (100) tied array
beams - PSRs/deg2
- Surveys only possible with compact configurations
- At present
- Future Supercomputers may allow search even with
MWA-like telescopes