Title: David Radford
1Signal Decomposition Algorithm for GRETINA
2Outline
- Introduction to the problem
- Candidate algorithms
- Recent progress
- Adaptive Grid Search
- Singular Value Decomposition
- Plans and conclusions
3Signal Decomposition
- Digital pulse processing of segment data
- Uses data from both hit segments and image
charges - Extracts multiple interaction positions
energies - Must allow for at least two interactions per hit
segment - Uses a set of calculated basis pulse shapes
- Done on a per-crystal basis
- Ideally suited to parallel processing
- Requires about 90 of total CPU cycles
- - The major processing bottleneck
- - Risk baseline design allows
- 4 ms/crystal/CPU for decomposition
4Event Processing
36 segments per detector
Segment events
Event Building Data Flow
Crystal Event Builder
Crystal events
Signal Decomposition
Interaction points
1-30 crystals
Data from Auxiliary Detectors
Global Event Builder
Global Events
Tracking
Analysis Archiving
5Signal Decomposition
- Candidate algorithms
- Adaptive Grid Search
- Singular Value Decomposition
- Constrained Least-Squares /
- Sequential Quadratic Programming
- When work begun in 2003, best algorithm was
taking - 7 s / segment / CPU
- - would require 105 CPUs
6Signal Decomposition
Most hit crystals have one or two hit
segments Most hit segments have one or two
interactions
CPU time goes as AGS 100n SVD n
SQP n for n interactions
7Signal Decomposition AGS
Adaptive Grid Search algorithm Start on a
course grid, to roughly localize the
interactions, then refine the grid close by.
8Signal Decomposition SVD
Singular Value Decomposition algorithm
- Very roughly
- Full Least Squares matrix is underdetermined
(singular). - But it can be decomposed into the product of
three - matrices, one of which contains the
correlations - (eigenvalues). By neglecting the small
eigenvalues, the - product can be inverted.
- Then an approximate fit can be obtained with
very little - computational effort, using a precalculated
SVD inverse. - The more eigenvalues kept, the higher the
quality of the fit.
9Signal Decomposition SVD
Singular Value Decomposition algorithm (early
work, LBNL)
Results of SVD in three dimensions for two
interactions indicated by red squares. The upper
row shows obtained probability distributions
taking all possible positions into account. The
bottom row shows the results after limiting the
radius to the regions between the green lines.
The final result agrees well with the input
positions.
10AGS
- First efforts completed in 2003
- Adaptive Grid Search,
- followed by constrained least-squares
- Grid search in position only energy fractions
are fitted - (see following slides)
- 1 - 2 interactions per hit segment
- Excellent results for single-segment events
- Converges for 100 of events
- Reproduces positions of simulated events to lt ½
mm - Very fast 7 ms/event/CPU (2 GHz P4)
11AGS SQP
Example events Blue measured Red fitted
12Some Math
13Math Contd
14AGS
Some numbers for adaptive grid search 35000
grid points in 1/6 crystal (one column, 1x1x1
mm) 2x2x2mm (slices 1-3) or 3x3x3 mm (slices
4-6) coarse grid gives N ? 600 course grid
points per segment. For two interactions in one
segment, have N(N-1)/2 ? 1.8 x 105 pairs of
points for grid search. This takes 3 ms/cpu
to run through. But (N(N-1)/2)2 3.2 x 1010
combinations for two interactions in each of 2
segments totally unfeasible! Limit N to only 64
points then (N(N-1)/2)2 4 x 106 But
(N(N-1)/2)3 8 x 109 combinations for two
interactions in each of 3 segments still
impossible.
15AGS Contd
Adaptive grid search fitting Energies ei and ej
are constrained, such that 0.1(eiej) ? ei ?
0.9(eiej) Once the best pair of positions
(lowest ?2) is found, then all neighbor pairs are
examined on the finer (1x1x1 mm) grid. This is
9x9 81 pairs. If any of them are better, the
procedure is repeated. For this later procedure,
the summed signal-products cannot be
precalculated. Finally, nonlinear least-squares
(SQP) can be used to interpolate off the grid.
This improves the fit 50 of the time.
16AGS Recent progress
- Code has been completely rewritten in C
- - Translated from FORTRAN
- Cleaner, easier to develop and maintain
- Results verified to be identical, and slightly
faster - Extended to handle two-segment events
- - Up to four interactions total
- Starts with one interaction per hit segment,
- then adds interactions
- - AGS again followed by constrained least-squares
- - Converges for 100 of events
- Excellent speed 3-8 ms/event/CPU for 1 seg
- 15-25 ms/event/CPU for 2 seg. (2GHz P4)
17SVD
- DOE SBIR (Phase I) with Tech-X Corp.
- Funded to investigate alternative algorithms
- Started with Singular Value Decomposition
- Used signal basis developed for AGS
- Two-step SVD
- 2 mm grid (50 eigenvalues) to localize
interaction region, - followed by 1 mm grid (200 e.v.) over reduced
space - Works perfectly for single interaction
- Currently tested for up to 3x2 interactions
- Results certainly good enough for input to SQP
- CPU time linear in number of interactions
- 6 ms/segment/CPU (2GHz G5)
18New SVD Results
- 2D projections of SVD amplitudes
- Interaction sites at (13,9,11) and (8,11,11)
19Current Concept
- 1 segment AGS SQP
- 2 segments AGS SQP or SVD SQP
- 3 segments SVD SQP
- Need to include fitting of variable event time
- Cylindrical coordinate system
- (rather than Cartesian coordinates used
presently) - should save time and improve accuracy
- - constraints programmed into SQP
- Need to develop good metrics for performance
20To-Do List
- Try to include three-segment events in AGS?
- Continue SVD development with Tech-X Corp
- SVD ? least-squares
- SVD ? grid search ? least-squares
- Replace Cartesian coordinate system with
cylindrical or quasi-cylindrical coords (begun) - Allow for variation in event start time
- Allow for occasional three interactions/segment?
- Compare reliability of AGS and SVD results
- Examine failure modes in detail, develop metrics
- Deal with irregular-hexagonal detectors
21Conclusion
- Excellent progress made over past 12 months
- AGS for two-segment events
- SVD development with Tech-X
- - looks very promising for 2 or more segments
- Final algorithm speed should be sufficient
- - Moores Law over 3-4 years