Tiny Triplet Finder - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Tiny Triplet Finder

Description:

Hit data come out of the detector planes in random order. Hit data from 3 planes generated by same ... Maximum 16 clock cycles are needed for resetting. ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 36
Provided by: jywu2
Learn more at: http://www-ppd.fnal.gov
Category:

less

Transcript and Presenter's Notes

Title: Tiny Triplet Finder


1
Tiny Triplet Finder
  • Jinyuan Wu, Z. Shi
  • Dec. 2003

2
Hits, Hit Data Triplets
  • Hit data come out of the detector planes in
    random order.
  • Hit data from 3 planes generated by same particle
    tracks are organized together to form triplets.

3
Triplet Finding
  • Three data items must satisfy the condition xA
    xC 2 xB.
  • A total of N3 combinations must be checked (e.g.
    5x5x5125).
  • Three layers of loops if the process is
    implemented in software.
  • Large silicon resource may be needed without
    careful planning.

Plane A
Plane B
Plane C
4
Tiny Triplet Finder (Animation)
x3
(1) Fill in hits to x1 and x3 planes.
Implement x1 and x3, the plane x2 is for
eye-guide only.
(2) Cycle through x2 hits.
x2
Line matched.
x1
5
Tiny Triplet Finder
Bit-wise Logic Block
Bit Array/Shifter A
Bit Array/Shifter C
Hash Sorter A
Hash Sorter C
6
TTF OperationsPhase I Filling Bit Arrays
Bit Array/Shifters
Note Flipped Bit Order
  • xA xC 2 xB
  • xA - xC constant

Physical Planes
Fill a corresponding logic cell.
For any hit
7
TTF Operations Phase II Making Match
Bit Array/Shifters
Triplet is found.
Logically shift the bit array.
Perform bit-wise AND in this range.
Physical Planes
For any center plane hit
8
TTF Operations Phase II Making Match (Next Hit)
Bit Array/Shifters
Triplet is found through bit-wise AND.
Logically shift the bit array.
Fake triplet may exist. It will be cut out in the
later stages.
Physical Planes
Loop to next center plane hit
9
TTF OperationsPhase II Making Match (More)
Bit Array/Shifters
Logically shift the bit array.
Triplet is found through bit-wise AND.
Physical Planes
Loop to next center plane hit
10
TTF Operations Phase II Making Match (and More)
Bit Array/Shifters
Logically shift the bit array.
Triplet is found through bit-wise AND.
Physical Planes
Loop to next center plane hit
11
TTF Operations Phase II Making Match (Last Hit)
Bit Array/Shifters
Logically shift the bit array.
Physical Planes
Triplet is found through bit-wise AND.
Loop to next center plane hit
12
Multiple Hits TripletsKeep Them All
  • Each bin may be filled with more than one hits.
  • Hits data are kept in hash sorters allowing
    multiple hits per bin.
  • There are may be more than one match in each
    bit-wise AND operation.
  • They are all sent out, one-by-one, to the later
    stages for fine cut and arbitration processes.

13
Boundary IssuesBeyond Just Bit-wise AND
  • When the track hits near the boundary of a bin,
    simple bit-wise AND may miss the triplet.
  • The bit-wise OR-AND logic will cover the
    boundary.
  • The logic cells in most of todays FPGAs have 4
    inputs. So the OR-AND bit-wise logic doesnt
    increase any resource usage.

14
Tiny Triplet Finder Block Diagram
77 LUT 66 FF
DA
172 LUT FF
192 LUT 111 FF
92 LUT FF
BitLogic
KA
A
XB
C
KC
EN
Pop
77 LUT 66 FF
Halt
15
Schematics
16
Simulation
(1) Filling hits on Plane A and C.
(2) Looping over hits on Plane B.
The extra combination halts the pipeline for
proper operation.
This combination is a fake triplet.
Pipelined internal operations
Triplets are grouped together.
17
Logic Cell Usage
  • Both 64- and 128-bit TTF designs fit 100 FPGA
    comfortably.
  • A simple 64-bit Hough transform design is shown
    for scale.
  • A 1200 FPGA is shown for scale.

18
Pentlet FindingBeyond Just Bit-wise AND
  • Use 4 bit arrays.
  • There are 3 constraints total.
  • More constraints help to eliminating fake tracks.
  • It is possible to use bit-wise majority logic
    (such as 3-out-of-4) to accommodate detector
    inefficiency issues.

Plane A
Plane C
Plane E
Plane B
Plane D
19
Comparison of Baseline and Tiny Triplet Finder
Data Flow
Station N-1 bend
Station N-1 bend
Station N bend
FIFO
FIFO
AM Long Doublet
Station N bend
Station N1 bend
Station N1 bend
FIFO
FIFO
FIFO
FIFO
FIFO
AM Triplet
FTF Triplet
Hash Sorter
Hash Sorter
Station N-1 Non-bend
Station N-1 Non-bend
FIFO
FIFO
FIFO
FIFO
AM Short Doublet
Hash Sorter Short Doublet
Station N Non-bend
Station N Non-bend
FIFO
FIFO
FIFO
FIFO
AM Short Doublet
Hash Sorter Short Doublet
Station N1 Non-bend
Station N1 Non-bend
FIFO
FIFO
FIFO
FIFO
AM Short Doublet
Hash Sorter Short Doublet
20
The End
  • Thanks

21
Short Doublet Stage 1 of 3 Identical Ones
y3y (83bits)
x3y (7bits)
y2y (83bits)
x2y (7bits)
77 LUT 66 FF
y1y (83bits)
x1y (7bits)
E
P
StnID
HitMap
H
Triplets In
Station N-1 Non-bend
FIFO
FIFO
Hash Sorter Short Doublet
Triplets Out
y3y (83bits)
x3y (7bits)
y2y (83bits)
x2y (7bits)
x1x (83bits)
Dy1(400u)
y1y (83bits)
Dx1(400u)
E
P
StnID
HitMap
H
22
Hash Sorter Vertex-II Implementation
POP
POPQ
RDY
REPOP
WT
MD
EVeq
EV(90)
IdK(70)
IdKQ(70)
EV(80)
K(80)
DOB(270)
DI(270)
DIQ(270)
IdN(70)
PUSH
PUSHQ
DUMPD
IdP(70)
IdQ(70)
SR
Vertex II Implementation
CLK
23
Hash Sorter Cyclone Implementation
POP
POPQ
RDY
REPOP
WT
MD
EVeq
EV(80)
IdK(60)
IdKQ(60)
EV(70)
K(70)
DOB(280)
DI(280)
DIQ(280)
IdN(60)
PUSH
PUSHQ
DUMPD
IdP(60)
IdQ(60)
SR
Cyclone Implementation
CLK
24
Bit Array/shifterXILINX Implementation
  • Each Bit Array/Shifter has 64 bins.
  • Each bin is a 16-bit RAM (look-up table).
  • The RAM operates like a D Flip-flop.
  • Each xc2v1000 FPGA has 10,240 such RAMs.

25
Bit Array/shifterFilling the Hits
  • Each hit is written into 8 to 16 different RAMs
    with different addresses.
  • One clock cycle (not 8 to 16) is used to fill a
    hit.

26
Bit Array/shifterShift Reading
  • With given addresses of the RAMs, hit patterns
    appear at the outputs of the arrays.
  • Changing addresses shifts the hit patterns.
  • One array shifts in unit of 1 bin, the other in
    unit of 8 bins.
  • The relative shift of the two patterns covers 128
    bins.
  • One clock cycle is used for a reading, regardless
    the distance of the relative shifting.

27
Bit Array/shifterComments Xilinx or Not
  • Writing/reading operations need only single clock
    cycle. ?
  • Storage and shifter are combined big resource
    saving. ?
  • Maximum 16 clock cycles are needed for resetting.
    ?
  • Non-xilinx implementation Additional 64x8512
    LEs may be needed. ? But resetting needs only
    one clock cycle. ?

28
Challenge on Triplet Finding (1)The Facts in
Road Language
  • A triplet has two parameters.
  • With two hits in two planes, two parameters can
    be determined. The 3rd hit in the 3rd plane will
    provide the first constraint.
  • Assume each parameter is sliced into 64 bins.
    (Thats not too many.)
  • There are 64 x 64 4096 roads, i.e., possible
    track configurations.
  • Any hit can belong to 64 possible roads. Two
    hits in two planes have one common road. Three
    hits must have one common road to be in a triplet.

29
Challenge on Triplet Finding (2)A Possible
Implementation Using CAM
  • It is possible to use (static) contents
    addressable memory (CAM) to implement triplet
    finding.
  • Hit patterns are pre-stored in CAM.
  • When a hit pattern match the pre-stored one, the
    address represent the road.
  • Each of 3 planes has 64 possible hit locations.
    (64x644096 roads)
  • Assume one road can generate one hit pattern, a
    CAM array with 64x3192 input bits, 4096 words is
    needed.
  • It can be done with 128 small CAMs (32-bit
    32-words). EPC20K1000E has 160. (80)
  • Oops boundary issues have not be considered.
    (More patterns ? )

Plane A
Plane B
Plane C
Road
30
Challenge on Triplet Finding (3)A Possible
Implementation Using Hough Transformation
  • Book a 2-D (64x64) histogram. Each bin of the
    histogram represent a road.
  • Each hit from plane A, C or B increments the
    counts of the 64 bins in a row, column or
    diagonal.
  • The bins with count 3 are valid triplet roads.
  • Each histogram bin is a counter plus selection
    logic.
  • Assume each bin can be implemented with 4 Altera
    logic elements (LEs).
  • 4096 bins need 16,384 LEs, EPC20K1000E has
    38,400. (16,384/38,400 43)
  • However How to find the bins with count 3 and
    encode them is not trivial.

31
Silicon Usage (Estimate)
  • The Tiny Triplet Finder (TTF) silicon usage is
    acceptable.
  • The hash sorter silicon usage is acceptable.
  • The Baseline, if not implemented automatically,
    could reduce the size a lot.

32
NBin
DOB
DI
IdNext
33
(No Transcript)
34
0
1
2
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
0
1
6
6
6
6
6
6
6
6
0
7
7
7
7
7
7
7
7
0
1
2
3
4
5
6
7
0
0
0
0
0
0
0
0
0
1
2
3
4
5
6
1
1
1
1
1
1
1
1
0
1
2
3
4
5
2
2
2
2
2
2
2
2
0
1
2
3
4
3
3
3
3
3
3
3
3
0
1
2
3
4
4
4
4
4
4
4
4
35
3
4
4
5
6
7
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
2
3
4
5
6
7
6
6
6
6
6
6
6
6
1
2
3
4
5
6
7
7
7
7
7
7
7
7
7
0
1
2
3
4
5
6
7
0
0
0
0
0
0
0
0
7
1
1
1
1
1
1
1
1
6
7
2
2
2
2
2
2
2
2
5
6
7
3
3
3
3
3
3
3
3
4
5
6
7
4
4
4
4
4
4
4
4
Write a Comment
User Comments (0)
About PowerShow.com