Title: SLHC Design Considerations for the CSC TrackFinder
1SLHC Design Considerations for the CSC
Track-FinderAn Asynchronous Trigger Proposal
- Darin Acosta and Alex Madorsky
- University of Florida
2Outline
- CSC Muon Trigger
- Brief review of the CSC Track-Finder for CMS
- CSC BX assignment issues
- CSC occupancy estimates
- Some obvious changes for SLHC operation
- Interesting RD Directions
- Xilinx Rocket IO X technology
- Proposal to go asynchronous at Level-1
- After BX assignment, of course
3CSC Muon Trigger Scheme
EMU
Trigger
On-Chamber Trigger Primitives
Muon Port Card(Rice)
3-D Track-Finding and Measurement
Trigger Motherboard(UCLA)
Strip FE cards
Sector Receiver/ Processor(U. Florida)
LCT
OPTICAL
FE
SP
SR/SP
MPC
LCT
3? / port card
TMB
FE
2? / chamber
3? / sector
Wire LCT card
Wire FE cards
In counting house
RIM
CSC Muon Sorter(Rice)
RPC Interface Module
DT
RPC
4?
4?
4?
Combination of all 3 Muon Systems
Global L1
Global ? Trigger
4?
4 CSC Track-Finder Crate
Single crate solution, 2nd generation prototypes
under test
Clock Control Board
Sector Processor
SR
SR
SR
SR
SR
SR
SR
SR
SR
SR
SR
SR
CCB
MS
/
/
/
/
/
/
/
/
/
/
/
/
SP
SP
SP
SP
SP
SP
SP
SP
SP
SP
SP
SP
From MPC
SBS 620 Controller
(chamber 4)
Muon Sorter
From MPC
(chamber 3)
From MPC
(chamber 2)
From MPC
(chamber 1B)
From MPC
(chamber 1A)
To DAQ
180 ? 1.6 Gbit/s optical links Data clocked in
parallel at 80 MHz in 2 frames (effective 40
MHz) Custom 6U GTLP backplane for
interconnections (mostly 80 MHz) Rear transition
cards with 40 MHz LVDS SCSI cables to/from DT
5SP2002 Main Board (SR Logic)
Phi Global LUT
PLL patch
Eta Global LUT
Phi Local LUT
TLK2501 Transceiver
To/from custom GTLP back-plane
Front FPGA
- Optical Transceivers
- 15 x 1.6 Gbit/s Links
SR Logic
6SP Trigger Logic
- Xilinx Virtex-2 XC2V4000800 user I/O
- Same mezzanine card is used for Muon Sorter
- Track-Finding logic operates at 40 MHz
- Frequency of track stub data from optical links
- Easily upgradeable path
SP2002 mezzanine card
7Track-Finding Latency
11 ? 25 ns, or 275 ns
8CSC BX Assignment
- CSC Time resolution
- 60 ns maximum drift-time per plane, 6 planes per
chamber? 5 ns chamber resolution (ALCT takes
time from 2nd hit) - Not likely to change for SLHC
- Centered peak
- For time distribution centered in BX interval
LHC
SLHC - Probability to get LCT in correct BX
- LHC 98.8
- SLHC 80 (will clearly need to consider
multi-BX)
9From Andrey TB99 Data
10Other CSC Time Scenarios for SLHC
- Bifurcated peak
- Fit 98.8 of LCTs in 2 BX window, but ambiguity
on which BX muon belongs - Offset peak by ?3 ns
- Puts 97 of LCTs in 2 BX window
- 70 in central BX
- 27 or 3 in following BX(choice depends on BX
algorithm for Track-Finder)
49.4
0.6
49.4
0.6
BX1 BX2
70
27
3
BX1 BX2 BX3
11Track-Finder BX Assignment
- Current CSC Track-Finder takes earliest arriving
LCT as definition of track BX over a multi-BX
window - LHC
- 2-station tracks
- Centered peak 98.8 correct BX assignment
- SLHC
- 2-station tracks
- Offset peak 87
- Centered peak 80
- 3-station tracks
- Offset peak 89
- Centered peak 73
- 2 BX window (25 ns) for these efficiencies
- 88 BX i.d. efficiency with offset peak and
taking earliest arriving LCT, two or more stations
Offset peak gives best performance
12Alternative Track-Finder BX Assignment
- Take ALCT approach, and consider second arriving
segment to define track BX - SLHC
- 2-station tracks
- Offset peak 87 correct BX assignment
- Centered peak 80
- 3-station tracks
- Offset peak 78 (82)
- Centered peak 90 (95)
- 2BX (3BX) windows
- Can improve BX i.d. efficiency to 95 with
centered peak, taking second LCT, requiring 3 or
more stations
Same as before
Offset peak gives worse performance for 3 stations
13Andreys Conclusion
14Improved CSC Performance?
- Its worth considering what can be done to
slightly improve CSC timing (change gas, increase
high-voltage) - Suppose 5 ns resolution ? 4 ns
- BX i.d. from first hit
- 2-station tracks
- Offset peak 93
- Centered peak 88
- 3-station tracks
- Offset peak 96
- Centered peak 83
- BX i.d. from second hit
- 2-station tracks
- Offset peak 61
- Centered peak 88
- 3-station tracks
- Offset peak 87 (87)
- Centered peak 96 (98)
BX1 BX2 BX3
15CSC Occupancy (LHC)
- Start from CMSIM trigger study
- R.Cousins, J.Mumford, and V.Valuev CMS Note
2002/007 - Dedicated ORCA trigger simulation, albeit with
LCT logic that does not exactly match final
production hardware firmware - Correlated LCT Occupancy (ALCTCLCT match)
- Entire CSC system 0.05 / pp collision
or 0.9 LCTs per BX _at_
L 1034 - MPC occupancies _at_ L 1034
- ME1 0.025 / BX rescaled to 30
subsectors - ME2?4 0.008 / BX
- Recall that MPC can accept up to 3 LCTs / BX
- Adding neutrons leads to 30 increase in ME1?ME3,
3X higher in ME4 - Ignore neutrons for now. Correlated LCT rate
from neutrons probably doesnt scale linearly
with luminosity since it is composed mostly of
random hits. Hard to make projections.
16Projected CSC Occupancy (SLHC)
- SLHC
- Assume L 1035, BX interval decreases to 12.5
ns, chambers stay the same - Correlated LCT Occupancy
- Entire CSC system 0.9 10 / 2 4.5 LCTs /
BX _at_ 80 MHz - MPC occupancies _at_ L 1035, assuming 80 MHz
operation - ME1 0.125 / BX (ME2?4 is 3X smaller)
- P (?2) 0.7 (spoils di-? measurement in 1
MPC) - MPC occupancies _at_ L 1035, assuming 40 MHz
operation - ME1 0.25 / 25 ns (ME2?4 is 3X
smaller) - P (?2) 2.6 (spoils di-? measurement in 1
MPC) - Occupancies are not huge, but we neglected
neutrons,and LCT ghost probability might be
higher.
17Projected Occupancy in CSC Track-Finder
- For optimum efficiency and decreased sensitivity
to exact CSC timing, we plan to use a 2 BX (50
ns) window at LHC to accept LCTs from the MPCs - e.g. See A.Drozdetskis testbeam analysis talk
from Oct.03 EMU meeting - LCT efficiency goes from 98 to 99.5 with 2 BX
window, Track-Finding efficiency goes with square
or cube of this - Earliest arriving LCT defines BX (but this may
not be best choice) - SR occupancies for SLHC _at_ L 1035, 4 BX window
- ME1 0.5 / 50 ns (every other trigger
BX!) - ME2?4 is 3X smaller
- BX i.d. study suggests only 2?3 BX will be
required - Need to perform detailed rate studies to see if
we pick up fake tracks that trigger
18Inclusion of Muon Data with Tracker
- It would be extremely desirable to include
tracker data at Level-1 - How it is planned to be used at LHC for HLT
- Attach tracker hits to improve PT assignment
precision from 15 standalone muon measurement
to 1.5 with the tracker - Will improve sign determination as well and
offers vertex constraints - Find pixel tracks within cone around muon track
and compute sum PT as an isolation criterion - Less sensitive to pile-up than calorimetric
information if primary vertex of hard-scattering
can be determined (100 vertices total at SLHC!) - To do this requires ??? information on the muons
finer than the currently reported 0.05?2.5 - No problem, since both are already available at
0.0125 and 0.015
19Muon Rate at L 1034
From DAQ TDR
Note limited rejection power (slope) without
tracker information
20Preliminary Conclusions on CSC for SLHC
- Will probably want to upgrade front-end trigger
boards and optical links to send LCT data _at_ 80
MHz - Simulated muon occupancy may be low enough to
avoid this, but with strong caveats. Real data
may be worse. - BX identification is about 80 correct at LCT
level - BX identification is about 90 correct at
Track-Finder using current algorithm - 2 BX acceptance window, 2 or more stations,
offset timing peak - Can be increased to 95 correct BX i.d. at
Track-Finder - 3 BX acceptance window, 3 or more stations (need
ME4!) - Requires new logic in Track-Finder to take second
LCT time - Trivial to add more finer eta and phi information
to reported muon candidate for use by a
muon-tracker match box
21More Generic RD
22Xilinx RocketIO X
- Maximum speed 10.3125 Gbit/sec
- Latency is not yet published
- Will be better than RocketIO according to tech
support - Minimal latency calculation (in clock cycles)
based on Rocket IO documents
A lot, but
23RocketIO X
- Clock cycle at 10 Gbit serial speed 250 MHz
- Total latency 84 ns (should be better for
RocketIO X) - Thus, probably an interesting avenue to explore
to bring huge data volumes into an FPGA
24An Asynchronous Level-1 Trigger?
- The high-speed data links that will be available
for use at SLHC (10?100 Gbit/s), and the
challenges of timing in a fully synchronous
system and distributing a jitter-free clock at
the sub-ps level, started us thinking about going
asynchronous after the front-end BX assignment - We DO still need a clock synchronous with the
machine frequency distributed to the FE boards
for BX assignment - But unlike early LHC electronics, there will now
be several orders of magnitude separating the
machine frequency from the data communication
frequency - We KNOW this has to work because HLT is already
asynchronous - Moreover, we know that CSC trigger data collected
synchronously at the CSC Track-Finder exactly
matches the DAQ data collected asynchronously
25Advantages of an Asynchronous Design
- De-couple clocking requirements of high-speed
data links from synchronous BX assignment - Data instead is sent with a BX label when it is
available, and trigger logic assembles event for
processing - Can use on-board xtal oscillators for serdes
reference clock, rather than multiplying an 80
MHz clock to multi-GHz - Maximize the utility of the available bandwidth
- A synchronous system must keep a low occupancy on
the data links, otherwise Poisson fluctuations in
one BX will overflow the link ? wasted bandwidth
by sending lots of 0s - Some trigger subsystems recognized this and
already serialize data over multiple BX - RPC, DT
- A step toward an asynchronous design already!
26Advantages II
- Respond more robustly to bursts
- Could allow more segments/clusters in an
unusually busy event by allowing transmission
time to vary (Imagine a black-hole event) - Detector technology may not keep pace with
shorter SLHC BX - e.g. The CSC FE may not have the ability to
accurately determine the 12.5 ns bunch.
Track-Finder might need a 50 ns (4 BX) window to
trigger efficiently. - DT system may be in even worse shape with BX
assignment - Less compelling why data must be sent
synchronously when it is naturally distributed
over several BX - Might it be possible to not have a machine BX at
all, but one long train? (BX i.d. then becomes a
time-stamp) Big question will be detector
occupancies
27Advantages III
- Decouples clock frequency of algorithm from BX
frequency - A shorter BX does not mean faster logic. Logic
works at transistor switching speed. Too short
of a clock means wasted overhead to allow signals
to settle before latch - Might allow continued use of legacy 40 MHz boards
- Might allow incorporation of DSPs and CPUs into
trigger architecture - Is a high-performance DSP useful?
- What about embedded PowerPC chip in FPGA?
- If not for triggering, very useful for slow
control, DAQ, - Perhaps track-finding, jet-clustering, b-tagging
could benefit - Might think of this as merging traditional L2
into L1, rather than L2 into L3 as CMS now does.
Given that we want tracker at L1, maybe this is a
way we need to go.
28Constraints
- Proper BX assignment is still required at
front-end - Data is stamped with 12.5 ns bunch crossing time
- Distribution of synchronous 80 MHz clock much
less challenging (its done already) than one
used to drive data links - TTC system may be overly complicated for what we
need - Level-1 decision must be reached by a maximum
latency or sooner - Must have a time-out mechanism on data
transmission and algorithms - Still must keep latency short!
- Event building circuitry needed in FPGAs
- Should be small
- Escape clause
- Its always possible to wrap up the asynchronous
logic into a synchronous black box (re-align
data at trigger output) - We considered this as an option for the MPC?SP
optical link transmission in the current system,
for example, to solve clock jitter issues - A matter of determining how small or large an
asynchronous block can be built
29Advanced Detector Research Proposal
- Submitted a proposal to the DOE Advanced Detector
Research program for FY04 to prototype such an
asynchronous system - Based on existing CSC detector, with upgrade to
ALCT logic and redesigned Processor board