Title: Detektoraufbau
1Status of the TRD Electronics ALICE
Week Heidelberg 08.09. - 12.09.2003
Kirchhoff Institute of PhysicsChair of Computer
Science University Heidelberg, Germany phone
49 6221 54 4303 fax 49 6221 54
4345 email ti_at_kip.uni-heidelberg.de url www.ti.u
ni-hd.de
Venelin Angelov
2TRD Electronics participating institutes
- Kirchhoff Institute of Physics, Chair of Computer
Science, University of Heidelberg, Germany - Institute of Microelectronics, University of
Kaiserslautern, Germany - Institute of Physics, University of Heidelberg,
Germany - Darmstadt GSI, Germany
- University of Applied Sciences Cologne,
Communications Engineering, Germany
3TRD Electronics Overview
- 1156032 analog channels
- 10 MHz digitization rate
- 183 channels per module
- 64224 sources ? 1 trigger bit
- track processing on chamber
- trigger/readout integrated
- maximum latency 6µs
- 20 space points per tracklet
- 4-6 tracklets (layers) per track
L1
trigger
to CTP
Tracklet
Tracklet
Tracklet
To HLT
PASA
ADC
TRD
Preprocessor
Processor
Merger
GTU
DAQ
TPP
TP
TM
event buffer
Store RAW
data until
L1A
detector
6
layers
1200000
Channels
18 3 Channels
4Multi Chip Module
PASA
Internal ADCs (Kaiserslautern)
Digital Frontend and Tracklet Preprocessor
MIMD Processor 4 CPUs, Global Register
File, Interrupt controllers, Counter/Timers, Arbit
er for the Global I/O Bus
Instruction Memory
Master State Machine
External Pretrigger
Serial Interface slave
Serial Interface
Global I/O-Bus
Quad ported Data Memory
Network Interface
Readout Network
5Preamplifier and shaping amplifier (PASA)
Input Pads x18
diff output to the ADC
Charge Sensitive Preamplifier
P/Z cancellation
Shaper 1
Shaper 2
Differential output
FWHM 120 ns ENC 850 e at 25 pF Gain 12 mV/fC INL
0.3 12 mW/channel
Output shaper 1
Output preamplifier stage
0.35 um AMS, 21.3 mm2
Time
6Test of the ADCs in TRAP2
The ADC inserts an IRQ, then one of the CPUs
reads the data and stores in the
IMEM Simultaneous Processor and ADC operation on
same die
7Filter
Non-linearity
Pedestal
Gain
Tail
Crosstalk
8The MIMD Architecture
- Four RISC CPU's
- Coupled by Registers (GRF) and Quad ported data
Memory - Register coupling to the Preprocessor
- Global bus for Periphery
- Local busses for Communication, Event Buffer read
and direct ADC read - I-MEM 4 single ported SRAMs
- Serial Interface for Configuration
- IRQ Controller for each CPU
- Counter/Timer/PsRG for each CPU and one on the
global bus - Low power design, CPU clocks gated individually
FIT
Local Bus
Local Bus
Bus
Const.
CPU 0
CPU 1
IMEM
Interrupt
IMEM
Evt. Buffer
D-MEM
4x10
Network Interface
10
Cnt/Timer
GRF
IMEM
CPU 2
CPU 3
IMEM
4
Config.
Local Bus
Local Bus
4
9Readout tree specifications
- number of tracklets to be read out
- 64.224 MCMs
- with max. 4 tracklets / MCM
- BUT max. 40 tracklets / chamber is adequate
(simulation) - time for read out
- 200 ns latency (for first tracklet)
- 400 ns data transfer
- mechanical electrical restrictions
- chip pin count
- transfer frequency
- max. length for LVDS transfer on PCB
- modular layout of readout boards
- number speed cost of detector links
- RESULTING READOUT STRUCTURE
- 8 Bit data ports DDR (strobeparityspare 11
LVDS Bit) - tree structure with a max. tree width 4
- ? max. tree depth 4 (84 ns latency incl. TM,
w/o opt. link del.) - ? 2 links / chamber (1080 optical detector links,
each 2.4GBd)
10Slow Control Serial Network (SCSN)
Up to 126 slaves per ring CRC protected 24MBits/s
transfer rate 16 Address-, 32 Data- bits/frame
11The TRAP2 chip bonded on the MCM
12ACEX PCI board with two MCMs
13The test board with the MCM
BGA socket developed at KIP
SCSN
NI - output
14Summary of the test results (TRAP2)
- What has been tested
- Serial Configuration, most of the configuration
registers in all blocks, connected to the Global
Bus - Clock gating, Global State Machine
- The large LUTs (non-linearity, position), Event
Buffers - CPUs with Register Files and Interrupt
controllers - Instruction and DFF Quad Port Data Memories
- Local Buses
- ADCs, Acquisition in the event buffers
- PLL, Clock and Pretrigger distribution outputs
- What is left for testing
- Digital filters
- Parallel Network inputs
- Parallel Network outputs with the delay units
- Real acquisition and readout mode
15Summary of the test results (TRAP2)
- There are some small functional bugs or missing
nice to have features, but none of them makes
the chip unusable. All of them are fully
understood - The CPUs and the rest of the chip operate at the
specified clock rate of 120MHz - The ADC problems are fully understood and being
reworked - ADC performance is not influenced by the
operation of the fast on-chip processors
16Readout board
17Global Tracking Unit (GTU)
- Online Trigger Architecture
- local tracking units (MCMs) on detector perform
straight line fits - readout network with 1080 optical links (2 per
detector) - "tracklet" (32-bit word)
- Y position
- slope
- Z position (padrow number)
- charge
- central point of processing Global Tracking Unit
(GTU)
- GTU architecture
- tracklets from detectors on different planes are
matched to tracks - matching is performed on per-module basis no
inter-module matching. - built of FPGAs
- 1 (large) FPGA chip per module (12 links), 90
FPGAs - timing is critical
- 1.4 µs total processing time
- much to be done in parallel
- problem 3-dimensional matching of tracklets
18Tracklet Matching
- matching required in 3 coordinates
- z position
- y position
- deflection (dy/dx slope)
- deflection matching
- deflection is only relevant if tracklets match in
Y position - deflection matching condition
- deflection angle difference lt 3.0
- y position matching
- project tracklets to virtual plane in the middle
of each module - sort tracklets in ascending order of Y
- apply sliding window matching algorithm
- matching condition difference in projected Y
position lt 11.6 mm - from simulation max. 2 tracks inside window per
plane per Z-channel
- z position matching
- handled separately
- only few different values for Z
- tracklets arrive sorted in Z
- final matching condition
- a track is found, if ? 4 tracklets from different
detectors inside sameY window match also in
deflection angle
19DCS Board
- First prototype tested,
- Second version submitted
- ARM based technology
- 100k FPGA flexibility
- 32MB SDRAM
- LINUX system with EasyNet
20Summary of the status
- PASA submitted, the chips will come back to the
end of Sep. - TRAP2 submitted, preliminary tests done
- The MCM for the TRAP2 and PASA chips and the
readout board ready - Tests with PASA-TRAP1 dont indicate for any
noise problems - The first prototype of the DCS board tested, the
next version submitted - Radiation hardness of TRAP1 and some DCS
components tested - Some building blocks of the GTU are developed
21SPARE
22ADC principle of operation
10 bit, 10 MSPS Low power 6 to 9
mW/channel Small area
essential Operations - comparision - /- V ref -
x2 - (Sample Hold)
ADC- Stage
Example of quantization process
Principle of the cyclic AD-Conversion
23Filter and Tracklet Preprocessor
Digital FILter
64 timebins deep
DFIL
Event
Buffer
ADC
Non- Lin
Tail- canc
Cross- talk
Offs
Gain
Q
DFIL
Condition Check
ADC
hit
Event
Buffer
Position
Para
-
CPU0
Calc
meter
COG
Q
DFIL
Condition Check
ADC
LUT
Calc
hit
)
hits
Event
Buffer
Position
Para
-
CPU1
Calc
meter
COG
LUT
Calc
Unit (max. 4
181
channels
FIT Register File and tracklet selection
Position
Para
-
CPU2
Calc
meter
COG
Select
LUT
Calc
Q
DFIL
Condition Check
ADC
Hit
hit
Position
Para
-
Event
Buffer
CPU3
Calc
meter
COG
LUT
Calc
Q
DFIL
Condition Check
ADC
hit
FIT register file is for the CPUs a readonly
register file
Event
Buffer
DFIL
Event
Buffer
ADC
24Tracklet Fit Concept
During Drift Time (in Preprocessor) N
hitcount yi position ?xi timebin sum ?yi
position sum ? xi yi timebinposition sum ?yi2
position² sum, ?xi2 , ?Qi sum charge
After Drift Time (in MIMD) a intercept b
slope ?2 track quality merge track segments in
padrow
For all groups of 3 neighbor channels will be
checked whether the hit conditions are met or
not. For up to 4 hit candidates/timebin the
position and the other terms are calculated and
stored in the Fit Register File.
25NI Datapath
10
10
10
10
Processor
Network Interface
port0
port1
port2
port3
16
16
16
16
CPU 1
16
DMEM
I/O 0
local I/O 0
16
CPU 2
16
IMEM
I/O 1
local I/O 1
16
CPU 3
16
GRF
I/O 2
local I/O 2
global bus arbiter
16
CPU 4
16
I/O 3
local I/O 3
16
I/O G
global I/O
config
16
16
16
16
- Network Interface
- local global I/O interfaces
- input port with data resync. and DDR decoding
- input fifos (zero latency)
- port mux to define readout order
- output port with DDR encoding and programmable
delay unit
16
port4
Delay units
10
26Power management
- Power consumption is very important for this
application. The CPU clocks and LVDS I/O cells
are dynamically gated/enabled by the global state
machines (GSM) - The transitions of the state machine are
triggered by pretrigger signal or writing to some
special command register - GSM is coded with redundant bits (Hamming) with
maximal safety (no illegal states) - The other modules with permanent clock are only
Slow Control Serial Network (SCSN) and Reset
generator
27DCS functional Block
Altera FPGA with ARM core and 100k gates Ethernet
physical layer chip 32 MByte SDRAM 16 Mbyte flash
EPROM TTCrx clock recovery 3.3 and 1.8 Volt
voltage regulators RS422 driver and receiver for
JTAG LVDS clock and trigger driver Watchdog and
Voltage Supervisor 16 (24) Bit ADC with 10
Samples per second