Title: Data errors
1Data errors
- Common data blocks found in different events
- Bitflips
- Found in MCM headers (event counters)
- Missing in the empty ADC masks and in the
tracklet end markers - Conclusions, testpattern runs
2Common blocks of data (1)
Common block of data found in 2 (or once in 3)
events in the same run. No errors in the
structure of the first event, broken structure of
the second event.
- Run 440_2 440_2
- Event 365 392
- EventCnt 357 366
- ROB 4 lt 953395 gt 2
- MCM 1 lt52gt 8
- Link 12 12
- Run 466_1 466_1
- Event 2500 3796
- EventCnt 834 1266
- ROB 6 lt 1050702 gt 6
- MCM 7 lt44gt 6
- Link 12 12
ZS
no ZS
lt distance in 32bit words gt ltsize in 32bit
wordsgt
3Common blocks of data (2)
Common block of data from two adjacent events
found in a later event. No errors in the
structure of the first two event, broken
structure of the third event.
- Run 470_1 470_1
- Event 2686 4012
- EventCnt 51133 51575
- ROB 6 lt 813091 gt 4
- MCM 3 lt24gt 14
- Link 12 12
- ltEOEgt - continue on the next event
- Run 470_1 470_1
- Event 2689 4012
- EventCnt 51134 51575
- ROB 2 lt 880527 gt 4
- MCM 12 lt32gt 14
- Link 12 12
lt distance in 32bit words gt ltsize in 32bit
wordsgt
4Common blocks of data (3)
Common block of data found in 2 events in the two
adjacent runs. No errors in the structure of the
first event, broken structure of the second event.
- Run 472_3 473_1
- Event 21691 460
- EventCnt 141163 141546
- ROB 4 lt 926564 gt 6
- MCM 10 lt49gt 0
- Link 10 10
lt distance in 32bit words gt ltsize in 32bit
wordsgt
The consequences of these types of errors are
wrong event counters, wrong last bits in the ADC
data, breaks in the event structure.
5Biterrors in event counter
- The event counters are correct in the previous
and in the next event. This is not an internal
TRAP error. - The event counter (in the MCM header) is the
first word sent by the MCM. - Frequently found errors in bits 0, 2 and 16 of
the event counter (add 4 for the bit position in
the output data). - Bit 20 (in event counter this is bit 16)
- run 474 in the first 5000/3 events in 47 cases
the bit should be 0, but is 1 - run 472 in the first 5000/3 events in 13 cases
the bit should be 1, but is 0 - Bit 6 (in event counter this is bit 2)
- run 472 in the first 5000/3 events in 8 cases
should be 0, but is 1 - run 473 and 466 in the first 5000/3 events in 4
cases should be 0, but is 1 - No preferred ROBs or MCMs, only links 1..3 do not
appear.
6Biterrors in event counter, bit0
- Bit 4 (in event counter this is bit 0)
The peaks do not correlate to any configuration
changes, but may be T?
7Missing biterrors in tracklet endmarker
- In all analyzed runs the tracklet endmarker is
0xAAAAAAAA. - If the bit 4 (event counter bit 0) tends
frequently to be 1 instead of 0, then one would
expect to find tracklet endmarkers with set bit
4 - 0xAAAAAAAA gt 0xAAAAAABA
- In such a case, the endmarker will be not
stripped by the TRAP logic in the column merger,
board merger and half-chamber merger, and will
propagate as tracklet. A search for 0xAAAAAAAA
xor 2n (n0..31) in runs 437..474 (4.5GB) gave 8
times EA and 15 times BA but as part of the
pair of endmarkers and NOT like this -
lttrackletgt 0xaaaaaaba lttrackletgt 0xaaaaaaaa
0xaaaaaaaa
This can happen only between HCM and ORI, in ORI
or later, but not between the TRAPs.
0xaaaaaaba 0xaaaaaaaa 0xaaaaaaaa
8Examples of found biterrors in tracklet endmarker
Two typical examples of biterrors in tracklet
endmarker
0x00000000 0x00000000 0x00000000 0x00000000
0x23b41d57 0xaaaaaaba 0xaaaaaaaa 0x83004049
0x7bf3c291 0xac0deaac 0xfe00000c
0x02409032 0x03009026 0xe30c989c 0xfe00000c
0x00000000 0xaaaaaaea 0xaaaaaaaa 0x830040cd
0x78cba251 0xbc0c989c 0xfe00000c
9Missing biterrors in the empty adc masks
If bits 20 and 4 are frequently 1 instead of 0,
then we should expect the same error in the empty
ADC masks.
Bit 20 (16 ec) pattern
Bit 4 (0 ec) pattern
0xFE10000C ltany MCM headergt 0xFE00000C
0xFE00001C ltany MCM headergt 0xFE00000C
Found only once, but as part of the known common
block error.
Not found
10Conclusions
- The block errors most probably are due to timing
errors in the GTU FPGAs. - The reasons for the bitflip errors are still not
very clear, they were for sure detected in the
MCM headers, but not in the ADC masks (the two
mostly transferred data words in case of zero
suppression). - In order to understand more we need to test the
complete readout chain with testpatterns,
generated in the TRAPs.
11Reference
437_01 440_13 440_25 454_01 440_01 440_14
440_26 461_01 440_02 440_15 441_01 462_01
440_03 440_16 441_02 466_01 440_04 440_17
441_04 467_01 440_05 440_18 445_01 470_01
440_06 440_19 446_01 471_01 440_07 440_20
447_01 472_01 440_08 440_21 448_01 472_02
440_09 440_22 449_01 472_03 440_10 440_23
450_01 473_01 440_11 440_24 451_01 474_01
440_12
Used runs