Title: Transactionbased communicationcentric debug
1Transaction-based communication-centric debug
Kees Goossens, Bart Vermeulen, Remco van
Steeden, Martijn Bennebroek Research, NXP
Semiconductors Computer Engineering, Technical
University Delft Testable Design and Test of
Integrated Systems, Technical University of
Twente Research, Philips, The
Netherlands presenter Andreas Hansson (TUE)
- Kees Goossens
- SOC Architectures and Infrastructure (SAI) group
- Systems and Circuits sector, Research
- NXP Semiconductors (formerly Philips Research)
- kees.goossens_at_nxp.com
2overview
- debug
- communication-centric
- transaction-based
- classification
- debug architecture
- examples
- conclusions
3debug is
- Error localization when a chip does not work in
its intended application - Difficult due to limited visibility on the
internal behaviour - Becoming
- Time consuming
- Unpredictable
- which threatens
- Time-to-Market
- Brand Image
4communication-centric debug
- debug of individual programmable processors is
mature - but modern SOCs contain 4 programmable
processors - multi-processor debug is a challenge
- SOC complexity resides in the interactions
between IP blocks / tasks - exchanging data synchronising
mem
CPU
T
B
T
B
CPU
mem
T
B
T
B
5communication-centric debug
- older interconnects serialised all transactions
- a unique global SOC communication trace /
behaviour (easy)
mem
CPU
APB VPB AHB etc.
T
B
T
B
CPU
mem
T
B
T
B
6communication-centric debug
- latest interconnects allow split, pipelined,
concurrent transactions - many possible interleavings of sequential
behaviours - no unique SOC communication trace / behaviour
(hard)
ML-AHBAXI NOC
mem
CPU
T
B
T
B
CPU
mem
T
B
T
B
7communication-centric debug
- examples of splitting, pipelining, reordering,
and concurrency - there may be no point in time (cycle) where all
transactions have finished
req1
req2
req3
master1
resp1
resp2
resp3
topology
slave1
exec1
exec3
slave2
exec4
exec2
exec5
master2
req4
req5
resp4
resp5
time
8communication-centric debug
- traditional processor-centric debugfocuses on
control of the IP (computation)
IP
interconnect
IP
monitor
monitor
debug control
9communication-centric debug
- because the interconnect is the locus of all IP
interactions - we propose to focus debug on the interactions
between IPsthrough control of the interconnect
(communication)
IP
interconnect
IP
IP
interconnect
IP
monitor
monitor
monitor
monitor
monitor
debug control
debug control
10processors / computation
message request or response
granularity ofinternal IP control
req
M
S
resp
control debug of IP internally
- processor centric, debugger per processor
- source-code-level debugging not shown
11NOCs / communication
- LLFC, E2EFC, packet not meaningful outside NOC
control debug of interconnect internally
granularity ofinternal NOC control
stop RNI globally consistent at NOC level
stop at IP-NOC boundary consistent at NOC IP
level
12transactions communication computation
granularity ofinternal IP control
most useful
control debug of communication
between masters slaves
useful
control debug of IP interconnect
internally
granularity ofinternal NOC control
stop RNI globally consistent at NOC/IP level
stop at IP-NOC boundary consistent at NOC IP
level
13transaction-based communication-centric debug
- processor and interconnect behaviours coincide at
- instruction/flit, message transaction levels
- instruction flit level is relevant for IP NOC
internal debug - control granularity inside IP NOC
- message transaction levels are relevant for SOC
debug - interaction between IPs NOC
14transaction-based communication-centric debug
- message-level debug
- see requests responses,i.e. interleaving of
masters slaves - transaction-level debug
- see requests responses atomically at
masters,i.e. interleaving of masters only
NOC
mem
CPU
NOC
mem
CPU
T
T
B
B
T
T
B
B
CPU
mem
CPU
mem
T
T
B
B
T
B
T
B
message level
transaction level
15transaction-based debug
- this enables message/transaction-level stopping,
single stepping, etc.
req1
req2
req3
master1
resp1
resp2
resp3
topology
slave1
exec1
exec3
slave2
exec4
exec2
exec5
master2
req4
req5
resp4
resp5
1
2
3
4
time
16architecture operation
- transactions
- signal groups (command, write data, read data,
...) - valid/accept (valid/ready) handshake
- basic concept intervene in the valid/accept
handshake
request
request
master
NI shell
NI kernel
router
slave
NI shell
NI kernel
valid
accept
accept
valid
17architecture operation
- monitor or TAP (test access protocol) controller
generates event - event is distributed on broadcast interconnect
- follows NOC layout
- runs at NOC functional speed
request
request
master
NI shell
NI kernel
router
slave
NI shell
NI kernel
valid
accept
accept
valid
valid
monitor
monitor
monitor
monitor
monitor
monitor
stop module
stop module
stop module
stop module
stop module
TAP controller
stop module
event distribution interconnect (EDI)
18architecture operation
- finish ongoing handshakes, then mask accept/valid
clock controllers
request
master
NI shell
NI kernel
router
slave
NI shell
NI kernel
valid
accept
accept
valid
accept
valid
monitor
monitor
monitor
monitor
monitor
monitor
stop module
stop module
stop module
stop module
stop module
TAP controller
stop module
event distribution interconnect (EDI)
19architecture operation
- check if NOC activity has ceased, otherwise force
a stop
clock controllers
debug data interconnect (DDI), e.g. JTAG, NOC
request
request
master
NI shell
NI kernel
router
slave
NI shell
NI kernel
valid
accept
EORQ
EORQ
accept
valid
accept
valid
monitor
monitor
monitor
monitor
monitor
monitor
stop module
stop module
stop module
stop module
stop module
TAP controller
stop module
debug control interconnect (DCI), e.g. JTAG, NOC
event distribution interconnect (EDI)
20architecture operation
clock controllers
debug data interconnect (DDI), e.g. JTAG, NOC
request
request
master
NI shell
NI kernel
router
slave
NI shell
NI kernel
valid
accept
EORQ
EORQ
accept
valid
accept
valid
monitor
monitor
monitor
monitor
monitor
monitor
stop module
stop module
stop module
stop module
stop module
TAP controller
stop module
debug control interconnect (DCI), e.g. JTAG, NOC
event distribution interconnect (EDI)
21interconnects
- functional data interconnect
- any interconnect, but here we assume a NOC
- event distribution interconnect (EDI)
- could use NOC, but require minimum latency to not
lose events - here, very simple dedicated broadcast
interconnect - debug data interconnect (DDI)
- use a separate functional interconnect (e.g.
control bus) - using NOC is tricky
- here, reuse existing (cheap) debug infrastructure
(scan chains TAP) - debug control interconnect (DCI)
- NOC in previous work of Ciordas
- here, reuse existing (cheap) debug infrastructure
(scan chains TAP)
22example normal operation
write
read
write
read
MNIP
master NI side
rdata
rdata
wdata
wdata
livetrns
in between transactions
SNIP
slave NI side
read
write
rdata
wdata
23example transaction-level stopping
write
read
no new write
MNIP
master NI side
finish rdata
rdata
wdata
wdata
blockedlivetrns
stop_in
blockedno live transactions
slave NI side
SNIP
read
write
rdata
wdata
stop_in
stop
finish current transactions dont start any new
transactions
24example message-level stopping
write
read
MNIP
master NI side
rdata
wdata
wdata
MNIP never blockedlive transactions
blockedlivetrns
stop_in
slave NI side
SNIP
no read
no write
rdata
no write data
wdata
stop_in
stop
finish current messages dont start any new
transactions
25conclusions
- transactions are natural abstraction between IPs
and interconnect(TLM was already invented..) - interconnect is a natural place to control SOC
behaviour - performance is high
- EDI _at_ 500 MHz (functional NOC frequency)
- CDI DDI _at_ 10 MHz (conventional test debug
frequency) - costs are low
- extra NOC hardware cost is only 4.5 (monitors,
EDI, NI shell) - re-use scan TAP for DDI
- flexible programmable hardware infrastructure,
but at low level - future work is software layer to offerdebug
programming, stopping, single stepping, etc.
26acknowledgements
- monitoring
- Calin Ciordas (TUE)
- debug
- Siddharth Umrani (TUD)
- plus the whole Æthereal team, especially
- Martijn Coenen (NXP)
- Andreas Hansson (TUE)
- Jef van Meerbergen (Philips)
27(No Transcript)
28transaction-based debug
- software debug traditionally focusses on
instructions or above - hardware debug traditionally focusses on clock
cycles or below - these are hard to combine
- different tools
- no unique correspondence between cycles at
different locations - e.g. due to GALS
- there may be no point in time (cycle) where all
instructions have finished
29transaction-based debug
- software debug traditionally focusses on
instructions or above - hardware debug traditionally focusses on clock
cycles or below - these are hard to combine
- different tools
- no unique correspondence between cycles at
different locations - e.g. due to GALS
- there may be no point in time (cycle) where all
instructions have finished - transactions (requests responses)