Title: Checksum
1Checksum offloading
- A look at how the Pro1000 NICs can be programmed
to compute and insert TCP/IP checksums
2Network efficiency
- Last time (in our nictcp.c demo) we saw the
amount of work a CPU would need to do when
setting up an ethernet packet for transmission
with TCP/IP protocol format - In a busy network this amount of packet-
computation becomes a bottleneck that degrades
overall system performance - But a lot of that work can be offloaded!
3The loops are costly
- To prepare for a packet-transmission, the
device-driver has to execute a few dozen
assignment-statements, to set up fields in the
packets headers and in the Transmit Descriptor
that will be used by the NIC - Most of these assignments involve simple
memory-to-memory copying of parameters - But the checksum fields require loops
4Cant unroll checksum-loops
- One programming technique for speeding up
loop-execution is known as unrolling, to avoid
the test-and-branch inefficiency - But it requires knowing in advance what number of
loop-iterations will be needed
int sum 0 sum wp0 sum wp1 sum
wp2 sum wp99
5The offload solution
- Modern network controllers can be built to
perform TCP/IP checksum calculations on
packet-data as it is being fetched from ram - This relieves a CPU from having to do the most
intense portion of packet preparation - But checksum offloading is an optional
capability that has to be enabled and
programmed for a specific packet-layout
6Context descriptors
- Intels Pro1000 network controllers employ
special Context Transmit-Descriptors for
enabling and configuring the checksum-offloading
capability - Two kinds of Context Descriptor are used
- An Offload Context Descriptor (Type 0)
- A Data Context Descriptor (Type 1)
7Context descriptor (type 0)
63 48
47 40 39 32 31
16 15
8 7 0
IPCSS
IPCSE
IPCSO
TUCSS
TUCSE
TUCSO
PAYLEN
DTYP 0
MSS
TUCMD
STA
HDRLEN
RSV
DEXT1 (Extended Descriptor)
Legend IPCSS (IP CheckSum Start) TUCSS
(TCP/UDP CheckSum Start) IPCSO (IP CheckSum
Offset) TUCSO (TCP/UDP CheckSum Offset) IPCSE
(IP CheckSum Ending) TUCSE (TCP/UDP CheckSum
Ending) PAYLEN (Payload Length) DTYP
(Descriptor Type) TUCMD (TCP/UCP Command) STA
(TCP/UDP Status) HDRLEN (Header Length) MSS
(Maximum Segment Size)
8The TUCMD byte
7 6 5 4
3 2 1 0
IDE
SNAP
DEXT (1)
reserved (0)
RS
TSE
IP
TCP
Legend IDE (Interrupt Delay Enable) SNAP
(Sub-Network Access Protocol) DEXT (Descriptor
Extension) RS (Report Status) TSE
(TCP-Segmentation Enable) IP (Internet
Protocol) TCP (Transport Control Protocol)
always valid valid only when TSE1
9Context descriptor (type 1)
63 48
47 40 39 32 31
16 15
8 7 0
ADDRESS
DTALEN
DTYP 1
VLAN
DCMD
STA
POPTS
RSV
DEXT1 (Extended Descriptor)
Legend DTALEN (Data Length) DTYP (Descriptor
Type) DCMD (Descriptor Command) STA (Status)
RSV (Reserved) POPTS (Packet Options) VLAN
(VLAN tag)
10The DCMD byte
7 6 5 4
3 2 1 0
IDE
VLE
DEXT (1)
reserved (0)
RS
TSE
IFCS
EOP
Legend IDE (Interrupt Delay Enable) VLE
(VLAN Enable) DEXT (Descriptor Extension) RS
(Report Status) TSE (TCP-Segmentation Enable)
IFCS (Insert Frame CheckSum) EOP (End Of
Packet))
always valid valid only when EOP1
11Our usage example
- Weve created a module named offload.c which
demonstrates the NICs checksum-offloading
capability for TCP/IP packets - Its a modification of our earlier nictcp.c
character-mode device-driver module - We have excerpted the main changes in a
class-handout the full version is online
12Data-type definitions
// Our type-definition for the Type 0
Context-Descriptor typedef struct unsigned
char ipcss unsigned char ipcso unsigned
short ipcse unsigned char tucss unsigned
char tucso unsigned short tucse unsigned
int paylen20 unsigned int dtyp4 unsigned
int tucmd8 unsigned char status unsigned
char hdrlen unsigned short mss
TX_CONTEXT_OFFLOAD
13Definitions (continued)
// Our type-definition for the Type 1
Context-Descriptor typedef struct unsigned
long long base_addr unsigned
int dtalen20 unsigned int dtyp4 unsigned
int dcmd8 unsigned char status unsigned
char pkt_opts unsigned short vlan_tag
TX_CONTEXT_DATA typedef union TX_CONTEXT_
OFFLOAD off TX_CONTEXT_DATA dat
TX_DESCRIPTOR
14Our packets layout
Ethernet Header (14 bytes)
14 bytes
IP Header (20 bytes)
HDR CKSUM
(no options)
10 bytes
TCP Header (20 bytes)
TCP CKSUM
(no options)
16 bytes
Packet-Data (length varies)
15How we use contexts
- Our offload.c driver will send a Type 0
Context Descriptor within module_init()
txring 0 .off.ipcss 14 // IP-header
CheckSum Start txring 0 .off.ipcso 24 //
IP-header CheckSum Offset txring 0 .off.ipcse
34 // IP-header CheckSum Ending txring 0
.off.tucss 34 // TCP/UDP-segment CheckSum
Start txring 0 .off.tucso 50 //
TCP/UDP-segment Checksum Offset txring 0
.off.tucse 0 // TCP/UDP-segment Checksum
Ending txring 0 .dtyp 0 // Type 0 Context
Descriptor txring 0 .tucmd
(1ltlt5)(1ltlt3) // DEXT1, RS1 iowrite32( 1,
io E1000_TDT ) // give ownership to NIC
16Using contexts (continued)
- Our offload.c driver will then use a Type 1
context descriptor every time its write()
function is called to transmit user-data - The network controller remembers the
checksum-offloading parameters that we sent
during module-initialization, and so it continues
to apply them to every outgoing packet (we keep
our same packet-layout)
17Sequence of write() steps
- Adjust the len argument (if necessary)
- Copy len bytes from the users buf array
- Prepend the packets TCP Header
- Insert the pseudo-headers checksum
- Prepend the packets IP Header
- Prepend the packets Ethernet Header
- Initialize the Data-Context Tx-Descriptor
- Give descriptor-ownership to the NIC
18The TCP pseudo-header
- We do initialize the TCP Checksum field, (but
this only needs a short computation) - The ones complement sum of these six words is
placed into TCP Checksum
Protocol-ID ( 6)
TCP Segment-length
Zero
Source IP-address
Destination IP-address
19Setting up the Type-1 Context
int txtail ioread32( io E1000_TDT
) txring txtail .dat.base_addr tx_desc
(txtail TX_BUFSIZ) txring txtail
.dat.dtalen 54 len txring txtail
.dat.dtyp 1 txring txtail .dat.dcmd
0 txring txtail .dat.status 0 txring
txtail .dat.pkt_opts 3 // IXSM1,
TXSM1 txring txtail .dat.vlan_tag
vlan_id txring txtail .dat.dcmd
(1ltlt0) // EOP (End-Of-Packet) txring txtail
.dat.dcmd (1ltlt3) // RS (Report
Status) txring txtail .dat.dcmd (1ltlt5) //
DEXT (Descriptor Extension) txring txtail
.dat.dcmd (1ltlt6) // VLE (VLAN Enable)
txtail (1 txtail) N_TX_DESC iowrite32(
txtail, io E1000_TDT )
20In-class demonstration
- We can demonstrate checksum-offloading by using
our dram.c device-driver to look at the packet
that is being transmitted from one of our
anchor machines, and to look at the packet that
gets received by another anchor machine - The checksum-fields (at offsets 24 and 50) do get
modified by the network hardware!
21In-class exercise
- The NIC can also deal with packets having the UDP
protocol-format but you need to employ
different parameters in the Type 0 Context
Descriptor and arrange a header for the UDP
segment that has a different length and
arrangement of parameters - Also the UDP protocol-ID is 17 (0x11)
22UDP Header
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Source Port Source Port Source Port Source Port Source Port Source Port Source Port Source Port Source Port Source Port Source Port Source Port Source Port Source Port Source Port Source Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port Destination Port
Length Length Length Length Length Length Length Length Length Length Length Length Length Length Length Length Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum Checksum
Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data
UDP header
Traditional Big-Endian representation