Title: Zero Copy
1Zero Copy
- Chia-Tai Tsai
- tai_at_cis.nctu.edu.tw
2Introduction
- Data copy and Checksum overhead dominates
processing time for high throughput application
in networking software. - Single-Copy (CPU copy)
- Single-copy and checksum overhead still accounts
for 60 of networking software overhead. - Zero-Copy
- Moving data between application domains and
network interfaces without CPU intervention
3Where do copies occur?
Sender
Receiver
Move data from application to system buffer
Move data fromsystem buffer toapplication
COPY
OVERHEAD
TCP/IP Protocol
TCP/IP Protocol
Compare Checksum
Compare Checksum
Network Driver
Network Driver
Transmit packet tonetwork interface
Deposit packet in host memory
4General Operating System Structure and Data Path
user space kernel space
5Example
- Thus, copy operations is expensive
- bandwidth is limited
- consumes CPU cycles
- affects the cache
Notethese transfers only show data movement
between sub-systems. Additionally, data touching
operations within a sub-system will require that
data is moved from memory and to the CPU, e.g.
- checksum calculation - encryption - data
encoding - forward error correction
Pentium 4 Processor
registers
cache(s)
RDRAM
RDRAM
RDRAM
RDRAM
PCI slots
PCI slots
PCI slots
6To Eliminate Copies
- Main idea Pass data by reference all the way
down through the protocol stack - We need
- Advanced Network devices scatter/gather DMA
- Modification of OS kernel
7Zero-Copy Basic Idea
mbuf
buf
b_data
m_data
bus(es)
8Zero-Copy Dynamic Allocation
user space memory
application
mbuf memory pools
buf memory pools
file system
communication system
mbuf
buf
mbuf cluster
buf cluster
9Zero-Copy Static Allocation
header
data pointer
mbuf pointer
buf pointer
bufs
- Allocate all needed memory during stream
initialization - If possible, set all buf and mbuf data pointers
- Use alternating buffers
mbufs
dataarea
10Zero-Copy Operations
currently used buffer
currently used buffer
header
header
send offset
send offset
bufs
bufs
- Stream initialization
- Read operation
- Send operation
- Stream close
mbufs
mbufs
dataarea
dataarea
11Zero Copy Schemes
- User accessible interface memory
- Kernel-network shared memory
- User-kernel shared memory
- User-kernel page remapping COW(copy-on-write)
12User accessible interface memory
- The network interface memory is accessible and
pre-mapped into user and kernel address space. - Cons
- Requires complicated hardware support, software
changes - On receive side, it requires intelligence in the
network hardware to direct incoming data to the
right interface memory pool. - Application is required to use special buffer
management calls to allocate and use the
interface memory. - Limited interface memory could pose a serious
resource problem.
13Kernel-network shared memory
- OS kernel manage the interface memory and uses
DMA or PIO(program I/O) to move data between
interface memory and application buffer. - Not require application to be modified
- Cons
- Kernel code change manages special pool of
memory from network interface.
14User-kernel shared memory
- Defines a new set of APIs with shared semantics
between the user and kernel address spaces. - Uses DMA to move data between the shared memory
and the network interface. - Fast Buffers
- it uses a per-process buffer pool that is
pre-mapped in both the user and kernel address
spaces. - Cons
- No compatibility because of new APIs
- Network hardware must be capable of targeting DMA
transfer of an incoming packet to the correct
memory pool allocated by client.
15User-kernel shared memory
16User-kernel page remapping COW
- Uses DMA to transfer data between interface
memory and kernel buffers, and remaps buffers by
editing the MMU (Memory Management Unit ) table
to give the appearance of data transfer. (with
copy-on-write) - No modification in socket interface and VM system
- Cons
- All the buffers involved must align on page
boundaries and occupy an integral number of MMU
page - problem in larger MTU size than system page
size. - Network drivers must arrange receive buffers on a
page boundary. - need to predict header size
- Application should avoid reusing busy buffers.
17Conclusions
- Efficient zero-copy implementation for network
I/O. - Design based on virtual memory page remapping and
copy-on-write require the least amount of
changes.