Title: CS 498 Lecture 9 Traffic Control for QoS
1CS 498 Lecture 9Traffic Control for QoS
- Jennifer Hou
- Department of Computer Science
- University of Illinois at Urbana-Champaign
- Reading Chapters 18, The Linux Networking
Architecture Design and Implementation of
Network Protocols in the Linux Kernel
2Traffic Control
- Two major functions
- Policing
- Usually implemented at the router.
- Data connections are monitored and packets that
are transmitted violating a specified strategy
are discarded. - Traffic shaping
- Usually implemented at end hosts.
- Data connections are regulated to conform to
certain rate. Surplus packets are either marked
and then sent or delayed at the sender side until
the rate constraint no longer holds true.
3Processing of Network Data
Upper layers (TCP, UDP, )
Traffic control
Ingress policing
Input de-multiplexing
Forwarding
Output queuing
4Traffic Control in Linux Kernel
Local delivery
Locally created data
net/core/dev.c
net/ipv4/ip_input.c
Forwarding
dev_queue_xmit
net/sched/sch_ingress.c Traffic control
in Incoming direction
net/sched/sch_.c net/sched/cls_.c Traffic
control in outing direction
net/core/dev.c driver.c
dev-gt hard_start_xmit
5Traffic Control in Linux Kernel
...
..
p8022_rcv
arp_rcv
ip_rcv
ip_queue_xmit
arp_send
ETH_P_802_2
dev.c
br_input.c
dev.c
...
handle_bridge
dev_queue_xmit
net_rx_action
CONFIG_BRIDGE
dev-gtqdisc-gtenqueue
do_softirq
Scheduler
eth1
eth0
CPU1
CPU2
dev.c
net_tx_action
softnet_datacpun.input_pkt_queue
qdisc_run
dev.c
netif_rx
Scheduler
qdisc_restart
eth_type_trans()
driver.c
dev-gtqdisc-gtdequeue
dev_alloc_skb()
driver.c
net_interrupt
dev-gthard_start_xmit
6Components of Traffic Control
- Queuing disciplines
- Packets sent are passed to a queueing discipline
and sorted within the queue in compliance with
specific rules. - Packets can be removed no earlier than when the
queueing discipline has marked them as ready for
transmission. - Classes (within a queuing disciplines)
- Within a queue discipline, packets can be
allocated to different classes. - Filters are used to allocate packets to classes
with a queueing discipline.
7Queuing Discipline
- Each network device has a queuing discipline
- It controls how packets are enqueued on the
device are treated - Possible operations keep, drop, mark
- A simple one may just consist of a single queue
Queuing discipline
8Complex Queuing Discipline
- Queuing discipline
- May use filters to distinguish among different
classes of packets - Process each class in a specific way
- Two filters can point to one class
- Classes
- do not store packets
- They use another queuing discipline to do that
Enqueue
dequeue
Queueing discipline
Filter
Filter
Filter
Class 2
Class 1
Queueing Discipline
Queueing Discipline
9Complex Queuing Discipline
10Policing
- When packets of a connection are enqueued, the
connection can be policed - Letting the packets go
- Dropping the packets
- Letting the packets go but mark them
11Data Structures
- Include/net/pkt_sched.h
- Include/net/sch_generic.h
12Traffic Control in Linux Kernel
- Traffic control kernel code resides mainly in
net/sched - Traffic control in the incoming direction is
handled by net/sched/sch_ingress.c. - Various scheduling disciplines in the outgoing
direction are given in - net/sched/sch_.c
- net/sched/cls_.c
13Traffic Control in Linux Kernel
- Interface used inside the kernel can be found in
- /usr/src/linux-(version)/include/net/pkt_cls.h
- /usr/src/linux-(version)/include/net/pket_sched.h
- Interfaces between kernel traffic control and
user space programs are decared in - /usr/include/linux/pkt_cls.h
- /usr/include/linux/pkt_sched.h.
14Inserting Traffic Control
Enqueue
dequeue
Queueing discipline
Filter
Filter
Filter
Class 2
Class 1
Queueing Discipline
Queueing Discipline
15Queueing Discipline -- Qdisc
- struct Qdisc
-
- int (enqueue)(struct
sk_buff skb, struct Qdisc dev) - struct sk_buff (dequeue)(struct
Qdisc dev) - unsigned flags32
- define TCQ_F_BUILTIN 1
- define TCQ_F_THROTTLED 2
- define TCQ_F_INGRESS 4
- int padded
- struct Qdisc_ops ops
- u32 handle
- u32 parent
- atomic_t refcnt
- struct sk_buff_head q
- struct net_device dev
- struct list_head list
- struct gnet_stats_basic bstats
- struct gnet_stats_queue qstats
The Qdisc_ops data structure
The socket buffer queue governed by this qdisc
The network device to which the Qdisc is allocated
When an outer queue passes a packet to an inner
queue the packet may have to be discarded. If the
outer queueing discipline implements the
callback function reshape_fail then it can be
invoked by the inner queueing discipline.
16Queuing Disciplines Qdisc_ops
- struct Qdisc_ops
- struct Qdisc_ops next
- struct Qdisc_class_ops cl_ops
- char idIFNAMSIZ
- int priv_size
- int (enqueue)(struct sk_buff , struct Qdisc
) - struct sk_buff (dequeue)(struct Qdisc )
- int (requeue)(struct sk_buff , struct Qdisc
) - unsigned int (drop)(struct Qdisc )
- int (init)(struct Qdisc , struct rtattr
arg) - void (reset)(struct Qdisc )
- void (destroy)(struct Qdisc )
- int (change)(struct Qdisc , struct rtattr
arg) - int (dump)(struct Qdisc , struct sk_buff )
-
The packet should be arranged at the position in
the queueing discipline where it has been before
A queueing discipline can be added via
register_qdisc() in init_module()
17Qdisc_ops
- enqueue()
- Enqueues a packet
- Return values are
- NET_XMIT_SUCCESS, if the packet is accepted
- NET_XMIT_DROP, if the packet is discarded
- NET_XMIT_CN, if the packet is discarded because
of buffer overflow - NET_XMIT_POLICED, if the packet is discarded
because of violation of a policing rule. - NET_XMIT_BYPASS, if the packet is accepted, but
will not leave the queue via the regular
dequeue() function.
18Qdisc_ops
- dequeue()
- Returns a pointer to a packet (skb) eligible for
sending - A return value of null means that there are no
packets ready to be sent. (The total number of
packets in the queue is given in struct Qdisc
q?q.len.) - requeue()
- Puts a packet back into the original position in
the queue where it had been before. - The number of packets running through the queue
should not be increased. - drop()
- Drops one packet from the queue
19Qdisc_ops
- init()
- Initializes the queuing discipline
- reset()
- Resets the queuing discipline to its initial
state (empty queue, reset counter, delete times) - destroy()
- Removes a queuing discipline and frees all the
resources reserved during the runtime of the
queueing discipline. - change()
- Changes the parameters of a queuing discipline
- dump()
- Returns output configuration parameters and
statistics of a queueing discipline.
20Qdisc_class_ops
- struct Qdisc_class_ops
-
- / Child qdisc manipulation /
- int
(graft)(struct Qdisc , unsigned long cl, -
struct Qdisc , struct Qdisc ) - struct Qdisc (leaf)(struct
Qdisc , unsigned long cl) - / Class manipulation routines /
- unsigned long (get)(struct
Qdisc , u32 classid) - void
(put)(struct Qdisc , unsigned long) - int
(change)(struct Qdisc , u32, u32, -
struct rtattr , unsigned long ) - int
(delete)(struct Qdisc , unsigned long) - void
(walk)(struct Qdisc , struct qdisc_walker
arg) - / Filter manipulation /
- struct tcf_proto
(tcf_chain)(struct Qdisc , unsigned long) - unsigned long
(bind_tcf)(struct Qdisc , unsigned long, -
u32 classid) - void
(unbind_tcf)(struct Qdisc , unsigned long) - / rtnetlink specific /
21Qdisc_class_ops
- graft() binds a queueing discipline to a class
- leaf() returns a pointer to the queueing
discipline currently bound to the class - get() maps the classid to the internal
identification and increments the reference
counter by one. - Each class is associated with two ids
- classid (of type u32) is used by the user and the
configuration tools used in the user space. - Internal identification (of type unsigned long)
is used within the kernel - put() decrements the usage counter.
22Qdisc_class_ops
- change() changes the class parameters
- delete() checks if the class is not referenced
and if not, deletes the class. - walk() walks through the linked list of the all
the classes of a queueing discipline and invokes
the associated callback functions to obtain
configuration/statistics data. - tcf_chain() returns a pointer to the linked list
for the filter bound to the class. - bind_tcf() binds a filter to a class.
- dump_class() gives configuration and statistics
data of a class.
23tcf_proto
- struct tcf_proto
-
- / Fast access part /
- struct tcf_proto next
- void root
- int
(classify)(struct sk_buff, struct tcf_proto, -
struct tcf_result ) - u32 protocol
- / All the rest /
- u32 prio
- u32 classid
- struct Qdisc q
- void data
- struct tcf_proto_ops ops
-
24tcf_proto_ops
- struct tcf_proto_ops
-
- struct tcf_proto_ops next
- char kindIFNAMSIZ
- int (classify)(struct
sk_buff, struct tcf_proto, - struct
tcf_result ) - int (init)(struct
tcf_proto) - void (destroy)(struct
tcf_proto) - unsigned long (get)(struct
tcf_proto, u32 handle) - void (put)(struct
tcf_proto, unsigned long) - int (change)(struct
tcf_proto, unsigned long, - u32
handle, struct rtattr , unsigned long ) - int (delete)(struct
tcf_proto, unsigned long) - void (walk)(struct
tcf_proto, struct tcf_walker arg) - / rtnetlink specific /
- int (dump)(struct
tcf_proto, unsigned long, - struct
sk_buff skb, struct tcmsg) - struct module owner
-
25tcf_proto_ops
- classify() classifies a packet (checks if the
filtering rule applies to the packet) - Possible return values are
- TC_POLICE_OK the packet is accepted by the
filter. - TC_POLICE_RECLASSIFY the packet violates agreed
parameters and should be allocated to a different
class. - TCP_POLICE_SHOT the packet was dropped because
of violation of agreed parameters - TCP_POLICE_UNSPEC The rule does not match the
packet, and the packet should be passed to the
next filter. - tcf_result contains the classid and the internal
identification of the class.
26Queueing Discipline Example
27RED
28Dropping Probability pa
Linux implementation
pb
29RED implementation I
- struct red_sched_data / Parameters /
- u32 limit / HARD maximal queue length /
- u32 qth_min / Min average length threshold A
scaled / - u32 qth_max / Max average length threshold A
scaled / - char Wlog / log(W) /
- char Plog / random number bits /
-
- unsigned long qave / Average queue length A
scaled / - int qcount / Packets since last random number
generation / - u32 qR / Cached random number /
- psched_time_t qidlestart / Start of idle period
/ - struct tc_red_xstats st
30RED implementation II Compute average queue
length
- We want
- avg avg (1- w) w backlog
- Code in Linux
- q-gtqave sch-gtstats.backlog - (q-gtqave gtgt
q-gtWlog) - Why
- avg q-gtqave w
- w 2(-wlog)
31RED implementation III
- Ideally avg should be calculated every constant
clock interval - In Linux it is updated every packet outgoing
- Care need to be taken for idle period
32RED implementation IVDecide dropping probability
- We want enqueue if
- Linux code
- if (((q-gtqave - q-gtqth_min)gtgtq-gtWlog)q-gtqcount lt
q-gtqR) goto enqeue - max_P (qth_max qth_min)/2Plog
- q-gtqR rnd 2Plog