Linux Kernel Implementation of the INET Address Family - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Linux Kernel Implementation of the INET Address Family

Description:

Large parts of my notes were left out, or not explained in detail ... They also include generic files for fragmenting, defragmenting, and forwarding packets. ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 25
Provided by: course5
Category:

less

Transcript and Presenter's Notes

Title: Linux Kernel Implementation of the INET Address Family


1
Linux Kernel Implementation of the INET Address
Family
  • by
  • Robert Ball

2
Explanation of organization of paper
  • Key Operating Systems components as cited as
    part of the requirements were used in the paper.
    Large parts of my notes were left out, or not
    explained in detail shorten the length of the
    paper and to conform to this requirement.

3
Conclusion
  • This project has been something I have wanted to
    so for some time now. At first it was very
    difficult to figure out how things are structured
    in the source code. Although there are several
    good reference books on the Linux kernel none of
    them went into the detail that I required for
    this paper. After going through the kernel and
    looking at large amounts of source code and
    reading several references I found that I had far
    too much data for this assignment.
  • I also had to limit the amount of detail that I
    explained so as to not turn this document into a
    book. It was very difficult to determine exactly
    what to put in and what to take out.

4
What is Linux
  • In short, Linux is an operating system based on
    Unix that was first developed by Linus Torvalds
    as a hobby.
  • It appears to many people that Linux is
    Microsofts only competition in operating systems
    for personal computers.
  • Linux is open-source.

5
Linux kernel vs. Distribution
  • A distribution of Linux often comes with the
    Linux operating system and many other features
    such as editors, programming languages, games,
    etc.
  • The kernel is used to read and write from files
    and networks, buffer text and graphics to a
    monitor(s), manage memory, etc. All Linux
    distributions have this core, the Linux kernel,
    but much more as well.

6
Linux Kernel
  • Highly modularized.
  • Written in either C or hardware specific
    assembly.
  • The reason for going with C and not C, or
    another language is three-fold
  • Main kernel writers know C best.
  • They believe C is slower.
  • The kernel is already written in C and would take
    too long to rewrite it in C or another
    language.

7
Kernel Version Number
  • Most recent kernel when paper was written
    2.4.22.
  • Source code can be found at http//www.kernel.org.
  • A Linux kernel version is a number as such
  • First number is the version number.
  • Second number
  • Even number denotes stable kernel.
  • Odd number denotes development kernel.
  • Third number denotes release number.
  • Latest version as of today 2.4.23

8
Kernel File Structure
  • Linux kernel is divided up logically into 13
    folders. Each of these folders has a specific
    purpose.
  • The networking files can be found in three places
    with a few exceptions
  • /net/ipv4/
  • /include/net/
  • /include/linux/
  • The include folder contains all the header files
    for the kernel.
  • The net folder has 26 subfolders. Each subfolder
    contains the code for a different networking
    protocol such as ATM, TCP/IP, Netware, AppleTalk,
    etc.

9
From Sockets to Device Drivers
  • A socket is nothing more that a specialized pipe
    between the network layer and the users
    application. That is to say that if an
    application wished to talk to another application
    on another computer, then a socket would be
    created to act as a transparent layer between the
    two computers.

10
(No Transcript)
11
BSD Socket
  • The BSD socket code is located at /net/socket.c.
  • The only interface that the user has between the
    network and his application.
  • By having only one generic, highly customizable
    socket (interface), all network applications can
    know what to expect regardless of what protocol
    it actually uses.
  • The function interfaces between users and the
    kernel are called system calls.
  • The generic system call that can call all the
    other network system calls in the kernel (for
    networking), sys_socketcall, receives different
    parameters for which protocol(s) the user
    application wishes to use and returns a handle (a
    pointer) to the created socket.

12
System Calls
  • 1 or 17
  • asmlinkage long sys_socketcall(int call, unsigned
    long args)
  • The argument call is defined as a number from
    1-17 as following (All other numbers return an
    error. It should also be pointed out that each of
    these system calls can be called by the users
    application independently as well.)
  • asmlinkage long sys_socket(int family, int type,
    int protocol)
  • asmlinkage long sys_bind(int fd, struct sockaddr
    umyaddr, int addrlen)
  • asmlinkage long sys_connect(int fd, struct
    sockaddr uservaddr, int addrlen)
  • asmlinkage long sys_listen(int fd, int backlog)
  • asmlinkage long sys_accept(int fd, struct
    sockaddr upeer_sockaddr, int upeer_addrlen)
  • asmlinkage long sys_getsockname(int fd, struct
    sockaddr usockaddr, int usockaddr_len)
  • asmlinkage long sys_getpeername(int fd, struct
    sockaddr usockaddr, int usockaddr_len)
  • asmlinkage long sys_socketpair(int family, int
    type, int protocol, int usockvec2)
  • asmlinkage long sys_send(int fd, void buff,
    size_t len, unsigned flags)
  • asmlinkage long sys_recv(int fd, void ubuf,
    size_t size, unsigned flags)
  • asmlinkage long sys_sendto(int fd, void buff,
    size_t len, unsigned flags, struct sockaddr
    addr, int addr_len)
  • asmlinkage long sys_recvfrom(int fd, void ubuf,
    size_t size, unsigned flags, struct sockaddr
    addr, int addr_len)
  • asmlinkage long sys_shutdown(int fd, int how)
  • asmlinkage long sys_setsockopt(int fd, int level,
    int optname, char optval, int optlen)
  • asmlinkage long sys_getsockopt(int fd, int level,
    int optname, char optval, int optlen)

13
sk_buff
14
INET Socket
  • The INET address family is a set of protocols
    that have been standardized by the ISO.
  • The INET socket is initiated by the BSD socket
    code (/net/socket.c). When the system call
    sys_socket is called. The arguments to create an
    INET socket are indicated, two specific functions
    are called.
  • The first, inet_init, is called when the INET
    module is loaded into the kernel.
  • The second, inet_create, is called by the BSD
    socket code. The BSD socket layer calls the INET
    layer socket from the registered INET proto_ops
    data structure to perform work for it.
  • inet_init is called by the following syntax
  • static int __init inet_init(void)
  • It creates the first creates a dummy sk_buff
    structure and initializes all the necessary
    structures and information that the kernel needs
    to know about this module.
  • module_init(inet_init)
  • (Line 1205 in /net/ipv4/af_inet.c)

15
INET Address Family Protocols
  • The paper discusses the following protocols from
    the INET address family
  • TCP
  • IP
  • ARP
  • UDP
  • ICMP

16
Did not put this figure in the paper as I ran out
of space, but here is how it all fits together
17
TCP (Transmission Control Protocol)
  • The TCP is a connection-based protocol that
    guarantees delivery in the correct order of
    packets.

18
TCP/IP Stack
19
TCP, cont.
  • When the INET module is being loaded into memory,
    the INET code calls the TCP module. The method
    that is called that initializes the TCP module
    follows
  • void __init tcp_init(void)
  • (Line 2535 in /net/ipv4/tcp.c)
  • The TCP layer is a large robust layer that
    guarantees that packets will be delivered and in
    the correct order. To make such promises the TCP
    layer has several files to complete its rather
    complicated task. This paper cannot go into
    explicit detail into the details of TCP, but the
    following list of files make up the TCP layer
  • /include/net/tcp.h
  • Header files that defines the structures that the
    TCP layer uses.
  • /net/ipv4/tcp.c
  • Driver for all the TCP functions.
  • /net/ipv4/tcp_diag.c
  • Module for monitoring TCP sockets.
  • /net/ipv4/tcp_input.c
  • Functions specific for TCP input.
  • /net/ipv4/tcp_output.c
  • Functions specific for TCP output.
  • /net/ipv4/tcp_ipv4.c
  • Code for IPv4 specific functions.
  • /net/ipv4/tcp_minisocks.c
  • Implements the time-wait state to close
    connections gracefully.
  • /net/ipv4/tcp_timer.c

20
IP (Internet Protocol)
  • Once the data that has been formed in the TCP
    layer it is then passed to the IP layer. The IP
    layer is simply a layer that is used for delivery
    of packets across the Internet like a postal
    system. It allows the delivery of packets from
    one computer to another without a direct link
    between the computers. With TCP enabled, TCP
    establishes the link between the two computers so
    that they can send messages directly. IP is
    usually accompanied by either a TCP or UDP layer.
  • There are over 50 files used in the IP layer.
    These files include functions used exclusively
    for routers, for servers, and for clients. They
    also include generic files for fragmenting,
    defragmenting, and forwarding packets. Other
    things include additional support for the TCP and
    UDP layers. To make a simple description of every
    file would be too long. Instead, an overview of
    the layer is described below.
  • /include/net/ip.h
  • Header file that defines the main structures that
    the IP layer uses. See also /include/net/ip_fib.h,
    /include/net/ipconfig.h, /include/net/ipip.h,
    and /include/linux/ip.h.
  • /net/ipv4/ipip.c
  • General driver for the IP layer.
  • /net/ipv4/ip_forward.c
  • Main file used for forwarding packets.
  • /net/ipv4/ip_fragment.c
  • Fragments packets into smaller chunks.
  • /net/ipv4/ip_input.c
  • Main file used for receiving incoming packets.
  • /net/ipv4/ip_sockglue.c
  • Reverses fragmentation and glues the fragments
    back together.

21
ARP, UDP, and ICMP
  • ARP (Address Resolution Protocol), which is RFC
    826, is used to determine the Ethernet address of
    a host given the IP address. The files that it
    uses are the following
  • /net/arp.h
  • Header file that defines the structures that the
    ARP layer uses.
  • /net/ipv4/arp.c
  • Main driver for the implementation of ARP.
  • /net/ipv4/netfilter/arp_tables.c
  • Packet matching code for ARP packets.
  • UDP (User Datagram Protocol) is a connectionless
    protocol that does not guarantee delivery or a
    particular order of its packets. This protocol is
    often used for things like video/audio streaming
    and other such applications that rely on speed
    more than accuracy. The UDP layer uses the
    following files
  • /include/net/udp.h
  • Header file that defines the structures that the
    UDP layer uses.
  • /net/ipv4/udp.c
  • Implementation of UDP.
  • ICMP (Internet Control Message Protocol) is a
    protocol that supports packets containing error,
    control, and informational messages. The PING
    command, for example, uses ICMP to test an
    Internet connection. ICMP is defined by RFC 792
    and uses the following files
  • /include/net/icmp.h
  • Header file that defines the structures that the
    ICMP layer uses.
  • /net/ipv4/icmp.c
  • Implementation of ICMP.

22
Interesting Things Found
  • It should be pointed out to the reader at this
    time that the Linux kernel writers try to follow
    the published standards so as to be to
    communicate with non-Linux systems. Although this
    goal is often achieved there are sometimes other
    considerations that are taken into account such
    as security. To illustrate one specific example
    of where the Linux kernel writers did not follow
    the published standards consider the following
    excerpt from /net/ipv4/tcp_timer.c (lines
    93-104)
  • / Do not allow orphaned sockets to eat all our
    resources.
  • This is direct violation of TCP specs, but it
    is required
  • to prevent DoS attacks. It is called when a
    retransmission timeout
  • or zero probe timeout occurs on orphaned
    socket.
  • Criterium is still not confirmed
    experimentally and may change.
  • We kill the socket, if
  • 1. If number of orphaned sockets exceeds an
    administratively configured
  • limit.
  • 2. If we have strong memory pressure.
  • /
  • static int tcp_out_of_resources(struct sock sk,
    int do_reset)

23
Interesting, cont.
  • The Linux kernel writers try to separate modules
    as best they can, but there are occasions when
    some modules, such as the IP module and the TCP
    module, are too closely related that they cannot
    (or are not willing) completely separate them.
    Please consider the following snippet as an
    example
  • static int tcp_invert_tuple(struct
    ip_conntrack_tuple tuple, const struct
    ip_conntrack_tuple orig)
  • (Lines 110-111 in /net/ipv4/netfilter/ip_conntrack
    _proto_tcp.c)
  • This snippet of code illustrates the fact that
    the IP module cannot be exclusively removed from
    the TCP module as they are interrelated. In fact
    the entire file /net/ipv4/netfilter/ip_conntrack_p
    roto_tcp.c is used only for TCP packets that are
    currently in the IP layer. There are several
    other such files used exclusively for other
    layers such as UDP and ICMP. Although this
    destroys the modularity paradigm that the kernel
    writers try to adhere to, there are few of these
    cross-module implementations in the kernel as a
    whole.

24
Interesting, cont.
  • To create a robust kernel that deals with
    networking, as with all commercial operating
    systems, the kernel writers must suppose that the
    world is not a perfect place and must deal with
    differences of opinion and implementations as is
    evidenced in the following snippet of code
  • ifndef I_WISH_WORLD_WERE_PERFECT
  • / It is not -( All the routers (except for
    Linux) return only
  • 8 bytes of packet payload. It means, that
    precise relaying of
  • ICMP in the real Internet is absolutely
    infeasible.
  • /
  • (Lines 294299 in /net/ipv4/ipip.c)
Write a Comment
User Comments (0)
About PowerShow.com