Title: Internetworking Protocols and Programming
1Internetworking Protocols and Programming CSE
5348 / 7348 Instructor Krish Pillai Session 9
2Elementary Socket System Calls
- bind System call assigns a name to an unnamed
socket - include ltsys/types.hgt
- include ltsys/socket.hgt
- int bind (int sockfd, struct sockaddr myaddr,
int addrlen) - Servers register their well-known addresses with
the system so that the system can forward packets
bound for this IP address and port number to the
bound process - A client can register a specific address for
itself - A connectionless client needs to assure that the
system assigns it some unique address, so that
the other end has a valid return address to send
its responses to
3Elementary Socket System Calls
- connect System call client establishes a
connection with the server - include ltsys/types.hgt
- include ltsys/socket.hgt
- int connect (int sockfd, struct sockaddr
servaddr, int addrlen) - sockfd is a descriptor returned from the socket
call - Second argument is a sockaddr filled with server
descriptors - The connect call does not return until a
connection is negotiated and established - A connection-oriented client does not have to
bind to a local address before calling connect.
Local address is auto assigned
4Elementary Socket System Calls
- listen System call connection-oriented server
indicates to the system its willingness to
receive connections - include ltsys/types.hgt
- include ltsys/socket.hgt
- int listen (int sockfd, int backlog)
- call is executed after both the socket and bind
calls and immediately before the accept system
call - Backlog defines queue for incoming connections
while the server is executing the accept command
(usually set to five) - In concurrent connection-oriented servers, the
server needs to accept a request and fork a child
process before it can do another accept. This
involves time delay with possible queue buildup
5Elementary Socket System Calls
- accept System call after connection-oriented
server calls listen it executes the accept system
call - include ltsys/types.hgt
- include ltsys/socket.hgt
- int accept (int sockfd, struct sockaddr peer,
int addrlen) - accept takes the first request in the queue and
creates another socket with the same properties
as sockfd, assigns a new descriptor and returns
this value - The sockaddr is filled with the address of the
client requesting service - addrlen is a value-result parameter. It contains
the size of the struct sockaddr before the call,
and is filled in with the size of the sockaddr
that defines the connection request
6Elementary Socket System Calls
- send/sendto System calls similar to write but
requires additional arguments - include ltsys/types.hgt
- include ltsys/socket.hgt
- int send (int sockfd, char buff, int nbytes, int
flags) - int sendto (int sockfd, char buff, int nbytes,
int flags, struct sockaddr to, int addrlen) - send call sends data into the socket defined by
sockfd. Contents of buffer pointed to by buff, up
to nbytes length is transmitted. sockaddr holds
destination address for sendto function call - The flags field is either zero or is formed by
ORing the following - MSG_OOB send out-of-band data
- MSG_DONTROUTE bypass routing (send or sendto)
7Elementary Socket System Calls
- recv/recvfrom System calls similar to read but
requires additional arguments - include ltsys/types.hgt
- include ltsys/socket.hgt
- int recv (int sockfd, char buff, int nbytes, int
flags) - int recvfrom (int sockfd, char buff, int nbytes,
int flags, struct sockaddr from, int addrlen) - Receives data from a client. The recv system call
is used with connection oriented client/servers. - recvfrom is used for UDP . Fills in from and
addrlen - The flags field is either zero or is formed by
ORing the following - MSG_OOB receive out-of-band data
- MSG_PEEK peek at incoming message (recv or
recvfrom)
8Elementary Socket System Calls
- Connectionless Clients can also call the connect
system call - The connect call for UDP is a dummied call that
does not send any packets out through a UDP
socket - Local data structures for the destination
address get set up with this call - Once connected the client can use send and
recv to transmit data to the server - The server address does not have to be supplied
each time data is transmitted as in the case of
sendto and recvfrom - The term connect for a UDP client is a misnomer,
but helps code efficiency
9Elementary Socket System Calls
- close System calls closes the socket and sends
any queued data if protocol is reliable - include ltsys/types.hgt
- include ltsys/socket.hgt
- int close (int sockfd)
- Sends any queued data is the protocol used by the
socket is reliable - Normally system tries to return from the close
immediately, but kernel attempts to send any data
queued
10Elementary Socket System Calls
- getsockname System call returns local protocol
address associated with a socket - include ltsys/types.hgt
- include ltsys/socket.hgt
- int getsockname (int sockfd, struct sockaddr
localaddr, int addrlen) - If a connection-oriented client does not call
bind, getsockname can be used to return the local
IP address and local port number assigned to the
connection by the kernel - After calling bind with a port number of zero,
getsockname can be used to get the port allocated
by the system for the process
11Elementary Socket System Calls
- getpeername System call returns foreign
protocol address associated with a socket - include ltsys/types.hgt
- include ltsys/socket.hgt
- int getpeername (int sockfd, struct sockaddr
peeraddr, int addrlen) - If a connection-oriented server calls accept and
execs a child process, getpeername is the only
way the child process can obtain the clients
identity
12Client-Server Model
- A Server takes a request at a WKP, performs its
service and provides result to the requester - The Client sends a request to the Server, and
waits for response - Servers generally live longer than Clients and
are harder to implement - Servers can be Iterative or Concurrent
- Most servers allow multiple Clients to connect
to the same port - Concurrent servers are multi-tasking (fork) or
multi-threaded (pthread_create) to achieve
parallelism
13The Iterative Connection Oriented Server
- int sockfd, newsockfd
- if ( (sockfd socket()) lt 0)
- bail_out(socket error)
- if (bind(sockfd, ) lt 0)
- bail_out(bind error)
- if (listen(sockfd, 5) lt 0)
- bail_out(listen error)
- for ( )
- newsockfd accept(sockfd, ) / blocks /
- if (newsockfd lt 0)
- bail_out(accept error)
- proc_request(newsockfd)
- close(newsockfd) / parent fork return value
is 1 / -
- accept duplicates socket descriptor.
- The Process holds two descriptors
- Original sockfd, and
- The duplicate, newsockfd
- Use each newsockfd to service request, and
sockfd to call the accept function
14The Concurrent Connection Oriented Server
- accept duplicates socket descriptor.
- Forking a child process creates four descriptors
referring to the same global resource. - Both processes hold two descriptors
- Original sockfd, and
- The duplicate, newsockfd
- Child uses newsockfd, and parent uses sockfd to
operate
- int sockfd, newsockfd
- if ( (sockfd socket()) lt 0)
- bail_out(socket error)
- if (bind(sockfd, ) lt 0)
- bail_out(bind error)
- if (listen(sockfd, 5) lt 0)
- bail_out(listen error)
- for ( )
- newsockfd accept(sockfd, ) / blocks /
- if (newsockfd lt 0)
- bail_out(accept error)
- if (fork() 0)
- close(sockfd) / child return value is 0/
- proc_request(newsockfd)
- exit(0)
-
- close(newsockfd) / parent fork return value
is 1 / -
Parent control flow
15The Endian Problem
- Computer Main Memory is organized as aggregates
of bytes - Computer registers, on the other hand, operate on
multibyte values - Multibyte representation is not standardized for
CPU architectures - Processors store and interpret multibyte values
in inconsistent ways - Consider a 16-bit integer composed of two bytes
stored at main memory locations A and A1 - There are two ways to store the 16-bit value in
memory
16The Endian Problem
- Little Endian - Low order byte can be stored at
location A, and high order at location A1 - Big Endian - High order byte can be stored at
location A, and Low order at location A1
Ends with the lower memory location
High order byte
Low order byte
Little Endian
A1
A
High order byte
Low order byte
Big Endian
Ends with the higher memory location
A
A1
17The Endian Problem
- SPARC, PowerPC, Intel i960 etc., follow the Big
Endian approach - The x86 family of processors, DEC PDP-11, DEC VAX
etc., follow the Little Endian format - The Network is standardized as Big Endian
- Problems arise when a data is marshalled and sent
as an octet stream to a remote machine with a
different Endianness - The data may end up getting unmarshalled and
packed in byte reverse order - An integer 45AB hex, when transmitted to a remote
machine may end up as AB45 hex due to Endian
incompatibility
18Byte Ordering routines
- Network/Host ordering To correct Endian
inconsistencies, the API offers htonl, htons,
ntohl, and ntohs function calls - include ltsys/types.hgt
- include ltnetinet/in.hgt
- u_long htonl (u_long hostlong)
- u_long htons (u_long hostshort)
- u_long ntohl (u_long netlong)
- u_long ntohs (u_long netshort)
- Assumption is that a short integer occupies 16
bits and a long integer 32 bits - On Big-Endian machines these functions are
dummied out since there is nothing to be done - On Little-Endian machines these function
transpose short and long integers into Network
order (Big-Endian)
19Address Conversion
- Internet Address is usually written in
dotted-decimal format e.g. 192.168.0.1 - Socket API handles Internet addresses as struct
in_addr containing a 32-bit Internet address - struct in_addr
- u_long s_addr / 32 bit network ID host ID
/ -
- The inet_addr and inet_ntoa functions convert
dotted decimal notation to a 32 bit Internet
address - include ltsys/socket.hgt
- include ltnetinet/in.hgt
- include ltarpa/inet.hgt
- unsigned long inet_addr(char ptr)
- char inet_ntoa(struct in_addr inaddr)
20Socket Options
- The getsockopt and setsockopt system calls allow
the application to get socket attributes or to
set them - The two functions are defined as
- include ltsys/socket.hgt
- int getsockopt(int sockfd, int level, int
optname, void optval, - socklen_t optlen)
- int setsockopt(int sockfd, int level, int
optname, const void optval, - socklen_t optlen)
- Parameter sockfd refers to an open socket
descriptor - optval is a type independent pointer to a
variable from which the new value of the option
is used by setsockopt, or into which the current
value of the option is stored by getsockopt upon
return
21Socket Options
- The type of optval will depend on the variable
optname - getsockopt or setsockopt will inspect parameters
level and optname, and then decide how to
interpret the contents of the void pointer optval - Refer to Page 569 in text book (Vol. 3) for
listing of supported options. Example below shows
method to read maximum segment size for a TCP
socket - / optlen and maxseg are of type int /
- / sockfd is an open and valid socket descriptor
/ - optlen sizeof(maxseg)
- if (getsockopt(sockfd, IPPROTO_TCP,
TCP_MAXSEG,(void ) maxseg, - optlen) lt 0)
- bail_out(TCP_MAXSEG getsockopt error)
- printf(TCP maxseg d\n, maxseg)
22Socket Options
- Socket options can also be modified by the fcntl
system call - include ltfcntl.hgt
- int fcntl(int fd, int cmd, int arg)
- The first parameter fd refers to an open socket
descriptor - Two commands of interest are F_SETFL and F_GETFL
- The FNDELAY argument used with the F_SETFL
command allows the user to set a socket in
non-blocking mode - An I/O request that cannot complete is never done
on a non-blocking socket - The call returns with an errno value set to
EWOULDBLOCK - Return from a connect system call would result in
errno being set to EINPROGRESS
23Single Threaded Server
- Processes use up Virtual Memory space and puts
load on the Scheduler due to increased context
switches - If the arrival rate of transactions is high but
the average session hold time is small, a single
threaded architecture is adopted - A TCP Client connecting to a well known port is
assigned a new socket on the server as a result
of the accept call - Each new slave socket created by the accept call
can be monitored by the server - The master socket is continuously monitored for
new requests - The slave socket is monitored for data
availability - The select function detects readable information
on all sockets
24Single Threaded Server
The select() function can be used to monitor all
socket descriptors
WKP
fd3
fd2
fd1
The accept function call returns a new socket
descriptor each time a client connects
25Select system call
- select() system call allows user process to
block until one or more events occur to wake up
the process - include ltsys/types.hgt
- include ltsys/time.hgt
- int select (int maxfdpl, fd_set readfds, fd_set
writefds, fd_set exceptfds, struct
timeval timeout) - FD_ZERO(fd_set fdset)
- FD_SET(int fd, fd_set fdset)
- FD_CLR(int fd, fd_set fdset)
- FD_ISSET(int fd, fd_set fdset)
- The timeout argument
- struct timeval
- long tv_sec / seconds /
- long tv_usec / microseconds /
26Select system call
- Arguments readfds, writefds, and exceptfds, are
value-result parameters - The maxfdpl parameter specifies the size of the
bit array that would be used by select to
represent the descriptors - Multiple descriptor values are indicated by the
bit positions in a large array of type fd_set - For example to check if any descriptor in set
1,2,3 are ready for reading - fd_set fdvar
- FD_ZERO(fdvar)
- FD_SET(1, fdvar) FD_SET(2, fdvar) FD_SET(3,
fdvar) - Other macros are
- FD_CLR(int fd, fd_set fdset) / clears bit
number fd / - FD_ISSET(int fd, fd_set fdset) / checks if bit
fd is set /
27Select system call
- The value of timeout is interpreted as follows
- Return immediately after checking descriptors
timeval contents are set to zero - Return if at least one descriptor is ready for
I/O, but limit wait to timeval contents - Wait indefinitely for descriptors to go ready for
I/O timeout pointer set to null - The indefinite wait for I/O readiness is a
cancellation point and can be terminated or
unblocked by sending an interrupt signal - The select() function can be used as a high
precision timer by setting all parameters to zero
except for the timeval struct
28UDP single threaded server
- The accept call is meant only for connection
oriented servers or stream sockets - To simulate this function, a UDP server waits for
requests on a master port - As soon as the client connects, the server
creates a new socket and supplies the client with
the unreserved socket number - Each client requesting service at the WKP is
redirected to an unreserved port - The server iteratively does a select on all the
sockets to see if they are ready to be read from,
or written into - Server also sets aside new buffer for each
connection avoiding packet interleaving from
different clients into the same buffer
29Alternative Architectures
- The Client-Server architecture is demand driven
with network I/O occurring each time data is
requested - Data transfer efficiency can be increased by
caching information on the client side - HTTP clients or browsers do this for information
that is fairly stable - The Client saves the soft state of the Server,
which should be aged out - Another alternative is for the Server to
periodically push out unsolicited information to
all clients - This is inefficient but is useful for
applications such as ruptime - This application broadcasts machine status once
every minute to all machines on the physical
network
30Multiprotocol Servers
- Applications talking on both ends of a socket
should speak the same protocol to communicate - Sometimes the same service is offered using two
different protocols - It is desirable not to have multiple servers for
each protocol providing the same service - The algorithm required for computing a response
is the same irrespective of the protocol and
should be reused - The server can be made to open two sockets, one
for each protocol, bound to the same port if
needed - If the TCP socket becomes ready, a client has
requested a TCP connection and an accept function
call follows - If the UDP socket becomes active a recvfrom is
used to read from it
31Multiservice Servers
- Ideally a single server should be able to provide
multiple services - A Connectionless multiserver Server can open
multiple sockets and bind each to WKPs - When a request comes in, the Server can invoke
the code fragment designed to offer that service - Server uses the select() function call to wait
from datagrams from various clients to arrive - A Connection oriented multiservice server can be
based on the iterative algorithm - The server opens a socket for each WKP and
listens for requests - The accept() function creates a new socket which
can be added to the array of descriptors that is
passed to select() - Alternatively the server can be made concurrent
in for both protocols
32Multiservice Servers
- A Multiservice server can be made to invoke
separate programs based on the service it needs
to offer - The inetd daemon provided in most Operating
Systems is a Multiservice Server of sorts - The inetd server works on the basis of a
configuration file - The configuration file is an ASCII file that
describes the service, the protocol port, and the
program that is to be invoked - The inetd server listens for service requests on
all services defined in the configuration file - When a request comes in, the daemon in turn
invokes the program that provides the service and
passes it the socket descriptor on which the
service program should operate - The advantage of this approach is that services
are invoked only when clients connect, thereby
saving on platform resources
33Super Server- inetd
- A typical entry in the configuration file
inetd.conf looks like - ftp stream tcp nowait root
/usr/sbin/in.ftpd in.ftpd - telnet stream tcp nowait root
/usr/sbin/in.telnetd in.telnetd - The protocol port for each service is defined in
the /etc/services file - The fourth field indicates to inetd if the server
should be run iteratively or concurrently - The wait keyword forces inetd to wait until one
invocation of the program terminates - If nowait is specified, multiple invocations are
done for each request from a client - The sixth field indicates the program that is to
be run followed by the arguments to the program
34Reading Assignment
- Read the following from the Text Book
- Chapters 20, 21 of Volume 1
- Chapters 9,10,11,13, and 14 of Volume 3