Title: SystemLevel IO and Networks
1System-Level I/O and Networks
- Topics
- Unix I/O
- Robust reading and writing
- Reading file metadata
- Sharing files
- I/O redirection
- Standard I/O
- Internetworking
- Client-server programming model
- Networks
- Internetworks
- Global IP Internet
- IP addresses
- Domain names
- Connections
- Network Programming
- Programmers view of the Internet
- Sockets interface
- Writing clients and servers
2A Typical Hardware System
CPU chip
register file
ALU
system bus
memory bus
main memory
I/O bridge
bus interface
I/O bus
Expansion slots for other devices such as network
adapters.
USB controller
disk controller
graphics adapter
mouse
keyboard
monitor
disk
3Reading a Disk Sector Step 1
CPU chip
CPU initiates a disk read by writing a command,
logical block number, and destination memory
address to a port (address) associated with disk
controller.
register file
ALU
main memory
bus interface
I/O bus
USB controller
disk controller
graphics adapter
mouse
keyboard
monitor
disk
4Reading a Disk Sector Step 2
CPU chip
Disk controller reads the sector and performs a
direct memory access (DMA) transfer into main
memory.
register file
ALU
main memory
bus interface
I/O bus
USB controller
disk controller
graphics adapter
mouse
keyboard
monitor
disk
5Reading a Disk Sector Step 3
CPU chip
When the DMA transfer completes, the disk
controller notifies the CPU with an interrupt
(i.e., asserts a special interrupt pin on the
CPU)
register file
ALU
main memory
bus interface
I/O bus
USB controller
disk controller
graphics adapter
mouse
keyboard
monitor
disk
6Unix Files
- A Unix file is a sequence of m bytes
- B0, B1, .... , Bk , .... , Bm-1
- All I/O devices are represented as files
- /dev/sda2 (/usr disk partition)
- /dev/tty2 (terminal)
- Even the kernel is represented as a file
- /dev/kmem (kernel memory image)
- /proc (kernel data structures)
7Unix File Types
- Regular file
- Binary or text file.
- Unix does not know the difference!
- Directory file
- A file that contains the names and locations of
other files. - Character special and block special files
- Terminals (character special) and disks ( block
special) - FIFO (named pipe)
- A file type used for interprocess comunication
- Socket
- A file type used for network communication
between processes
8Unix I/O
- The elegant mapping of files to devices allows
kernel to export simple interface called Unix
I/O. - Key Unix idea All input and output is handled in
a consistent and uniform way. - Basic Unix I/O operations (system calls)
- Opening and closing files
- open()and close()
- Changing the current file position (seek)
- lseek (not discussed)
- Reading and writing a file
- read() and write()
9Opening Files
- Opening a file informs the kernel that you are
getting ready to access that file. - Returns a small identifying integer file
descriptor - fd -1 indicates that an error occurred
- Each process created by a Unix shell begins life
with three open files associated with a terminal - 0 standard input
- 1 standard output
- 2 standard error
int fd / file descriptor / if ((fd
open(/etc/hosts, O_RDONLY)) lt 0)
perror(open) exit(1)
10Closing Files
- Closing a file informs the kernel that you are
finished accessing that file. - Closing an already closed file is a recipe for
disaster in threaded programs - Moral Always check return codes, even for
seemingly benign functions such as close()
int fd / file descriptor / int retval /
return value / if ((retval close(fd)) lt 0)
perror(close) exit(1)
11Reading Files
- Reading a file copies bytes from the current file
position to memory, and then updates file
position. - Returns number of bytes read from file fd into
buf - nbytes lt 0 indicates that an error occurred.
- short counts (0 lt nbytes lt sizeof(buf) ) are
possible and are not errors!
char buf512 int fd / file descriptor
/ int nbytes / number of bytes read / /
Open file fd ... / / Then read up to 512 bytes
from file fd / if ((nbytes read(fd, buf,
sizeof(buf))) lt 0) perror(read)
exit(1)
12Writing Files
- Writing a file copies bytes from memory to the
current file position, and then updates current
file position. - Returns number of bytes written from buf to file
fd. - nbytes lt 0 indicates that an error occurred.
- As with reads, short counts are possible and are
not errors! - Transfers up to 512 bytes from address buf to
file fd
char buf512 int fd / file descriptor
/ int nbytes / number of bytes read / /
Open the file fd ... / / Then write up to 512
bytes from buf to file fd / if ((nbytes
write(fd, buf, sizeof(buf)) lt 0)
perror(write) exit(1)
13Unix I/O Example
- Copying standard input to standard output one
byte at a time. - Note the use of error handling wrappers for read
and write (Appendix B).
include "csapp.h" int main(void) char
c while(Read(STDIN_FILENO, c, 1) ! 0)
Write(STDOUT_FILENO, c, 1) exit(0)
14Dealing with Short Counts
- Short counts can occur in these situations
- Encountering (end-of-file) EOF on reads.
- Reading text lines from a terminal.
- Reading and writing network sockets or Unix
pipes. - Short counts never occur in these situations
- Reading from disk files (except for EOF)
- Writing to disk files.
- How should you deal with short counts in your
code? - Use the RIO (Robust I/O) package from your
textbooks csapp.c file (Appendix B).
15The RIO Package
- RIO is a set of wrappers that provide efficient
and robust I/O in applications such as network
programs that are subject to short counts. - RIO provides two different kinds of functions
- Unbuffered input and output of binary data
- rio_readn and rio_writen
- Buffered input of binary data and text lines
- rio_readlineb and rio_readnb
- Cleans up some problems with Stevenss readline
and readn functions. - Unlike the Stevens routines, the buffered RIO
routines are thread-safe and can be interleaved
arbitrarily on the same descriptor. - Download from csapp.cs.cmu.edu/public/ics/code/src
/csapp.c csapp.cs.cmu.edu/public/ics/code/include/
csapp.h
16Unbuffered RIO Input and Output
- Same interface as Unix read and write
- Especially useful for transferring data on
network sockets - rio_readn returns short count only it encounters
EOF. - rio_writen never returns a short count.
- Calls to rio_readn and rio_writen can be
interleaved arbitrarily on the same descriptor.
include csapp.h ssize_t rio_readn(int fd,
void usrbuf, size_t n) ssize_t rio_writen(nt
fd, void usrbuf, size_t n) Return num.
bytes transferred if OK, 0 on EOF (rio_readn
only), -1 on error
17Implementation of rio_readn
/ rio_readn - robustly read n bytes
(unbuffered) / ssize_t rio_readn(int fd, void
usrbuf, size_t n) size_t nleft n
ssize_t nread char bufp usrbuf
while (nleft gt 0) if ((nread read(fd, bufp,
nleft)) lt 0) if (errno EINTR) /
interrupted by sig
handler return / nread 0 / and
call read() again / else return -1
/ errno set by read() / else if (nread
0) break / EOF / nleft -
nread bufp nread return (n -
nleft) / return gt 0 /
18Buffered RIO Input Functions
- Efficiently read text lines and binary data from
a file partially cached in an internal memory
buffer - rio_readlineb reads a text line of up to maxlen
bytes from file fd and stores the line in usrbuf. - Especially useful for reading text lines from
network sockets. - rio_readnb reads up to n bytes from file fd.
- Calls to rio_readlineb and rio_readnb can be
interleaved arbitrarily on the same descriptor. - Warning Dont interleave with calls to rio_readn
include csapp.h void rio_readinitb(rio_t rp,
int fd) ssize_t rio_readlineb(rio_t rp, void
usrbuf, size_t maxlen) ssize_t rio_readnb(rio_t
rp, void usrbuf, size_t n)
Return num. bytes read if OK, 0 on EOF, -1 on
error
19RIO Example
- Copying the lines of a text file from standard
input to standard output.
include "csapp.h" int main(int argc, char
argv) int n rio_t rio char
bufMAXLINE Rio_readinitb(rio,
STDIN_FILENO) while((n Rio_readlineb(rio,
buf, MAXLINE)) ! 0) Rio_writen(STDOUT_FILENO,
buf, n) exit(0)
20File Metadata
- Metadata is data about data, in this case file
data. - Maintained by kernel, accessed by users with the
stat and fstat functions.
/ Metadata returned by the stat and fstat
functions / struct stat dev_t
st_dev / device / ino_t
st_ino / inode / mode_t
st_mode / protection and file type /
nlink_t st_nlink / number of hard
links / uid_t st_uid / user
ID of owner / gid_t st_gid /
group ID of owner / dev_t st_rdev
/ device type (if inode device) / off_t
st_size / total size, in bytes /
unsigned long st_blksize / blocksize for
filesystem I/O / unsigned long st_blocks
/ number of blocks allocated / time_t
st_atime / time of last access /
time_t st_mtime / time of last
modification / time_t st_ctime /
time of last change /
21Example of Accessing File Metadata
/ statcheck.c - Querying and manipulating a
files meta data / include "csapp.h" int main
(int argc, char argv) struct stat stat
char type, readok Stat(argv1,
stat) if (S_ISREG(stat.st_mode)) / file
type/ type "regular" else if
(S_ISDIR(stat.st_mode)) type "directory"
else type "other" if ((stat.st_mode
S_IRUSR)) / OK to read?/ readok "yes"
else readok "no" printf("type s, read
s\n", type, readok) exit(0)
bassgt ./statcheck statcheck.c type regular,
read yes bassgt chmod 000 statcheck.c bassgt
./statcheck statcheck.c type regular, read no
22How the Unix Kernel Represents Open Files
- Two descriptors referencing two distinct open
disk files. Descriptor 1 (stdout) points to
terminal, and descriptor 4 points to open disk
file.
Open file table shared by all processes
v-node table shared by all processes
Descriptor table one table per process
File A (terminal)
stdin
File access
fd 0
stdout
Info in stat struct
fd 1
File size
File pos
stderr
fd 2
File type
refcnt1
fd 3
...
...
fd 4
File B (disk)
File access
File size
File pos
File type
refcnt1
...
...
23File Sharing
- Two distinct descriptors sharing the same disk
file through two distinct open file table entries - E.g., Calling open twice with the same filename
argument
Open file table (shared by all processes)
v-node table (shared by all processes)
Descriptor table (one table per process)
File A
File access
fd 0
fd 1
File pos
File size
fd 2
refcnt1
File type
fd 3
...
...
fd 4
File B
File pos
refcnt1
...
24How Processes Share Files
- A child process inherits its parents open files.
Here is the situation immediately after a fork
Open file table (shared by all processes)
v-node table (shared by all processes)
Descriptor tables
Parent's table
File A
File access
fd 0
fd 1
File size
File pos
fd 2
File type
refcnt2
fd 3
...
...
fd 4
Child's table
File B
File access
fd 0
File size
fd 1
File pos
fd 2
File type
refcnt2
fd 3
...
...
fd 4
25Standard I/O Functions
- The C standard library (libc.a) contains a
collection of higher-level standard I/O functions - Documented in Appendix B of KR.
- Examples of standard I/O functions
- Opening and closing files (fopen and fclose)
- Reading and writing bytes (fread and fwrite)
- Reading and writing text lines (fgets and fputs)
- Formatted reading and writing (fscanf and fprintf)
26Standard I/O Streams
- Standard I/O models open files as streams
- Abstraction for a file descriptor and a buffer in
memory. - C programs begin life with three open streams
(defined in stdio.h) - stdin (standard input)
- stdout (standard output)
- stderr (standard error)
include ltstdio.hgt extern FILE stdin /
standard input (descriptor 0) / extern FILE
stdout / standard output (descriptor 1)
/ extern FILE stderr / standard error
(descriptor 2) / int main()
fprintf(stdout, Hello, world\n)
27Buffering in Standard I/O
- Standard I/O functions use buffered I/O
printf(h)
printf(e)
printf(l)
printf(l)
printf(o)
buf
printf(\n)
h
e
l
l
o
\n
.
.
fflush(stdout)
write(1, buf 6, 6)
28Standard I/O Buffering in Action
- You can see this buffering in action for
yourself, using the always fascinating Unix
strace program
include ltstdio.hgt int main()
printf("h") printf("e") printf("l")
printf("l") printf("o") printf("\n")
fflush(stdout) exit(0)
linuxgt strace ./hello execve("./hello",
"hello", / ... /). ... write(1, "hello\n",
6...) 6 ... _exit(0)
?
29Unix I/O vs. Standard I/O vs. RIO
- Standard I/O and RIO are implemented using
low-level Unix I/O. - Which ones should you use in your programs?
fopen fdopen fread fwrite fscanf fprintf
sscanf sprintf fgets fputs fflush fseek fclose
C application program
rio_readn rio_writen rio_readinitb rio_readlineb r
io_readnb
Standard I/O functions
RIO functions
open read write lseek stat close
Unix I/O functions (accessed via system calls)
30Pros and Cons of Unix I/O
- Pros
- Unix I/O is the most general and lowest overhead
form of I/O. - All other I/O packages are implemented using Unix
I/O functions. - Unix I/O provides functions for accessing file
metadata. - Cons
- Dealing with short counts is tricky and error
prone. - Efficient reading of text lines requires some
form of buffering, also tricky and error prone. - Both of these issues are addressed by the
standard I/O and RIO packages.
31Pros and Cons of Standard I/O
- Pros
- Buffering increases efficiency by decreasing the
number of read and write system calls. - Short counts are handled automatically.
- Cons
- Provides no function for accessing file metadata
- Standard I/O is not appropriate for input and
output on network sockets - There are poorly documented restrictions on
streams that interact badly with restrictions on
sockets
32Pros and Cons of Standard I/O (cont)
- Restrictions on streams
- Restriction 1 input function cannot follow
output function without intervening call to
fflush, fseek, fsetpos, or rewind. - Latter three functions all use lseek to change
file position. - Restriction 2 output function cannot follow an
input function with intervening call to fseek,
fsetpos, or rewind. - Restriction on sockets
- You are not allowed to change the file position
of a socket.
33Pros and Cons of Standard I/O (cont)
- Workaround for restriction 1
- Flush stream after every output.
- Workaround for restriction 2
- Open two streams on the same descriptor, one for
reading and one for writing - However, this requires you to close the same
descriptor twice - Creates a deadly race in concurrent threaded
programs!
FILE fpin, fpout fpin fdopen(sockfd,
r) fpout fdopen(sockfd, w)
fclose(fpin) fclose(fpout)
34Choosing I/O Functions
- General rule Use the highest-level I/O functions
you can. - Many C programmers are able to do all of their
work using the standard I/O functions. - When to use standard I/O?
- When working with disk or terminal files.
- When to use raw Unix I/O
- When you need to fetch file metadata.
- In rare cases when you need absolute highest
performance. - When to use RIO?
- When you are reading and writing network sockets
or pipes. - Never use standard I/O or raw Unix I/O on sockets
or pipes.
35A Client-Server Transaction
- Every network application is based on the
client-server model - A server process and one or more client processes
- Server manages some resource.
- Server provides service by manipulating resource
for clients.
1. Client sends request
Client process
Server process
Resource
4. Client handles response
2. Server handles request
3. Server sends response
Note clients and servers are processes running
on hosts (can be the same or different hosts).
36Hardware Org of a Network Host
CPU chip
register file
ALU
system bus
memory bus
main memory
I/O bridge
MI
Expansion slots
I/O bus
USB controller
network adapter
disk controller
graphics adapter
mouse
keyboard
monitor
disk
network
37Computer Networks
- A network is a hierarchical system of boxes and
wires organized by geographical proximity - LAN (local area network) spans a building or
campus. - Ethernet is most prominent example.
- WAN (wide-area network) spans country or world.
- Typically high-speed point-to-point phone lines.
- An internetwork (internet) is an interconnected
set of networks. - The Gobal IP Internet (uppercase I) is the most
famous example of an internet (lowercase i) - Lets see how we would build an internet from the
ground up.
38Lowest Level Ethernet Segment
- Ethernet segment consists of a collection of
hosts connected by wires (twisted pairs) to a
hub. - Spans room or floor in a building.
- Operation
- Each Ethernet adapter has a unique 48-bit
address. - Hosts send bits to any other host in chunks
called frames. - Hubs lavishly copies each bit from each port to
every other port. - Every host sees every bit.
host
host
host
100 Mb/s
100 Mb/s
hub
ports
39Next Level Bridged Ethernet Segment
- Spans building or campus.
- Bridges cleverly learn which hosts are reachable
from which ports and then selectively copy frames
from port to port.
A
B
host
host
host
host
host
X
hub
hub
bridge
100 Mb/s
100 Mb/s
1 Gb/s
host
host
100 Mb/s
100 Mb/s
hub
bridge
hub
Y
host
host
host
host
host
C
40Conceptual View of LANs
- For simplicity, hubs, bridges, and wires are
often shown as a collection of hosts attached to
a single wire
...
host
host
host
41Next Level internets
- Multiple incompatible LANs can be physically
connected by specialized computers called
routers. - The connected networks are called an internet.
...
...
host
host
host
host
host
host
LAN 1
LAN 2
router
router
router
WAN
WAN
LAN 1 and LAN 2 might be completely different,
totally incompatible LANs (e.g., Ethernet and ATM)
42The Notion of an internet Protocol
- How is it possible to send bits across
incompatible LANs and WANs? - Solution protocol software running on each host
and router smoothes out the differences between
the different networks. - Implements an internet protocol (i.e., set of
rules) that governs how hosts and routers should
cooperate when they transfer data from network to
network. - TCP/IP is the protocol for the global IP
Internet.
43What Does an internet Protocol Do?
- 1. Provides a naming scheme
- An internet protocol defines a uniform format for
host addresses. - Each host (and router) is assigned at least one
of these internet addresses that uniquely
identifies it. - 2. Provides a delivery mechanism
- An internet protocol defines a standard transfer
unit (packet) - Packet consists of header and payload
- Header contains info such as packet size, source
and destination addresses. - Payload contains data bits sent from source
host.
44Transferring Data Over an internet
Host A
Host B
client
server
(1)
(8)
data
data
protocol software
protocol software
internet packet
(2)
(7)
data
PH
FH1
data
PH
FH2
LAN1 frame
LAN1 adapter
LAN2 adapter
Router
(3)
(6)
data
PH
data
PH
FH2
FH1
LAN1 adapter
LAN2 adapter
LAN1
LAN2
LAN2 frame
(4)
data
PH
FH1
(5)
data
PH
FH2
protocol software
45Other Issues
- We are glossing over a number of important
questions - What if different networks have different maximum
frame sizes? (segmentation) - How do routers know where to forward frames?
- How are routers informed when the network
topology changes? - What if packets get lost?
- These (and other) questions are addressed by the
area of systems known as computer networking.
46Global IP Internet
- Most famous example of an internet.
- Based on the TCP/IP protocol family
- IP (Internet protocol)
- Provides basic naming scheme and unreliable
delivery capability of packets (datagrams) from
host-to-host. - UDP (Unreliable Datagram Protocol)
- Uses IP to provide unreliable datagram delivery
from process-to-process. - TCP (Transmission Control Protocol)
- Uses IP to provide reliable byte streams from
process-to-process over connections. - Accessed via a mix of Unix file I/O and functions
from the sockets interface. -
47Hardware and Software Org of an Internet
Application
Internet client host
Internet server host
Client
Server
User code
Sockets interface (system calls)
TCP/IP
TCP/IP
Kernel code
Hardware interface (interrupts)
Hardware and firmware
Network adapter
Network adapter
Global IP Internet
48A Programmers View of the Internet
- 1. Hosts are mapped to a set of 32-bit IP
addresses. - 128.2.203.179
- 2. The set of IP addresses is mapped to a set of
identifiers called Internet domain names. - 128.2.203.179 is mapped to www.cs.cmu.edu
- 3. A process on one Internet host can communicate
with a process on another Internet host over a
connection.
491. IP Addresses
- 32-bit IP addresses are stored in an IP address
struct - IP addresses are always stored in memory in
network byte order (big-endian byte order) - True in general for any integer transferred in a
packet header from one machine to another. - E.g., the port number used to identify an
Internet connection.
/ Internet address structure / struct in_addr
unsigned int s_addr / network byte order
(big-endian) /
Handy network byte-order conversion
functions htonl convert long int from host to
network byte order. htons convert short int from
host to network byte order. ntohl convert long
int from network to host byte order. ntohs
convert short int from network to host byte order.
50Dotted Decimal Notation
- By convention, each byte in a 32-bit IP address
is represented by its decimal value and separated
by a period - IP address 0x8002C2F2 128.2.194.242
- Functions for converting between binary IP
addresses and dotted decimal strings - inet_aton converts a dotted decimal string to
an IP address in network byte order. - inet_ntoa converts an IP address in network by
order to its corresponding dotted decimal string. - n denotes network representation. a denotes
application representation.
512. Internet Domain Names
unnamed root
mil
edu
gov
com
First-level domain names
Second-level domain names
cmu
berkeley
mit
amazon
Third-level domain names
cs
ece
www 208.216.181.15
cmcl
pdl
kittyhawk 128.2.194.242
imperial 128.2.189.40
52Domain Naming System (DNS)
- The Internet maintains a mapping between IP
addresses and domain names in a huge worldwide
distributed database called DNS. - Conceptually, programmers can view the DNS
database as a collection of millions of host
entry structures - Functions for retrieving host entries from DNS
- gethostbyname query key is a DNS domain name.
- gethostbyaddr query key is an IP address.
/ DNS host entry structure / struct hostent
char h_name / official domain name
of host / char h_aliases /
null-terminated array of domain names / int
h_addrtype / host address type (AF_INET)
/ int h_length / length of an
address, in bytes / char h_addr_list /
null-terminated array of in_addr structs /
53Properties of DNS Host Entries
- Each host entry is an equivalence class of domain
names and IP addresses. - Each host has a locally defined domain name
localhost which always maps to the loopback
address 127.0.0.1 - Different kinds of mappings are possible
- Simple case 1-1 mapping between domain name and
IP addr - kittyhawk.cmcl.cs.cmu.edu maps to 128.2.194.242
- Multiple domain names mapped to the same IP
address - eecs.mit.edu and cs.mit.edu both map to 18.62.1.6
- Multiple domain names mapped to multiple IP
addresses - aol.com and www.aol.com map to multiple IP addrs.
- Some valid domain names dont map to any IP
address - for example cmcl.cs.cmu.edu
54A Program That Queries DNS
int main(int argc, char argv) / argv1 is a
domain name char pp
or dotted decimal IP addr / struct in_addr
addr struct hostent hostp if
(inet_aton(argv1, addr) ! 0) hostp
Gethostbyaddr((const char )addr, sizeof(addr),
AF_INET) else hostp
Gethostbyname(argv1) printf("official
hostname s\n", hostp-gth_name) for (pp
hostp-gth_aliases pp ! NULL pp)
printf("alias s\n", pp) for (pp
hostp-gth_addr_list pp ! NULL pp)
addr.s_addr ((unsigned int )pp)
printf("address s\n", inet_ntoa(addr))
553. Internet Connections
- Clients and servers communicate by sending
streams of bytes over connections - Point-to-point, full-duplex (2-way
communication), and reliable. - A socket is an endpoint of a connection
- Socket address is an IPaddressport pair
- A port is a 16-bit integer that identifies a
process - Ephemeral port Assigned automatically on client
when client makes a connection request - Well-known port Associated with some service
provided by a server (e.g., port 80 is associated
with Web servers) - A connection is uniquely identified by the socket
addresses of its endpoints (socket pair) - (cliaddrcliport, servaddrservport)
56Internet Connections
- Clients and servers communicate by sending
streams of bytes over connections. - Connections are point-to-point, full-duplex
(2-way communication), and reliable.
Client socket address 128.2.194.24251213
Server socket address 208.216.181.1580
Server (port 80)
Client
Connection socket pair (128.2.194.24251213,
208.216.181.1580)
Client host address 128.2.194.242
Server host address 208.216.181.15
Note 51213 is an ephemeral port allocated by the
kernel
Note 80 is a well-known port associated with Web
servers
57Clients
- Examples of client programs
- Web browsers, ftp, telnet, ssh
- How does a client find the server?
- The IP address in the server socket address
identifies the host (more precisely, an adapter
on the host) - The (well-known) port in the server socket
address identifies the service, and thus
implicitly identifies the server process that
performs that service. - Examples of well know ports
- Port 7 Echo server
- Port 23 Telnet server
- Port 25 Mail server
- Port 80 Web server
58Using Ports to Identify Services
Server host 128.2.194.242
Web server (port 80)
Client host
Service request for 128.2.194.24280 (i.e., the
Web server)
Kernel
Client
Echo server (port 7)
Web server (port 80)
Service request for 128.2.194.2427 (i.e., the
echo server)
Kernel
Client
Echo server (port 7)
59Servers
- Servers are long-running processes (daemons).
- Created at boot-time (typically) by the init
process (process 1) - Run continuously until the machine is turned off.
- Each server waits for requests to arrive on a
well-known port associated with a particular
service. - Port 7 echo server
- Port 23 telnet server
- Port 25 mail server
- Port 80 HTTP server
- A machine that runs a server process is also
often referred to as a server.
60Server Examples
- Web server (port 80)
- Resource files/compute cycles (CGI programs)
- Service retrieves files and runs CGI programs on
behalf of the client - FTP server (20, 21)
- Resource files
- Service stores and retrieve files
- Telnet server (23)
- Resource terminal
- Service proxies a terminal on the server machine
- Mail server (25)
- Resource email spool file
- Service stores mail messages in spool file
See /etc/services for a comprehensive list of the
services available on a Linux machine.
61Sockets Interface
- Created in the early 80s as part of the original
Berkeley distribution of Unix that contained an
early version of the Internet protocols. - Provides a user-level interface to the network.
- Underlying basis for all Internet applications.
- Based on client/server programming model.
62Overview of the Sockets Interface
Client
Server
socket
socket
bind
open_listenfd
open_clientfd
listen
Connection request
accept
connect
rio_readlineb
rio_writen
Await connection request from next client
rio_writen
rio_readlineb
EOF
rio_readlineb
close
close
63Sockets
- What is a socket?
- To the kernel, a socket is an endpoint of
communication. - To an application, a socket is a file descriptor
that lets the application read/write from/to the
network. - Remember All Unix I/O devices, including
networks, are modeled as files. - Clients and servers communicate with each by
reading from and writing to socket descriptors. - The main distinction between regular file I/O and
socket I/O is how the application opens the
socket descriptors.
64Socket Address Structures
- Generic socket address
- For address arguments to connect, bind, and
accept. - Necessary only because C did not have generic
(void ) pointers when the sockets interface was
designed. - Internet-specific socket address
- Must cast (sockaddr_in ) to (sockaddr ) for
connect, bind, and accept.
struct sockaddr unsigned short sa_family
/ protocol family / char
sa_data14 / address data. /
struct sockaddr_in unsigned short
sin_family / address family (always AF_INET)
/ unsigned short sin_port / port num in
network byte order / struct in_addr
sin_addr / IP addr in network byte order /
unsigned char sin_zero8 / pad to
sizeof(struct sockaddr) /
65Echo Client Main Routine
include "csapp.h" / usage ./echoclient host
port / int main(int argc, char argv)
int clientfd, port char host,
bufMAXLINE rio_t rio host
argv1 port atoi(argv2)
clientfd Open_clientfd(host, port)
Rio_readinitb(rio, clientfd) while
(Fgets(buf, MAXLINE, stdin) ! NULL)
Rio_writen(clientfd, buf, strlen(buf))
Rio_readlineb(rio, buf, MAXLINE)
Fputs(buf, stdout) Close(clientfd)
exit(0)
66Echo Client open_clientfd
int open_clientfd(char hostname, int port)
int clientfd struct hostent hp struct
sockaddr_in serveraddr if ((clientfd
socket(AF_INET, SOCK_STREAM, 0)) lt 0) return
-1 / check errno for cause of error / /
Fill in the server's IP address and port / if
((hp gethostbyname(hostname)) NULL)
return -2 / check h_errno for cause of error /
bzero((char ) serveraddr, sizeof(serveraddr))
serveraddr.sin_family AF_INET
bcopy((char )hp-gth_addr, (char
)serveraddr.sin_addr.s_addr, hp-gth_length)
serveraddr.sin_port htons(port) /
Establish a connection with the server / if
(connect(clientfd, (SA ) serveraddr,
sizeof(serveraddr)) lt 0) return -1
return clientfd
This function opens a connection from the client
to the server at hostnameport
67Echo Client open_clientfd (socket)
- socket creates a socket descriptor on the client.
- AF_INET indicates that the socket is associated
with Internet protocols. - SOCK_STREAM selects a reliable byte stream
connection.
int clientfd / socket descriptor / if
((clientfd socket(AF_INET, SOCK_STREAM, 0)) lt
0) return -1 / check errno for cause of
error / ... (more)
68Echo Client open_clientfd (gethostbyname)
- The client then builds the servers Internet
address.
int clientfd / socket
descriptor / struct hostent hp /
DNS host entry / struct sockaddr_in serveraddr
/ servers IP address / ... / fill in the
server's IP address and port / if ((hp
gethostbyname(hostname)) NULL) return -2
/ check h_errno for cause of error /
bzero((char ) serveraddr, sizeof(serveraddr))
serveraddr.sin_family AF_INET bcopy((char
)hp-gth_addr, (char )serveraddr.sin_addr
.s_addr, hp-gth_length) serveraddr.sin_port
htons(port)
69Echo Client open_clientfd (connect)
- Finally the client creates a connection with the
server. - Client process suspends (blocks) until the
connection is created. - After resuming, the client is ready to begin
exchanging messages with the server via Unix I/O
calls on descriptor sockfd.
int clientfd / socket
descriptor / struct sockaddr_in serveraddr
/ server address / typedef struct sockaddr
SA / generic sockaddr / ... /
Establish a connection with the server / if
(connect(clientfd, (SA )serveraddr,
sizeof(serveraddr)) lt 0) return -1
return clientfd
70Echo Server Main Routine
int main(int argc, char argv) int
listenfd, connfd, port, clientlen struct
sockaddr_in clientaddr struct hostent hp
char haddrp port atoi(argv1) / the
server listens on a port passed
on the command line / listenfd
open_listenfd(port) while (1)
clientlen sizeof(clientaddr) connfd
Accept(listenfd, (SA )clientaddr, clientlen)
hp Gethostbyaddr((const char
)clientaddr.sin_addr.s_addr,
sizeof(clientaddr.sin_addr.s_addr),
AF_INET) haddrp inet_ntoa(clientaddr.si
n_addr) printf("server connected to s
(s)\n", hp-gth_name, haddrp)
echo(connfd) Close(connfd)
71Echo Server open_listenfd
int open_listenfd(int port) int
listenfd, optval1 struct sockaddr_in
serveraddr / Create a socket
descriptor / if ((listenfd
socket(AF_INET, SOCK_STREAM, 0)) lt 0)
return -1 / Eliminates "Address already
in use" error from bind. / if
(setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR,
(const void )optval ,
sizeof(int)) lt 0) return -1 ...
(more)
72Echo Server open_listenfd (cont)
... / Listenfd will be an endpoint for all
requests to port on any IP address for
this host / bzero((char ) serveraddr,
sizeof(serveraddr)) serveraddr.sin_family
AF_INET serveraddr.sin_addr.s_addr
htonl(INADDR_ANY) serveraddr.sin_port
htons((unsigned short)port) if
(bind(listenfd, (SA )serveraddr,
sizeof(serveraddr)) lt 0) return -1
/ Make it a listening socket ready to accept
connection requests / if
(listen(listenfd, LISTENQ) lt 0) return
-1 return listenfd
73Echo Server open_listenfd(socket)
- socket creates a socket descriptor on the server.
- AF_INET indicates that the socket is associated
with Internet protocols. - SOCK_STREAM selects a reliable byte stream
connection.
int listenfd / listening socket descriptor /
/ Create a socket descriptor / if ((listenfd
socket(AF_INET, SOCK_STREAM, 0)) lt 0)
return -1
74Echo Server open_listenfd(setsockopt)
- The socket can be given some attributes.
- Handy trick that allows us to rerun the server
immediately after we kill it. - Otherwise we would have to wait about 15 secs.
- Eliminates Address already in use error from
bind(). - Strongly suggest you do this for all your servers
to simplify debugging.
... / Eliminates "Address already in use" error
from bind(). / if (setsockopt(listenfd,
SOL_SOCKET, SO_REUSEADDR, (const
void )optval , sizeof(int)) lt 0) return
-1
75Echo Server open_listenfd (initialize socket
address)
- Next, we initialize the socket with the servers
Internet address (IP address and port) - IP addr and port stored in network (big-endian)
byte order - htonl() converts longs from host byte order to
network byte order. - htons() convers shorts from host byte order to
network byte order.
struct sockaddr_in serveraddr / server's
socket addr / ... / listenfd will be an
endpoint for all requests to port on any IP
address for this host / bzero((char )
serveraddr, sizeof(serveraddr))
serveraddr.sin_family AF_INET
serveraddr.sin_addr.s_addr htonl(INADDR_ANY)
serveraddr.sin_port htons((unsigned short)port)
76Echo Server open_listenfd (bind)
- bind associates the socket with the socket
address we just created.
int listenfd / listening
socket / struct sockaddr_in serveraddr /
servers socket addr / ... / listenfd will
be an endpoint for all requests to port on
any IP address for this host / if
(bind(listenfd, (SA )serveraddr,
sizeof(serveraddr)) lt 0) return -1
77Echo Server open_listenfd (listen)
- listen indicates that this socket will accept
connection (connect) requests from clients. - Were finally ready to enter the main server loop
that accepts and processes client connection
requests.
int listenfd / listening socket / ... /
Make it a listening socket ready to accept
connection requests / if (listen(listenfd,
LISTENQ) lt 0) return -1 return
listenfd
78Echo Server Main Loop
- The server loops endlessly, waiting for
connection requests, then reading input from the
client, and echoing the input back to the client.
main() / create and configure the
listening socket / while(1) /
Accept() wait for a connection request /
/ echo() read and echo input lines from client
til EOF / / Close() close the connection
/
79Echo Server accept
- accept() blocks waiting for a connection request.
- accept returns a connected descriptor (connfd)
with the same properties as the listening
descriptor (listenfd) - Returns when the connection between client and
server is created and ready for I/O transfers. - All I/O with the client will be done via the
connected socket. - accept also fills in clients IP address.
int listenfd / listening descriptor /
int connfd / connected descriptor /
struct sockaddr_in clientaddr int clientlen
clientlen sizeof(clientaddr)
connfd Accept(listenfd, (SA )clientaddr,
clientlen)
80Echo Server accept Illustrated
1. Server blocks in accept, waiting for
connection request on listening descriptor
listenfd.
listenfd(3)
Server
Client
clientfd
Connection request
listenfd(3)
2. Client makes connection request by calling and
blocking in connect.
Server
Client
clientfd
3. Server returns connfd from accept. Client
returns from connect. Connection is now
established between clientfd and connfd.
listenfd(3)
Server
Client
clientfd
connfd(4)
81Connected vs. Listening Descriptors
- Listening descriptor
- End point for client connection requests.
- Created once and exists for lifetime of the
server. - Connected descriptor
- End point of the connection between client and
server. - A new descriptor is created each time the server
accepts a connection request from a client. - Exists only as long as it takes to service
client. - Why the distinction?
- Allows for concurrent servers that can
communicate over many client connections
simultaneously. - E.g., Each time we receive a new request, we fork
a child to handle the request.
82Echo Server Identifying the Client
- The server can determine the domain name and IP
address of the client.
struct hostent hp / pointer to DNS host
entry / char haddrp / pointer to
dotted decimal string / hp
Gethostbyaddr((const char )clientaddr.sin_addr.s
_addr, sizeof(clientaddr.s
in_addr.s_addr), AF_INET) haddrp
inet_ntoa(clientaddr.sin_addr)
printf("server connected to s (s)\n",
hp-gth_name, haddrp)
83Echo Server echo
- The server uses RIO to read and echo text lines
until EOF (end-of-file) is encountered. - EOF notification caused by client calling
close(clientfd). - IMPORTANT EOF is a condition, not a particular
data byte.
void echo(int connfd) size_t n
char bufMAXLINE rio_t rio
Rio_readinitb(rio, connfd) while((n
Rio_readlineb(rio, buf, MAXLINE)) ! 0)
printf("server received d bytes\n", n)
Rio_writen(connfd, buf, n)
84Testing Servers Using telnet
- The telnet program is invaluable for testing
servers that transmit ASCII strings over Internet
connections - Our simple echo server
- Web servers
- Mail servers
- Usage
- unixgt telnet lthostgt ltportnumbergt
- Creates a connection with a server running on
lthostgt and listening on port ltportnumbergt.
85Testing the Echo Server With telnet
bassgt echoserver 5000 server established
connection with KITTYHAWK.CMCL (128.2.194.242) ser
ver received 5 bytes 123 server established
connection with KITTYHAWK.CMCL (128.2.194.242) ser
ver received 8 bytes 456789 kittyhawkgt telnet
bass 5000 Trying 128.2.222.85... Connected to
BASS.CMCL.CS.CMU.EDU. Escape character is
''. 123 123 Connection closed by foreign
host. kittyhawkgt telnet bass 5000 Trying
128.2.222.85... Connected to BASS.CMCL.CS.CMU.EDU.
Escape character is ''. 456789 456789 Connectio
n closed by foreign host. kittyhawkgt
86Running the Echo Client and Server
bassgt echoserver 5000 server established
connection with KITTYHAWK.CMCL (128.2.194.242) ser
ver received 4 bytes 123 server established
connection with KITTYHAWK.CMCL (128.2.194.242) ser
ver received 7 bytes 456789 ... kittyhawkgt
echoclient bass 5000 Please enter msg 123 Echo
from server 123 kittyhawkgt echoclient bass
5000 Please enter msg 456789 Echo from server
456789 kittyhawkgt
87For More Information
- W. Richard Stevens, Unix Network Programming
Networking APIs Sockets and XTI, Volume 1,
Second Edition, Prentice Hall, 1998. - THE network programming bible.