Title: CPEG 419
1CPEG 419
Introduction to Data Networking
- Review of Lecture 1 and continuation of chapter 1
2Announcements
- Homework 1 due next week
- Project 1 due next week
3Today
- Review and complete Chapter 1
- Start Chapter 2
4Packet Switching Case
- What is the probability of more than 100 users
being active?
The probability of 101 users being active plus,
102 users being active, plus, ., 200 users being
active, which is
We conclude that if there are 200 users, then in
pretty much always things will work fine
Suppose that there are 300 users
Might be acceptable performance
Suppose that there are 400 users
Therefore circuit switching could support 100
users, while packet switching can support 400
users. A factor of 4 more!!!
5Losses and delay in packet switched networks
- Losses
- Transmission losses
- In fiber links, bit-error is 10-12 or better
(i.e., less). - What is the probability of packet error when
there are 1400 bytes in a packet? - In wireless links, the bit-error rate can be very
high - Congestion losses.
- If too many packets arrive at the same time, then
the buffers will fill up and packets are lost. - Increasing the link speeds or reducing the number
of users can reduce the probability of loss. - Increasing the size of the buffer reduces losses,
but also increases delay. - Delay
- Queuing delay
- Transmission delay
- Propagation delay
- Processing delay
6In the news
- News sources
- www.lightreading.com (general networks)
- www.unstrung.com (wireless and mobile)
- www.darkreading.com (network security)
- www.alleyinsider.com (general tech business news)
- arstechnica.com (general tech news)
7The Protocol Stack
- The application layer includes network
applications and network application protocols - e.g. of applications web, IM, email
- e.g., application protocols OSCAR, http, smtp,
ftp, DNS. - Provide a service to a user or another
application. - Require service from the lower layers, but
typically only interact with the transport layer.
8The Protocol Stack
- The transport layer (typically) transports
messages from and to applications - Different transport layer protocols provide
different types of services. - Types of services MAY include
- Reliability the sender application can be
assured that the data is correctly received, or
receives an error message. - Congestion and flow control attempt to send data
quickly but not so quickly to cause congestion in
the network or at the receiving host - Error detection / correction
- In order delivery
- Break long messages into small chunks suitable
for transmission over the network - Multiplexing so that multiple transport layer
connections can occur simultaneously - Note that when a transport protocol provides
these services, the application does not have to.
- This makes implementation of applications easier.
- This allows careful design of transport
protocols, following the divide and conquer
approach - The transport layer uses the network layer to
deliver packets, but does not require any type of
service guarantees from the network layer - In practice, the transport layer hopes for in
order delivery.
9Transport layer protocols TCP and UDP
- TCP and UDP are the most widely used transport
protocols. - Other protocols include SCTP (UD and Cisco are
active in developing SCTP), RTP (for multimedia
such as VoIP) - TCP and UDP will be covered in great detail
later. But for now - TCP provides many services
- Congestion control
- Flow control
- Reliability
- Multiplexing
- Error detection
- UDP provides few services
- Error detection
- Multiplexing
- The application must implement any other services
that it requires. - TCP requires a connection to be established, UDP
does not
10Transport Multiplexing
- Transport layers use ports to provide
multiplexing - A two hosts can have multiple simultaneous
connections by using ports. - Well known ports can be used to specify a
particular application - E.g., web servers will accept TCP connections on
port 80 - A host can have two connections with a web server
by using different ports
host (web server)
host
TCP
UDP
TCP
UDP
0
0
0
0
4567
80
4568
216-1
216-1
216-1
216-1
11Sockets gateway between the app layer and the
transport layer
- process sends/receives messages to/from its
socket - socket analogous to door
- sending process shoves message out door
- sending process relies on transport
infrastructure on other side of door which brings
message to socket at receiving process
12TCP Sockets
- An application accesses TCP and UDP through
sockets. - TCP is connection based so one host must be
listening and the other must be connecting
(calling) - The basic steps for a TCP listener
- Define socket variable as a TCP socket
- Bind socket to a port (the bind function)
- If some other application is or was recently (120
sec) listening on this port, this function will
fail. - The application must check that this command
succeeds. - Listen on this port (the listen function)
- When a the other host connects, the listen
function completes and data can be send or
received. - Close socket
- Basic steps for TCP caller
- Define socket variable as a TCP socket
- No port is given, the OS will assign which ever
port is available. The application has no control
over the port - Connect
- Send data
- Close socket
13UDP Sockets
- UDP are connectionless.
- A host sends a packet when it wants.
- There is no concept of one host connecting to
another. - There is only the concept of one host sending a
packet and the other host receiving the packet.
And either host can send or receive - Steps to send and then receive a UDP message
- Define socket as a UDP socket
- Bind socket to a port
- If this port is in use, bind will fail
- Send message
- Wait for message
- There are two ways to wait for messages, blocking
or non-blocking - A blocking function will wait for a message to
arrive. It might wait forever. - A non-blocking will return immediately, but if no
message was waiting in the transport layer, then
no message is returned - select function allows a time out to be set. So
the function will wait until a message arrives or
the timeout time to elapse. - Close socket
- Steps to receive a UDP message
- Define socket as a UDP socket
- Bind socket to a port
- If this port is in use, bind will fail
14Project 1
Due 9/16
- In this project messages will be sent over TCP
and UDP. - The project is description currently at
- http//www.eecis.udel.edu/bohacek/Classes/CPEG419
_2005/Proj1/project1_part1.htm - All the required information should be online.
- This project can be completed by cut and pasting
from the web site. But try to understand the
steps. - Let me know if there are typos.
15The Protocol Stack
- The network layer routes packets (datagrams)
through the network - The network layer gets packets from the transport
layer or from the link layer. - Depending on the destination address, the network
layer will give the packet to the transport
protocol or to a specific link layer to send on a
specific link - The network layer also provides fragmenting of a
large packet into chunks suitable for the link
layer
16The Protocol Stack
- The link layer moves packets (frames) between two
hosts - However, the link layer may provide a wide range
of services including - Media access control
- Error detection / correction
- Routing over layer 2 networks
- Reliability (where the network layer is informed
if the transmission fails)
17The Protocol Stack
- The physical layer moves packets (frames) between
two connected hosts - This requires putting the bits onto a physical
medium and decoding them from the medium. - In this course we mostly neglect the physical
layer and assume that is works correctly (each
layer always assumes that the other layers work
correctly) - But the performance of a protocol at a layer
often dependent on the other layers. - One approach is for cross-layer design
18Encapsulation
source
message
application transport network link physical
segment
datagram
frame
switch
destination
application transport network link physical
router
19Chapter 2 The Application Layer
20Goals of this Chapter
- To understand common application protocols work
- Web (http)
- Email (smtp)
- FTP
- DNS
- P2P
- IM
- To understand how the design alternatives for
application design - A network application runs on many hosts, it is a
distributed application - This chapter discusses several designs of
distributed applications
21Road Map
- Application basics
- Web
- Email
- FTP
- DNS
- P2P
- Graph theory
- State diagrams
- P2P design
- IM
22Road Map
- Application basics
- Web
- Email
- FTP
- DNS
- P2P
- Graph theory
- State diagrams
- P2P design
- IM
23Creating a network app
- write programs that
- run on (different) end systems
- communicate over network
- e.g., web server software communicates with
browser software - No need to write software for network-core
devices - Network-core devices do not run user applications
- applications on end systems allows for rapid app
development, propagation
24An App-layer protocol defines
- Types of messages exchanged,
- e.g., request, response
- Message syntax
- what fields in messages how fields are
delineated - Message semantics
- meaning of information in fields
- Rules for when and how processes send respond
to messages
- Public-domain protocols
- defined in RFCs
- allows for interoperability
- e.g., HTTP, SMTP
- Proprietary protocols
- e.g., Skype
25Ports
- An application is identified by the hosts IP
address, transport protocols, and port - E.g., A web server has a particular IP address,
listens with TCP on port 80. - A web browser on a host will connect a request a
file from the web server. The browser is
identified by the hosts IP address and a TCP
port.
26What transport service does an app need?
- Throughput
- some apps (e.g., multimedia) require minimum
amount of throughput to be useful (i.e., in
order for the user to gain utility) - other apps (elastic apps) make use of whatever
throughput they get - Security
- Encryption, data integrity,
- Data reliability
- some apps (e.g., audio) can tolerate some loss
- other apps (e.g., file transfer, telnet) require
100 reliable data transfer
- Timing
- some apps (e.g., Internet telephony, interactive
games) require low delay to be effective
27Transport service requirements of common apps
Application file transfer e-mail Web
documents real-time audio/video stored
audio/video interactive games instant messaging
Throughput elastic elastic some what
elastic audio 5kbps-1Mbps video10kbps-5Mbps same
as above few kbps up elastic
Time Sensitive no no not really yes, 100s
msec yes, few secs yes, 100s msec yes and no
Data loss no loss no loss no loss loss-tolerant
loss-tolerant loss-tolerant no loss
28Internet transport protocols services
- TCP service
- connection-oriented setup required between
client and server processes - reliable transport between sending and receiving
process - flow control sender wont overwhelm receiver
- congestion control throttle sender when network
overloaded - does not provide timing, minimum throughput
guarantees, security
- UDP service
- unreliable data transfer between sending and
receiving process - does not provide reliability, flow control,
congestion control, timing, throughput guarantee,
or security - Does not require connection set-up
- Packets can be sent at any rate desired (but this
might be cause considerable congestion)
29Internet apps application, transport protocols
Application layer protocol SMTP RFC
2821 Telnet RFC 854 HTTP RFC 2616 FTP RFC
959 HTTP (eg Youtube), RTP RFC 1889 SIP, RTP,
proprietary (e.g., Skype)
Underlying transport protocol TCP TCP TCP TCP TCP
or UDP typically UDP
Application e-mail remote terminal access Web
file transfer streaming multimedia Internet
telephony
30Road Map
- Application basics
- Web
- Email
- FTP
- DNS
- P2P
- Graph theory
- State diagrams
- P2P design
- IM
31Web and HTTP
- Web page consists of objects
- Object can be HTML file, JPEG image, Java applet,
audio file, - Web page consists of base HTML-file which
includes several referenced objects - The browser first requests the base file
- The base file species text and URLs of objects
- The browser requests these objects, where ever
they are (not always on the same server) - HTTP is used to request the base file and all the
other files - Note, that HTTP can be used for other
applications besides web - Each object is addressable by a URL
- Example URL
32HTTP overview
- HTTP hypertext transfer protocol
- Webs application layer protocol
- client/server model
- client browser that requests, receives,
displays Web objects - server Web server sends objects in response to
requests
33HTTP overview (continued)
- Uses TCP
- client initiates TCP connection (creates socket)
to server, port 80 - server accepts TCP connection from client
- HTTP messages (application-layer protocol
messages) exchanged between browser (HTTP client)
and Web server (HTTP server) - TCP connection closed
- HTTP is stateless
- server maintains no information about past client
requests
aside
- Protocols that maintain state are complex!
- past history (state) must be maintained
- if server/client crashes, their views of state
may be inconsistent, must be reconciled
34HTTP connections
- Nonpersistent HTTP
- At most one object is sent over a TCP connection.
- Persistent HTTP
- Multiple objects can be sent over single TCP
connection between client and server.
35Nonpersistent HTTP
- Suppose user enters URL www.someSchool.edu/someDep
artment/home.index
(contains text, references to 10 jpeg images)
- 1a. HTTP client initiates TCP connection to HTTP
server (process) at www.someSchool.edu on port 80
1b. HTTP server at host www.someSchool.edu
waiting for TCP connection at port 80. accepts
connection, notifying client
2. HTTP client sends HTTP request message
(containing URL) into TCP connection socket.
Message indicates that client wants object
someDepartment/home.index
3. HTTP server receives request message, forms
response message containing requested object, and
sends message into its socket
5. HTTP client receives response message
containing html file, displays html. Parsing
html file, finds 10 referenced jpeg objects
4. HTTP server closes TCP connection.
time
6. Steps 1-5 repeated for each of 10 jpeg objects
36Non-Persistent HTTP Response time
- Definition of RTT time for a small packet to
travel from client to server and back. - Response time
- one RTT to initiate TCP connection
- one RTT for HTTP request and first few bytes of
HTTP response to return - file transmission time
- total 2RTTtransmit time
initiate TCP connection
RTT
request file
RTT
file received
time
time
37Persistent HTTP
- Nonpersistent HTTP issues
- requires 2 RTTs per object
- OS overhead for each TCP connection
- browsers often open parallel TCP connections to
fetch referenced objects
- Persistent HTTP
- server leaves connection open after sending
response - subsequent HTTP messages between same
client/server sent over open connection - client sends requests as soon as it encounters a
referenced object - as little as one RTT for all the referenced
objects
38HTTP request message
- two types of HTTP messages request, response
- HTTP request message
- ASCII (human-readable format)
request line (GET, POST, HEAD commands)
GET /somedir/page.html HTTP/1.1 Host
www.someschool.edu User-agent
Mozilla/4.0 Connection close Accept-languagefr
(extra carriage return, line feed)
header lines
Carriage return, line feed indicates end of
message
39HTTP request message general format
40HTTP response message
status line (protocol status code status phrase)
HTTP/1.1 200 OK Connection close Date Thu, 06
Aug 1998 120015 GMT Server Apache/1.3.0
(Unix) Last-Modified Mon, 22 Jun 1998 ...
Content-Length 6821 Content-Type text/html
data data data data data ...
header lines
data, e.g., requested HTML file
41HTTP response status codes
In first line in server-gtclient response
message. A few sample codes
- 200 OK
- request succeeded, requested object later in this
message - 301 Moved Permanently
- requested object moved, new location specified
later in this message (Location) - 400 Bad Request
- request message not understood by server
- 404 Not Found
- requested document not found on this server
- 505 HTTP Version Not Supported
42Trying out HTTP (client side) for yourself
- 1. Telnet to your favorite Web server
Opens TCP connection to port 80 (default HTTP
server port) at cis.poly.edu. Anything typed in
sent to port 80 at cis.poly.edu
telnet cis.poly.edu 80
2. Type in a GET HTTP request
By typing this in (hit carriage return twice),
you send this minimal (but complete) GET request
to HTTP server
GET /ross/ HTTP/1.1 Host cis.poly.edu
3. Look at response message sent by HTTP server!
43Wireshark (ethereal)
- Wireshark captures all packets that pass through
the hosts interface - To run Wireshark , libpcap (linux) or winpcap
(windows) must be installed. It comes with
wireshark package - Then, run wireshark
- Select Capture
- Find the active interface
- E.g., mot generic dialup, nor vnp, nor packet
scheduler, but wireless . With IP address - Then select prepare
- Lets watch TCP packets on port 80
- Next to capture filter, enter TCP port 80
- Select update in realtime and autoscroll
- Might need to enable or disable capture in
promiscuous mode - Press start
- Press close
- Load www.eecis.udel.edu page in browser
- Press stop in Wireshark
- Find http request to 128.4.40.10.
- Right click and select follow TCP stream
44Web caches (proxy server)
Goal reduce network utilization by satisfying
client request without involving origin server
- user sets browser Web accesses via cache
- browser sends all HTTP requests to cache
- object in cache cache returns object
- else cache requests object from origin server,
then returns object to client
origin server
Proxy server
client
client
origin server
45More about Web caching
- cache acts as both client and server
- typically cache is installed by ISP (university,
company, residential ISP)
- Why Web caching?
- reduce response time for client request
- reduce traffic on an institutions access link.
- Internet dense with caches enables poor
content providers to effectively deliver content
(but so does P2P file sharing)
46Caching example
origin servers
- Assumptions
- average object size 100,000 bits
- avg. request rate from institutions browsers to
origin servers 15/sec - delay from institutional router to any origin
server and back to router 2 sec - Consequences
- utilization on LAN 15
- utilization on access link 100
- total delay Internet delay access delay
LAN delay - 2 sec minutes milliseconds
public Internet
1.5 Mbps access link
institutional network
10 Mbps LAN
institutional cache
47Caching example (cont)
origin servers
- possible solution
- increase bandwidth of access link to, say, 10
Mbps - consequence
- utilization on LAN 15
- utilization on access link 15
- Total delay Internet delay access delay
LAN delay - 2 sec msecs msecs
- often a costly upgrade
public Internet
10 Mbps access link
institutional network
10 Mbps LAN
institutional cache
48Caching example (cont)
origin servers
- possible solution install cache
- suppose hit rate is 0.4
- consequence
- 40 requests will be satisfied almost immediately
- 60 requests satisfied by origin server
- utilization of access link reduced to 60,
resulting in negligible delays (say 10 msec) - total avg delay Internet delay access delay
LAN delay .6(2.01) secs
.4milliseconds lt 1.4 secs
public Internet
1.5 Mbps access link
institutional network
10 Mbps LAN
institutional cache
49Conditional GET
server
cache
- Goal dont send object if cache has up-to-date
cached version - cache specify date of cached copy in HTTP
request - If-modified-since ltdategt
- server response contains no object if cached
copy is up-to-date - HTTP/1.0 304 Not Modified
HTTP request msg If-modified-since ltdategt
object not modified
HTTP request msg If-modified-since ltdategt
object modified
HTTP response HTTP/1.0 200 OK ltdatagt
50Road Map
- Application basics
- Web
- FTP
- Email
- DNS
- P2P
- Graph theory
- State diagrams
- P2P design
- IM
51FTP the file transfer protocol
file transfer
user at host
remote file system
local file system
- transfer file to/from remote host
- client/server model
- client side that initiates transfer (either
to/from remote) - server remote host
- ftp RFC 959
- ftp server listens on port 21
52FTP is weird separate control and data
connections
- FTP client contacts FTP server at port 21, TCP is
transport protocol - client authorized over control connection
- This is done in clear text (i.e., unencrypted)
- So if some one if sniffing packets, your password
might be learned. - Sniffing packets is difficult on ethernet,
encrypted wifi, and DSL, but is possible on cable
modems - client browses remote directory by sending
commands over control connection. - Data is transferred over different connections.
Two approaches - Active
- Passive
- Active mode is a problem for firewalls
- If my desktop is not a server, if should not
receive any requests for connections. - But FTP servers will make such a requests
- Active
- The client opens a TCP socket with on some port
(port number gt1024) - The client sends the server the port
- The server connects to the clients port where
the servers source port is 20
53FTP Passive mode
- When a file is to be transferred, the server
opens a port (numbergt1024 and not 20) - The server sends this port number information
over the command connection - The client connects to the servers over this port.
- Drawback of passive
- Some enterprises (companies) like to control
which applications are used - E.g., web browsing is ok, but skype is not
- One way to do this is to block out going
connections based on the port. - However, this will cause FTP to fail, unless the
device that blocks connections is smart
54Road Map
- Application basics
- Web
- FTP
- Email
- DNS
- P2P
- Graph theory
- State diagrams
- P2P design
- IM
55Email Protocol Design
- Basic assumption weak user agents and strong
mail servers - The user wants to send the mail and leave
- The user wants to get the mail
- The user may come and go whenever (e.g., roaming
laptop) - It should be possible to send mail to a user even
if neither user is online at the same time. - We conclude that there must be a middle man/mail
server. - Servers are not that strong The protocol must be
as robust as possible to servers being offline - No single server why
- Single point of failure
- The server would have to be too big (congestion)
- We conclude that there should be many mail
servers - Two types of hosts
- Users
- Mail servers
- Each user has a mail box in its mail server
- Users retrieve mail from their mail server at
there convenience - Users give mail to their mail servers to deliver
the mail
56Email Protocol Design
- Two types of hosts
- Users
- Mail servers
- Each user has a mail box in its mail server
- Users retrieve mail from their mail server at
there convenience - Users give mail to their mail servers to deliver
the mail - Mail servers communicate with
- The users that have mail boxes in the server
- Other mail servers
Destination user requests emails from mailbox
User composes mail and sends it to its mail
server (or a mail server that will send mail for
it)
Mail server finds the destination mail server and
attempts to send the mail
Destination server gives mails to user
57Email Protocol Design
- Two types of hosts
- Users
- Mail servers
- Each user has a mail box in its mail server
- Users retrieve mail from their mail server at
there convenience - Users give mail to their mail servers to deliver
the mail - Mail servers communicate with
- The users that have mail boxes in the server
- Other mail servers
Destination user requests emails from mailbox
User composes mail and sends it to its mail
server (or a mail server that will send mail for
it)
Mail server finds the destination mail server and
attempts to send the mail
Destination server gives mails to user
SMTP
SMTP
POP3 IMAP
58Electronic Mail Details
outgoing message queue
user mailbox
- Three major components
- user agents
- mail servers
- simple mail transfer protocol SMTP
- User Agent
- a.k.a. mail reader
- composing, editing, reading mail messages
- e.g., Eudora, Outlook, elm, Mozilla Thunderbird
- Put outgoing on server (with SMTP)
- Get incoming messages from server
SMTP
mail server
SMTP
SMTP
59Electronic Mail mail servers
- Mail Servers
- mailbox contains incoming messages for user
- message queue of outgoing (to be sent) mail
messages - SMTP protocol between mail servers to send email
messages - client sending mail server
- server receiving mail server
- Reliable several attempts and provide
notification if delivery fails
60Electronic Mail SMTP RFC 2821
- uses TCP to reliably transfer email message from
client to server, port 25 - direct transfer sending server to receiving
server - Emails are pushed to servers (but users pull
messages from servers) - three phases of transfer
- handshaking (greeting)
- transfer of messages
- closure
- command/response interaction
- commands ASCII text
- response status code and phrase
- messages must be in 7-bit ASCII
- Makes it difficult to send attachments
61Scenario Alice sends message to Bob
- 4) SMTP client sends Alices message over the TCP
connection - 5) Bobs mail server places the message in Bobs
mailbox - 6) Bob invokes his user agent to read message
- 1) Alice uses UA to compose message and to
bob_at_someschool.edu - 2) Alices UA sends message to her mail server
message placed in message queue - 3) Client side of SMTP opens TCP connection with
Bobs mail server
1
2
6
3
4
5
62Sample SMTP interaction
Client connects to server
S 220 hamburger.edu C HELO crepes.fr
S 250 Hello crepes.fr, pleased to meet
you C MAIL FROM ltalice_at_crepes.frgt
S 250 alice_at_crepes.fr... Sender ok C RCPT
TO ltbob_at_hamburger.edugt S 250
bob_at_hamburger.edu ... Recipient ok C DATA
S 354 Enter mail, end with "." on a line
by itself C Do you like ketchup? C
How about pickles? C . S 250
Message accepted for delivery C QUIT
S 221 hamburger.edu closing connection
63Try SMTP interaction for yourself
- telnet mail.eecis.udel.edu 25
- see 220 reply from server
- enter HELO, MAIL FROM, RCPT TO, DATA, QUIT
commands - above lets you send email without using email
client (reader)
64SMTP final words
- SMTP uses persistent connections
- SMTP requires message (header body) to be in
7-bit ASCII - SMTP server uses CRLF.CRLF to determine end of
message
- Comparison with HTTP
- HTTP pull
- SMTP push
- both have ASCII command/response interaction,
status codes - HTTP each object encapsulated in its own
response msg - SMTP multiple objects sent in multipart msg
65Mail access
- POP3 and IMAP are two protocols for access mail
on a mail server - Web-based mail works differently, the web mail
server and the mail server can be integrated, so
that there is no user agent.
66Mail access protocols
SMTP
access protocol
receivers mail server
- SMTP delivery/storage to receivers server
- Mail access protocol retrieval from server
- POP Post Office Protocol RFC 1939
- authorization (agent lt--gtserver) and download
- IMAP Internet Mail Access Protocol RFC 1730
- more features (more complex)
- manipulation of stored msgs on server
- HTTP gmail, Hotmail, Yahoo! Mail, etc.
67Road Map
- Application basics
- Web
- FTP
- Email
- DNS
- P2P
- Graph theory
- State diagrams
- P2P design
- IM
68DNS domain name system
- Change names, like www.yahoo.com into IP address.
- Services provided by DNS
- Name to address translation
- Host aliasing
- A host relay1.west-coast.yahoo.com could have two
aliases, yahoo.com and www.yahoo.com. - In this case, the canonical hostname is
relay1.west-coast.yahoo.com. - DNS can provide canonical host names
- Mail server aliasing
- When a mail server wants to send a mail to
Me_at_udel.edu, it does not send it to www.udel.edu,
but to mail.udel.edu. Or maybe udmail.udel.edu.
DNS can translate udel.edu to mail.udel.edu - (Cheap) Load distribution
- Cnn.com has several servers.
- DNS will respond with all address,
- but it will reorder the addresses every time.
- If the client uses the first address listed, then
each client will use different servers. - Content distribution networks (CDN) are better
ways of load balancing
69DNS - structure
- Centralized DNS?
- Pros somewhat easy to maintain (there is only
one system). But it must always be online - Cons
- Single point of failure (the system crashes -gt no
web) - Congestion
- Server would be far from some hosts (delay)
- Database would be too big
- The register bohacek-pc1.pc.udel.edu would
require interacting with the big server - Instead, a distributed hierarchical database is
used.
70Domain Hierarchy
edu
com
gov
mil
org
net
uk
in
UD
upenn
yahoo
cisco
whitehouse
nasa
navy
arpa
acm
art
eecis
bohacek_pc10
bohacek_pc1
71Administrative Zones in the Domain Hierarchy
root
edu
com
gov
mil
org
net
uk
in
yahoo
cisco
whitehouse
nasa
UD
upenn
navy
arpa
acm
art
eecis
bohacek_pc1
bohacek_pc10
It is possible that .edu and .gov are
administered together Note that UD administered
art but not eecis Some times a single service
provider will administer the domains for a large
number of .coms
72Root servers
- Each layer in the hierarchy knows about the
domain names below it - The highest level is the root.
- There are 13 root servers
- Each of these servers is actually several
servers, and some of the machines that comprise a
server are distributed geographically.
a Verisign, Dulles, VA c Cogent, Herndon, VA
(also LA) d U Maryland College Park, MD g US DoD
Vienna, VA h ARL Aberdeen, MD j Verisign, ( 21
locations)
k RIPE London (also 16 other locations)
i Autonomica, Stockholm (plus 28 other
locations)
m WIDE Tokyo (also Seoul, Paris, SF)
e NASA Mt View, CA f Internet Software C. Palo
Alto, CA (and 36 other locations)
13 root name servers worldwide
b USC-ISI Marina del Rey, CA l ICANN Los
Angeles, CA
73overview
- Top-level domain (TLD) servers
- There are around 200 top-level domains
- These include com, edu, mil, info, in, uk, cn,
- Currently,
- network solutions maintains the TLD servers for
com - Educause maintains the TLD servers for edu
- The root servers know the addresses and names of
all top level servers - Organizations have a hierarchy of DNS servers
74DNS queries
- Suppose a host needs the IP address of
bohacek-pc1.eecis.udel.edu - If this IP address is not in cache, the host asks
its local DNS server. - If the DNS server does not have it in cache, it
checks if is had the IP address of the DNS server
of eecis.udel.edu in cache - If not, it checks if IP address of the dns server
of udel.edu in cache - If not, it check if it has the IP address of the
top-level domain server of edu in cache - It not, it asks the root server for the IP
address of the edu TLD server - The DNS server always has the IP address of the
root servers - The local DNS server asks the edu TLD server for
address of bohack-pc1.eecis.udel.edu. - The TLD server does not know that IP address, but
instead gives the IP address of the dns server
for UD - The local DNS server asks the UD dns server for
the address of bohack-pc1.eecis.udel.edu. - The UD dns server does not know the address, but
instead returns the address of the eecis dns
server. - The local DNS server asks the eecis dns server
for the address of bohacek-pc1.eecis.udel.edu - Eecis dns server replies with the address.
- This address is returned to the host that
orginally asked the question.
75DNS Queries
Root server (IP address are always known)
Browser wants to show www. eecis.udel.edu
What is the IP address of www.eecis.udel.edu?
Root server does not know. Instead, it responds
with dns server that might, specifically, the TLD
server for .edu
Browser needs the IP address of www.
eecis.udel.edu
TLD server for .edu
What is the ip address of www.eecis.udel.edu?
Host asks local DNS server for IP address of www.
eecis.udel.edu
TLD server does not know. Instead replies with
the name and IP address of the UD DNS server
What is the ip address of www.eecis.udel.edu?
It is 128.4.1.2
UD dns server does not know. Instead it replies
with the name and IP address of the eecis dns
server.
What is the ip address of www.eecis.udel.edu?
- Local DNS server checks if it has the IP address
of www.eecis.udel.edu in cache. - If not, it checks if is had the IP address of the
DNS server of eecis.udel.edu in cache - If not, it checks if IP address of the dns server
of udel.edu in cache - If not, it check if it has the IP address of the
top-level domain server of edu in cache - .if not, ..
It is 128.4.1.2
76DNS Queries
Root server (IP addresses are always known)
What is the IP address of www.eecis.udel.edu?
Root server does not know. Instead, it responds
with name and address of a server that might,
specifically, the TLD server for .edu
Browser wants to show www.eecis.udel.edu
Browser needs the IP address of
www.eecis.udel.edu
Host asks local DNS server for IP address of
www.eecis.udel.edu
What is the IP address of www.eecis.udel.edu?
TLD server for .edu
TLD server does not know. Instead replies with
the name and IP address of the UD DNS server
What is the ip address of www.eecis.udel.edu?
It is 128.4.1.2
UD DNS server does not know. Instead it replies
with the name and IP address of the eecis dns
server.
UD DNS server
What is the IP address of www.eecis.udel.edu?
- Local DNS server checks if it has the IP address
of www.eecis.udel.edu in cache. - If not, it checks if is had the IP address of the
DNS server of eecis.udel.edu in cache - If not, it checks if it has the IP address of the
DNS server of udel.edu in cache - If not, it checks if it has the IP address of the
top-level domain server of edu in cache - .if not, ..
It is 128.4.1.2
eecis DNS server
77DNS Queries
Browser wants to show www.eecis.udel.edu
Browser needs the IP address of
www.eecis.udel.edu
Host asks local DNS server for IP address of
www.eecis.udel.edu
It is 128.4.1.2
- Local DNS server checks if it has the IP address
of www.eecis.udel.edu in cache. - If yes, then return it
78DNS Queries
Browser wants to show www.eecis.udel.edu
Browser needs the IP address of
www.eecis.udel.edu
Host asks local DNS server for IP address of
www.eecis.udel.edu
It is 128.4.1.2
What is the IP address of www.eecis.udel.edu?
- Local DNS server checks if it has the IP address
of www.eecis.udel.edu in cache. - If not, it checks if is had the IP address of the
DNS server of eecis.udel.edu in cache - If yes, query it
It is 128.4.1.2
eecis DNS server
79DNS Queries
Browser wants to show www.eecis.udel.edu
Browser needs the IP address of
www.eecis.udel.edu
Host asks local DNS server for IP address of
www.eecis.udel.edu
What is the IP address of www.eecis.udel.edu?
TLD server for .edu
TLD server does not know. Instead replies with
the name and IP address of the UD DNS server
What is the ip address of www.eecis.udel.edu?
It is 128.4.1.2
UD DNS server does not know. Instead it replies
with the name and IP address of the eecis dns
server.
UD DNS server
What is the IP address of www.eecis.udel.edu?
- Local DNS server checks if it has the IP address
of www.eecis.udel.edu in cache. - If not, it checks if is had the IP address of the
DNS server of eecis.udel.edu in cache - If not, it checks if it has the IP address of the
DNS server of udel.edu in cache - If not, it checks if it has the IP address of the
top-level domain server of edu in cache - .if so, then query it
It is 128.4.1.2
eecis DNS server
80Attack on DNS
- Hackers have tried to bring down DNS by
performing a DoS on the root servers - DoS denial of service. Sends more packets or
requests for service than the server can
accommodate. Resulting in poor service for normal
users.
- This failed because
- There are many very strong root servers and have
firewalls/filters - The attacks used ICMP ping packets
- DNS requests would have been more effective
- It is rare that a root server is needed
- Usually only the TLD server is needed
- Or only a domain server.
81DNS Message Details
- DNS Record
- (Name, Value, Type, Class, TTL)
- If Type A
- Name is the host name
- Value is the IP address of the host
- If Type NS
- Name is a domain name
- Value is the name of the DNS server for the
domain - E.g., (udel.edu, dns.udel.edu, NS, , )
- Type MX
- Name is the domain name
- Value is the name of the mail server for the
domain - E.g., (udel.edu, mail.udel.edu, MX, , )
- Type CName
- Name is a host name
- Value is the canonical name of the host
- E.g., (www.yahoo.com, relay-east.yahoo.com,
CName, , ) - TTL is the time to live, so DNS caches can be
timed out - Class is no longer used, it is set as IN
82DNS query
- (Name, Type, Class)
- (UDel.edu, MX, IN)
- Please provide the name of the UDs mail server
- (mail.UDel.edu, A, IN)
- Please provide the IP address for mail.udel.edu
83DNS message format
- DNS protocol query and reply messages, both
with same message format
- msg header
- identification 16 bit for query, reply to
query uses same - flags
- query or reply
- recursion desired
- recursion available
- reply is authoritative
84DNS message format
Name, type fields for a query
RRs in response to query
records for authoritative servers
additional helpful info that may be used
85DNS Queries
Root server (IP addresses are always known)
Browser wants to show www.eecis.udel.edu
Browser needs the IP address of
www.eecis.udel.edu
TLD server for .edu
UD DNS server
- Local DNS server checks if it has the IP address
of www.eecis.udel.edu in cache. - If not, it checks if is had the IP address of the
DNS server of eecis.udel.edu in cache - If not, it checks if it has the IP address of the
DNS server of udel.edu in cache - If not, it checks if it has the IP address of the
top-level domain server of edu in cache - .if not, ..
eecis DNS server
86DNS Flags
- The DNS header has a query ID
- The query has this ID and the server copies this
ID into the response - Flag indicating query or answer
- Flag indicating whether the server is the
authoritative server for the answer (as oppose to
a cached answer) - A recursive desired flag indicating that the
host/server would like the server to perform the
recursive DNS lookup - A recursive available flag indicating whether the
server is available to to the recursive lookup
87DNS
- Which transport protocol should DNS use?
- Why?
88Peer-to-peer file sharing
- About P2P
- 30 or more of the bytes transferred on the
Internet are from P2P users - Skype is a very successful P2P VoIP app
- Written in 3-4 months
- Topics covered
- Scalability
- P2P querying
- Case study
- BitTorrent
- Skype
89Pure P2P architecture
- Review What is the difference between
peer-to-peer and client/server? - Each hosts acts as both a server and a client.
- no always-on server
- arbitrary end systems directly communicate
- peers are intermittently connected and may change
IP addresses - Pure P2P has significant drawbacks.
- P2P-like systems with some central servers are
more common. - But in all cases, the file transfer is between
peers, not from servers.
90File Distribution Server-Client vs P2P
- Question How much time to distribute file from
one server to N peers?
us server upload bandwidth
Server
ui peer i upload bandwidth
u2
d1
u1
d2
us
di peer i download bandwidth
File, size F
dN
Network (with abundant bandwidth)
uN
91File distribution time server-client
- Time for the server to send a copy to a single
client - F/us
- Time for the server send N copies
- NF/us time
- client i takes F/di time to download
Server
u2
F
d1
u1
d2
us
Network (with abundant bandwidth)
dN
uN
92File distribution time P2P
Server
- server must send one copy
- F/us time
- client i download time
- F/di
- Total data to be downloaded
- NF
u2
F
d1
u1
d2
us
Network (with abundant bandwidth)
dN
uN
- fastest possible transfer rate us Sui
Can you make a schedule for the download the take
this amount?
93Server-client vs. P2P example
Client upload rate u, F/u 1 hour, us 10u,
dmin us
Conclusion P2P systems are scalable. But the
load is distributed to all users, so P2P users
have more load than clients in the client-server
model.
94Peer-to-peer Querying
- While the file is transferred from the peer, how
to find the file - Options
- Centralize directory
- Napster
- Single point of failure
- Performance bottleneck
- Target for the RIAA
- Always up
- Easy to find
- Easy protocol
- Query flooding
- Gnutella
- Hosts find other host and form a network of
neighbors (overlay network) - Search for a file (covered next)
- How to set up the network bootstrap?
- Have a central list of peers
- Have distributed lists of peers
- Search out a peer by scanning like in project
95Querying Flooding State Diagram
Inform user of file location
User Request for File
Inform user that query failed
Set AttemptCounter 0
AttemptCounter
AttemptCountergtMaxAttempts
TimergtTO
else
Send out a request for file to all neighbors Set
Timer0
wait
Reply from peer
96Listening Peer
wait
Request arrives
Have seen request before
Get request ID
Check for file in directory
Send request to all neighbors
File is in local dir
Send response to peer that requested file
97Expanding ring
98(hierarchical peer-to-peer network)
- KaZaA
- Not all peers are equal super peers (?)
- Super peers (group leaders) have higher bit-rate
connections, are more stable, etc. - Peers connect to group leaders
- The group leaders keep a list of file shared by
all their children peer. - group leaders connect to a small number of other
group leaders - A child host will ask its group leader for a
file, if the group leader does not know where it
is, it will flood the network of group leaders.
The response from other group leaders follows a
reverse path to the asking group leader (so other
leader can cache the response) - A file is identified with a ID (e.g., MD5) that
can take a string (file) and come to a unique ID.
A small change in the file causes a large change
in the ID. It is not possible to construct two
files that have the same ID. The ID is a finger
print. - Since files are ID-ed, multiple copies of the
same file can be found and these copies can be
downloaded from multiple hosts in parallel. - Note the if you are downloading while other are
uploading, the uploading slows down the
downloading, but only a little bit.
99BitTorrent
- Centralized P2P
- A centralized server, or tracker, tracks the
clients involved in the P2P transfer - This is similar to Napster
- Companies that host these site get sued and are
attacked by DDoS - Components of BitTorrent System
- Torrent Files
- Trackers
- Seeders
- Peers
100Torrent File
- Required to download
- Can be found on web sites or sent by email
- Contains information about the file and the
tracker - Announce the URL of the tracker
- Creation date
- Info
- Length of file
- Name of file
- Length of each piece (except for the last)
- Pieces the 20B SHA-1 value of each piece
- Note, the number of pieces can be determined
counting the number of bytes in the pieces field
and dividing by 20 - If the download contains multiple files, then a
single torrent file will contain information
about all files.
101Tracker
- Make a HTTP Get request to the tracker specifying
the SHA-1 hash of the file to be downloaded - The request also includes the number of bytes
downloaded and the number uploaded - If the client does not upload enough, the tracker
might not provide a reply - The reply contains
- The time when the tracker information should be
refreshed (usually 30 minutes) - A list of the peers
- IP address and port (usually 6881)
- Peer ID
102File distribution with BitTorrent
tracker tracks peers participating in torrent
103BitTorrent (1)
- file divided into 256KB chunks.
- peer joining torrent
- has no chunks, but will accumulate them over time
- registers with tracker to get list of peers,
connects to subset of peers (neighbors) - while downloading, peer uploads chunks to other
peers. - peers may come and go
- once peer has entire file, it may (selfishly)
leave or (altruistically) remain
104BitTorrent (2)
- Sending Chunks tit-for-tat
- Alice sends chunks to four neighbors currently
sending her chunks at the highest rate - re-evaluate top 4 every 10 secs
- every 30 secs randomly select another peer,
starts sending chunks - newly chosen peer may join top 4
- optimistically unchoke
- Pulling Chunks
- at any given time, different peers have different
subsets of file chunks - periodically, a peer (Alice) asks each neighbor
for list of chunks that they have. - Alice sends requests for her missing chunks
- rarest first
- So rarest chunks are spread, and chunks are
uniformly common
105BitTorrent Tit-for-tat
(1) Alice optimistically unchokes Bob
(2) Alice becomes one of Bobs top-four
providers Bob reciprocates
(3) Bob becomes one of Alices top-four providers
With higher upload rate, can find better trading
partners get file faster!
106BitTorrent Pros/Cons
- Centralized server
- Slow to get the transfer started
- Web transfers start much faster and will achieve
a sustained rate - Peers must upload
- Some peers might not be in position to upload
(e.g., mobile phone) - Chunks can be corrupted
- HBO distributed fake chunks
- Since the SHA-1 hash does not match what is given
in the Torrent File, the chunk is dropped after
it is downloaded - This wastes bandwidth and can greatly increase
download time