INFO 330 Computer Networking Technology I

About This Presentation

Title:

INFO 330 Computer Networking Technology I

Description:

Computer Networking Technology I. Chapter 4. The Network Layer. Glenn Booker. 1. INFO 330 Chapter 4 ... And pass them through the network ... – PowerPoint PPT presentation

Number of Views:135

Avg rating:3.0/5.0

Slides: 147

Provided by: khow6

Category:

more less

Transcript and Presenter's Notes

Title: INFO 330 Computer Networking Technology I

1
INFO 330Computer Networking Technology I

Chapter 4
The Network Layer
Glenn Booker

2
The Network Layer

So, the transport layer provides process to
process communication
The network layer is expected to provide host to
host communication
Cool.
Um, how?

3
The Network Layer

The Network Layer has to do two things
Forwarding is the process within a single router
to determine which outgoing link a packet has to
take
Routing is the process (and algorithm) of
choosing the best path (route) between source
and destination
Forwarding is like deciding which turn to make
at one intersection
Routing is deciding which roads to take

4
The Network Layer

Recall the network layer is expected to
Receive segments from the transport layer
Encapsulate them into datagrams (how much does
data weigh?)
And pass them through the network
The job of most routers is to look at the network
header information, and determine which link to
pass the datagram
The application and transport layer information
are invisible and irrelevant to routers

5
The Network Layer

A router has a forwarding table which tells which
link to take, based on the headers destination
address
The forwarding table is written based on output
from a routing algorithm
Routing algorithms may be centrally controlled
and then downloaded to each router or each
router may follow their own algorithm

6
The Network Layer

A packet switch is a device that transfers a
packet from an input link to an output link
Some are link-layer switches, which use the link
layer header info
The rest we call routers, which use network layer
header info
Another function in the network layer can be
connection setup
Only for virtual circuit networks (ATM, X.25)

7
Network Service Model

What services could we expect from a network
layer?
Guaranteed delivery of all packets
Delivery within a specified time (bounded delay)
Delivery of packets in order
Guaranteed minimal bandwidth
Guaranteed maximum jitter (delay variation)
Security services
Would be nice, huh?

8
Network Service Model

What do we get from the Internet?
Best-effort service
Meaning, none of the above!!
Some VC networks, such as ATM, can provide many
of the ideal services (see p. 322)
Constant Bit Rate (CBR) and Available Bit Rate
(ABR) are types of ATM service

9
Network Service Model

Refining our earlier definition, the network
layer can provide connection-based or
connection-less service
A network that provides only a connection-based
service at the network layer is a virtual
circuit (VC) network
A network that provides only connectionless
service at the network layer is a datagram
network

10
Virtual Circuit Networks

A VC Network needs to have
A path from source to destination
VC numbers, one per link along the path
Entries in the forwarding table in each router
along the path
Each packet carries a VC number which changes as
it goes along each link in the VC
This keeps from having to store and coordinate VC
numbers across routers

11
Virtual Circuit Networks

Each router has to know the VC numbers for
incoming and outgoing linksIncoming Link
Incoming VCandOutgoing Link Outgoing VC
Each foursome of in/out link and VC numbers
corresponds to how one VC is handled in that
router so each VC being created adds one line of
data (which is later removed)

12
Virtual Circuit Networks

So a simple VC might have VC 12 on the first
link, then get VC 22 on the second link, and VC
37 on the third
So the life of a VC connection includes
VC setup the network layer defines the routers
in the VC, sets VC numbers for each link, and
creates new entries in the forwarding table of
each router

13
Virtual Circuit Networks

Data transfer is the intended purpose of the VC
connection
VC teardown is when sender or receiver tells the
VC it wants to end the connection then the
forwarding tables are updated to remove the
entries associated with this VC
Notice that VC setup and teardown involve the
hosts and all routers along the path, whereas
TCP only involved the hosts

14
Virtual Circuit Networks

The messages to set up and tear down a VC are
signaling messages, which have their own
protocols, e.g. ATMs Q.2931
No, were not going to dissect them
yippee

15
Datagram Networks

Datagram networks stamp each packet with the
address of the destination host, and send it into
the network
There is no state information about connections,
because there arent any connections within the
network!

16
Datagram Networks

Each router between hosts uses the address to
forward the packet using a forwarding table
If our addresses had 32 bits, there could be
4,294,967,296 entries in that table!

17
Datagram Networks

Fortunately, we dont need to look at ALL of the
address to determine its correct link (a key
observation!)
Instead, match the address prefix with
forwarding table entries
Use the longest prefix matching rule
Match the longest prefix possible in the
forwarding table
For this to be practical, large ranges of
addresses should go to each link, or the table
will be huge!

18
Longest prefix matching rule

The router just finds the longest prefix and uses
that entry in the routing table to forward the
packet
Prefix Link
11001000 00010111 00010 0
11001000 00010111 00011000 1
11001000 00010111 00011 2
Otherwise 3

19
Datagram Networks

So even though there is no connection data,
routers in datagram networks need to maintain the
forwarding tables
The routing algorithm typically updates them
every 1-5 minutes
Hence its quite possible for the later part of a
long session to follow a different path than the
first part!

20
More History

The VC network came about because of its
similarity to telephone networks
But the Internet was connecting complex
computers, so the datagram network was created
because the computers could handle more complex
operations than the routers (recall our IMP
friends from Chapter 1)
This also makes it easier to connect dissimilar
networks, and create many new applications

21
Router Innards

Now look at forwarding in more detail
A router has four kinds of parts
Input ports
Output ports
Switch fabric between the inputs and outputs
And a routing processor to control the switch
fabric, using the routing protocols

22
Router Innards
23
Router Innards

The input and output ports include
The physical connection to the network, and
Take the signal through the data link layer
The input ports also look up the destination
address, decides how to forward the packet, and
creates control packets to send to the routing
processor
The three boxes represent the physical layer,
data link layer, and lookup/forward module

24
Input Ports

The routing processor determines the forwarding
table contents, and shadow copies it to each
input port
This avoids a processing bottleneck
Looking up where to forward packets is simple in
concept the challenge is maintaining line speed
Want to process each packet in less time than it
takes to receive the next one

25
Tree Lookup

One way to look up the correct output port is
through a binary tree data structure
Look at the first bit in the address if its a
zero, follow the left branch of the tree
otherwise follow the right branch
Repeat as many times as needed to resolve the
address
Sadly, this is still too slow
Content addressable memories (CAMs), caching,
and better data structure are possible solutions

26
Tree Lookup
For a 3-bit address
27
Switching Fabric

The input ports determine the output port needed
switching fabric makes it happen
Many approaches for switching fabric have been
used
Switching via memory uses the CPU directly
Switching via bus makes every packet go over a
bus before getting off at the correct output
very slow

28
Switching Fabric

Switching via interconnect network uses 2n
horizontal and vertical buses to connect n inputs
to n outputs but this can produce blockages
Lots of other approaches have been used
Switches handle staggering data rates (400
million packets/sec as of 11/09), so their
technology is constantly being pushed

29
Switching Fabric Approaches
30
Output Ports

The output ports take packets from the output
port memory (queue) and transmit them over the
outgoing link
Hence the three functions of output ports are
Queuing
Data link processing
Physical line termination

31
Queuing

Weve discussed buffers in connection with
output ports, but they also exist with input
ports
Packet loss can occur at input or output queues,
depending on
Input traffic load
Switching fabric speed
Line speed

32
Switching Fabric Speed

For a router with n input and n output ports
If the switching fabric has a speed n times as
fast as the input line speed, no queuing can
occur at the inputs
But the output ports can easily become overloaded
if many inputs all feed the same output port
A packet scheduler at the output port decides
which packet is next for transmission

33
Packet Scheduler

The packet scheduler needs rules
Could use first come, first served (FCFS)
approach
Could use weighted fair queuing (WFQ)
The packet scheduler affects the quality of
service of the connection
More details on this in Chapter 7, which we
arent covering this term

34
Incoming Buffer

If theres not enough room in the buffer for a
new incoming packet, have to decide
Drop the new packet (called drop tail), or
Drop an existing packet to make room
Can also mark packets for congestion control when
buffer is getting full
Dropping and marking strategies are Active Queue
Management (AQM) algorithms

35
Incoming Buffer

Examples of AQM algorithms include
Random Early Detection (RED), which uses random
variables to decide when to drop or mark a packet
when buffer approaches full
If the switch fabric is too slow, packets have to
wait in the input queue before moving to an
output queue
Head-of-the-line (HOL) blocking is when a packet
waits for a packet to cross, even though its
output port is open

36
The Internet Protocol (IP)

Now see how all this applies to the Internet
Well cover both the existing IPv4 and the
emerging IPv6 (versions 4 and 6)
The network layer has three major parts
Internet Protocol, which handles addressing
Routing protocols (e.g. RIP, OSPF, BGP), which
choose the best path for packets
Internet Control Message Protocol (ICMP), which
handles error reporting and signaling

37
Datagram Format

A segment in the transport layer becomes one or
more datagrams in the network layer
First discuss IPv4, with hints how IPv6 is
different

38
Datagram Format

The IPv4 datagram header has at least five
4-byte (32-bit) fields, like TCP
Version number, header length, type of service,
and datagram length in bytes
Identifier, some flags, and fragmentation offset
Time-to-live, upper layer protocol, and header
checksum
Source IP address (32 bits)
Destination IP address (32 bits)
Then options, followed by the segment data

39
Datagram Format

Version number is 4 bits for the IP version
Header length is 4 bits for the number of bytes
in the IP header (usually 20 B)
Type of service (TOS) is 8 bits which allow one
to specify different levels of service (real
time or not)
Datagram length in bytes is the total of the
header plus the actual data segment
Is a 16 bit field, but typical length is under
1500 B

40
Datagram Format

The Identifier, flags, and fragmentation offset
all relate to IP fragmentation (breaking a
segment into multiple datagrams)
Time-to-live (TTL) is a countdown integer, to
prevent packets from wandering in the network for
40 years
It increments down one with each router, and
kills the datagram when it gets to zero

41
Datagram Format

Protocol is the transport layer protocol
Only used when get to the destination host
E.g. 6TCP, 17UDP see RFC 3232 for others
Header checksum hey, didnt we have a transport
checksum?
Yes, but this only covers the IP header, not the
segment data
And TCP might be run over other network
protocols, e.g. our VC buddy, ATM

42
Datagram Format

Source and destination IP addresses well
discuss in more detail soon
Option fields allow for rarely used functions,
but slow IP processing
Hence these are not allowed in IPv6
The Data in the datagram can be the TCP or UDP
segment, or contain other message formats such as
ICMP

43
Fragmentation

A frame can hold up to the Maximum Transmission
Unit (MTU) bytes of data
But not all link-layer protocols can handle the
same size packets
Ethernet handles up to 1500 B frames
Some WAN protocols only handle 500 B frames
Since datagrams get passed from one router to the
next, and dont know the path ahead, some routers
have to break up a datagram

44
Fragmentation

An IP datagram can be broken into two or more
fragments
Expect the fragments to be reassembled by the
destination hosts network layer
Recurring theme minimize work done by routers
Each initial datagram has an identification
number, in addition to the source and destination
addresses

45
Fragmentation

This is the Identification field in the header
The identification number is incremented for
each new segment
Each fragment keeps the original identification
number
The last fragment has Flag0 set, all other frags
with that ID number have Flag1
The offset field identifies where the frag fits
in the original datagram the number of 8-byte
chunks from the start

46
Fragmentation Example, p. 347

Suppose we have a 4000 B datagram (20 B of
header, plus 3980 B of segment), but the MTU only
allows 1500 B per frame
Make three fragments (4000/1500 round up)
All frags have the same identifier (e.g. 777)
The first two frags will have 1480 B of data,
plus 20 B of IP header the last frag will have
the remaining data (1020 B) plus 20 B header
The first two frags have Flag1 the last Flag0

47
Fragmentation Example, p. 347

The offset value is weird counts 8-byte chunks
Offset is 0 for the first frag (its the first
frag), 185 8-byte chunks (1480 B) for the second
frag, and 370 8-byte chunks (2980 B) for the
third frag
Why 8-byte chunks? Offset is a 13 bit field, but
the offset in bytes could be 16 bits long, hence
use 8 (23) byte chunks to describe offset
Forces fragments to be a multiple of 8 bytes in
size
Fortunately, IPv6 gets rid of router fragmentation

48
Evil Fragmentation

Fragmentation can be used for attacks
Jolt2 attack Send a lot of incomplete fragments
to a server (e.g. none have zero offset) itll
eventually run out of storage and crash
Send overlapping frags to a server some get
confused and crash

49
IPv4 Addressing

Recall that hosts have to have interfaces to the
network, over which to send datagrams
Routers need many interfaces, since they are
connected to multiple links
Therefore every IP address is associated with an
interface, not a host or router
IPv4 addresses are 32 bits (4 bytes), written in
dotted decimal notation (byte.byte.byte.byte)

50
IPv4 Addressing

Every Internet address visible to the must have a
unique IP address
Local networks can hide many systems behind one
IP using network address translation (NAT)
IP addresses are given out as hierarchically as
possible, so many local addresses have the same
prefix or subnet (leftmost bits in the IP
address)
Subnet IP network network in much literature

51
IPv4 Addressing

How many bits of the address are used to define
the subnet is given as a suffix after a slash,
e.g. 213.1.3.0/24 means the first 24 bits of the
address are the subnet mask
Often the links of a router each point to a
different subnet, e.g. in Fig 4.15
Subnets also can be defined for the interfaces
between routers
A subnet is essentially an isolated part of a
larger network

52
Fig 4.15 Subnet example
53
Pre-CIDR

Internet domains originally had prefixes of
Class A8, Class B16, or Class C24 bits
Led to lots of wasted address space!
Class A ? 16,777,216 hosts per domain
Class B ? 64k hosts
Class C ? 256 hosts

54
CIDR

Now we use Classless Interdomain Routing (CIDR,
RFC 1519) to avoid that limitation
Any subnet of the form a.b.c.d/x can be used
The x is called the prefix or network prefix
Outside of the network (subnet), only the prefix
is used for routing
The rest of the address defines hosts within the
network

Image from http//www.naturalandsustainable.com/ca
tegory/hard-cider/
55
CIDR

So if a prefix is of the form a.b.c.d/21,
21 bits of the address are the prefix
The remaining 32-21 11 bits are unique to each
device within that subnet
Giving you room for 211 2048 hosts
The a.b.c.d part of the CIDR address can be
anything that fits within the prefix length in
binary

56
Broadcast Address

The IP broadcast address is a special IP address
255.255.255.255 (or all ones, 111111111.11111111.1
1111111.11111111)
When the destination address is that value, the
message goes to all hosts within the subnet
Routers usually wont forward these messages but
might

57
Obtaining IP Addresses

Typically an ISP gets a block of IP addresses,
and assigns them to customers
E.g. the ISP might get 200.23.16.0/20, which it
breaks down into smaller subnets for each
customer 200.23.16.0/23 for one, 200.23.18.0/23
for another, etc.
That way, routing knows anything starting with
200.23.16.0/20 goes to that ISP, and the ISP
routes it more specifically to each customer,
who then routes it to each specific host

58
Obtaining IP Addresses
The use of a prefix for multiple subnets is
called address or route aggregation, or route
summarization
59
Managing IP Addresses

While ideally it would be nice to have a unique
subnet for everything, in reality it gets
messier many ISPs might have several subnet
ranges assigned to them
ICANN manages IP addresses, based on RFC 2050,
as well as managing domain names

60
Getting a Host IP Address

An organization assigns host addresses within its
subnet
Routers have IP addresses manually assigned
Hosts can be manually assigned, but usually use
Dynamic Host Configuration Protocol (DHCP)
DHCP sets the host IP address, the subnet mask,
defines the first-hop router (default gateway),
and local DNS server
DHCP is often known as a plug-and-play protocol,
because it makes network admin much easier!

61
DHCP

For example, an ISP can use DHCP to assign IP
addresses to dialup customers
Need fewer IP addresses than you have customers,
since all wont be online at once
Need to manage which IP addresses are in use, and
which are available to be assigned
DHCP is also handy for mobile clients, such as
connecting to Dragonfly

62
DHCP

Dynamic Host Configuration Protocol (DHCP) makes
our lives much easier
DHCP is client/server based
There must be at least one DHCP server to tell
everyone else what their IP addresses are
A router can act as a DHCP relay agent, so that
multiple subnets can share one DHCP server

63
DHCP

A new host on a subnet follows a four-step
process to get an address
DHCP server discovery use a DHCP discover
message (using UDP, port 67) to the broadcast IP
of 255.255.255.255, with a source IP of all zeros
A relay agent will pass the message to the server
DHCP server offer(s) each DHCP server responds
with a DHCP offer message, including IP, network
mask, address lease time (TTL), etc.
Many offers can be received by a host

64
DHCP

DHCP request the new host (client) chooses from
the offers, selects one, and sends a DHCP request
message to that server
DHCP ACK the server responds with an ACK
message, and confirms the requested parameters
Once the client is connected with its assigned
IP, the lease can be renewed
One minor drawback is that an IP address cant be
kept between subnets, bad for mobile clients

65
Network Address Translation

Network Address Translation (NAT) allows local
networks to define IP addresses that are
invisible to the outside world
The NAT router looks like a device with one IP
address to the outside world, but usually uses
DHCP to assign IP addresses from private
networks to local devices
It doesnt have to use private networks, you
could use publicly visible IP addresses

66
Private networks

NAT typically uses prefixes reserved for private
networks, per RFC 1918
The Internet Assigned Numbers Authority (IANA)
has reserved the following three blocks of the IP
address space for private internets
10.0.0.0/8
172.16.0.0/12
192.168.0.0/16

67
Network Address Translation
68
Network Address Translation

The NAT router keeps a translation table
Destination address and port number
Source local host IP AND port number
Hence NAT has to change the addressing of every
datagram in out of the network!
Some purists object to this, because it
interferes with host-to-host communication
Need workarounds for P2P applications

69
UPnP

Peer to peer applications need an easy way to
cross a NAT router (NAT traversal)
Universal Plug and Play (UPnP) does that, for
either TCP or UDP packets

70
ICMP

ICMP is an old (1981) protocol (RFC 792) to
communicate error messages across the network
layer
E.g. Destination network unreachable
ICMP is a nudge above IP, since ICMP sends IP
datagrams, instead of a TCP or UDP segment
ICMP messages have a type and code field (p.
364), plus the first 8 bytes of the
offending IP datagram

71
ICMP Ping

ICMP message also convey other kinds of
information, such as congestion control, bad IP
header data, TTL expired, etc.
Ping uses an ICMP message type 8, code 0, which
is an echo request
The reply should be type 0, code 0, echo reply

72
Traceroute

Traceroute sends UDP segments with bad port
numbers and successive TTL (1, then 2, then 3,
etc.) and times each datagram
When each TTL occurs, an ICMP warning message is
sent from that router, which returns to give the
round trip time (RTT) and the routers information

73
Traceroute

When a datagram gets to the other host, the UDP
segment has a weird port number, which prompts an
ICMP message of type 3, code 3, destination port
unreachable
That tells traceroute the other host has been
reached, so no more datagrams are needed
Sneaky!

74
ICMP and Firewalls

Firewalls typically inspect the headers of
packets to look for threatening contents
Pings coming from outside your network can map IP
addresses, for example
Port scans can look for open ports
An Intrusion Detection System (IDS) goes further
by looking at packet contents (data), and
comparing them to known attacks

75
IPv6

The IETF realized that the Internet would run out
of IP address space, and CIDR, NAT, and DHCP
arent enough to save it
By 1996, 100 of Class A addresses were used, 62
of Class B addresses, and 37 of Class C
IPv6 was first called IPng (next generation)
IPv6 is defined by RFC 2460
Whats different from IPv4?

76
IPv6 Datagram

The IP addresses went from 32 to 128 bits
2128 340,282,366,920,938,463,463,374,607,431,770
,000,000
Really, we wont run out of IP addresses. Ever.
In contrast, the number of cells in 6 billion
people is about 6E91E12 6E21, a factor of 56
million billion under the 3.4E38 possible
addresses

77
IPv6 Datagram

Adds an anycast address type, which can go to
any in a group of hosts
Header is fixed 40-bytes (2x4 B 2x16 B)
Adds flow labeling and priority, where a flow is
a group of packets requiring special handling
(real time service, or paid priority enhancement)

78
IPv6 Datagram

IPv6 addresses can be a 16-value dotted decimal
notation, e.g. 128.91.45.157.220.40.0.0.0.0.252.87
.212.200.31.255 or the hex equivalent
805B.2D9D.DC28.0000.0000.FC57.D4C8.1FFF
There are lots of rules for abbreviating IPv6
addresses most common is which hides a
bunch of zeroes
Removes from IPv4
Fragmentation, Header checksum, and Options

79
IPv6 Datagram

Specifically, IPv6 headers have the following
fields
IP version, now obviously a 6
Traffic class, similar to the TOS field
Flow label, an identifier for a given flow
Payload length number of bytes in the data
Does not count the header, since thats a fixed
40 B
Next header is the protocol field from IPv4
Hop limit acts like the time-to-live (TTL) field
Source and destination addresses, are 128 bits
each
Then the data

80
ICMPv6

ICMP has been updated for new messages under IPv6
It also takes over the Internet Group Management
Protocol (IGMP) which well get to later it
involves joining and leaving multicast groups

81
IPv4 versus IPv6

The transition from IPv4 to IPv6 is huge tens
of millions of hosts and routers only speak IPv4
Three major approaches for making the transition
to v6
Flag day approach
Have everyone (in the whole world) update to v6
by a given specific day only run v6 after that
day
Isnt logistically or financially possible

82
IPv4 versus IPv6

The dual stack approach means implement v4 and v6
at the same time, and switch back forth as
needed
Every v6 node also runs v4 this is called an
IPv6/IPv4 node
Works, but often loses the benefit of v6 existing
Tunneling is also possible
Wherever a section of IPv4 links needs to be
crossed, package the IPv6 datagram in an IPv4
datagram
Then unwrap the v6 datagram when back in v6 land

83
IPv6 Adoption

The adoption of IPv6 has been slow, partly
because of CIDR, NAT, and DHCP
However large scale technology changes typically
take a long time
How many phone lines are optical yet?
Network protocols are very slow to change,
whereas apps are easy to change
IPv6 will probably be around a long time!

84
IP Security

IPv4 was designed in the 1970s, long before
anyone expected the Internet to be a public
medium and hence it has no security in it
IPsec was created to work with IPv4 or IPv6 and
add security to the network layer
It allows TCP and UDP traffic to take place in a
secure environment

85
IP Security

IPsec
Allows hosts to negotiate encryptiion protocols
Use that protocol to encrypt each datagram
Verify that the header and data retain their
integrity
Authenticate the origin of a trusted source
This is covered more in chapter 8

86
Routing Algorithms

Mostly have focused on forwarding now address
routing
Both datagram and VC networks need to perform
routing, i.e. find good paths between sender and
receiver
A host is typically attached to its default
router (first hop), which well call the source
router similarly the destination has a
destination router

87
Routing Algorithms

A good route typically minimizes cost, but may
also avoid other concerns (e.g. ownership of
networks, privacy of data, etc.)
Use a graph to show routing problems, with N
nodes (routers) and E edges (links)
Assume the cost of each edge is a given c(x,y)
cost of edge between nodes x and y(x,y) is the
edge between those nodes

88
Routing Algorithms

The cost of an edge not available is infinite
A path is defined by a sequence of nodes (x1, x2,
x3, , xn)
The cost of a path is the sum of the edge costs
along it c(x1,y1)c(x2,y2)c(xn, yn)
Some path between nodes x and y is the least-cost
path
If all edges have the same cost, the shortest
path is also the least-cost path

89
Routing Algorithms

Two key ways to classify routing are
A global routing algorithm uses knowledge of the
entire network to calculate the best path
Also called link-state (LS) algorithms
A decentralized routing algorithm finds the least
cost path in an iterative decentralized manner
no node has complete knowledge of the network
Only the local costs are known
The distance-vector (DV) algorithm is one example

90
Routing Algorithms

Another way to classify routing algorithms is
static vs dynamic
Static routing algorithms change slowly over
time, often by human intervention
Dynamic routing algorithms change to adjust for
traffic, topology, etc.
Can update periodically, or adjust for network
changes

91
Routing Algorithms

A third classification (!) is load-sensitive
versus load-insensitive algorithms
Does congestion change the routing?
High cost for a congested link leads to using
load-sensitive routing, but most Internet
algorithms are load-insensitive
So we have global vs. decentralized, static vs.
dynamic, and load-sensitive vs.
load-insensitive

92
Link-State Routing Algorithm

The LS algorithm uses complete knowledge of
network topology and link costs
The identity and cost of links for each router
are broadcast using a link-state broadcast, such
as the Internets OSPF protocol
The actual routing is calculated using Dijkstras
algorithm (named for Edsger Dijkstra)

93
Link-State Routing Algorithm

Dijkstras algorithm is iterative, so that after
k iterations, the least-cost paths are known to k
destination nodes
The global routing algorithm initializes all
nodes, then does a loop as many times as you have
nodes in the network
Each loop adds the lowest cost node to N, the
list of nodes no longer under consideration,
until all nodes are in N

94
Dijkstras Algorithm
95
Dijkstras Algorithm

For example, the algorithm finds the cost to get
from u to w is first 5 (path uw), then 4 (uxw),
then 3 (uxyw), and cant improve on the cost of 3
When done, we have the lowest cost path from the
source to all other nodes
Complexity of this algorithm is the need to
search n(n1)/2 nodes, which is O(n2) (the
order of n squared)

96
Oscillations

If the cost of a path depends on the direction
through that path, algorithms can undergo
oscillations where the best path changes from
clockwise to counter-clockwise with each
iteration
To avoid this, dont run the algorithm on all
nodes at the same time
Or dont use load-based link costs

97
Distance-Vector (DV) Routing

The Distance-Vector Routing Algorithm is
iterative, asynchronous, and distributed
Nodes get data from directly attached neighbors,
and distribute the results to the neighbors
Assume were going from node x to node y, and the
neighbors of x are nodes v
The Bellman-Ford equation gives us
dx(y) minc(x,v) dv(y)

98
Distance-Vector Routing

Say what?
Start at node x
For each neighbor v, find the cost to get from v
to y, which is dv(y)
The cost from each neighbor to y is the cost from
x to v, plus the cost from v to y, or c(x,v)
dv(y)
The cheapest cost from x to y is the smallest
value of the previous bullet for any neighbor of x

99
Distance-Vector Routing

Cute parlor trick?
Actually this is the basis for forwarding tables!
For some destination y, the lowest cost path
goes through a particular neighbor v
The DV algorithm essentially follows the
Bellman-Ford equation
As each node gets cost data from its neighbors,
the cost to get anywhere in the network
approaches the ideal value dx(y)

100
Distance-Vector Routing

This depends on asynchronous data exchange among
nodes
And after all nodes have exchanged information,
the routing wont change (becomes quiescent)
until theres a change in link cost or a dead
link
Many protocols use some variation on this
approach, including ARPAnet, the Internets RIP
and BGP protocols, Novell IPX, ISO IDRP, etc.

101
DV Changes

If the cost of a link decreases, updates to its
neighbors will generally occur peacefully
If a cost goes up, leftover incorrect information
can cause a routing loop (bounce back and forth
between nodes)
Large cost increases can result in thousands of
bounces before the problem corrects itself, hence
known as the count-to-infinity problem

102
DV Changes

Fix somewhat with the poisoned reverse
Pretend the cost to go backward on a link is
infinite, so it wont try to bounce back
But if the loop involves more than two nodes,
this doesnt help

103
Compare LS vs. DV Routing

Under LS, nodes talk to all other nodes, but
exchange costs of direct connections
Under DV, nodes only talk to neighbors, but
gives cost estimates to all other nodes
Message complexity
LS sends cost changes to every node in the
network DV only propagates changes when cost
decreases

104
Compare LS vs. DV Routing

Speed of convergence
LS converges with speed O(n2) DV converges
slowly, and can suffer from routing loops and the
count-to-infinity problem
Robustness
If a node fails under LS, the rest of the network
is relatively unaffected (for routing) under DV,
a faulty router can mislead the rest of the
network
So both approaches have advantages

105
Other Routing Approaches

LS and DV are the only routing approaches widely
used in the Internet
Many others have been defined over the years
Network flow problems model the network as a big
equation to solve
Circuit-switched routing algorithms use
telephone-like logic to find the cheapest routes

106
Hierarchical Routing

LS and DV assume the network is a herd of
connected routers all peers or equals
Scaling for LS routing is daunting for huge
number of routers
Most administrators want autonomy to decide their
structure
What happens if theres structure to routers?
Organize routers into autonomous systems (AS)

107
Autonomous Systems (AS)

Under AS, groups of routers
Are under control of one administration authority
Use one routing protocol (LS or DV) within that
group, their intra-autonomous system routing
protocol
Connect to other groups via gateway routers
Routing information separates routing within the
AS from routing outside the AS
Need to know which outside addresses are best
reached from which gateway routers

108
Autonomous Systems (AS)
109
Autonomous Systems (AS)

In order for the AS to talk to each other, they
need to use the same inter-AS routing protocol
called BGP4 for the Internet
BGP4 defines which subnets are reachable from
various gateway routers (assuming more than one
exists)
One common strategy is hot-potato routing, where
you send a packet to the cheapest gateway
router

110
Autonomous Systems (AS)

AS communicate to each other about new
destinations nearby
Large ISPs may set up dozens of AS just for
themselves smaller ISPs might be one AS
Now look at two intra-AS routing protocols (RIP
and OSPF) and the inter-AS routing protocol BGP

111
RIP

The Routing Information Protocol (RIP) is an
older intra-AS routing protocol
Based on work by Xerox and part of the BSD Unix
distribution in 1982
RIP version 2 is defined by RFC 2453
Works based on the DV model
Cost is based on hop count each link has cost1
Hop is the number of subnets crossed to get from
source to destination

112
RIP

Max cost allowed in RIP is 15 hops
Routing updates are every 30 sec using RIP
response messages or advertisements
Each RIP router maintains a routing table
The routing table contains the destination
subnet, the next router to get there, and the
number of hops to that destination
Exchanging routing tables allows routers to find
the cheapest routes

113
RIP

If a neighboring router doesnt provide an update
for three minutes, its assumed to be dead (rest
in peace?), and the routing table is adjusted
accordingly
RIP messages go over UDP using port 520
In Unix, the daemon routed (route dee)
implements RIP

114
OSPF (think sunscreen?)

OSPF and its cousin, IS-IS are widely used for
intra-AS routing
OSPF version 2 is defined by RFC 2328
IS-IS is defined by RFC 1195
OSPF uses LS routing, and creates a complete
topological map of the entire AS
Then it follows Dijkstras algorithm to find the
shortest paths everywhere in the AS

OSPF Open Shortest Path First, IS
Intermediate System
115
OSPF

Link cost can be 1 (just count hops) or weighted
inversely to the links capacity (to put more
traffic where it can be handled well)

116
OSPF

All routers in the AS broadcast state information
to all other routers
1) when theres a change in link cost or status,
or
2) every 30 minutes to say theyre alive
OSPF messages are carried straight over IP

117
OSPF

OSPF advantages include
Security exchanges between OSPF routers must be
authenticated, either by simple password or MD5
encryption
Use multiple paths that are the same cost
Also handles multicast (MOSPF)
Allows creation of hierarchy within the AS
Defines Areas, which connect to the Boundary
Routers through Area Boundary Routers and maybe
Backbone Routers

118
OSPF Internal Hierarchy
119
BGP

So, RIP or OSPF can be used for routing within an
AS
But when the source and destination hosts cross
many AS, need BGP, the Border Gateway Protocol
(currently BGP4)
BGP gives AS the means to
Get subnet info from neighboring AS
Propagate that info to routers within the AS
Find good routes to subnets

120
BGP

BGP is massively complex
BGP uses semi-permanent TCP connections (using
port 179) between routers that connect AS, and
between routers within an AS
Connections between AS are external BGP (eBGP)
Within an AS uses internal BGP (iBGP)

121
BGP

Which destinations are reachable through a
neighboring AS is expressed using CIDR prefixes,
e.g. 138.67.16/24
Each AS is identified by an ASN (AS number)
ASNs are defined by ICANN and RFC 1930

122
BGP

BGP peers (routers) advertise routes to each
other
Routes consist of a prefix and BGP attributes
BGP learns all possible routes, then follows a
set of rules to determine which to keep
Policies are established to determine what kind
of routes are allowed, not just possible

123
Broadcast and Multicast

So far everything has focused on one source and
one destination trying to communicate (unicast)
Broadcast routing sends a packet from a source to
all other nodes in the network
Multicast routing sends from a source node to
selective other network nodes

124
Broadcast Routing

A simple way to handle broadcasting is to make N
copies of a packet, and send one to each of the N
destination nodes (hosts)
This is N-way-unicast, since it really isnt a
broadcast method at all
Major disadvantages of this simple approach
Its really inefficient, and overloads the first
link
Its hard to know all target addresses, unless
you add on a broadcast membership protocol

125
Uncontrolled Flooding

A possible approach is to send a packet to its
neighbors, who send it to their neighbors, etc.
Massive problems include
Cycle never ends if there are loops in the
network
Multiple interconnections result in a broadcast
storm when a node gets e.g. three messages to
broadcast to all their neighbors, who get
multiple broadcast messages, and so on

126
Controlled Flooding

Try flooding, but with more logic to prevent a
broadcast storm
Several possible approaches
Sequence-number-controlled flooding adds its
address and a broadcast sequence number in the
packet
Nodes check for having received this sequence
number (e.g. broadcast 1254) already if not,
duplicate it and send to neighbors

127
Controlled Flooding

Reverse path forwarding (RPF) or reverse path
broadcasting (RPB) is subtle
When a packet is received, send it out on all
other links ONLY IF it was received from the
shortest unicast path back to the source
Otherwise, throw it out

128
Spanning-Tree Broadcast

While the controlled flooding approaches do avoid
a broadcast storm, they can still send duplicate
packets
A spanning tree diagram connects all the nodes in
a network exactly once
One that has minimum cost is a minimum spanning
tree
Hence a possible broadcast approach is to
construct a minimum spanning tree and use it

129
Spanning-Tree Broadcast

Once defined, the spanning tree can be used to
initiate a broadcast from any node
Each node only knows which adjacent nodes are
part of the tree
Many algorithms can be used to create spanning
trees

130
Reality v Broadcast Algorithms

Broadcast algorithms are used at the application
and network layers
Gnutella uses app-layer broadcasting, with a
time-to-live hop number countdown to give
limited-scope flooding
OSPF uses sequence-controlled flooding to
broadcast link-state advertisements (LSAs), as
well as in the IS-IS protocol
Sequence number and age data are used by OSPF to
tell old LSAs from newer ones

131
Multicast

Multicast sends a packet only to select nodes in
a network
There also may be more than one sender
Examples of uses include
Bulk software upgrades
Streaming media to a class or meeting
Shared apps like teleconferencing
Data feeds (stock prices)
Interactive gaming

132
Multicast

Key problems are
How to identify the receivers of the message
How to address those receivers
In unicast, the IP address of the recipient was
enough but now, does every address get the list
of all recipients?
Addressing could be larger than the message
Solve using address indirection

133
Multicast

Address indirection uses a single identifier
(here, a class D multicast address) for the group
of receivers, and address the packet only with
that single identifier
The single identifier is a multicast group
So how do we manage this multicast group? Create
an RFC! (duh!)
Internet Group Management Protocol

134
IGMP

The Internet Group Management Protocol (IGMP),
version 3, RFC 3376, works between a gateway
router (first hop router) and its hosts only
within its LAN
IGMP allows a host to tell the router that a
hosted app wants to join a multicast group
Then the router communicates to other routers
using a network-layer multicast routing
algorithm, e.g. PIM, DVMRP, or MOSPF

135
IGMP

IGMP only has three message types, carried in an
IP datagram
Membership_query is sent by the router to find
all groups joined by hosts on that interface, or
determines if a particular group has been joined
Membership_report is sent by the hosts to reply
to a query, or to tell the router when a group
has first been joined

136
IGMP

Leave_group message is oddly optional a host can
leave a group by not responding to queries
So joining a multicast group is based on receiver
host action sending a membership_report to the
router
This means the sender doesnt control membership
doesnt add new receivers to the group

137
Multicast Routing

Multicast routing algorithms need to ensure that
all routers with hosts in the group get the
desired packets
Other routers might have to get them too, but
avoid that where possible
Two major approaches are used for multicast
routing
Using a group-shared tree
Using a source-based tree

138
Using a group-shared tree

Like the spanning-tree algorithm, build a tree
that includes all edge routers with hosts in the
group
Uses a single tree to allow sending from any
sender kind of a global approach
A central node is used to coordinate the process,
so new routers send messages to it to get added
to the tree
Also called a center-based tree approach

139
Using a source-based tree

Focuses on making a shared routing tree based on
a specific source sender
Uses the RPF (reverse path forwarding) algorithm,
tweaked for multicast
Can result in thousands of unwanted packets to
routers with no group members
Routers who get unwanted packets send a pruning
message to a router upstream from it

140
Multicast in the Internet

The first multicast routing algorithm is the
Distance-Vector Multicast Routing Protocol
(DVMRP, RFC 1075)
Uses source-based trees with RPF and pruning
Uses a DV algorithm to find the shortest path to
the source
Also monitors downstream dependent routers
Has graft messages to, yes, undo a pruning

141
Multicast in the Internet

The Protocol-Independent Multicast (PIM, RFC
3973) routing protocol is widely used
Uses dense or sparse modes, depending on the
density of routers with group member hosts
Dense mode uses flood-and-prune RPF
Sparse mode uses center-based tree, like the
core-based tree (CBT) protocol
Can switch from group-shared tree to source-based
tree after joining

142
Multicast in the Internet

PIM sparse domains can be joined at rendevous
points using Multicast Source Discovery Protocol
(MSDP, RFC 4611)
A third option for multicast is Source-Specific
Multicast (SSM, RFC 4607)
Under SSM only one host can send traffic into
the multicast tree, which makes defining the
tree a lot easier

143
Multicast in the Internet

BGP can also support multicast (RFC 4271)
RFC 5110 is good for more discussion of multicast
routing
Increasingly multicast is being handled at the
application layer, such as End System Multicast
(ESM) from Carnegie Mellon

144
Multicast Babel?

So far assumed all routers use the same multicast
protocol
Within an AS this should be true
But different AS could run different protocols
RFC 2715 defines interoperability rules for
multicast routing protocols to play nicely with
each other
DVMRP is the de facto standard, but PIM and BGP
are also viable

145
Are We Dead Yet?