Title: A Hierarchical IPv4 Framework
1A Hierarchical IPv4 Framework
- Patrick Frejborg
- pfrejborg_at_gmail.com
- 24 Feb 2009
2Why hIPv4 ?
- Addressing RFC 4984
- It is commonly recognized that todays Internet
routing and addressing system is facing serious
scaling problems. The ever increasing user
population, as well as multiple other factors
including multi-homing, traffic engineering, and
policy routing, have been driving the growth of
the Default Free Zone (DFZ) routing table size at
an increasing and potentially alarming rate.
While it has been long recognized that the
existing routing architecture may have serious
scalability problems, effective solutions have
yet to be identified, developed, and deployed.
3Influence sources
- The Locator ID Separation Protocol development
work at IRTF - MPLS solutions, mainly the shim header that made
it possible to create new services on top of an
IP backbone - Anycast Rendezvous Point (RP) with Multicast
Source Discovery Protocol (MSDP) - IPv6 installations at Enterprises
- Why would enterprises migrate to IPv6 what will
they gain? - Bigger migration project than Y2K for what
reason? - Applications have to be ported to IPv6, a lot of
work to be done who will sponsor? - Shortage of IPv4 is not the problem of an
enterprise will use NAT instead! - PSTN architecture
- Havent seen or heard that PSTN will soon run out
of decimal numbers and that we have to migrate to
hexadecimal keypads, do you? - Either not aware of scalability issues with SS7
hidden prefixes to solve routing issues are used
between PSTN switches
4So, what if
- What if we borrow concepts from existing
solutions and glue them together - Basic ideas and goals in LISP are definitely
interesting, especially the Routing Locators
(RLOC) and Endpoint ID (EID) concept - MPLS forwarding and shim header concept
- Anycast RP
- Numbering architecture from the PSTN, i.e.
country and national destination code concepts
are ported to the IPv4 world an Internet
country is an Autonomous System or an area of a
service provider! - Trade off is
- New hardware is needed at some spots in the
Internet - Minor software upgrade for Internet routers
- Extensions are needed for DNS and DHCP
- Extension to current IPv4 stack at hosts, but
most applications continue to use the IPv4 socket
API (stream and datagram sockets) - Raw socket applications needs to be enhanced
5Some basic rules (1)
- Allocate a globally unique IPv4 block for RLOC
allocations hereafter called the Global RLOC
Block (GRB) - Assign one RLOC for each Autonomous System (AS)
or service provider, this AS or service provider
area is called a RLOC realm - Only GRB prefixes are exchanged between RLOC
realms - A multihomed enterprise with an AS number will
have a RLOC assigned and thus is a RLOC realm - Regional Internet Registries will allocate
Provider Independent IP addresses for enterprises
both single and multihomed. This assignment is
unique in the country/countries where the IP
block is deployed - Residential/consumer customers will use Provider
Aggregatable IP addresses
6Some basic rules (2)
- Introduce extensions to current protocols
- DNS add RLOC record for each host
- DHCP add RLOC option for a scope
- Current IGP and BGP are still valid routing
protocols - Define a shim header that contains RLOC and EID
information. The new shim header is called a LISP
header - When the LISP header is inserted to an IPv4
datagram the new header combination is called a
hIPv4 header - Introduce new functionalities, routing is still
done upon the IPv4 forwarding plane - LISP Switch Router (LSR) in a certain situation
the LSR shall swap the IPv4 and LISP header - The RLOC identifier is configured as an Anycast
address on one or several LSR within a RLOC realm
- Intermediate routers need to support hIPv4 in the
control plane in order to reply to ICMP requests
7Outcome, when hIPv4 is fully implemented
- Gaining several recyclable IPv4 address blocks
- Allocation of PI blocks are unique within a
country or countries of deployment - PA addresses are only locally significant within
the RLOC realm - Creating hierarchy at the control plane
- Only GRB prefixes are announced between RLOC
realms - Multihomed enterprises will only advertise their
assigned RLOC to the service providers - Single homed PI addresses are installed in the
RIB of the local RLOC realm - PA addresses are installed in the RIB of the
local RLOC realm - Current size of the Default Free Zone (DFZ) RIB
is decreased - No or minor changes to the current DFZ topology
- No new signaling protocols, neither an overlay
topology is introduced instead AS destination
based routing with IPv4 as the forwarding plane!
8Life of a hIPv4 connection
9Client -gt Server
www.foo.com?
A-record 10.2.2.2RLOC172.16.0.5
S10.1.1.1 D10.2.2.2
S10.1.1.1 D172.16.0.5
S10.1.1.1 D10.2.2.2
R172.16.0.3 E10.2.2.2
S172.16.0.3 D10.2.2.2
S172.16.0.3 D10.2.2.2
S10.1.1.1 D172.16.0.5
R172.16.0.5 E10.1.1.1
R172.16.0.5 E10.1.1.1
R172.16.0.3 E10.2.2.2
SWAP
IPv4 API
IPv4 header
LISP header
10Server -gt Client
S10.2.2.2 D10.1.1.1
S10.2.2.2 D10.1.1.1
S10.2.2.2 D172.16.0.3
R172.16.0.5 E10.1.1.1
S10.2.2.2 D172.16.0.3
S172.16.0.5 D10.1.1.1
S172.16.0.5 D10.1.1.1
R172.16.0.5 E10.1.1.1
R172.16.0.3 E10.2.2.2
R172.16.0.3 E10.2.2.2
SWAP
IPv4 API
IPv4 header
LISP header
11The hIPv4 header
- Version 4 is still valid but new protocol IDs are
needed for current IPv4 protocols (ICMP, IGMP,
TCP, UDP, IP in IP, GRE, ESP, AH etc) in order
for the stack to identify when IPv4 or hIPv4
header is applied - Forwarding network devices will calculate the
IPv4 header checksum per each hop - Hosts shall calculate the TCP and UDP
pseudoheader checksum including RLOC and EID
values - Since remote LSR will swap the IPv4 and LISP
header the TCP checksum will be bogus, unless
12LSR functionality
- The assigned RLOC shall be configured as an
Anycast address and announced to the Internet - When the IPv4 headers destination address of the
hIPv4 packet is equal to the RLOC at the remote
LSR, then - verify IP and TCP/UDP checksum, include RLOC and
EID values for the pseudoheader calculation - replace the source address in the IPv4 header
with the RLOC address of the LISP header - replace the destination address in the IPv4
header with the EID address of the LISP header - replace the RLOC address in the LISP header with
the destination address of the IPv4 header - replace the EID address in the LISP header with
the source address of the IPv4 header - decrease TTL with one
- calculate IP and TCP/UDP checksums, include RLOC
and EID values for the pseudoheader calculation - forward the datagram upon the destination address
of the IPv4 header
13The hIPv4 stack functionalities
- The IPv4 socket API is still using the tuplets
- RLOC identifiers are provided by DHCPand DNS
schemas - The hIPv4 stack must assemble the outgoing
datagram with - local IP address -gt src IP address
- remote IP address -gt EID
- local RLOC -gt RLOC
- remote RLOC -gt dst IP address
- The hIPv4 stack must present the headers of the
incoming datagram to the IPv4 socket API as - src IP address -gt remote RLOC
- dst IP address -gt local IP address
- RLOC -gt local RLOC
- EID -gt remote IP address
14Considerations
15Src IP Dst IP considerations
- Since source and destination addresses are only
locally significant within a RLOC realm there is
a slight chance that source and destination
address at the API will be the same when
connections are established between RLOC realms. - Connection is still unique since two processes
communicating over TCP form a logical connection
that is uniquely identifiable by the tuplets
involved, that is by the combination of
lt local_IP_address, local_port,
remote_IP_address, remote_portgt
16Src IP Dst IP considerations
S10.2.2.2 D10.2.2.2
S172.16.0.4 D10.2.2.2
S172.16.0.4 D10.2.2.2
R172.16.0.5 E10.2.2.2
R172.16.0.5 E10.2.2.2
SWAP
www.foo.com?
A-record 10.2.2.2RLOC172.16.0.5
S10.2.2.2 D172.16.0.5
R172.16.0.4 E10.2.2.2
S10.2.2.2 D10.2.2.2
S10.2.2.2 D172.16.0.5
R172.16.0.4 E10.2.2.2
IPv4 API
IPv4 header
LISP header
17Identical connection situation
- Since source and destination addresses are only
locally significant within a RLOC realm there is
a slight chance that source and destination
address and source ports at the API will be the
same when connections are established from two
clients residing in separate RLOC realms
contacting a server in a third RLOC realm. - Connection is unique since two processes
communicating over TCP form a logical connection
that is uniquely identifiable by the tuplets
involved, that is by the combination of
lt local_IP_address, local_port,
remote_IP_address, remote_portgt - But if the source port from both clients have the
same value the connection is no longer unique! - Solution is, the hIPv4 stack must accept only one
unique connection upon RLOC information, the
identical connection is not allowed and the
client is informed by an ICMP notification
18Identical connection situation
SWAP
www.foo.com?
S10.1.1.1 D172.16.0.5
S172.16.0.3 D10.2.2.2
A-record 10.2.2.2RLOC172.16.0.5
R172.16.0.3 E10.2.2.2
R172.16.0.5 E10.1.1.1
S172.16.0.3 D10.2.2.2
S10.1.1.1 D10.2.2.2
R172.16.0.5 E10.1.1.1
S10.1.1.1 D172.16.0.5
R172.16.0.3 E10.2.2.2
S10.1.1.1 D10.2.2.2
S172.16.0.4 D10.2.2.2
S172.16.0.4 D10.2.2.2
R172.16.0.5 E10.1.1.1
R172.16.0.5 E10.1.1.1
www.foo.com?
SWAP
A-record 10.2.2.2RLOC172.16.0.5
S10.1.1.1 D172.16.0.5
R172.16.0.4 E10.2.2.2
S10.1.1.1 D10.2.2.2
S10.1.1.1 D172.16.0.5
R172.16.0.4 E10.2.2.2
IPv4 API
IPv4 header
LISP header
19Traceroute considerations
- The routers and devices in the path to the remote
RLOC realm needs to support ICMP extensions - ICMP services are deployed in the control plane,
the forwarding plane remains intact - That is, software upgrade is needed for the
control plane - The hIPv4 ICMP extensions shall be compatible
with RFC 4884
20Traceroute,1 (intra-AS)
traceroute www.foo.com
A-record 10.2.2.2RLOC172.16.0.5
S10.2.2.2 D10.1.1.1
ICMP extensions
S10.1.1.1 D172.16.0.5
R172.16.0.3 E10.2.2.2
S172.16.0.3 D10.1.1.1
R172.16.0.5 EOIF
ICMP extensions
IPv4 API
IPv4 header
LISP header
21Traceroute,2 (inter-AS)
traceroute www.foo.com
A-record 10.2.2.2RLOC172.16.0.5
S10.2.2.2 D10.1.1.1
ICMP extensions
S10.1.1.1 D172.16.0.5
R172.16.0.3 E10.2.2.2
S10.1.1.1 D172.16.0.5
SOIF D172.16.0.3
R172.16.0.3 E10.2.2.2
R172.16.0.1 E10.1.1.1
ICMP extensions
S172.16.0.1 D10.1.1.1
R172.16.0.3 EOIF
ICMP extensions
SWAP
IPv4 API
IPv4 header
LISP header
22Traceroute,3 (target-AS)
traceroute www.foo.com
A-record 10.2.2.2RLOC172.16.0.5
S10.2.2.2 D10.1.1.1
ICMP extensions
S10.1.1.1 D172.16.0.5
SWAP
R172.16.0.3 E10.2.2.2
S172.16.0.3 D10.2.2.2
R172.16.0.5 E10.1.1.1
S10.1.1.1 D172.16.0.5
SOIF D172.16.0.3
R172.16.0.3 E10.2.2.2
R172.16.0.5 E10.1.1.1
SOIF D172.16.0.3
ICMP extensions
R172.16.0.5 E10.1.1.1
S172.16.0.5 D10.1.1.1
ICMP extensions
R172.16.0.3 EOIF
ICMP extensions
SWAP
IPv4 API
IPv4 header
LISP header
23Multicast considerations
- Source address (S) for a group (G) is no longer
visible outside the local RLOC realm (only GRB
prefixes are seen), therefore Reverse Path
Forwarding (RPF) is only valid within the local
RLOC realm - In order to enable RPF globally for a (S,G), the
multicast enabled LSR (mLSR) must at the source
RLOC realm replace the source address with the
local RLOC identifier - LSR in the source RLOC realm shall act as an
Anycast RP with MSDP capabilities - The mLSR will decide which multicast groups are
announced to other AS - The receiver will locate the source via MSDP, the
shared tree can be established to the mLSR - Source Specific Multicast schema will need an
extension, RLOC and EID options shall be added to
SSM
24Multicast forwarding
S10.1.1.1 G225.5.5.5
S10.1.1.1 D225.5.5.5
S10.1.1.1 D225.5.5.5
S172.16.0.3 D225.5.5.5
S172.16.0.3 D225.5.5.5
R172.16.0.3 l E10.1.1.1
S172.16.0.3 D225.5.5.5
S10.1.1.1 D225.5.5.5
R172.16.0.3 E10.1.1.1
R172.16.0.3 E10.1.1.1
R172.16.0.3 E10.1.1.1
SWAP
IPv4 API
IPv4 header
LISP header
25RTCP receiver reports
S10.1.1.1 D225.5.5.5
S10.1.1.1 G225.5.5.5
S10.2.2.2 D172.16.0.3
S10.2.2.2 D172.16.0.3
R172.16.0.5 E10.1.1.1
R172.16.0.5 E10.1.1.1
S172.16.0.5 D10.1.1.1
S172.16.0.5 D10.1.1.1
R172.16.0.3 E10.2.2.2
R172.16.0.3 E10.2.2.2
SWAP
IPv4 API
IPv4 header
LISP header
26Traffic Engineering considerations
- Load balancing is influenced by the placement of
LSRs within a RLOC realm LSR provides nearest
routing schema - A service provider can have several RLOC
assigned traffic engineering and filtering can
be done upon RLOC addresses - If needed an RLOC identifier based Traffic
Engineering solution can perhaps be developed.
Establish explicit routing paths upon RLOC
information, that is create explicit paths that
can be engineered via specific RLOC realms.
27Path MTU Discovery considerations
- Since the hIPv4 header is assembled at the host
the hIPv4 packet will use current PTMUD
mechanisms - The network will not see any differences between
the sizes of an IPv4 or an hIPv4 datagram
28SIP considerations
- SIP uses the local IP address of the host in the
messages - In SDP for the target of the media
- In the Contact of a REGISTER as the target for
incoming INVITE - In the Via of request as the target for a
response - Since SIP is carrying IP addresses of hosts it
have caused a lot of problems in NAT environments
hIPV4 can mitigate the pain since it will
reduce the need of NAT - SIP needs to be extended to support the hIPV4
framework, i.e. carry RLOC information in the SIP
messages - New SDP attribute is needed to provide the RLOC
information to the remote UA - Add a RLOC Extension Header Field for SIP
29SIP considerations, INVITE
S172.16.0.4 D10.2.2.2
sip.foo.com?
SWAP
R172.16.0.5 E10.3.3.3
A-record 10.3.3.3RLOC172.16.0.4
S10.1.1.1 D172.16.0.4
INVITE bob_at_10.2.2.2SDP a10.1.1.1SDP m45668
RTPSDP l172.16.0.3
R172.16.0.3 E10.3.3.3
INVITE bob_at_foo.comSPP a10.1.1.1SDP m45668
RTPSDP l172.16.0.3
SWAP
S172.16.0.3 D10.3.3.3
S10.3.3.3 D172.16.0.5
R172.16.0.4 E10.1.1.1
R172.16.0.4 E10.2.2.2
INVITE bob_at_foo.comSDP a10.1.1.1SDP m45668
RTPSDP l172.16.0.3
INVITE bob_at_10.2.2.2SDP a10.1.1.1SDP m45668
RTPSDP l172.16.0.3
bob_at_10.2.2.2R172.16.0.5
30SIP considerations, 200 OK
SWAP
S172.16.0.4 D10.1.1.1
R172.16.0.3 E10.3.3.3
200 OKSDP a10.2.2.2SDP m35678 RTPSDP
l172.16.0.5
S10.2.2.2 D172.16.0.4
R172.16.0.5 E10.3.3.3
S10.3.3.3 D172.16.0.3
200 OKSDP a10.2.2.2SDP m35678 RTPSDP
l172.16.0.5
R172.16.0.4 E10.1.1.1
200 OKSDP a10.2.2.2SDP m35678 RTPSDP
l172.16.0.5
S10.2.2.2 D172.16.0.4
S172.16.0.5 D10.3.3.3
R172.16.0.5 E10.3.3.3
R172.16.0.4 E10.2.2.2
200 OKSDP a10.2.2.2SDP m35678 RTPSDP
l172.16.0.5
200 OKSDP a10.2.2.2SDP m35678 RTPSDP
l172.16.0.5
SWAP
31SIP considerations, RTP
INVITE bob_at_10.2.2.2SDP a10.1.1.1SDP m45668
RTPSDP l172.16.0.3
200 OKSDP a10.2.2.2SDP m35678 RTPSDP
l172.16.0.5
S10.2.2.2 D172.16.0.3
R172.16.0.5 E10.1.1.1
S10.1.1.1 D172.16.0.5
RTP
R172.16.0.3 E10.2.2.2
RTP
32Mobility considerations
- Site mobility, a site wishes to changes its
attachment point to the Internet without changing
its IP address block. The change of attachment
point is possible when PI addresses are allocated
to the site. Only local RLOC identifier needs to
be changed. - Host mobility, Alex C. Snoerens and Hari
Balakrishnans An End-to-End Approach to Host
Mobility is interesting. Since the IPv4 stacks
needs to be enhanced studies should be carried
out to see if TCP connection method can be
implemented in the hIPv4 stack. - Another interesting host mobility solution is
Reliable Network Connections paper by Victor C.
Zandy and Barton P. Miller. Studies should be
carried out to see rocks and racks can be
integrated to the hIPv4 stack. http//pages.cs.wis
c.edu/zandy/rocks/
33Transition considerations
- Upgrades of host stacks, DNS DHCP databases,
security devices and network devices can be
carried out in parallel without change of
topology or major network breaks - LSRs can be added to an AS or a service provider
area when commercially available in order to
create a RLOC realm - When the hIPV4 framework is ready at a RLOC realm
the RLOC record can be added for those hosts in
the DNS, one by one. - Legacy IPv4 clients will still use legacy IPv4
schema but when a hIPv4 client receives a DNS
response with RLOC (and not matching local RLOC)
it can use the hIPV4 framework to reach the
server. Intra-RLOC realm connections (remote
RLOClocal RLOC) will use legacy IPv4 connections
no added value to use the hIPv4 framework
inside a RLOC realm. - When will Internet migrate from a flat to a
hierarchical topology? - Possible tipping point 1 when the RIB of DFZ is
getting close to the capabilities of current
hardware who will pay for the upgrade? Or will
the service provider only accept GRB prefixes
from other providers and avoid capital expenses? - Possible tipping point 2 when the exhaust of
IPv4 addresses is causing enough problems for
enterprises - Both customer and provider have a common interest
that Internet is available and affordable!
34Security considerations
- Hijacking of prefixes by longest match from
another RLOC realm is no longer possible since
the source prefix is separated by a locator. - In order execute a hijack of a certain prefix the
whole RLOC realm must be routed via a bogus RLOC
realm. Studies should be carried out with the
Secure Inter-Domain Routing (SIDR) workgroup if
the RLOC identifiers can be protected from
hijacking.
35Summary
36Carrots for Everyone, Long Term
- Enterprises
- No need to learn a new protocol, only RLOC
concept is introduced - Minimize porting of applications to a new
protocol, IPv4 socket API is extended - Get Provider Independent addresses without
multihoming requirement, i.e. achieve site
mobility - When hosts are upgraded to support the hIPv4
framework, NAT solutions can be removed - Internet Service Providers
- No need to learn new routing protocols
- Remove IPv4 address constraints
- Hierarchical BGP, smaller RIB for each RLOC realm
- Internal prefix flaps are not seen in other RLOC
realms, only GRB state changes are reflected
globally update churn is reduced