Title: DHCP Failover Protocol
1DHCP Failover Protocol
DECUS Europe 2000
Thursday, 13 Apr 2000 900 - 945
Jeff Schreiber
schreiber_at_process.com
2Outline
- DHCP Basic Operation
- Existing forms of Redundancy
- Requirements for Failover Redundancy
- Problems, Goals, and Limitations
- How it works
3Redundancies
- DNS both Primary and Secondary
- Hardware configurations
- But only make-shift redundancies for DHCP
4Basically, how DHCP works
- Client DHCPDISCOVER
- Server DHCPOFFER
- ClientDHCPREQUEST
- or DHCPDECLINE
- Server DHCPACK
- Lease time
- IP address
- DNS server other information
5Basically, how DHCP works
- Client goes into a bound state and starts the
T1 and T2 timers - T1 is 1/2 the Lease time
- T2 is about 80 of the lease time.
- At T1, client sends a (unicast) DHCPREQUEST to
renew the lease.
6Basically, how DHCP works
- 3 things can happen
- Server says No and client gives up address at
the end of the lease - No response. Therefore, the client keeps trying
until T2 when it sends out a broadcast. - Gets the renewal as desired. New lease starts
here.
7Basically, how DHCP works
- Clients on different network as the server use a
DHCP relay that forwards DHCP communications from
one subnet to the other.
8Existing forms of DHCP redundancy
- 2 DHCP servers, both active at the same time.
- No synchronization or communications between
servers - 2 disjoint address pools.
- inefficient
- wastes addresses.
- Increases network recourses
- both servers respond to clients
9Existing forms of DHCP redundancy
- Brute force
- Have a standby server and periodically save the
lease database. - Performance problems.
- Possibility of issuing one address to two
clients. - Proprietary primary backup solutions
- do not provide safe failover (1 address can be
given to two clients).
10Requirements for Failover servers
- Cannot give two clients the same address.
- The secondary should be able to take over for the
primary. - Do not change the fundamental way that DHCP
works. - Do not change the client
- Server can change (al biet slightly)
- Client to give up the lease when told to or at
the end of the lease if it does not get renewed.
11Things to address
- How does primary server update secondary and
when. - Failover assumes that an INIT_REBOOT does not
have an existing address. This scenario can
happen if the Client gets the 1st address while
primary cannot talk to secondary, then reboots
again.
12Things to address
- Server updates require stable storage to work
reliably. Dont want to add a significant amount
of time that it already takes to do this. - Clients may not be on same network Therefore need
to have a DHCP relay forward the DHCP stuff to a
particular server to that that can send a request
to more than one server.
13Problems to be aware of
- Primary crashes before it can update secondary.
Secondary has no record of primary allocation
(DHCPACK) - Primary and secondary cannot talk but clients can
see both. (network partitioning) - Inherent to TCP connections, is keepalives to
make sure that the secondary is there.
14Problems to be aware of
- In a TCP connection (as opposed to a UDP) will
time out and will take up to 9 minutes. This
usually cannot be changed. This is too long for
a DHCP. RESULTS TCP is useful for reliable
message delivery, but cannot be depended upon do
detect server failover.
15Goals (continued)
- Must work with existing clients
- Must work with existing boot relay agents
- Must provide failover redundancy between servers
that are not located on the same subnet - Provide service to DHCP clients in the event of
primary server failure.
16Goals (continued)
- Avoid binding (giving) and address to a client
that another client already has. (no duplicate
addresses) - Minimize the need for manual administration
intervention. - Impose no additional client delays as a result of
primary-backup communications - Share IP pools between primary and secondary
servers
17Goals (continued)
- Handled partitioned networks.
- Resynchronize without operator intervention when
primary failure is corrected. - Enable one server to be secondary to many primary
servers. - Allow proper lease renewal from either server.
18Goals (continued)
- If either server loses all of the information
that it has stored in stable storage, it should
be able to refresh from the other server.
19Limitations
- Only one secondary server.
- Have a subset of addresses that only the
secondary can hand out. - Neither server hand out addresses during a
recovering failure.
20MCLT
- Maximum Client Lead Time
- a lease time known to both the secondary and
primary servers. - Places an upper bound on the difference allowed
between the lease time given to a client by a
server and the lease time known by the other
server. - Is much less than the real lease time.
21MCLT
- Tell the client what the other server knows, plus
MCLT - Tell the other server what the client wanted (or
what the client was supposed to get) plus 1/2 of
what it got - Dont give the client more than what it asked for
(or what it was supposed to get).
22Practical Use
Client
Primary DHCP Server
Secondary DHCP Server
DHCPREQUEST
1 hour (MCLT)
1 day 1/2 hour
1/2 Hour Later
Renew Request
1 day (Lease)
1 day 1/2 hour
1/2 Day Later
Renew Request
1 day (Lease)
1 day 1/2 hour
23Practical Use
Primary DHCP Server
Primary DHCP Server
Client
Secondary DHCP Server
Primary DHCP Server
DHCPREQUEST
1 hour (MCLT)
1 day 1/2 hour
1/2 Hour Later
Renew Request
(No Answer)
Request Broadcasted
1 Hour (MCLT)
Im Back
Heres what Ive done
24Questions
Thats all folks Any Questions?
25Getting the Slides
Slides available via anonymous FTP ftp//ftp.pro
cess.com/decus/europe_2000/dhcp_failover.ppt