Title: Surviving Large Scale Internet Failures
1Surviving Large Scale Internet Failures
- Dr. Krishna Kant
- Intel Corp.
2The Problem
- Internet has two critical elements
- Routing (Inter intra domain)
- Name resolution
- How robust are they against large scale
failures/attacks? - How do we improve them?
3Outline
- Overview
- Internet Infrastructure elements Large Scale
Failures - Dealing with Routing Failures
- Routing algorithms their properties
- Improving BGP Convergence
- Other Performance Metrics
- Dealing with Name Resolution Failures
- Name resolution preliminaries
- DNS vulnerabilities Solution
- Conclusions and Open Issues
4Internet Routing
- Not a homogeneous network
- A network autonomous systems (AS)
- Each AS under the control of an ISP.
- Large variation in AS sizes typical heavy tail.
- Inter-AS routing
- Border Gateway Protocol (BGP). A path-vector
algorithm. - Serious scalability/recovery issues.
- Intra-AS routing
- Several algorithms usually work fine
- Central control, smaller network,
5Internet Name Resolution
- Domain Name Server (DNS)
- May add significant delays
- Replication of TLDs others resists attacks, but
extensive caching makes it easier! - Not designed for security - can be easily
attacked. - DNS security
- Crypto techniques can stop many attacks, but
substantial overhead other challenges. - Other solutions
- Peer to peer based, but no solution is entirely
adequate.
6Large Scale Failures
- Characteristics
- Affects a significant of infrastructure in some
region - Routers, Links, Name servers
- Generally non-uniformly distributed, e.g.,
confined to a geographical area. - Why study large scale failures?
- Several instances of moderate sized failures
already. - Larger scale failures only a matter of time
- Potentially different behavior
- Secondary failures due to large recovery traffic,
substantial imbalance in load,
7Routing Failure Causes
- Large area router/link damage (e.g., earthquake)
- Large scale failure due to buggy SW update.
- High BW cable cuts
- Router configuration errors
- Aggregation of large un-owned IP blocks
- Happens when prefixes are aggregated for
efficiency - Incorrect policy settings resulting in large
scale delivery failures - Network wide congestion (DoS attack)
- Malicious route advertisements via worms
8Name Resolution Failure Causes
- Records containing fake (name, IP) info can be
easily altered. - Poisoning of records doesnt even require
compromising the server! - Extensive caching ? More points of entry.
- Poisoning of TLD records (or other large chunks
of name space) - Disable access to huge number of sites
- Example March 2005 .com attack
- Poisoning a perfect way to gain control of
sensitive information on a large scale.
9Major Infrastructure Failure Events
- Blackout widespread power outage (NY Italy
2003) - Hurricane widespread damage (Katrina)
- Earthquake Undersea cable damage (Taiwan Dec
2006) - Infrastructure induced (e.g., May 2007, Japan)
- Many other potential causes
10Taiwan Earthquake Dec 2006
- Issues
- Global traffic passes through a small number of
seismically active choke points. - Luzon strait, Malacca strait, South coast of
Japan - Satellite overland cables dont have enough
capacity to provide backup. - Several countries depend on only 1-2 distinct
landing points. - Outlook
- Economics makes change unlikely.
- May be exploited by collusion of pirates
terrorists - Will perhaps see repeat performance!
- Reference http//www.pinr.com/report.php?acview_
reportreport_id602language_id1
11Hurricane Katrina (Aug 2005)
- Major local outages. No major regional cable
routes through the worst affected areas. - Outages persisted for weeks months. Notable
after-effects in FL (significant outages 4 days
later!) - Reference
- http//www.renesys.com/tech/presentations/pdf/Ren
esys-Katrina-Report-9sep2005.pdf
12NY Power Outage (Aug 2003)
- No of concurrent network outages vs. time
- Large ASes suffered less than smaller ones.
- Behavior very similar to Italian power outage of
Sept 2003. - A significant no of ASes had all their routers
down for gt4 hours.
13Slammer Worm (Jan 2003)
- Scanner worm started w/ buffer overflow of MS
SQL. - Very rapid replication, huge congestion buildup
in 10 mins - Korea falls out, 5/13 DNS root servers fail,
failed ATMs, - High BGP activity to find working routes.
- Reference http//www.cs.ucsd.edu/
savage/papers/IEEESP03.pdf
14Infrastructure Induced Failures
- En-masse use of backup routes by 4000 Cisco
routers in May 2007 (Japan) - Routing table rewrites caused 7 hr downtime in NE
Japan. - Reference http//www.networkworld.com/news/2007/0
51607-cisco-routers-major-outage-japan.html - Akamai CDN failure June 2004
- Probably widespread failures in Akamais DNS.
- Reference http//www.landfield.com/isn/mail-archi
ve/2004/Jun/0064.html - Worldcom router mis-configuration Oct 2002
- Misconfigured eBGP router flooded internal
routers with routes. - Reference http//www.isoc-chicago.org/internetout
age.pdf
15Outline
- Overview
- Internet Infrastructure elements Large Scale
Failures - Dealing with Routing Failures
- Routing algorithms their properties
- Improving BGP Convergence
- Other Performance Metrics
- Dealing with Name Resolution Failures
- Name resolution preliminaries
- DNS vulnerabilities Solution
- Conclusions and Open Issues
16Routing Algorithms
- Basic methods
- Distance vector based (DV)
- Link State Based (LS)
- Path Vector Based (PV)
- DV Examples
- RIP (Routing Information Protocol).
- IGRP (Interior gateway routing Protocol).
- LS Examples
- OSPF (Open shortest path first)
- IS-IS (Intermediate system to IS)
- PV Examples
- BGP (Border Gateway Protocol)
- There are inter-domain (iBGP) inter-domain
(eBGP) versions.
17Distance Vector (DS) Protocols
- Each node advertises its path costs to its
neighbors. - Very simple but count to infinity problem
- Node w/ a broken link will receive old cost use
it to replace broken path! - Several versions to fix this.
- Difficult to use policies
Routing Table for A
B
D
Dest Next Cost
B B 1
C C 1
D B 4
E B 2
F C 3
E
A
F
C
18Link State (LS) Protocols
- Each node keeps complete adjacency/cost matrix
computes shortest paths locally - Difficult in a large network
- Any failure propagated via flooding
- Expensive in a large network
- Loop-free can use policies easily.
3
B
D
1
Src Dest Link Cost
A B 1 1
A C 2 1
B A 2 1
B D 3 2
C A 2 1
C D 4 1
C E 5 3
4
6
A
2
5
E
C
19Path Vector Protocols
- Each node initialized w/ some paths for each
dest. - Active paths updated much like in DV
- Explicitly withdraw failed paths ( advertise
next best) - Filtering on incoming/outgoing paths, path
selection policies
- Paths A to D
- Via B B/E/F/D, cost 3
- Via C C/E/F/D, cost 4
20Intra-domain Routing under Failures
- Inter-domain routing usually can limp back to
normal rather quickly - Single domain of control
- High visibility, common management network, etc.
- Most ASes are small
- Very simple policies only
- Routing algorithms
- Distance-vector (RIP, IGRP) simple,
enhancements prevent most count-to-infinity
problems - Link state (OSPF) Flooding handles failures
quickly. - Path vector (iBGP) Behavior similar to eBGP
21Inter-domain Routing
- BGP Default inter-AS protocol (RFC 1771)
- Path vector protocol, runs on TCP
- Scalable, rich policy settings
- But prone to long convergence delays
- High packet loss delay during convergence
22BGP Routing Table
- Prefix origin address for dest mask
(eg.,207.8.128.0/17) - Next hop Neighbor that announced the route
Dest prefix Next hop Cost
204.70.0.0/15 207.240.10.143 10
192.67.95.0/24 192.67.95.57 2
140.222.0.0/16 207.240.10.120 5
- One active route, others kept as backup
- Route attributes -- some may be conveyed
outside - ASpath Used for loop avoidance.
- MED (multi-exit discriminator) preferred
incoming path - Local pref Used for local path selection
23BGP Messages
- Message Types
- Open (establish TCP conn), notification, update,
keepalive - Update
- Withdraw old routes and/or advertise new ones.
- Only active routes can be advertised.
- May need to also advertise sub-prefix (e.g.,
207.8.240.0/24 which is contained in
207.8.128.0/17)
24Routing Process
- Input output policy engines
- Filter routes (by attributes, prefix, etc.)
- Manipulate attributes (eg. Local pref, MED, etc.)
25BGP Recovery
Symb Convergence Condition
Tup A down node/link restored
Tshort A shorter path advertised
Tlong Switchover to longer paths due to a link failure
Tdown All paths to a failed node withdrawn
- BGP Convergence Delay
- Time for ALL routes to stabilize.
- 4 different times defined!
- BGP parameters
- Minimum Route Advertisement Interval (MRAI)
- Path cost, path priority, input filter, output
filter, - MRAI specifics
- Applies only to adv., not withdrawals
- Intended per destination, Implemented per
peer - Damps out cycles of withdrawals advertisements.
- Convergence delay vs. MRAI A V-shaped curve
Convergence Delay
MRAI
26Impact of BGP Recovery
- Long Recovery Times
- Measurements for isolated failures
- gt3 minutes for 30 of isolated failures
- gt 15 minutes for 10 of cases
- Even larger for large scale failures.
- Consequences
- Connection attempts over invalid routes will
fail. - Long delays compromised QoS due to heavy packet
loss. - Packet loss 30X increase, delay 4X
Graphs taken from ref 2, Labovitz, et.al.
27BGP Illustration (1)
- Example Network
- All link costs 1 except as shown.
- Notation for best path PSD(N, cost) X
- S,D Source destination nodes
- N Neighbor of S thru which the path goes
- X Actual path (for illustration only)
- Sample starting paths to C
- PBC(D,3) BDAC, PDC(A,2) DAC, PFC(E,3)
FEAC, PIC(H,5) IHGAC - Paths shown using arrows (all share seg AC)
- Failure of A
- BGP does not attempt to diagnose problem or
broadcast failure events.
28BGP Illustration (2)
- NOTE Affected node names in blue, rest in white
- As neighbors install new paths avoiding A ?
- PDC(B,5) DBFEAC, PEC(EF,5) EFBDAC,
PGC(H,6) GHIBDAC - D advertises PDC DBFEAC to B
- Current PBC is via D ? B must pick a path not via
D ? - B installs PBC(F,4) BFEAC advertises it to F
I (First time) - Note Green indicates first adv by B
29BGP Illustration (3)
- E advertises PEC EFBDAC to F
- Current PFC is via E ?
- F installs PFC(B,4) FBDAC advertises to E
B - G advertises PGC GHIBDAC to H
- Current PHC is via H ?
- H installs PHC(I,5) HIBDAC advertises to I
30BGP Illustration (4)
- Bs adv BFEAC reaches F I
- PFC(B,4) FBDAC thru B ? F withdraws PFC has
no path to C! - PIC(H,5) IHGAC is shorter ? I retains it.
- Fs adv FBDAC reaches B PBC(F,4) BFEAC thru
F ? - B installs PBC(I,6) BIHGAC and advertises to
D, F I - Note Green text Bs first adv Grey text Bs
subsequent adv. (disallowed by MRAI)
31BGP Illustration (5a)
- Hs adv HIBDAC reaches I
- PIC(H,5) IHGAC thru H ? I installs PIC(B,6)
IBDAC advertises to B H. - Bs adv BIHGAC reaches D, F
- D updates PDC(B,8) DBIHGAC (Just a local
update) - F updates PFC(B,8) FBIHGAC advertises to E
- w/ MRAI
- D F have wrong (lower) cost metric, but will
still follow the same path thru. B.
32BGP Illustration (5b)
- Bs adv BIHGAC reaches I
- PIC(B,6) IBDAC thru B ? I withdraws PIC has
no path to C! - w/ MRAI
- I will continue to use the nonworking path IBDAC.
Same as having no path. - Is adv IBDAC reaches B H
- H changes its path to HIBDAC
- Bs path thru I, so B installs (C,10)
advertises to its neighbors D, F I
33BGP Illustration (5c)
- Fs update reaches E
- E updates its path locally.
- Is withdrawal of IBDAC reaches H ( also B)
- H withdraws the path IBDAC has no path to C!
- Hs withdrawal of HIBDAC reaches G ( also I)
- G withdraws the path GHIBDAC has no path to C!
- w/ MRAI
- Nonworking paths stay at E, H G
34BGP Illustration (6) No MRAI
- Bs adv C reaches D, F I (in some order)
- D updates its path cost (B,11)
- F updates its path cost (B,11) advertises PFC
to E. - I updates its path cost (B,13) advertises PIC
to H - Final updates
- Fs update FBC reaches E which updates its path
locally - Is adv IBC reaches H
- H updates its path cost (I,14) HIBC
advertises PHC to G - G does a local update
35BGP Illustration (5) w/ MRAI
- Hs adv HIBDAC reaches I
- PIC(H,5) IHGAC thru H ? I installs PIC(B,6)
IBDAC advertises to B H. - Is adv IBDAC reaches B H
- H changes its path to HIBDAC
- Bs path is thru I, so B installs (C,10)
- When MRAI expires, B advertises to its neighbors
D, F I - Note If MRAI is large, path recovery gets delayed
36BGP Illustration (6) w/ MRAI
- Bs adv C reaches D, F I (in some order)
- D updates its path cost (B,11)
- F updates its path cost (B,11) advertises PFC
to E. - I installs updated path IBC and advertises it
to H - Final updates Same as for (6)
- W/ vs. w/o MRAI
- MRAI avoids some unnecessary path updates (less
router load)
37BGP Known Analytic Results
- Lots of work for isolated failures
- Labovitz 1
- Convergence delay bound for full mesh networks
O(n3) for average case, O(n!) for worst case. - Labovitz 2, Obradovic 3, Pei8
- Assuming unit cost per hop
- Convergence delay ? Length of longest path
involved - Griffin and Premore 4
- V shaped curve of convergence delay wrt MRAI.
- Messages wrt MRAI decreases at a decreasing
rate. - LS failures Even harder!
38Evaluation of LS Failures
- Evaluation methods
- Primarily simulation. Analysis is intractable
- BGP Simulation Tools
- Several available, but simulation expense is the
key! - SSFNET scalable, but max 240 nodes on 32-bit
machine - SSFNet default parameter settings
- MRAI but jittered by 25 to avoid
synchronization - OSPFv2 used as the intra-domain protocol
39Topology Modeling
- Topology Generation BRITE
- Enhanced to generate arbitrary degree
distributions - Heavy tailed based on actual measurements.
- Approx 70 low 30 high degree nodes.
- Mostly used 1 router/AS ? Easier to see trends.
- Failure topology Geographical placement
- Emulated by placing all AS routers and ASes on a
1000x1000 grid - The area of an AS ? No. of routers in AS
40Convergence Delay vs. Failure Extent
- Initial rapid increase then flattens out.
- Delays increase rate both go up with network
size - ? Large failures can a problem!
41Delay Msg Traffic vs. MRAI
- Small networks in simulation ?
- Optimal MRAI for isolated failures small (0.375
s). - Chose a few larger values
- Main observations
- Larger failure ? Larger MRAI more effective
42Convergence Delay vs. MRAI
- A V-shaped curve, as expected
- Curve flattens out as failure extent increases
- Optimal MRAI shifts to right with failure extent.
43Impact of AS Distance
- ASes more likely to be connected to other
nearby ASes. - b indicates the preference for shorter distances
(smaller b ? higher preference) - Lower convergence delay for lower b.
44Outline
- Overview
- Internet Infrastructure elements Large Scale
Failures - Dealing with Routing Failures
- Routing algorithms their properties
- Improving BGP Convergence
- Other Performance Metrics
- Dealing with Name Resolution Failures
- Name resolution preliminaries
- DNS vulnerabilities Solution
- Conclusions and Open Issues
45Reducing Convergence Delays
- Many schemes in the literature
- Most evaluated only for isolated failures.
- Some popular schemes
- Ghost Flushing
- Consistency Assertions
- Root Cause Notification
- Our work (Large scale failure focused)
- Dynamic MRAI
- Batching
- Speculative Invalidation
46Ghost Flushing
- Bremler-Barr, Afek, Schwarz Infocom 2003
- An adv. implicitly replaces old path
- GF withdraws old path immediately.
- Pros
- Withdrawals will cascade thru the network
- More likely to install new working routes
- Cons
- Substantial additional load on routers
- Flushing takes away a working route!
- Install BC ?
- Routes at D, F, I via B will start working
- Flushing will take them away.
47Consistency Assertion
S
- Pei, Zhao, et.al., Infocom 2002
- If S has two paths SN1xD SN2yN1xD, first
path is withdrawn, then second path is not used
(considered infeasible). - Pros
- Avoids trying out paths that are unlikely to be
working. - Cons
- Consistency Checking can be expensive
N2
N1
y
x
D
48Root Cause Notification
- Pei, Azuma, Massy, Zhang Computer Networks, 2004
- Modify BGP messages to carry root cause (e.g.,
node/link failure). - Pros
- Avoid paths w/ failed nodes/links ? substantial
reduction in conv. delay. - Cons
- Change to BGP protocol. Unlikely to be adopted.
- Applicability to large scale failures unclear.
H
E
F
G
I
D
2
3
A
B
10
C
- D, E, G diagnose if A or link to A has failed.
- Propagate this info to neighbors
49Large Scale FailuresOur Approach
- What we cant or wouldnt do?
- No coordination between ASes
- Business issues, security issues, very hard to
do, - No change to wire protocol (i.e., no new msg
type). - No substantial router overhead
- Critical for large scale failures
- Solution applicable to both isolated LS
failures. - What we can do?
- Change MRAI based on network and/or load parms
(e.g., degree dependent, backlog dependent, ) - Process messages ( generate updates) differently
50Key Idea Dynamic MRAI
- Increase MRAI when the router is heavily loaded
- Reduces load of route changes.
- Relationship to large scale failure
- Larger failure size ? Greater router loading ?
Larger MRAI more appropriate. - Router load directed MRAI caters to all failure
sizes! - Implementation
- Queue length threshold based MRAI adjustment.
Decrease th1
Increase th1
Decrease th2
Increase th2
51Dynamic MRAI Effect on Delay
- Change wrt fixed MRAI9.375 secs.
- Improves convergence delay as compared to fixed
values.
52Key Idea Message Batching
- BGP default FIFO message processing ?
- Unnecessary processing, if
- A later update (already in queue) changes route
to dest. - Timer expiry before a later msg is processed.
- Relationship to large scale failure
- Significant batching (and hence batching
advantage) likely for large scale failures only. - Algorithm
- A separate logical queue/dest. allows
processing of all updates to dest as a batch. - gt1 update from same neighbor ? Delete older ones.
B
C
B
A
A
A
A
B
A
C
B
A
53Batching Effect on Delay
- Behavior similar to dynamic MRAI w/o actually
making it dynamic - Combination w/ dynamic MRAI works somewhat
better.
54Key Idea Speculative Invalidation
- Large scale failure
- A lot of route withdrawals for the failed AS, say
X - withdrawn paths w/ AS X e AS_path gt thres ?
Invalidate all paths containing X - Implementation Issues
- Going through the routes for invalidation is
inefficient - Use route filters at each node
- Threshold estimation ? Computed (see paper)
- Reverting routes to valid state ? time-slot based
55Effect of Invalidation
- Avoids exploring unnecessary paths
- Reduces conv. delay significantly, but
- May affect connectivity adversely.
- Implement only at nodes with degree 4 or higher
56Comparison of Various Schemes
- CA is the best scheme throughout!
- GF is rather poor
- Batching dynamic MRAI do pretty well
considering their simplicity
57Outline
- Overview
- Internet Infrastructure elements Large Scale
Failures - Dealing with Routing Failures
- Routing algorithms their properties
- Improving BGP Convergence
- Other Performance Metrics
- Dealing with Name Resolution Failures
- Name resolution preliminaries
- DNS vulnerabilities Solution
- Conclusions and Open Issues
58Whats the Right Performance Metric?
- Convergence delay
- Network centric, not user centric
- Instability in infrequently used routes is almost
irrelevant - User Centric Metrics
- Packet loss packet delays
- Convergence delay does not correlate well with
user centric metrics
59User Centric Metrics
- Computed over all routes entire convergence
period - Single metric Overall avg over routes time
- Distribution wrt routes, time dependent rate,
etc. - Frac of pkts lost
- Frac increase in pkt delay
- Absolute delay depends on route length not
meaningful. - Requires E2E measurements ? Much harder than pkt
loss
60Comparison between Schemes
- Comparing some major schemes
- Consistency assertion (CA)
- Ghost Flushing (GF)
- Speculative Invalidation (SI)
- All 3 schemes reduce conv delay substantially,
but - Only CA can really reduce the pkt losses!
61How Schemes affect routes
- Cumulative time for which there is no valid path
- T_noroute Time for which there is no route at
all - T_allinval Time for which all neighbors
advertise an invalid route - T_BGPinval Time for which BGP chooses an invalid
route (even though some neighbor has a valid
route). - GF increases T_noroute the most, CA reduces
T_allinval the most
62Changes to Reduce pkt Losses
- GF Difficult to reduce T_noroute. Not attempted.
- CA Use best route even if all of them are
infeasible, but dont advertise infeasible
routes. - Improves substantially
- SI Mark the route invalid probabilistically
depending on fail count (instead of
deterministically) - Improves substantially
63Conclusions Open Issues
- Inter-domain routing does not perform very well
for large scale failures. - Considered several schemes for improvement. Room
for further work. - Convergence delay is not the right metric
- Defined pkt loss related metric a simple scheme
to improve it. - Open Issues for large scale failures
- Analytic Modeling of convergence properties.
- What aspects affect pkt losses can we model
them? - How do we improve pkt loss performance ?
64Outline
- Overview
- Internet Infrastructure elements Large Scale
Failures - Dealing with Routing Failures
- Routing algorithms their properties
- Improving BGP Convergence
- Other Performance Metrics
- Dealing with Name Resolution Failures
- Name resolution preliminaries
- DNS vulnerabilities Solution
- Conclusions and Open Issues
65DNS Infrastructure
Browser
E-Mail
FTP
root
Resolver
Client
sg
nz
au
gov
edu
Cache
DNS Proxy
ips
sa
gb
Authoritative DNS Server
Local DNS Server
66DNS Usage
- Name ? IP mapping
- Best-matching
- Time-to-Live (TTL)
- Iterative vs. recursive lookup
- Delegation chains
- Avg length 46!
- Makes DNS very vulnerable
.au
.gov
.gb
.ips
.abc
.xyz
Q abc.gb.gov.au?
DNS Proxy
67DNS Resource Record (RR)
- Domain name
- (length, name) pairs, eg., intel.com ?
05intel03com00 - Record Types
- DNS Internal types
- Authority NS, SOA DNSSEC DS, DNSKEY, RRSIG,
- Many others TSIG, TKEY, CNAME, DNAME,
- Terminal RR
- Address records A, AAAA
- Informational TXT, HINFO, KEY, (data carried
to apps) - Non Terminal RR
- MX, SRV, PTR, w/ domain names resulting in
further queries. - Other fields
- RL Record length, RDATA IP address,
referral, - TTL Time To Live in a cache
68Outline
- Overview
- Internet Infrastructure elements Large Scale
Failures - Dealing with Routing Failures
- Routing algorithms their properties
- Improving BGP Convergence
- Other Performance Metrics
- Dealing with Name Resolution Failures
- Name resolution preliminaries
- DNS vulnerabilities Solution
- Conclusions and Open Issues
69DNS Attacks
- Inject incorrect RR into DNS proxy (poisoning)
- Compromise DNS proxy (hard)
- Intercept query send fake or addl response
- Query interception relatively easy
- UDP based ? Dont need any context!
- DNS query uses 16-bit client-id to connect
query w/ response. - Fairly static, can be guessed easily.
- Response can include additional RRs
- Intercept updates to authoritative server
- Technically not poisoning, but a problem
70Poisoning Consequences
- Can be exploited in many ways
- Disallow name resolution
- Direct all traffic small set of servers
- DDOS attack!
- Direct to a malicious server to collect info or
drop malware - Scale of attack simply depends on the level in
the hierarchy! - Poison propagates downwards
- Set large TTL to avoid expiry
- Actual scenario in Mar 05 (.com entry poisoned)
Proxy Cache
71Making DNS Robust
- TSIG (symmetric key crypto)
- Intended for secure master?slave proxy comm.
- Issues Not general, Scalability
- DNSSEC
- Stops cache poisoning, but issues of overhead,
infrastructure change, key mgmt, etc. - Based on PKI, a symmetric key version also
exists. - Cooperative Lookup
- Direct requests to responsive clients (CoDNS)
- Distributed hash table (DHT) structure for DNS
(CoDoNS) - Cooperative checking between clients (DoX)
72PK-DNSSEC
- Auth. chain starts from root
- Parent signs child certificates (avoids lying
about public key) - Encrypted exchange also supplies signed public
keys - F public key, f private key
Root Cert.
root
nz
au
sg
Fgov(query)
gov
edu
DNS proxy
fgov(resp, Fgb)
Fgb(query)
ips
sa
gb
gb
fgb(resp)
73CoDoNS
- Organize DNS using DHT (distributed hash table).
- Enhances availability via distribution and
replication - Explicit version control to keep all copies
current - Issues
- DHT issues (equal capacity nodes)
- Explicit version control unscalable.
- Not directed towards poisoning control (but
DNSSEC can be used)
74Domain Name Cross-referencing (DoX)
- Client peer groups
- Diversity common interest based
- Peers agree to cooperate on verification of
popular records. - Mutual verification
- Assumes that authoritative server is not poisoned.
Peer2
Peer1
Verify
Peer3
Peer4
75Choosing Peers
- Give get
- Give A peer must participate in verification
even if it is not interested in the address ?
Overhead - Get Immediate poison detection, high data
currency. - Selection of peers
- Topic channel w/ subscription by peers
- E.g. names under a Google/Yahoo directory
- Community channel, e.g., peers within the same
org - Minimizing overhead
- Verify only popular (perhaps most vulnerable)
names only. - May be adequate given the usual Zipf-like
popularity dist.
76DoX Verification
- Verification cache per peer
- Avoids unnecessary re-verification
- Verification
- DNS copy (Rd) verified copy (Rv) ? Stop
- Else send (Ro,Rn) (Rv,Rd) to all other peers
- At least m peers agree ? Stop, else obtain
authoritative copy Ra if Ra ! Rd, poison
detected. - Agreement procedure
- Involves local copy Rv remotely received
(Ro,Rn) - If RvRn ? agree, else peer obtains its own
authoritative copy Ra - Several cases, e.g., if RvRo, RaRn ? agree
- Verified copy was obsolete, got correct one now ?
Forced removal of obsolete copy
77Handling Multiple IPs per name
- DNS directed load distribution
- Easily handled with set comparison
- Multiple Views
- Used to differentiate inside/outside clients
- All peers should belong to same view (statically
or trial error). - Content Distribution Networks (CDNs)
- Same name translates to different IP addresses in
different regions - Need a flowset based IP address comparison
78Results Normal DNS
- Poison spreads in the cache
- More queries are affected
79Results DoX
- Poison removed immediately
80DoX vs. DNSSEC
Characteristic DNSSEC CoDoNS DoX
Poison detection No poisoning Yes Yes, unless all peers poisoned
Poison removal No poisoning No Quick, when possible
Effect on obsolete data No effect Explicit update propagation Improves data currency
Overhead High CPU overhead, increased msg size Significant replication overhead Latency overhead of inter-peer comm.
Protocol impact Substantial change Easy to interface w/ regular DNS None (can implement on top of base DNS)
Vulnerability Cryptographically secure Bit more robust (due to distribution proactive replication) Coordinated attack can defeat it. Updates must be secured.
Deployment Difficult Fairly easy Easy
Other Challenges Key distribution Unequal node capacities Good choice of peers
81Conclusions Open Issues
- DNS has numerous vulnerabilities easy to attack
- Several proposed solution, none entirely
satisfactory - Large deployed base resists significant overhaul
- Future Work
- Combine the best of DNSSEC, CoDoNS DoX.
- Choice of peers hardening against malicious
peers. - Tackling the delegation mess
- Math. analysis w/ delegation, non-poisson
traffic, - How do we make DNS robust against large scale
coordinated attacks?
82Thats all folks!Questions?
83BGP References
- A.L. Barabasi and R. Albert, Emergence of
Scaling in Random Networks, Science, pp.
509512, Oct. 1999. - A. Bremler-Barr, Y. Afek, and S. Schwarz,
Improved BGP convergence via ghost flushing, in
Proc. IEEE INFOCOM 2003, vol. 2, San Francisco,
CA, Mar 2003, pp. 927937. - S. Deshpande and B. Sikdar,On the Impact of
Route Processing and MRAI Timers on BGP
Convergence Times, in Proc. GLOBECOM 2004, Vol.
2, pp 1147- 1151. - T.G. Griffin and B.J. Premore, An experimental
analysis of BGP convergence time, in Proc. ICNP
2001, Riverside, California, Nov 2001, pp. 5361. - F. Hao, S. Kamat, and P. V. Koppol, "On metrics
for evaluating BGP routing convergence," Bell
Labora- tories Tech. Rep., 2003. - C. Labovitz, G. R. Malan, and F. Jahanian,
Internet Routing Instability, IEEE/ACM
Transactions on Networking, vol. 6, no. 5, pp.
515528, Oct. 1998. - C. Labovitz, Ahuja, et al., Delayed internet
routing convergence, in Proc. ACM SIGCOMM 2000,
Stockholm, Sweden, Aug 2000, pp. 175187. - C. Labovitz, A. Ahuja, et al., The Impact of
Internet Policy and Topology on Delayed Routing
Convergence, in Proc. IEEE INFOCOM 2001, vol. 1,
Anchorage, Alaska, Apr 2001, pp. 537546. - A. Lakhina, J.W. Byers, et al., On the
Geographic Location of Internet Resources, IEEE
Journal on Selected Areas in Communications, vol.
21 , no. 6, pp. 934948, Aug. 2003. - A. Medina, A. Lakhina, et al., Brite Universal
topology generation from a users perspective,
in Proc. MASCOTS 2001, Cincinnati, Ohio, Aug
2001, pp. 346-353. - D. Obradovic, Real-time Model and Convergence
Time of BGP, in Proc. IEEE INFOCOM 2002, vol. 2,
New York, June 2002, pp. 893901. - D. Pei, et al., "A study of packet delivery
perfor- mance during routing convergence," in
Proc. DSN 2003, San Francisco, CA, June 2003, pp.
183-192. - Dan Pei, B. Zhang, et al., An analysis of
convergence delay in path vector routing
protocols, Computer Networks, vol. 30, no. 3,
Feb. 2006, pp. 398421.
84BGP References
- D. Pei, X. Zhao, et al., Improving BGP
convergence through consistency assertions, in
Proc. IEEE INFOCOM 2002, vol. 2, New York, NY,
June 2327, 2002, pp. 902911. - Y. Rekhter, T. Li, and S. Hares, Border Gateway
Protocol 4, RFC 4271, Jan. 2006. - J. Rexford, J. Wang, et al., BGP routing
stability of popular destinations, in Proc.
Internet Measurement Workshop 2002, Marseille,
France, Nov. 68, 2002, pp. 197202. - A. Sahoo, K. Kant, and P. Mohapatra,
Characterization of BGP recovery under
Large-scale Failures, in Proc. ICC 2006,
Istanbul, Turkey, June 1115, 2006. - A. Sahoo, K. Kant, and P. Mohapatra, Improving
BGP Convergence Delay for Large Scale Failures,
in Proc. DSN 2006, June 25-28, 2006,
Philadelphia, Pennsylvania, pp. 323-332. - A. Sahoo, K. Kant, and P. Mohapatra, "Speculative
Route Invalidation to Improve BGP Convergence
Delay under Large-Scale Failures," in Proc. ICCCN
2006, Arlington, VA, Oct. 2006. - A. Sahoo, K. Kant, and P. Mohapatra, Improving
Packet Delivery Performance of BGP During
Large-Scale Failures", submitted to Globecom
2007. - SSFNet Scalable Simulation Framework.
Online. Available http//www.ssfnet.org/ - W. Sun, Z. M. Mao, K. G. Shin, Differentiated
BGP Update Processing for Improved Routing
Convergence, in Proc. ICNP 2006, Santa Barbara,
CA, Nov. 1215, 2006 , pp. 280289. - H. Tangmunarunkit, J. Doyle, et al, Does Size
Determine Degree in AS Topology?, ACM SIGCOMM,
vol. 31, issue 5, pp. 710, Oct. 2001. - R. Teixeira, S. Agarwal, and J. Rexford, BGP
routing changes Merging views from two ISPs,
ACM SIGCOMM, vol. 35, issue 5, pp. 7982, Oct.
2005. - B. Waxman, Routing of Multipoint Connections,
IEEE Journal on Selected Areas in Communications,
vol. 6, no. 9, pp. 16171622, Dec. 1988. - B. Zhang, R. Liu, et al., Measuring the
internets vital statistics Collecting the
internet AS-level topology , ACM SIGCOMM, vol.
35, issue 1, pp. 5361, Jan. 2005. - B. Zhang, D. Massey, and L. Zhang, "Destination
Reachability and BGP Convergence Time," in Proc.
GLOBECOM 2004, vol. 3, Dallas, TX, Nov 3, 2004,
1383-1389.
85DNS References
- R. Arends, R. Austein, et.al, DNS Security
Introduction Requirements,'' RFC 4033, 2005. - G. Ateniese S. Mangard, A new approach to DNS
security (DNSSEC),'' in Proc. 8th ACM conf on
comp comm system security, 2001. - D. Atkins R. Austein, Threat analysis of the
domain name system,' \urlhttp//www.rfc-archive.o
rg/getrfc.php?rfc3833, August 2004. - R. Curtmola, A. D. Sorbo, G. Ateniese, On the
performance and analysis of dns security
extensions,'' in Proceedings of CANS, 2005. - M. Theimer M. B. Jones, Overlook Scalable
name service on an overlay network,'' in Proc.
22nd ICDCS, 2002. - K. Park, V. Pai, et.al, CoDNS Improving DNS
performance and reliability via cooperative
lookups,'' in Proc. 6th Symp on OS design
impl., 2004. - L. Yuan, K. Kant, et. al, DoX A peer-to-peer
antidote for DNS cache poisoning attacks,'' in
Proc. IEEE ICC, 2006. - L. Yuan, K. Kant P. Mohapatra, A proxy view
of quality of domain name service,'' in IEEE
Infocom 2007. - V. Ramasubramanium E.G. Sirer, The design
implementation of next generation name service
for internet, Sigcom 2004.
86Backup
87Quality of DNS Service(QoDNS)
- Availability
- Measures if DNS can answer the query.
- Prob of correct referral when record is not
cached. - Accuracy
- Prob of hitting a stale record in proxy cache
- Poison Propagation
- Prob(poison at leaf level at tt level k
poisoned at t0) - Latency Additional time per query
- Overhead Additional msgs/BW per query
88Computation of Metrics
- Modification at authoritative server
- Copy obsolete but proxy not aware until TTL
expires a new query forces a reload
modification
U
X
hit
miss
hit
hit
hit
miss
- XR Residual time of query arrival process
- MR Residual time of modification process
- Y Inter-miss period TTL XR
89Dealing with Hierarchy
.gov
Level h-1
.sa
.ips
Level h
.gb
- A miss at a node ? a query at its parent
- Superposition of miss processes of children ?
query arrival process of parent - Recursively derive the arrival process bottom-up
90Deriving QoDNS Metrics
- Accuracy
- Prob. leaf record is current
- (Un)Availability
- Prob. BMR is Obsolete referral
- Latency
- RTT x referrals
- Overhead
- Related to referrals for current RRs tries
for obsolete RRs
.sg
.nz
.au
.gov
.edu
.gov
.sa
.ips
.gb
.gb
.abc
.xys
.qwe
.abc
91Model Validation
- Poisson and Gamma arrival model
- Uniform/Zipf popularity
Accuracy
Overhead
Higher rate ? More caching
Latency
Unavailability
Query Arrival Rate
92Survey of TTL Values
- 2.7 Million names on dmoz.org
- 1 hr, 1 day, 2 days dominates
- Some extremely small values
- How to pick the TTL for a domain?
CDF of TTLs
93Impact of TTL
overhead
No modification
failures
Moderate modification
Frequent modification
TTL