A Study of DNS Lameness - PowerPoint PPT Presentation

About This Presentation
Title:

A Study of DNS Lameness

Description:

has multiple addresses and at least one fits the above ... State of This Work. Code first ran at NANOG 25 'just in time' development ... – PowerPoint PPT presentation

Number of Views:11
Avg rating:3.0/5.0
Slides: 28
Provided by: robb161
Learn more at: http://www.iepg.org
Category:

less

Transcript and Presenter's Notes

Title: A Study of DNS Lameness


1
A Study of DNS Lameness
  • Edward Lewis ltedlewis_at_arin.netgt

2
Agenda
  • Lameness
  • Why
  • (Surprise) Spotty(?) results
  • Approach
  • Plans

3
Lameness is...
  • When an NS RR right-hand-side is a domain name
    that
  • has no address record(s)
  • does not respond to queries
  • responds negatively for the zone
  • Lameness might happen when the domain name
  • has multiple addresses and at least one fits the
    above
  • responds non-authoritatively (ie recursively)

4
Why Bother?
  • ARIN membership raised the issue of cleaning this
    up
  • Lame delegations cause some popular software to
    behave badly on the Internet
  • Lame delegations can be limited easily
  • Intermittent network problems make it infeasible
    to eliminate it completely

5
Reverse Map
  • This effort is targeted at ARIN's reverse map
    delegations
  • ARIN's /8's
  • Legacy /8's
  • Not all /8's - not RIPE's, not APNIC's
  • Dependencies are
  • simplifying assumptions about the parsing of the
    zone files
  • summary output breaks results into /16's and /24's

6
State of This Work
  • Code first ran at NANOG 25
  • "just in time" development
  • Trying at home set me back a bit
  • Code now runs at the home office
  • Results have not been verified
  • Results "look" to be valid
  • In brief, problem was traffic shaping
  • Unreliable UDP as a testing mechanism
  • Bandwidth bottleneck upstream (i.e., T-1)

7
Early results
  • Remeber, this is not all of in-addr.arpa...
  • Using counts from last run
  • Number of NS RR's 548,667
  • Number of zones 231,240
  • Number of name server names 25,047
  • Number of IP unique addresses 21,846
  • Of three runs made just before leaving for here,
    two runs had very similar counts, all three had
    similar 'ages

8
per Zone demographics
  • Servers per zone - max 7, avg 2.37
  • Addresses per zone - max 26, avg 2.32
  • Zones with no addresses 3,062
  • Zones with one address 7,365
  • All zones have multiple NS RR's
  • Some lacked glue for one, some had two names with
    identical glue, some duplicates slipped through

9
per Name Server
  • Zones - max 5772, avg 21.9
  • No address - 3,178
  • Multiple addresses - 219
  • Addresses - max 24, avg not counted
  • Longest name 41 chars
  • just to tell me how big to make name array

10
per IP
  • Zones - max 5772, avg 24.6
  • Addresses with multiple domain names pointing to
    them - 291
  • Max number of domain names pointing to an address
    - 9
  • PTR records not checked

11
Results in percentages
  • Counting by IP addresses
  • Number of zones 100 75-99 50-74 25-49 lt24
    dead sample size
  • any 46 8 8 5 17
    16 21,846
  • 1 64 0 0 0 16
    20 10,085
  • 2 51 0 20 0 11
    17 2,214
  • 3-4 42 6 14 12 10
    17 1,778
  • 5-8 34 13 12 9 17
    15 1,305
  • 9-16 30 17 9 10 24
    10 1,883
  • 17-32 25 20 13 11 21
    11 1,624
  • 33-64 14 29 16 11 19
    11 1,218
  • 65-128 9 33 18 12 22
    5 896
  • 129-1024 7 33 17 18 18
    7 811
  • 1025-2048 0 64 7 29 0
    0 14
  • 2049-8096 0 56 0 44 0
    0 18

12
What the preceding means
  • 100 servers are those that answered
    authoritatively for all of the claimed zones
  • 75-99 servers answered positively for almost all
    (one timed out zone would know a 100'er to this
  • 0-24 servers are likely not answering positively
    for much
  • dead means there was never any reply to a query
    (not even servfail)

13
Counting by Zones
  • Category All /16's /24's
  • No IP address 1 1 1 - unreachable
  • One IP address 3 5 3
  • Multi address 95 94 96 - "the
    requirement"
  • No working 38 21 39 - zones not
    reachable
  • One working 10 12 10
  • Multi working 52 67 51
  • No broken 49 58 49 - "perfect" zones
  • Some broken 13 21 12
  • All broken 38 21 39 - unreachable

14
What the preceding means
  • "No working" means that one will never get a
    reply about that zone (terminal lameness)
  • "No broken" means that all NS records lead to
    good servers (no lameness)
  • "Some broken" means that there is some lameness

15
What's Missing
  • A good measure of how many NS RR's are faulty
  • It dawned on me last week that I hadn't counted
    this - d'oh!
  • Code now dumps results 11 with NS RR's
  • Has to deal with multiple-address situations
  • Need to sort into canonical order for comparisons
  • Need to account for changes in NS RR's over time

16
Verifying Results
  • A list of all test results is produced
  • Just added
  • Should be 11 to NS RR's but 12K are missing
    during last run
  • Spot checks ought to be done, as testing via UDP
    is inherently inaccurate
  • List of results from different network locations
    should be correlated

17
Discussion Points
  • Test takes 11 hours via a T-1
  • Could speed up if servers always answered (djbdns
    issue)
  • UDP congestion control would help
  • Coordinating multiple instances of test
  • Eliminate false positives
  • Not for here what will ARIN/RIRs do with this?

18
Approach
  • Build the following lists from the zone files

Zone Record
NS Domain Name
IP address
NS Domain Name
IP address
19
Why?
  • This has been seen (1.128.in-addr.arpa)

Zone Record
NS Domain Name
IP address
NS Domain Name
IP address
IP address
20
The program
  • Runs in two phases
  • Reads NS RR's
  • Builds linked lists
  • Uses gethostbyname() to get "glue"
  • Runs through IP addresses
  • Issues SOA queries
  • Looks for aa1, rcode0, ancount1
  • Both steps print results

21
Impact on the 'net
About 1 second apart
NS
tester
NS
NS
NS
NS
NS
NS
NS
Performance hit is close to home
22
Chief Implementation Issue
  • Speeding up tests
  • When there's no answer, I use 3_at_30 second time
    outs
  • No reason to wait on bad servers
  • Queries are parallelized in two dimensions
  • Multiple IP addresses can be under test
    simultaneously
  • Multiple zone requests are pipelined to a server
  • Wouldn't need to speed this up if down servers
    could be eliminated quickly

23
Cost of Speeding Up
  • Test environment

tester
Internet
router
10/100Mb
1.5Mb
excess packets?
24
Solution
  • Needed to shape the traffic
  • Limit number of IP addresses tested
  • Stagger pipelined requests (one second apart)
  • Seems to slow transmission
  • Seems to avoid any rate limiting (if any)
  • Watching queries on network shows traffic is
    smooth, not bursty

25
Next steps
  • Finish tweaks to code
  • Distribute and run from different locations
  • Present observations to membership
  • Investigate the use of this data

26
Questions
???
27
Answers
!!!
Write a Comment
User Comments (0)
About PowerShow.com