Title: Collecting Data on the Internet
1Collecting Data on the Internet
- Henk Uijterwaal
-
- RIPE NCC New Projects Group
- Leiden, September 2000
2Overview
- Introduction
- History 101
- Test Traffic Measurements
- Routing Information Service
- Conclusions and questions
3History 101 How the RIPE NCC got involved in
this field
- 1989 RIPE Reseaux IP Europeen
- Informal organization of people interested in
IP-networks
- Volunteers doing work in working groups and on
mailing lists
- Some activities became more and more work
- 1992 RIPE NCC RIPE Network Coordination Centre
- Organization for activities that ISPs need to
organize as a group, even though they are
competitors
- Association
- Neutral, independent, not-for-profit
4RIPE NCC
- Membership organization
- Offices located in Amsterdam, NL, 85 staff
- Almost 3000 members in Europe, Africa, former
USSR, Middle East
- Main activities
- RIR IP and AS registration
- whois data-base with operational information
- k-root server
- Meetings and training courses
- Liaison with EU, governments, IANA, IETF, ITU,
- New Projects
5New Projects Group
- New things that are of interest to the community
as a group
- Start with initial plans...
- development ...
- offer as a service to the membership
- Current projects
- DISI Deployment of Internet Security
Infrastructure
- TTM Test Traffic Measurements
- RIS Routing Information Service
- New New Projects
- 9 people plus 1 or 2 students
6New Projects Group
7Overview
- Introduction
- History 101
- Test Traffic Measurements
- Background and goals
- Principles of active measurements
- Implementation
- Building a measurement network
- Analysis and some results
- Future plans
- Routing Information Service
- Conclusions and questions
8Background
- Interest from the RIPE community in active
performance measurements
- Performance measurements between ISPs are best
done by a neutral third party
- Independent
- Access to PoPs of competitors
- No bias due to commercial interests
- Project started in 1997
- Since October 2000 a regular service for the
entire Internet community
9TTM Goals
- One way measurements
- Dedicated measurement infrastructure
- Active measurements only
- Real traffic
- Inter-provider networks only
- Techniques can be used for internal networks
though
- Scientifically defendable, well defined
standards
- IETF IPPM
- RFCs 2330, 2679, 2680, ...
10TTM Service Goals
- Black box
- No configuration by the user
- No user access
- Guarantees well-defined environment for the
measurements
- Easy to install
- Little maintenance
- Host only has to look at the results
11One-way delay and loss measurements
ISP A
ISP B
Internet
Border Router
Border Router
Internal Network
Internal Network
12One-way delay and loss measurements
ISP A
ISP B
GPS Clock
Probe
Probe
Internet
Border Router
Border Router
Internal Network
Internal Network
13One-way delay and loss measurements
ISP A
ISP B
GPS Clock
Probe
Probe
Delay
Internet
Border Router
Border Router
Loss
Internal Network
Internal Network
14Routing Vectors
ISP A
ISP B
GPS Clock
Probe
Probe
Internet
Border Router
Border Router
Traceroute
Internal Network
Internal Network
15Unix timekeeping
- Hardware oscillator
- Interrupt every 10ms
- Software counter
- Counts interrupts since 1/1/70
- User access to time
- gettimeofday(), adjtime()
- Resolution only 10ms
- same order of magnitude as typical network
delays
- 2nd (faster) counter can improve that to 1?s
16Unix timekeeping
- A resolution of 1 ?s is several orders of
magnitude better than the typical delays on the
Internet
- But the clocks on two machines will run
completely independent of each other
- We have to synchronize our clocks
- Set the clock to the right initial value
- Tune it to run at the right speed
- Correct for experimental effects
- To do that, we need
- An external time reference source
- Flywheel to keep the clock running at right
speed
17Flywheel/Phase Locked Loop
18Flywheel/Phase Locked Loop
- Second Counter
- External time source
- PLL
- Determine the difference between internal and
external clock
- Make the internal clock run faster/slower
- Correct for variations over time
- Kernel level code
- NTP
- Internal clock synchronized to a few ?s
19External Time Source
- NTP to stratum-0 reference servers
- Depends on third party
- Accuracy depends on network conditions
- DCF77 and other long-wave radio sources
- Not available everywhere in our service region
- Needs operator interference for maximal accuracy
- GPS
- Available everywhere
- Clock pulses with 100 ns accuracy
- Cheap (500)
- Runs without operator interference
20GPS receivers
- Solution 1 Motorola
- Coax cable between receiver and antenna
- Limited length
- Hard to install
- Solution 2 Trimble
- Palissade or Acutime2000
- RS422 over cat5 cable
- 250m
- Needs interface
- Easy to install
21RIPE NCC Test-Box
- PC
- 4u rack mount
- Celeron 600 MHz, 256 Mb
- 99 unused
- 10/100 base-T Ethernet
- 2 disks on disk trays
- backups
- GPS interface
- Installation
- Pre-configured, plug and play
- Operated as a black box
22SoftwareTop Level DFD
- SendData
- Creates UDP packets
- ReceiveData
- BPF
- DoTraceroutes
- Routing vectors through a version of traceroute
23Accuracy of the measurements
- Plot of measurements on a single piece of
Ethernet
- Statistical errors of the order of 10 ?s
24Expanding the network
- Test-boxes have been installed since 1998
- Currently about 60
- 100 planned late 2001
- Fairly optimistic
25TTM Service
- TBs can be bought from the RIPE NCC
- 2500, hardware costs
- Plug and play
- Service fee for operation
- 3000/year
- coffee and cookies
- System administration and maintenance
- Standard Analysis
- Host only has to look at the results
- Available to the entire community
- Also outside the RIPE NCC region
26Data collection and analysis
- Daily Analysis Job
- Collect results
- Insert Routing Vectors in DB
- Merge send and receive info
- Convert to ROOT
- Produce plots
- Create Satellite plots
- Data can then be used for further analysis
- ROOT http//root.cern.ch
- Data analysis package
27Raw data sets
- Delays ROOT files with
- Source and Destination IP-address and port
- Arrival time
- Delay
- Experimental error estimates
- Traceroutes
- mySQL DB, 2 tables
- All IP paths seen
- Time that they were in used
- Data available for statistical and scientific
analysis
- There is an AUP, talk to us for details
28Presenting the data
- Passive user has to look
- Daily plots
- Plots on Demand
- Routing Vectors
- ...
- Active we warn the user
- Network alarms
29Plots on demand
30Plots on demand
31Plots on demand
32Network alarms (example)
- Measure delays over a long period
- Calculate median and percentile delays
- Compare against a short period
33Network alarms
- If 95 of the delays in the last 15 minutes is
outside the expected range, send an alarm
- Clear alarm when back to normal
- Warning for a NOC before customers complain
- Tunable
- Email today, SNMP in the future
34Other analysis work
- Studies comparing data with other experiments
- Papers in PAM2000 and PAM2001
- Verified our work
- Trends in the data
- M.Sc thesis
- Now turned into production code
35Analysis work in progress
- Trends in the data
- Summary numbers
- IPDV a.k.a. jitter
- Bandwidth and throughput
- Modeling of delays
- ...
36Overview
- Introduction
- History 101
- Test Traffic Measurements
- Routing Information Service
- Goals and motivation
- Set up of the RIS
- Some results
- Conclusions and questions
37Motivation
AS2
Router
Router
AS5
AS1
www.x.com
User
AS3
AS4
- AS1s NOC gets a user complaint
- Last night, I could not reach www.x.com.
- AS1s NOC looks at the current routing tables
- Well, it works now
38Motivation
- Something is wrong with your routing
- Current tools
- Log in to your router
- Use a looking glass on other routers
- Problems
- How to find right looking glass?
- What if the looking glass cannot be reached
either?
- Accessing multiple LGs takes a lot of time
- No history mechanism
- Solution Routing Information Service (RIS)
39Goals of the RIS
- Collect default-free time-stamped BGP
announcements between ASs and store in a data
base
- At several points on the Internet
- Set up interactive queries to database
- Giant looking glass with history
- Network reachability from other networks
- Provide raw data
- for reality checks, RRCC project
- to generate trend analysis
- Available to the Community
40History of the RIS
- Fall 1998 First idea
- July 1999 Official start of the project
- July - December 1999 Proof of concept
- RIPE-200, design document
- First prototypes
- 2000/2001 Development of the RIS
- More collection points
- Better data-base machine
- More user queries, daily reports
- Now Turn into a regular RIPE NCC service
- Next Analysis and additional products
41Setup of the RIS
IX1
EBGP
Default free Border Router
Remote Route Collector
Data Collection
Database
database
BGP Updates RIB dumps
Statistical analysis
Command line Web Interface
Raw data
Email, Web
User Interface
42Data CollectionRemote Routing Collectors (RRC)
- Pentium III
- 2 NICs
- Peering LAN
- External connection
- Lots of memory and disk
- Collection software Zebra
43What does a RRC collect?
- Snapshots of routing table 3 times a day
- Why?
- BGP4 ipv4 announcements between AS
- AS path origin AS
- Route prefix and length
- BGP attributes
- Errors in the BGP announcements
- RRCs are BGP listeners
- They do not announce routes to peers
- They will not forward IP traffic to peers
44Where are the RRCs located?
- RIPE NCC, Amsterdam, NL
- EBPG all over the world
- LINX, London, UK
- SFINX, Paris, FR
- AMS-IX, Amsterdam, NL
- CIX, Geneva, CH
- VIX, Vienna, AT
- Otemachi, Tokyo, JP
- Plan to add a few more, total O(10)
45Data Base _at_ the RIPE NCC
RIPE NCC
RRC00
User Interface
Insertion Machine
Server Machine
RRC01
RRC02
Disks
Tape
46Data Volume GBytes, estimated
47Data sets
- Routing table dumps
- Updates since the last dump
- Available from our DB server
- http//www.ripe.net/ris/ris-index.html
- No restrictions
- Send us a mail if you publish something about the
data
48User InterfaceInteractive Queries
- 4 Interactive Queries
- RIS Database by AS
- RIS Database by prefix
- ASInuse
- Status information
- Operational purposes
- Both are available through the RIS webpage
- Beta-service with no uptime guarantees (yet)
- More queries will be added in the future
- Feedback from the operators
49RIS User Interface (by AS)
50RIS User Interface (2)
51RIS User Interface (by prefix)
52RIS User Interface (4)
53ASInuse
- Shows if an AS is announced anywhere
- Intended for our Registration Service Department
- Other usage
- ISP starting to announce an AS
- Check if it actually seen somewhere
54ASInuse
55ASInuse
56User InterfaceDaily Analysis
- There is lots of interest in daily analyses of
the BGP data
- The RIS has all the data
- Current implementations use only 1 view of the
Internet
- The RIS has the advantage of having dozens of
views of the Internet
57Daily Analysis
- Project to produce a daily report of the RIS
data
- Last 24 hours or
- Recent snapshot
- 2 views
- General view
- View for a specific peer (selected by the user)
- Will be available by Email
- Keep data for trend analysis
58Daily Analysis
RIS
Queries
- RIS Database
- SQL2RRD Universal tool to do large amounts of
(similar) queries
- RRD Database
- Plots on the Web
- Last command to generate report
SQL2RRD
RRD
WWW
email
59RIS report
- Fully automated setup to produce daily updated
plots of the RIS data
- Flexible
- Expandable
- Day, Week, Month and Year view
- 10 year view in the future
- Documented
- Some screenshots showing the RIS report
- Do not copy the URL, it has changed!
60RIS report/Initial screen
61RIS report/BGP update count
62RIS report/Prefix distribution
63RIS report/Other plots
- Current plots
- BGP updates/minute
- Prefix distribution
- BGP distribution of types
- BGP unique announcements
- More plots and statistics can be easily added
- Personalized report possible by book-marking the
output pages
64User Interface RRCC
- Routing Registry Consistency Check
- The RIPE NCC has a routing registry (RR)
- Data is maintained on a best effort basis by
LIRs.
- This leads to inconsistencies w.r.t. reality
- Routes registered and not announced
- Routes announced and not registered
- Routes announced with different aggregation
- ...
65RRCC (2)
- Take a snapshot of the routing table from the
RIS
- Take the current data from the RR
- Compare
- Produce a list of differences
- The owner of the data looks at this list and
- Uses the list to fix the RR
- Changes his router configurations
- Being beta-tested at the moment
66Analysis plans
- Expand RIS report
- Long list of variables that can be added
- Calculate routing table from snapshot plus
updates
- Black holes in IP address space
- Route flaps
- Statistics
- Experiments
67Dark Addresses in IP spaceMay 11, 2000
of IPv4 Address space
68Overview
- Introduction
- History 101
- Test Traffic Measurements
- Routing Information Service
- Conclusions and questions
69URLs, Contact Addresses
- TTM
- http//www.ripe.net/test-traffic
- Papers
- Presentations
- For future test-box hosts
- ttm_at_ripe.net TTM Crew _at_ NCC
- tt-wg_at_ripe.net RIPE WG on this topic (Majordomo)
- RIS
- http//www.ripe.net/ris/ris-index.html
- Presentations
- Access to the data
- ris_at_ripe.net RIS Crew _at_ NCC
- routing-wg_at_ripe.net RIPE WG on this topic
(Majordomo)
70Conclusions
- TTM Device to measure one way delays, available
to the community
- RIS Looking glass on drugs
- Data sets for research purposes
71Questions, Discussion