Title: NSI Registry Engineering
1NSI Registry Engineering Operations Update
- Ari Balogh
- VP of Engineering
- abalogh_at_netsol.com
2High-Level Architecture
3Registrar Growth
4Average Daily Transactions
In millions, compared to Original Plan and New
Projections (peak of 27.5M)
5Total Transactions Summary
In millions
19
145
38
33
38
88
49
6Availability Performance
- Service Level Agreement (SLA) allowances
- 8 hours total outage per month, 4 hours unplanned
- 3 seconds average for check domain (excluding
worst 5) - 5 seconds average for add domain (excluding worst
5) - January observed performance
- 3.5 hours planned outage to implement governance
issues, no unplanned - 600 ms per check domain, 2.5 seconds per add
- February observed performance
- No planned or unplanned outages
- 700 ms per check domain, 2.6 seconds per add
7Availability Performance
- March observed performance
- Two 2 hour planned outages, 1.25 hour unplanned
outage - 60 ms per check domain, 300 ms per add
- April observed performance
- 2.5 hours planned outage, no unplanned
- 78.5 ms per check domain, 319.5 ms per add
- May observed performance
- 2 hours planned outage, no unplanned
- 34.7 ms per check domain, 257.2 ms per add
8A Root Performance - UDP Packets/Second
5 Minute Average
30 Minute Average
9A Root Performance - Drops Overflows
Drops - 5 Minute Average
Overflows - 5 Minute Average
10J gTLD Performance - UDP Packets/Second
5 Minute Average
30 Minute Average
11M gTLD Performance - UDP Packets/Second
5 Minute Average
30 Minute Average
12The Infrastructure Problem
- SLA that incurs 500K/day outage and performance
penalties - Single shared database experiencing 30 - 90 per
month OLTP growth - Heavyweight stored procedures
- Sustained 50-70 utilization with peaks to 100
and no more easy software fixes - Increasing extract duration for zones, Whois,
registrar extracts, 5 - 14 hours - Immature or end-of-life HA options for E4500
- Sun, Veritas, EMC version and support issues
13DB Server Evaluation
- Evaluated top Unix machines
- Sun E10000, HP V2500, IBM S7A/S80
- Narrowed to E10000 and S7A/S80
- Conducted three month live test of S7A/S80
- Ported gateway and application servers to IBM
Java environment - Created RRP path configuration
- Demonstrated performance and availability
(HA/CMP) - Investigated impacts of E10K
- Different administrative model
- EMC integration issues
14Definitive Results
- Excellent Java and C code portability
- S80 clear performance leader, benchmarks and
real-world - Approximately 3 times the throughput per CPU vs.
E10K - Noticeably improved Java performance (!)
- Robust HA implementation
- Complete 64-bit environment
- Native file system and volume management
excellent EMC integration - Impressive and thorough support
- Demonstrated appreciation for multi-vendor,
mission critical computing
15Scaling DNS
- Domain name resolutions on A Root
- 4Q99 - 220M per day
- 1Q00 - 430M per day
- 2Q00 - 650M per day
- 4Q00 - 1.5B per day, more?
- Need 64-bit machines to scale past 4GB/23M domain
name wall - Developing bind extensions for high performance
gTLD
1664-bit DNS Evaluation
- Engaged Unix vendors to aid with in-house
evaluation of 64-bit mid-range Unix servers - HP N4000, IBM H70, Sun E3500
- E3500 eliminated early -- scale and 64-bit issues
- H70 within 15 of N4000, upcoming upgrade
substantially faster - Chose M80 as new root/gTLD platform
- Using E4500s as alternate platform and
placeholder for UltraSparcIII generation
17The Dot Problem
Resolutions per day. A Root meltdown?
18Dot Diagnosis and Fix
- Too much load for existing E450
- Qualified and put into production the
evaluation H70 - Greater than 60 increased throughput
- Jump from 220M resolutions per day to over 400M
- Qualified and put into production an S80 as
placeholder for upcoming M80 deployment - Greater than factor of three improvement over
previous E450 - Tweaked TCP keepalive defaults and bind select
loop - Filtered dynamic updates
19The New Dot
A Root resolutions per day with H70
20Packet Drops
Percent packets dropped, day of H70 deployment
Deployed 11 a.m.
Current time (9 a.m. day after)
21Upcoming access -www.dnsentral.net