Title: Structure Preserving Anonymization of Router Configuration Data
1Structure Preserving Anonymization of Router
Configuration Data
- David A. Maltz,
- Jibin Zhan, Geoffrey Xie, Hui Zhang
- Carnegie Mellon University
- Gisli Hjalmtysson, Albert Greenberg, Jennifer
Rexford - ATT Labs Research
2Why Configuration Files are Valuable
- Configuration file program loaded on each
router - Controls operation of router
- Controls interactions between routers
- Configuration files allow researchers to study of
the details of real networks - The problem is getting access to them
- We have developed a technique for anonymizing
configuration files - We have a proposal for how configs could be made
accessible to the research community
3Why Configuration Files are Valuable - 2
- The set of configurations defines the network
- Captures many of the networks properties
- Topology (node degree, interconnectivity)
- Policies (CoS, QoS, packet filters, reachability)
- Routing (neighbors, OSPF weights, BGP policies)
- Security (vulnerabilities, mitigations)
- Only source of insight for Enterprise networks
- 10K networks that are currently a mystery
- Interesting! 10 1200 routers, global scale
- Configs are the only way to look at them
- Networks firewalled, external probes dropped
4Topology
Internet
Router 1 Config
Router 2 Config
interface Serial2/1.5 ip address 1.1.1.2/30
interface Serial1/0.5 ip address 1.1.1.1/30
5Quality of Service
- class-map GoodCustomer Â
- match access-group 136
- policy-map GoldService
- class GoodCustomerÂ
- bandwidth 2000
- queue-limit 40
- class class-default Â
- fair-queue 16
- queue-limit 20
- interface Serial0/0
- service-policy output GoldService
Class definition
CB-WFQ parameters
CB-WFQ policy name
6Routing
AS Numbers
- router bgp 65501
- neighbor EdgeSwitch peer-group
- neighbor EdgeSwitch remote-as 64740
- neighbor EdgeSwitch distribute-list 11 in
- neighbor EdgeSwitch route-map exportRoutes out
- neighbor 192.168.96.8 peer-group EdgeSwitch
- neighbor 192.168.96.9 peer-group EdgeSwitch
- neighbor 10.217.248.14 remote-as 65500
- neighbor 10.217.248.14 ebgp-multihop 5
-
Policies
Peers
7Security Issues
- access-list 143 deny 53 any any
- access-list 143 deny 55 any any
- access-list 143 deny 77 any any
- access-list 143 permit ip any any
- interface Serial0.2 multipoint
- ip access-group 143 in
- ip address 66.248.162.13 255.255.255.224
-
- interface Ethernet0
-
- ip address 144.201.41.59 255.255.255.0
-
Access list 143 Drops packets that can attack
Cisco interfaces
This interface is safe
This interface is not
8How to Get Configuration Files?
- Considered proprietary secrets of network owners
- Discloses business strategy
- Discloses vulnerabilities
- Anonymization breaks tie between data and owner
- Anonymized configs will show some network is
vulnerable, but which/where to attack? - We developed method for anonymizing configuration
files - Approach convinced some customers of ATT to
disclose their configs to CMU researchers
9Anonymization Challenges
- We dont know the intended use of the data
- Must anonymize entire configuration file
- A customized data set is easier to anonymize
- Must preserve structure of information in files
- Relationships of identifiers inside/between files
- IP address subnet relationships
- Traditional parsing tools are of no use
- No published grammar for Cisco IOS
- 200 different versions seen in 31 networks
10Anonymize Non-numeric Tokens
- Created pass list of words by string-scraping
Ciscos web pages - Contains most IOS commands
- Other words are generic networking terms (IETF)
- All tokens not in pass list are hashed with
salted SHA1
router bgp 64780 redistribute ospf 64 match
route-map NYOffice neighbor 1.2.3.4 remote-as
701 route-map NYOffice deny 10 match ip address 4
router bgp 64780 redistribute ospf 64 match
route-map 8aTzlvBrbaW neighbor 66.253.160.68
remote-as 701 route-map 8aTzlvBrbaW deny 10
match ip address 4
11Anonymize Specific Numbers
- Most numbers are harmless, some reveal identity
- Public AS numbers
- Phone numbers (NOCs, backup modems)
- 26 rules used to find and anonymize
context-dependent items - "neighbor\\sipAddrPatt\\sremote-as"
- " neighbor\s\w\sremote-as "
router bgp 64780 redistribute ospf 64 match
route-map NYOffice neighbor 1.2.3.4 remote-as
701 route-map NYOffice deny 10 match ip address 4
router bgp 64780 redistribute ospf 64 match
route-map 8aTzlvBrbaW neighbor 66.253.160.68
remote-as 1237 route-map 8aTzlvBrbaW deny 10
match ip address 4
12Limits of Anonymization
- Anonymization is a lossy process
- Comments meaningful identifiers removed
- (Were they right anyway???)
- Anonymizer preserves relationships it knows about
- Doesnt know about IP addr lt-gt ASN mapping
- A packet filter, based on IP address, and route
policy, based on ASN, could target same AS - Post-anonymization both mechanisms preserved,
but wont show them targeting same AS - (Router didnt have that external information
either)
13Potential Vulnerabilities Textual Attacks
- Identifying information left in configs
- Heuristics used as double-check
- Rules that anonymize public AS numbers record the
public AS numbers they find - Search post-anonymization file for any remaining
occurrences
14Potential VulnerabilitiesFingerprinting Attacks
- Network characteristics (fingerprint) extracted
from anonymized configs matched against public
data - Potential fingerprints
- BGP community strings
- Number of POPs, number of BGP peers
- Structure of address space utilization
- Others
- Evaluation still in progress
- Seems like backbone networks are identifiable
- Seems like enterprise networks are not
15A Clearinghouse for Configuration Data
Network owners
Retrieve Anonymizer
Questions Results
Anonymize test configs
Run tools on site Scalable, pictures
Blinded email
Upload configs
Website enforcing single-blind methodology
Retrieve configs
Blinded email
Analyze data
Register with site
Questions Results
Boot-strap with configs from academic/research
institutions?
16Questions?
17Fingerprinting Attacks
Data from networks in repository of anonymized
configs
BGP Peers per POP
POPs (sorted by peers/POP)
- 1. For each anonymized network, compute
fingerprint from anonymized config files - Will be 100 accurate
- 2. Experimentally measure real networks
18Fingerprinting Attacks
BGP Peers per POP
POPs (sorted by peers/POP)
- Evaluation still in progress
- Seems like backbone networks are identifiable
- Seems like enterprise networks are not
19Anonymize Regular Expressions
- Some AS numbers appear in regular expressions
- Expressions w/ only private AS numbers ! no
change - Expressions w/ public AS numbers ! expand and
anonymize
ip as-path access-list 101 permit _70 1-3_
Anonymize
1234, 543, 21
701, 702, 703
ip as-path access-list 101 permit _(123454321)_
20Anonymize IP Addresses
- Extended Minshalls prefix-preserving algorithm
- Made it class preserving
- Class A to Class A, etc.
- RIP and older protocols are class-full
- Made it subnet address preserving
- Assume 128.2.0.0/16 is subnet
- We want 128.2.0.0 ! 150.7.0.0
- Before extension, 128.2.0.0 ! 150.7.43.66
21Anonymize IP Addresses - 2
- Made it special address preserving
- Multicast, private address space
- Must fix collisions in mapping function
N
Special?
IP Addr
Anonymize
Y
22Anonymization Overview
- Minimize dependence on context
- If in-doubt, hash it out
- Remove all comments
- Find all IP addresses and hash using specialized
prefix-preserving anonymization - Hash all non-numeric tokens not known to be safe
- Anonymize specific numeric tokens using regular
expressions - Anonymize regular expressions appearing in
configs
23Why Configuration Files are Valuable
- Configuration file program loaded on each
router - Controls operation of router
- Controls interactions between routers
- The set of configurations defines the network
- Captures many of the networks properties
- Policies, topology, routing, feature set
- Configs give insight on Enterprise networks
- These networks are currently a mystery
- Interesting things happen there
- Configs are the only way to look at them
- Networks firewalled, with external probes dropped
- Configs allow study of the details of real
networks
24Anonymization Overview
- Minimize dependence on context
- If in-doubt, hash it out
- Use regular expressions to establish context when
needed examples - Remove all comments
- Anonymize public AS numbers (ASN)
25Anonymize IP Addresses
- Extended Minshalls prefix-preserving algorithm
- Made it class preserving
- Class A to Class A, etc.
- RIP and older protocols are class-full
- Made it subnet address preserving
- If 128.2.0.0/16 is subnet, want 128.2.0.0 !
150.7.0.0 - Before extension, 128.2.0.0 ! 150.7.43.66
- Made it special address preserving
- Multicast, private address space
- Must fix collisions in mapping function