Title: QoS and security using traditional services for new ends
1QoS and security - using traditional services for
new ends
- Henning Schulzrinne
- Dept. of Computer Science
- Columbia University
2Overview
- Some impolite remarks about network research and
QoS - QoS challenges in real networks
- NATs and firewalls
- DOS
- reliability
- Permission-based networking
- GIMPS next steps in signaling
3Impolite remarks on QoS and network research
4Lifecycle of technologies
traditional technology propagation
military
corporate
consumer
opex/capex doesnt matter expert support
capex/opex sensitive, but amortized expert
support
capex sensitive amateur
Can it be done?
Can I afford it?
Can my mother use it?
5Networking research is fashion-driven
workshop white paper
DARPA, NSF ?
EU Nth framework
trailing-edge research
Sigcomm Infocom Mobicom ICNP
networking courses First (European) workshop on X
-- YAP on X
secondary conferences
ATM DQDB QoS
mobile networks wireless ad-hoc, sensor
active networks
6Impact of network research
- Whats promising/interesting two different
axes - Intellectual merit ? interesting analysis,
broadly applicable, - Satisfies practical needs ? may not be a
scientific breakthrough - Field has few grand challenges and metrics
- cf., speech understanding or face recognition
- Depends largely on external technology inputs
- faster CPUs, better optical gear, compression
- typical performance improvements in queueing
20-50
- Networking research impact
- on deployed systems and protocols?
- on understanding network behavior?
- on other papers?
- Which of the 10,000 QoS papers had real impact?
- What papers were responsible for most important
networking advances? - TCP ?, web?, email?
7Maturing network research
- Old questions
- Can we make X work over packet networks?
- All major dedicated network applications (flight
reservations, embedded systems, radio, TV,
telephone, fax, messaging, ) are now available
on IP - Can we get M/G/T bits to the end user?
- Raw bits everywhere any media, anytime,
anywhere - New questions
- Dependency on communications ? Can we make the
network reliable? - Can non-technical users use networks without
becoming amateur sys-admins? ? auto/zeroconfigurat
ion, autonomous computing, self-healing networks,
- Can we prevent social and financial damage
inflicted through networks (viruses, spam, DOS,
identity theft, privacy violations, )?
8Observations on network research
- Frustration with inability to change network
infrastructure in less than 10 -- 20 year
horizons - IPv6
- Layer-3 multicast
- QoS
- Security
- Network research community has dismal track
record for new applications - web, IM, P2P (Gnutella, BitTorrent), vs.
video-on-demand - Niche applications get disproportionate attention
- active networks, ad-hoc networks, (structured)
P2P - successful applications dont care if they dont
scale - centralized IM search, unstructured P2P,
- Disconnect from standardization
- Few attempts to bring research work into
standards bodies - Standards bodies slow to catch up (e.g., P2P)
9Why do good ideas fail?
- Research O(.), CPU overhead
- per-flow reservation (RSVP) doesnt scale ? not
the problem - at least now -- routinely handle O(50,000)
routing states - Reality
- deployment costs of any new L3 technology is
probably billions of - Cost of failure
- conservative estimate (1 grad student year 2
papers) - 10,000 QoS papers _at_ 20,000/paper ? 200 million
10Cause of death for the next big thing
11QoS
- QoS is meaningless to users
- difficult to engineer service that is
consistently poor, but usable - common QoS models now
- scavenger service (worse-than-best-effort) ?
self-protection - DiffServ on access routers and NAT boxes
- care about service availability ? reliability
- but most commercial service is good enough for
VoIP/video/ most of the time - charging model problem ? users will arbitrage and
buy basic quality except during congestion
periods - see multi-homing vs. high-end providers
- as more and more value depends on network
services, can't afford random downtimes
12Why did QoS (mostly) fail?
- hypothesis The success of a technology is
inversely proportional to the number of papers
published before its popularity. - ACM 10,158 papers with QoS or quality of
service in abstract - IEEE 7,297 papers
- real-time streaming video-on-demand ? DVD via
Netflix or TCP onto 200 GB hard disk
- bandwidth too cheap to meter
- undemocratic some traffic is more equal than
others - reminds you of your mom no, you cant have that
10 Mb/s now - socialist administer scarcity - we like SUVs (or
to drive 100 mph)! - risky scheme security
- only displacement applications (such as
telephony) need QoS - requires cooperation edge-ISP, transit ISPs, end
systems - snake oil add QoS, lose half your bandwidth
13Why did QoS fail? (contd)
- dishonesty we only talk about the beneficiaries
- network has become harder to evolve
- network address translation
- firewalls
- high packetization overhead (VPNs, IPv6)
- to be useful, has to be nearly universally
supported (no, you cant make calls to AS 123) - network QoS vs. business class model coach is
empty, please refund fare
- currently, the ISP interface is IP and BGP
adding a third one is a big deal - new Internet service model TCP client (inside)
server (outside) - exception peer-to-peer on college campuses
- network to host you first, no, you first
- failure of IP QoS ? success of MPLS
- more TE than QoS
14Where did QoS technology succeed?
- Edge network
- VLAN prioritization
- 802.11e MAC layer priority
- IP TOS byte (not quite DiffServ) known since
1980s - Docsis/PacketCable ? application-initiated
- Mostly deals with self-interference
- No admission control
- No authorization (except Docsis)
15Network reliability
- we dont know precisely why network applications
fails - components and backbones appear to pretty
reliable - but we measured at 99.5 of usable time ? far
below 99.999 in telecom networks - lots of possible culprits, including DNS and
carrier interconnects - temporary overloads
- reduce operator errors
- e.g., XCONF effort in IETF
- inherently safe or fail-safe protocols?
- faster convergence in routing protocols
- BGP ? up to 20-30 minutes!
16New applications need for QoS?
- New bandwidth-intensive applications
- Reality-based networking
- (security) cameras
- Distributed games often require only
low-bandwidth control information - current game traffic VoIP
- Computation vs. storage vs. communications
- communications cost has decreased less rapidly
than storage costs - Emphasis on user control of communications
- from anywhere, anytime, any media to where
appropriate, my time, my media - Guess 1 user-selected research problem fix
spam - 2 keep cell phone from ringing in the movie
theater
17New network architectures for security
18Security challenges
- DOS, security attacks ? permissions-based
communications - only allow modest rates without asking
- effectively, back to circuit-switched
- Higher-level security services ? more
application-layer access via gateways, proxies, - User identity
- problem is not availability, but rather
over-abundance
19Trustability Internet decay
- Decay of inner cities small number of bad
elements lack of social controls and law
enforcement - Small number of miscreants
- The bulk of U.S. spam is coming from a very
limited set of IPs with high-bandwidth
connections," said Alperovitch, who estimated
that the high-volume spamming addresses number
fewer than 10,000 and the number of spammers at
less than 200. (Informationweek, Aug. 2004) - Naïve users
- with increasing firepower
20Trustability problems
- Traditional security didnt solve user interface
problem - is citi-bank.com my bank or phishing?
- traditional firewall (crunchy outside, squishy
inside) - fails with any content even JPEGs arent safe
- email usability rapidly decreasing
- most spam proposals unlikely to work
- notion of global village is an oxymoron
- in a village, you know your neighbors
- on-going approaches useful, but limited
- conversion of protocols to secured versions
(e.g., via TLS) - prevent source address spoofing
- OS and application robustness against buffer
overflow attacks - IETF MARID (SenderID, SPF, ) for email sender
identification - DOS traceback
- thus, may need to rethink network architecture
21Trustability A more polite Internet
- introduce yourself first
- shoot first, ask later (Bush)
- ask first, shoot later (Kerry)
yes, up to 10 kb/s
may I send?
- limits large-scale DDOS
- more circuit-oriented
- may get permission slip for future use
22Restoring the village part of the global village
- Its not what you know, its who you know
- Authentication works only if addresses can be
recognized by policy or human - Doesnt work well for first-time contacts ? much
of communications - wont be fixed by SPF and SenderID
- Need to leverage indirect knowledge
- our approach social networks for recognizing
users in SIP systems - leverage knowledge across media visiting web
page enables receipt of email from related
address ? make phishing more difficult
23GIMPS a modular data plane signaling protocol
(with Robert Hancock, Hannes Tschofenig, S. van
den Bosch, G. Karagiannis, A. McDonald, X. Fu and
others)
24Overview
- Signaling application vs. data plane
- Resource control
- DiffServ vs. IntServ
- Whats wrong with RSVP?
- Components of a general solution
- NSIS NTLP (GIMPS) NSLP
- Route change detection
25Signaling the big picture
SIP proxy server
session signaling
off-path NE
off-path signaling
data
AS2
AS1
on-path signaling
datapath signaling
26Need for data plane state establishment
- Differentiated treatment of packets
- QoS
- firewall (loss 100 vs. loss 0)
- Mapping state
- network address translation (NAT)
- Counting packets
- accounting
- Other state establishment
- setting up active network capsules
- MPLS paths
- pseudo-wire emulation (PWE) T1 over IP
- Related visit subset of data path nodes, but
dont leave state behind - diagnostics ? better traceroute
- link speeds, load, loss, packet treatment,
27On-path vs. off-path signaling
- On-path (path-coupled) visit subset of routers
on data path - Off-path (path-decoupled) anything else, but
presumably roughly along data path - one proposal one touch point for each AS
- bandwidth broker
- difficult part is resource tracking, not
signaling - No fundamental differences in protocol ? separate
out next-hop discovery to allow re-use
28Differentiated packet handling
- Not just QOS, but also
- firewall
- network address translation
- accounting and measurement
filter management
IntServ
DiffServ
traffic shaping, handling measurement
traffic filtering
29DiffServ ? IntServ
- Filter always uses packet characteristic
- 5-tuple (protocol, source/destination address
port) global label (TOS) - multiple flows can be mapped to one treatment
mechanism
30The scaling bogeyman
It doesnt scale!
- Networks routinely handle large-scale per-flow
state - firewalls
- NATs
- scaling cost per flow is constant (or
decreasing) - flow numbers are modest
- OC-48 can handle 31,875 DS-0 voice calls
- Mean call duration 9 min ? 60 requests/second
- probably about 3 MB of data
- partially explained by poor initial RSVP
explanations - where flow search time O(N) rather than O(1)
- likely limitations are in AAA, not router
signaling
31RSVP characteristics
- soft-state state vanishes if not refreshed
- two-pass signaling path discovery reservation
- receiver-based resource reservation
- separation of QoS signaling from routing
- with some router feedback
32The problem with RSVP
- Designed for QoS establishment, used mostly for
other things (RSVP-TE) - Designed for large-scale IP multicast ? customer
never materialized - adds significant complexity
- receiver-based ? PATH RESV
- designed for ASM (any-source) rather than SSM
(source-specific) - receiver-based motivated by receiver diversity
not very useful in practice - Designed in simpler days (1997)
- does not work well with mobile nodes (IP mobility
or changing IP addresses) - no support for NATs
- security mostly bolted on non-standard
mechanisms - single-purpose, with no clear extensibility model
- very primitive transport mechanism
- either refresh or exponential decay (refresh
reduction, RFC 2961)
33The cost of multicast for RSVP
- reservation styles
- multiple senders in same group shared vs.
distinct - sender selection explicit vs. wildcard
- receiver-oriented
- motivated by heterogeneous
- can do leaf-initiated join rather than
root-initiated - but still need periodic PATH to visit new
sub-tree - three different flow specs
- Sender_TSpec, ADSpec, (TSpec, RSpec)
- fairly tightly woven into core protocol
- state merging and management
- killer reservation (KR-II)
- generally, error handling problematic
60
20
60
30
10
20
20
40
60
20
60
ResvErr!
10
20
40
60
draft-fu-rsvp-multicast-analysis
34IETF NSIS working group
- chartered in Dec. 2001, after BOF in March 2001
- Motivated by Bradens two-layer model
(draft-lindell-waypoint, draft-braden-2level-signa
l-arch) - Active participation from Roke Manor, Siemens,
NEC Europe, Nokia, Samsung, Columbia - Based partially on CASP protocol designed by
Columbia/Siemens group and prototyped at UKy
35NSIS protocol structure
NSLP (C)
QoS, NAT/FW,
NTLP (GIMPS)
GIMPS
transport layer
UDP, TCP, SCTP IP router alert
- client layer does the real work
- reserve resources
- open firewall ports
-
- messaging layer
- establishes and tears down state
- negotiates features and capabilities
- transport layer
- reliable transport
36NSIS properties
- Network friendly
- congestion-controlled
- re-use of state across applications
- application-neutral
- add more applications later
- transport neutral
- any reliable protocol
- initially, TCP and SCTP
- also, UDP for initial probing
- policy neutral
- no particular AAA policy or protocol
- interaction with COPS, DIAMETER needs work
- soft state
- per-node time-out
- explicit removal of state
- extensible
- data format
- negotiation
37NSIS properties, cont'd.
- Topology hiding
- not recommended, but possible
- Light weight
- implementation complexity
- security associations (re-use)
- may not need kernel implementation
38What is GIMPS?
- Generic signaling transport service
- establishes state along path of data
- one sender, typically one receiver
- can be multiple receivers ? multicast (not in
initial version) - can be used for QoS per-flow or per-class
reservation - but not restricted to that
- avoid restricting users of protocol (and
religious arguments) - sender vs. receiver orientation
- more or less closely tied to data path
- initially, router-by-router (path-coupled)
- later, network (AS) path (path-decoupled)
39NSIS network model path-coupled
selective
NTLP chain
QoS
QoS
QoS midcom
omnivorous
- NTLP nodes form NTLP chain
- not every node processes all client protocols
- non-NTLP node regular router
- omnivorous processes all NTLP messages
- selective bypassed by NTLP messages with unknown
client protocols
40 Network model path-decoupled
Bandwidth broker NAC
NTLP
AS15465
AS17
AS 1249
data
- Also route network-by-network
- can combine router-by-router with out-of-path
messaging
41GIMPS messages
- Regular NTLP messages
- establish or tear down state
- carry client protocol
- datagram (D) or connection (C) mode
- Hop-by-hop reliability
- Generated by any node along the chain
42NSIS transport protocol usage
- Most signaling messages are small and infrequent
- but
- not all applications ? e.g., mobile code for
active networks - digital signatures
- re-"dialing" when resources are busy
- Need
- reliability ? to avoid long setup delays
- flow control ? avoid overloading signaling server
- congestion control ? avoid overloading network
- fragmentation of long signaling messages
- in-sequence delivery ? avoid race conditions
- transport-layer security ? integrity, privacy
- This defines standard reliable transport
protocols - TCP
- SCTP
- Avoid re-inventing wheel ? see SIP experience
43GIMPS transport protocol usage
- One transport connection ? many NSLP sessions
- may use multiple TCP/SCTP ports
- can use TLS for transport-layer security
- compared to IPsec, well-exercised key
establishment - not quite clear what the principal is
- re-use of transport ?
- no overhead of TCP and SCTP session establishment
- avoid TLS session setup
- better timer estimates
- SCTP avoids HOL blocking
44Message forwarding
- Route stateless or state-full
- stateless record route and retrace
- state-full based on next-hop information in C
node - Destination
- address ? look at destination address
- address record ? record route
- route ? based on recorded route
- state forward ? based on next-hop state
- state backward ? based on previous-hop state
- State
- no-op ? leave state as is
- ADD ? add message (and maybe client) state
- DEL ? delete message state
45Message format
common header
extensions
client protocol data
- No GIMPS distinction between requests and
responses - just routed in different directions
- client protocol may define requests and responses
- Common header defines
- destination flag
- state flag
- session identifier
- traffic selector identify traffic "covered" by
this session - message sequence number
- response sequence number
- message cookie ? avoid IP address impersonation
- origin address ? may not be data source or sink
- destination address or scope
46Message format, cont'd
- Limit session lifetime
- Avoid loops ? hop counter
- Mobility
- dead branch removal flag
- branch identifier
- Record route gathers up addresses of NSIS nodes
visited - Route addresses that NSIS message should visit
47Capability negotiation
- NSIS has named capabilities
- including client protocols
- Three mechanisms
- discovery count capabilities along a path
- "10 out of 15 can do QoS"
- record record capabilities for each node
- require for scout message, only stop once node
supports all capabilities (or-of-and) - avoid protocol versioning
48Next-hop discovery
- scout messages are special NSIS messages
- limited lt MTU size
- addressed to session destination
- UDP with router alert option ? get looked at by
each router - reflected when matching NSIS node found
next IP hop NSIS-aware?
existing transport connection?
Y
Y
done
N
N
use D mode to find next NSIS hop
establish transport connection
49Mobility and route changes
- avoids session identification by end point
addresses - avoid use of traffic selector as session
identifier - remove dead branch
DEL (B2)
discovers new route on refresh
B1
ADD B2
50QoS-NSLP resource reservation
- NSLP for signaling QoS reservations in the
Internet - both sender- and receiver-initiated reservations
- soft-state
- peer-to-peer signaling and refresh (rather than
end-to-end) - bundled sessions (e.g., video audio)
- agnostic about QoS models (IntServ, DiffServ,
RMD, )
51QoS-NSLP sender-initiated reservation
QNI
QNE
QNE
QNR
RESERVE
(RSN 4)
RESERVE
(RSN 17)
RESERVE
(RSN 3)
RESPONSE
RESPONSE
RESPONSE
52QoS-NSLP receiver-initiated reservation
QNI
QNE
QNE
QNR
QUERY
QUERY
QUERY
RESERVE
RESERVE
RESERVE
RESPONSE
RESPONSE
RESPONSE
53QoS flow aggregation
aggregate
QoS-NSLP style (RFC 3175)
traffic sink (LAN)
sinktree style (BGRP)
54Route change detection
- Dont want to wait for periodic rediscovery
delay of 30s - Not all route changes matter
- e.g., only changes between NSIS routers
- Data plane detection
- TTL change of arriving data packets
- propagation delay change for data packets
- monitoring propagation delay ( min(e2e delay))
- increases in packet loss or jitter
55Route change measurements
- 12 measurement sites (looking glass)
- one traceroute every 15 ? 2.75 hours per pair
- availability 99.8
- 0.1 repeated IP addresses
- 4.4 single hop with multiple IP addresses
- 422 route changes observed after data cleanup
(13,074 records) - 67 out of 422 also showed AS changes
- often, indicates multi-homing
56Route changes
57On-going and planned work
- Finish NTLP (GIMPS) and NSIS clients (NAT-FW and
QoS) - Longer term off-path signaling (new WG?)
- New applications diagnostics
- Mobility support
58Conclusion
- QoS deployment 25 year old technology at edge
only - can do 95 with 5 complexity
- Security concerns trump utilization optimization
- prioritize user traffic ? deny resources to
attacker - GIMPS a re-engineered generic signaling mechanism