Title: The importance of the network
1The importance of the network
Physical Network View
Overlay View
- From a distributed systems standpoint, the
physical network provides the backbone for
overlays. - Distributed systems developers take for granted
that a node can talk (reliably, if need be) to
any other node in the same physically connected
component via some identifier (say, IP address) - Tools, such as network coordinates, can help
developers
2End-to-End Arguments in System Design
CS525 What about the network?
- J.H. Saltzer
- D.P. Reed
- D.D. Clark
3Overview
- The end-to-end argument
- Examples/applications of the argument
- Discussion
4Function placement
- Should functions that end users/applications
perform be implemented at lower or higher levels? - If we want to transfer a file reliably, what
should be the job of each computer subsystem? - How about encryption? Delivery Acknowledgement?
Duplicate message suppression?
5End-to-end argument
- The function in question can completely and
correctly be implemented only with the knowledge
and help of the application standing at the end
points of the communication system. Therefore,
providing that questioned function as a feature
of the communication system itself is not
possible. (Sometimes an incomplete version of the
function provided by the communication system may
be useful as a performance enhancement.)
6Careful file transfer
Computer A
Computer B
7Careful file transfer
Computer A
Computer B
F
- File transfer program on A asks file system to
read F from disk
8Careful file transfer
Computer A
Computer B
F
- File transfer program on A asks file system to
read F from disk - File transfer program on A asks communication
system to send file
9Careful file transfer
Computer A
Computer B
- File transfer program on A asks file system to
read F from disk - File transfer program on A asks communication
system to send file - Communication system transmits packets
10Careful file transfer
Computer A
Computer B
F
- File transfer program on A asks file system to
read F from disk - File transfer program on A asks communication
system to send file - Communication system transmits packets
- Communication system gives F to file transfer
program on B
11Careful file transfer
Computer A
Computer B
F
- File transfer program on A asks file system to
read F from disk - File transfer program on A asks communication
system to send file - Communication system transmits packets
- Communication system gives F to file transfer
program on B - File transfer program on B asks file system to
write F to disk
12What can go wrong?
Computer A
Computer B
A
A
- Reading to and writing from file system
13What can go wrong?
Computer A
Computer B
A
A
B
B
- Reading to and writing from file system
- Breaking up file / reassembling file
14What can go wrong?
Computer A
Computer B
A
A
C
B
B
- Reading to and writing from file system
- Breaking up file / reassembling file
- Transmitting file over communication system
15Possible solution 1
- Ensure each step by some form of error checking
duplicate copies, redundancy, timeout and retry,
etc. - Packet error checking at each hop
- Send every packet three times
- Acknowledge packet reception at each hop
16Problems with this solution
Computer A
Computer B
A
A
B
B
- Not complete still requires application level
checking - May not be economical
17Possible solution 2
- End-to-end check and retry
- Application commits or retries based on checksum
value. - If errors along the way are rare, this will most
likely finish on first try.
18Performance
- Lower levels can be reliable as a performance
booster - Transferring large files
- Regardless of data communication, end-to-end
check must be done - Tradeoff based on performance, not correctness
- Is the amount of effort put into the reliability
worth the performance gain?
19Delivery guarantee
Computer A
Computer B
Computer A
Computer B
message
message
RFNM
got it
- ARPANET returns RFNM to acknowledge successful
message delivery - Is this really useful to end application?
20Data encryption
Computer A
Computer B
- Communication system needs keys
- Cleartext at host, before application
- Authenticity check must be performed
21Data encryption
Computer A
Computer B
- Keys are maintained by end application
- Ciphertext before application
- Authenticity by default (assuming both keys are
private)
22Identifying the ends
- Low level bit checking is bad for real-time voice
transfer high level error checking is better. - However, low level reliability measures may be
fine is voice is being stored.
23Discussion Layering model
Computer A
Computer B
Application
Application
Router
Transport
Transport
Network
Network
Network
Data Link
Data Link
Data Link
Physical
Physical
Physical
- TCP (usually) runs only at end hosts
- Does TCP violate end-to-end by being below
application? - Is giving the application the option of TCP or
UDP the way to go?
24Discussion TCP splitting
Computer B
Computer A
- Performance much better in wired section
- Intermediate node acts as end host
- What else can we do?
25Discussion Spam
- The end user for email is generally considered to
be a human. - By the end-to-end argument, the network should
deliver all mail to the user. - Are spam control mechanisms therefore in
violation of the end-to-end argument? - If so, is it an appropriate violation?
26Discussion End-to-end today
- Is the end-to-end argument still valid today?
- Is hardware good enough that we dont have to
worry about end checks? - Applications are becoming more and more complex.
- Do P2P systems, such as Chord, violate
end-to-end? - Does in-network aggregation, such as in sensor
networks, violate end-to-end?
27Stable and Accurate Network Coordinates
- J. Ledlie et al. (Harvard University)
- In International Conference on
- Distributed Computing Systems (ICDCS06)
Some slides taken from the authors presentation
28Outline
- Background
- Two Practical Problems
- Latencies are not static
- Changing coordinates is expensive
- Proposed Solutions
- Latency Filter
- Update Filter
- Conclusion
29Outline
- Background
- Two Practical Problems
- Latencies are not static
- Changing coordinates is expensive
- Proposed Solutions
- Latency Filter
- Update Filter
- Conclusion
30Motivation of Network Coordinates
(-15,20)
(-40,20)
E
(20,20)
C
D
Player
Game Server
(0,8)
A
B
(25,8)
F
(-39,7)
RTTAB
I
Direct measurement is not scalable!
Predict latency by coordinates
G
(20,-15)
(-25,-17)
H
(9,-20)
Pick server with lowest mean latency for all
players.
Use centroid of network coordinates! Server A
31Benefits of NCs
- Estimate/Predict RTT without direct probing
- Scalability
- Make well-understood geometric algorithms
applicable to distributed systems problems - Powerful abstraction
32How Network Coordinates Work
A
- A starts measurement to B.
- B replies with its coord. A deduces RTT.
- A computes estimate and error.
- A moves toward ideal coord, relative to B.
- Repeat with C, D, E.
- Predict to X.
(103,84)
C
A
A
(100,80)
A
A
RTT60ms
60ms
Coord?
E
D
X at (140,20)
B
(70,40)
Estimate(100,80)-(70,40)50ms ErrorRTT-Estimat
e60-5010ms
X
Goal minimize global prediction error
33Vivaldi Network Coordinates
- Simple
- Adaptive
- Periodic RTT measurements with neighbors
- Refine coordinates (pulled or pushed by each
neighbor) - Decentralized
- Works well in simulation
34Outline
- Network Coordinates
- Two Practical Problems
- Latencies are not static
- Changing coordinates is expensive
- Proposed Solutions
- Latency Filter
- Update Filter
- Conclusion
35Problem 1 Latencies are not Static
- Raw latency data have errors and change
RTTAC5ms,5ms,6ms,40ms,41ms,40ms
RTTAB60ms,60ms,59ms,1000ms,70ms,60ms
A
B
C
36Problem 1(a) Errors are Unpredictable
Three hours of measurements from berkeley to
uvic.ca
82 of measurements within 1ms of median
37Problem1(b) Latencies can Change
Three days of measurements from ntu.edu.tw to
6planetlab.edu.cn
Need to remove noise, but remain adaptive
38Outline
- Network Coordinates
- Two Practical Problems
- Latencies are not static
- Changing coordinates is expensive
- Proposed Solutions
- Latency Filter
- Update Filter
- Conclusion
39Solution 1 Latency Filter
Problem Latencies are not Static
- Filtering with histories
- Minimum of previous four samples works best.
Time
newest
oldest
60 61 59 62
t0
Receives 1000ms RTT
1000 60 61 59
t1
70 1000 60 61
t2
How do they find out?
40Solution 1 Latency Filter
- General Moving Percentile (MP) filter
- h size of the history window
- p percentile returned as the prediction
- e.g. Minimum of previous four samples
- h4, p25
- Run experiments on the 3-day trace, varying h and
p - Evaluation metric Relative Error
- h4,p25 achieves the lowest error
-
Relative Error (RTT-Estimate)/RTT
41Latency Filter in the Big Picture
Simple Thresholds
Sliding Windows
42Latency Filter in Practice
226 PlanetLab nodes (coord in 3D Space)
Latency Filter (h4,p25)
Raw Coordinates
Latency Filters eliminate outliers that cause
distortions of many coords all at once (e.g.,
minute 38 of the video)
43Outline
- Network Coordinates
- Two Practical Problems
- Latencies are not static
- Changing coordinates is expensive
- Proposed Solutions
- Latency Filter
- Update Filter
44Problem 2Changing Coordinates is Expensive
- Frequent coord change, even with Latency Filter
- App-specific cost
- e.g., cascading heavyweight process migration in
streaming DBs - Most apps would prefer to be notified only when
significant change occurs - Is it possible to tell apps less frequently and
retain high accuracy?
45Outline
- Network Coordinates
- Two Practical Problems
- Latencies are not static
- Changing coordinates is expensive
- Proposed Solutions
- Latency Filter
- Update Filter
46Solution 2 Update Filter
Problem Changing Coordinates is Expensive
- Distinguish system-level coordinates Cs from
application-level coordinates Ca
Simple Thresholds
Sliding Windows
47Solution 2 Window-based Update Filter
- Keep history of recent coordinates
- Divide history into two windows (sets)
current (newest) and start (oldest) - When current and start diverge (by some metric),
update application with new coordinate - Two Metrics
- Local Relative Distance
- Energy
48Update FiltersLocal Relative Distance
- Remember nearest known neighbor
- Add coords to start and current windows
- Compare centroids of two windows
B
dmin
A
C
C0
C1
C2
C3
49Update FiltersLocal Relative Distance
- Remember nearest known neighbor
- Add coords to start and current windows
- Compare centroids of two windows
B
B
d
dmin
A
A
C
Start Window Ws
C0
C1
C2
C3
C4
Current Window Wc
50Update FiltersLocal Relative Distance
- Remember nearest known neighbor
- Add coords to start and current windows
- Compare centroids of two windows
B
B
d
dmin
A
A
C
Start Window Ws
C0
C1
C2
C3
C4
C5
Current Window Wc
If Centroid(Ws)-Centroid(Wc) gt d x e
51Update FiltersLocal Relative Distance
- Remember the nearest known neighbor
- Add coords to start and current windows
- Compare centroids of two windows
- Update app-level coordinate
B
B
d
dmin
A
A
C
Start Window Ws
C0
C1
C2
C3
C87
C88
C89
C90
Current Window Wc
If Centroid(Ws)-Centroid(Wc) gt dmin x e
Ca Centroid(Wc)
52Update FiltersEnergy
- Use a statistical test that specifically measures
the Euclidean distance between two multi-D
distributions - Input Aa1,a2,,an, Bb1,b2,bm
- Output dABenergy(A, B)
- If energy(Ws, Wc) gt Threshold
- Update
Ca Centroid(Wc)
53Update Latency In Practice
- Latency Filter and Update Filter combined to
create a much more stable set of coords.
Latency Filter
Latency and Update Filters
54Conclusion
- Stable and Accurate NCs
- Latency Filters
- Remove outliers while adapting to change
- Update Filters
- Tell app only when necessary
55Discussion
- How can NCs help
- Unstructured overlay construction?
- Message routing?
- Content distribution (e.g., multicast)?
- Resource location?
- Resource placement?
-
- Can you think of others?
56CAIDA Tools Overview
- Developed and supported by CAIDA (Cooperative
Association for Internet Data Analysis) at Univ.
of California at San Diego
57Taxonomy
- Topology
- mapping Internet topology
- Workload characterization
- Passive monitoring
- Performance evaluation
- Active probing (e.g., rtt, link quality)
- Routing
- Route stability
- Visualization
- Massive datasets, complex data attributes
- Network management
- Monitoring, databases, visualization
58RRDToolRound Robin Database Tool
- Industry standard application to store and
display time-series data - i.e. network bandwidth, machine-room temperature,
server load average - Round Robin data store
- Constant-sized database
59RRDTool Features
- Data Acquisition
- It is virtually impossible to collect data and
feed it into RRDtool on exact intervals
automatic interpolation - Consolidation
- Define consolidation functions (CF) (average,
minimum, maximum, total, last) and which interval
they occurs - Maintained on the fly
60RRDTool Features (contd)
- Aberrant Behavior Detection
- An algorithm for predicting the value of a time
series one time step into the future. - A measure of deviation between predicted and
observed values. - A mechanism to decide if and when an observed
value or sequence of observed values is too
deviant from the predicted value(s).
61RRDTool Screenshots
Spam server statistics
Santa Claus and Xmas trees ?
Inbound and outbound traffic on a switch.
62Taxonomy ToolAndover Internet Traffic Report
North AmericaAvg. Response Time 171 Avg.
Packet Loss 9 Total Routers 50 Network up
86
- Monitors how fast and reliable connections to
different parts of the world are. - Several servers placed around the world ping a
specific router at the same time. - The RTTs are averaged and compared to previous
RTTs from the same server over the past 7 days. - A score from 0 (bad) to 100 (good) is assigned to
that router.
63Taxonomy ToolAndover Internet Traffic Report
Global Index
Taken 3/13/07 _at_ 925pm
http//www.internettrafficreport.com/
64Taxonomy ToolGomez Performance Network
- Allows a user to choose between approx. 60
globally located computers to constantly monitor
website responsiveness. - Users can script custom transaction tests,
allowing GPN to monitor objects. - Users can choose frequency of polling.
65Taxonomy ToolGomez Performance Network
Test Errors and Availability Valid Tests
70 Failed Tests 1 Eliminated Tests 0 Total
Tests 71 Test Success Rate 98.59
Response Time Average 14.274 seconds Maximum
132.86 seconds Minimum 1.56 seconds
Historical Average Response Last 7 days 13.25
seconds Last 14 days 8.842 seconds Last 31 days
5.658 seconds
Object Errors and Availability Valid Objects
3095 Failed Objects 14 Total Objects
3109 Object Success Rate 99.55
66Importance to P2P
- These tools can be utilized by P2P systems.
- When forming an overlay, may want to base the
decision on physical link quality. - Discussion What other uses do these tools have
for P2P systems?