Title: A General Introduction to Tomography
1A General Introduction to Tomography Link Delay
Inference with EM Algorithm
- Presented by Joe, Wenjie Jiang
- 21/02/2004
2Outline of Talk
- Why tomography?
- Introduction to tomography
- Internal Link Delay Inference
- Basic EM
- A simple example to infer internal link delay
using EM algorithm - Conclusion
3Terminology Tomography
Brain Tomography Access is difficult!
Network Tomography Access is difficult!
Vardi 1996
4Why tomography?
- What is the
- Bandwidth?
- Loss rate?
- Link Delay?
- Traffic demands?
- Connectivity of links in the network? (Topology
Inference)
Path a connection between two end nodes, each
consisting of several links. Link a direct
connection with no intermediate routes/hosts.
5Motivation
- Identify congestion points and performance
bottlenecks - Dynamic routing
- Optimized service providing
- Security detection of anomalous/malicious
behavior - Capacity planning
6Why tomography - Difficulty
- Decentralized, heterogeneous and unregulated
nature of the internal network. - No incentive for individuals to collect and
distribute these info freely. - Collecting all statistics impose an impracticable
overhead expense - ISP regards the statistics highly confidential
- Relaying measurements to decision-making point
consumes bandwidth.
7Why tomography - Solution
- Widespread internal network monitoring is
expensive and infeasible - Edge-based measurement and statistical analysis
is practical and scalable
8Brain Tomography
9Network Tomography
10Where are you?
- Why tomography?
- Introduction to tomography
- Internal Link Delay Inference
- Basic EM
- A simple example to infer internal link delay
using EM algorithm - Conclusion
11Introduction to tomography
- Use a limited number of measurements to infer
network (link) performance parameters, using - -- Maximum Likelihood Estimator
- -- Estimation Maximization
- -- Bayesian Inference
- and assuming a prior model.
- Categories of problems
- -- Link level parameter estimation
- -- Sender-Receiver traffic intensity.
- -- Topology Inference
12Introduction to tomography (2)
- Two forms of network tomography
- -- link-level metric estimation based on
end-to-end, traffic measurements (counts of
sent/received packets, time delays between
sent/received packets) - -- path level (sender-receiver path) traffic
intensity estimation based on link-level
measurements (counts of packets through nodes) - Passive or Active measurements?
- Multicast or Unicast?
13Problem Description
- To solve the linear system
- A, ? and ehave special structures.
- Goal to maximize the likelihood function
14Problem Description (2)
- A routing matrix (graph)
- ? packet queuing delays for each link
- y packet delays measured at the edge
- e noise, inherent randomness in traffic
measurements
Statistical likelihood function
15Problem Description (3)
l1
l2
l3
l4
l5
l6
l7
l1
l2
l3
l4
l5
l6
l7
Y1
Y2
Y3
Y4
An virtual multicast tree with four receivers
Y1X1X2X4
16Where are you?
- Why tomography?
- Introduction to tomography
- Internal Link Delay Inference
- Basic EM
- A simple example to infer internal link delay
using EM algorithm - Conclusion
17Physical Topology
Measure end-to-end (from sender to receiver)
delays
18Logical Topology
Logical topology is formed by considering only
the branching points in the physical topology
Infer the logical link-level queuing delay
distributions!
19The basic idea of internal link delay tomography
Send a back-to-back packet pair from a sender,
each packet heading to a different receiver
Use the fact that delays are highly correlated on
shared links
Queuing delay difference between these two end
can be attributed to the unshared links
20Delay Estimation
- Measure end-to-end delay of packet pairs
Packets experience the same delay on link1
d2dmin0
d3gt0
Extra delay on link 3!
21Packet-pair measurements
- Key Assumptions
- Fixed known routes
- Temporal independence
- Spatial independence
- Packet-pair delays are identical on share links.
N delay measurements in all
22Parameters
ai parameter of delay pmf on link i
a1
a3
a2
a6
a4
a5
a7
a9
a8
23Link delay model
- ai delay pmf on link i
- Link delay model could be multinomial
- quantized delay model delay 0, 1, 2, 3,,L,8
- ai ai0,ai1,ai2,...,aiL,ai 8
- aijP delay(link i) j
- ai0ai1ai2,...,aiLai 81
24Goal
is the probability of the event of n-th
measurement
is the probability of the event of all
measurements
Our goal find
25Where are you?
- Why tomography?
- Introduction to tomography
- Internal Link Delay Inference
- Basic EM
- A simple example to infer internal link delay
using EM algorithm - Conclusion
26Review of MLE (Maximum Likelihood Estimation)
27Review of MLE (Maximum Likelihood Estimation)
- The basic idea of MLE God always let the event
with the biggest probability happen the most
likely -- The MLE of ? is to make the sample
occur the most likely - Note we assume Xx1,xN to be i.i.d
- The solution could be easy or hard depending on
the form of p(?X) - e.g. p(?X) is a single Gaussian ?(µ, s2), we
can set the derivative of logL(?X) to zero and
solve it directly.
28Complete Data
- The sample Xx1,xN together with the missing
(or latent) data Y is called complete data. - The complete likelihood is
- where p(x, y?) is the joint density of X and Y
given the parameter ?. - The complete log-likelihood is
29Complete MLE
- By the definition of conditional density,
- where p(yx,?) is the conditional density of Y
given Xx and ? - The complete MLE
30Basic idea of EM
- Given Xx and ? ?t-1, where ?t-1 is the current
estimates the unknown parameters - log p(x,Y ?) is a function of Y whose unique
best Mean Squared Error (MSE) predicator is
31EM steps
32The magic of EM
- the direct MLE of
- is relatively hard to solve
- But the MLE of complete log-likelihood is
relatively easier to obtain - since is a function of x and y, (y is hidden),
we use the expectation of y under x and - So
E-step
M-step
33Where are you?
- Why tomography?
- Introduction to tomography
- Internal Link Delay Inference
- Basic EM
- A simple example to infer internal link delay
using EM algorithm - Conclusion
34EM in link delay inference
Note that here notation x and y have opposite
meaning of x, y stated in previous EM algorithm
a1
x1
x2
x3
a3
a2
a6
x6
x4
x5
x7
x9
a4
a5
a7
a9
x8
a8
35EM in link delay inference (2)
- Complete data Z(X,Y)
- the complete data log-likelihood
- PaYX has nothing to do with a
- mi,j is the total number of packets experience a
delay j on link i over N measurements.
36EM in link delay inference (3)
The MLE of awould be
37EM in link delay inference (4)
MLE
which is the frequency of event mi
A simple example is that we toss a die, P( the
result i)ai (i1,26) mi how many times we see
result i
38EM in link delay inference (5)
- We notice that is similar to
- only different that should be replaced by
- So the MLE
39EM in link delay inference (6)
Probability Propagation
40A simple example
0
delay on each link fall into 0,1,2,3
x1
1
x2
x3
2
3
aijP delay (link i) j
y2
y1
41A simple example (2)
- Suppose there are 5 measurements
- (3,2), (4,2), (6,5), (0,0), (4,1)
0
x1
1
x2
x3
2
3
y2
y1
42A simple example (3)
0
x1
1
Bayes Formula
x2
x3
2
3
y2
y1
43A simple example (4)
0
x1
1
x2
x3
2
3
y2
y1
44A simple example (5)
0
x1
similarly
1
x2
x3
2
3
y2
y1
45A simple example (6)
j i 0 1 2 3
1 4/3 11/6 5/6 1
2 1 1/3 5/6 17/6
3 17/6 5/6 4/3 0
mi,j computed in the first iteration.
46A simple example (7)
the physical meaning of a1,0 is that the number
of packets that experience delay 0 on link i
divided by the total number of packets that
travel through link i
47A simple example (8)
j i 0 1 2 3
1 4/15 11/30 1/6 1/5
2 1/5 1/15 1/6 17/30
3 17/30 1/6 4/15 0
ai,j computed in the first iteration
48A simple example (9)
Iteration iterate E-step and M-step, until some
termination criteria is satisfied!
j i 0 1 2 3
1 0.4 0.4 0 0.2
2 0.2 0 0 0.8
3 0.4 0.2 0.4 0
After 6 iterations, ai,j converges to a fixed
value.
49A simple example (9)
- (3,2), (4,2), (6,5), (0,0), (4,1)
0
x1
1
x2
x3
2
3
y2
y1
50Complexity
51Where are you?
- Why tomography?
- Introduction to tomography
- Internal Link Delay Inference
- Basic EM
- A simple example to infer internal link delay
using EM algorithm - Conclusion
52Conclusion
- The field is just emerging.
- Deploying measurement/probing schemes and
inference algorithms in larger networks is the
next key step.
53Problems
- The spatial-temporally stationary and independent
traffic model has limitations, especially in
heavily loaded networks. - A trend for highly uncooperative environment for
active probing passive traffic monitoring
techniques, for example based on sampling TCP
traffic streams
54The End