Title: Time Series from their Observed Sums: Network Tomography
1Time Series from their Observed Sums Network
Tomography
- Edoardo M. Airoldi
- School of Computer Science
- Carnegie Mellon University
- (joint work with Christos Faloutsos)
SIGKDD, Seattle, WA August 23nd 2004
2Acknowledgements
- Srinivasan Seshan, CSD, CMU
- Russel Yount and Frank Kietzke, Network
Development, CMU - Stephen Fienberg, Statistics, CMU
- Jin Cao, Bell Labs
- Claudia Tebaldi, NCAR
- Yin Zhang, ATT Labs
3Outline
- Introduction / Motivation
- Survey
- Proposed Methods
- Results
- Conclusions
4Application Domains
- Communication Networks
- goal Who is sending to whom
- refs Cao et al (2001), Liang Yu (2003),
Zhang et al (2004) - Transportation Networks
- goal Who is going where
- Network Probing (Rish et al, IBM)
- goal Which server is down
- refs Rish et al (2002, 2004)
5Communication Networks
- A large ISP network has 100s of nodes, 1000s of
links, 10000s routes, and over 1 petabyte (1015
bytes) per day
OD flows
- Reliability analysis
- Predict link loads under unexpected/planned
router/link failures - Traffic engineering
- Optimize routes to minimize congestion
- Capacity planning
- Forecast future capacity requirements
link loads
6Mathematical Formulation
X1
X
X2
LINK
Y
X3
Situation at time t
X4
One Constraint
Total ?i Yi 0
7Problem Definition
Given topology, fixed routing scheme Anxm,
traffic on the links of the network Y(t)Y1(t),
, Yn(t) over time t 1, , T Find non-observab
le traffic between origin-destination (OD) pairs
X(t)X1(t), , Xm(t) over time t 1, , T.
Y(t) AX(t)
Under-constrained
8A Glance at the Data
Find OD Flows X(t)
X1(t1) X2(t1) X3(t1) X4(t1)
X1(t2) X2(t2) X3(t2) X4(t2)
X1(t3) X2(t3) X3(t3) X4(t3)
X1(t4) X2(t4) X3(t4) X4(t4)
?
Time
Kb
Y1(t1) Y2(t1) Y3(t1)
Y1(t2) Y2(t2) Y3(t2)
Y1(t3) Y2(t3) Y3(t3)
Y1(t4) Y2(t4) Y3(t4)
Measure Link Flows Y(t)
hour of the day
9Our Problem No Traffic Matrix
- Traffic matrix
- Gives traffic volumes between origin and
destination - Very difficult to directly measure
- Direct measurement Feldmann et al. 2000
- Collect flow-level data around the whole edge of
the network - Combine with routing data
- Semi-standard router feature Netflow
- Cisco, Juniper, etc.
- Not always well supported
- Potential performance impact on routers
- Huge amount of data (500GB/day)
- Widely available SNMP data gives only link loads
- Even this data is not perfect (glitches, loss, )
10Outline
- Introduction / Motivation
- Survey
- Proposed Methods
- Results
- Conclusions
11Infinite Exact Solutions
- Measurements (Yt) and routing scheme A3x4 allow
for many feasible OD flows (Yt) - For example
The problem is under-constrained and we need some
assumptions
12Related Work
y Ax
- Solutions in the past
- Direct solution SVD
- Scoring criterion GLS, maximum likelihood,
entropy, Bayesian methods, - Regularization assume independent OD flows
- Estimate OD flows xt using yt-?, yt?
13Pitfalls of Past Approaches
- Unrealistic Models
- Gaussian or Poisson OD traffic flows. But we
observe bursty, log-Normal traffic flows. - Time Dependence across Epochs
- Never explicitly addressed, and typically
assume xt independent over time. But we observe
time dependence of single OD flows.
14Empirical Laws log-Normality
- Aggregate OD flows look log-log Normal
Counts
Counts
Log Bytes
Log-Log Bytes
12321 OD time series. CMU validation data.
15Outline
- Introduction / Motivation
- Survey
- Proposed Method
- 1st Stage - Linear Dynamical Systems
- 2nd Stage - Bayesian Dynamical Systems
- Results
- Conclusions
16The Model
- A smooth average process ?t t gt 0
- A possibly bursty process xt t gt 0 to model
the OD traffic flows
17Parameter Estimation
- Estimate parameters underlying the average
process ?t t gt 0 - Calibrate priors for the parameters driving the
dynamic of the OD flows process xt t gt 0 - Estimate the OD flows using a Particle Filter
18Outline
- Introduction / Motivation
- Survey
- Proposed Method
- 1st Stage - Linear Dynamical Systems
- 2nd Stage - Bayesian Dynamical Systems
- Results
- Conclusions
19Introducing Time Dependence
- We introduce explicit time dependence
- ?(t) Fnxn ? ?(t-1) e(t)
- The distinct OD flows, components of ?(t), are
assumed to be independent - Use EM algorithm
20Introducing Time Dependence
- Our Linear Dynamical System contains the models
by Cao et al. as a special case
21Outline
- Introduction / Motivation
- Survey
- Proposed Method
- 1st Stage - Linear Dynamical Systems
- 2nd Stage - Bayesian Dynamical Systems
- Results
- Conclusions
22Bayesian Dynamical System
- Gamma and log-Normal OD flows (Xt)
- Use preliminary estimates of ?t t gt 0 , the
average OD flows, to softly constrain the
dynamical behavior of the OD flows to identify
the correct solution for Xt
23Non-Deterministic Dynamics
- Introduce explicit non-deterministic dynamics (F)
on the average OD flows - ?(t1) Fnxn ?(t)
- Diagonal matrix Fnxn Fi,i log-Normal
24Learning Latent Dynamics
- We want a preliminary estimate for Ft in
- ?t1 Ft1 ? ?t
?
P(?247Y247)
P(?246Y246)
Solve for F247
25Outline
- Introduction / Motivation
- Survey
- Proposed Methods
- Results
- Datasets
- Importance of Time Dependence
- Importance of non-Gaussianity
- Informative Priors for non-Gaussian BDS
- Conclusions
26Validation Data sets
- Consider star network topologies
- 4 OD flows, 9 OD flows and 16 OD flows
- Carnegie Mellon 12321 time series
- Lucent Technologies 32 time series
X1
X
X2
LINK
Y
X3
Situation at time t
X4
27Log-Normal OD Traffic Flows
- The validation OD traffic flows are skewed on
both data sets
28Outline
- Introduction / Motivation
- Survey
- Proposed Methods
- Results
- Datasets
- Importance of Time Dependence
- Importance of non-Gaussianity
- Informative Priors for non-Gaussian BDS
- Conclusions
29Reduce Variability
- Narrower range of possible values for the OD
traffic flows those which receive positive
posterior probability
30Robust Estimates
- Capture sharp changes in the distribution of the
OD traffic flows
31Outline
- Introduction / Motivation
- Survey
- Proposed Methods
- Results
- Datasets
- Importance of Time Dependence
- Importance of non-Gaussianity
- Informative Priors for non-Gaussian BDS
- Conclusions
32Capture Several Bursts
Kb
time
33Outline
- Introduction / Motivation
- Survey
- Proposed Methods
- Results
- Datasets
- Importance of Time Dependence
- Importance of non-Gaussianity
- Informative Priors for non-Gaussian BDS
- Conclusions
34Priors and Bayesian inference
- Informative Priors on ?t t gt 0 lead to
uni-modal posteriors
35Speed and Scalability
- The computing is time about 3 minutes
- 4 OD - 3 Links using R on Mac G4 667
- Linear in (OD) for each time point
- 1 day worth
- of data in 45
- minutes
36Model Comparison
37Numerical Comparison
l2
38Outline
- Introduction / Motivation
- Survey
- Proposed Methods
- Results
- Conclusions
39Past Approaches
- Unreasonable Models
- Gaussian or Poisson arrivals
- Time Dependence
- never explicitly addressed
40Conclusions
- Log-Normal models account for skewed and bursty,
non-observable OD flows - Novel BDS captures time dependence of data thus
reducing the variability of the estimates - Informative priors serve as soft constraints to
overcome the under-determinacy of the problem
41Future Work
- More tests on bigger networks
- from 2-star (4-D) to 4-star (16-D)
- Fit non-parametric seasonal components for the
non-observable OD flows
42 43Network Engineering
- State-of-the-Art guess and tweak
- Guess based on experience intuition
- Manually tweak things, and hope the best
- Disadvantages
- Manual process time consuming, error prone
- Not very reliable intuition may be wrong,
unexpected side effects - Suboptimal performance wastes resource/time
- Need to repeat the exercise when traffic pattern
changes
44A More Scientific Approach?
A "Well, we don't know the topology, we don't
know the traffic matrix, the routers don't
automatically adapt the routes to the traffic,
and we don't know how to optimize the routing
configuration. But, other than that, we're all
set!" Rexford2000, Kurose2003
45Contributions
- Realistic Models Gamma and log-Normal
- P( OD Flows(t) ?(t) )
- Explicit Time Dependence
- E( OD Flows(t) y(t) y(1) )
46Contributions
- Informative priors in a Bayesian Dynamical System
for an under-constrained problem - Drive our inferences to the correct solution
- Get high quality particles
- Easy solution for Sparse Traffic
47Exploring the OD space
- Gibbs sampler with Metropolis steps is able to
explore P(Xt Yt)
- We prove irreducibility of the chains
- Gamma, log-Normal
P(XtYt) gt 0
P(XtYt) 0
P(XtYt) gt 0
48Non-Deterministic Dynamics
- Introduce explicit non-deterministic dynamics (F)
on the average OD flows - ?(t1) Fnxn ?(t)
- Diagonal matrix Fnxn Fi,i log-Normal
leads to - ?(t1) F?(t) ? e?(t1) eFe?(t) ? ?(t1)
F?(t)
49Better OD Flows in 4 Steps
1
2
4
3
50Immanuel Kant o(1)
- In making inferences on non-observable quantities
we find the model we look for! - Assume a model that reasonably approximates real
OD flows, and of course it does not hurt to have
a prior opinion about it
51Learning OD Flows
- Typical solutions are based on
- Generalized Least Squares
- Maximum Likelihood
- Bayesian methods
- Entropy
- These methods generate one set of OD flows X from
multiple observations Y1,..,YT. In general
max pD1X, Xobs qD2Y, Yobs s.t.
Y A X, X ? 0, p,q ? 0,1 fixed
X
Random
52Intrinsic Dimensionality
- The routing matrix A has m rows lt n columns, and
its m rows are linearly independent - The space Rn where the OD flows live, can be
decomposed into a sub-space R(n-m) with an open
interior, and a degenerate sub-space Rm
It is possible to rearrange AA1,A2, and
XX1,X2 accordingly, so that given X2 ? R(n-m)
X1 A1(Y - A2X2) ? Rm
-1
53Doubly Stochastic BDS