I2.2: Analysis of significant substructures in time-varying networks - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

I2.2: Analysis of significant substructures in time-varying networks

Description:

I2.2: Analysis of significant substructures in time-varying networks Ambuj Singh (in collaboration with P. Bogdanov, M. Mongiovi, X. Yang) NS-CTA INARC Mid-Year Review – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 30
Provided by: Goo7579
Category:

less

Transcript and Presenter's Notes

Title: I2.2: Analysis of significant substructures in time-varying networks


1
I2.2 Analysis of significant substructures in
time-varying networks
Ambuj Singh (in collaboration with P. Bogdanov,
M. Mongiovi, X. Yang) NS-CTA INARC Mid-Year 
Review March 2011
03/22/11
2
Dynamic networks
  • Dynamic networks are commonplace
  • online interaction networks
  • Twitter, Wikipedia, LinkedIn, Facebook, ..
  • mobile networks
  • Cyber-physical scenario (EDIN, INARC)
  • virus propagation (E2.1)
  • Generative models to explain the network
    structure
  • preferential attachment Barabasi '99
  • forest-fire Lescovec '09
  • Markov Chain models (discrete, continuous)
  • when, where, what changes Avin '08, Clemente
    '08
  • Latent space / context models Zheng '05
  • Network flow/traffic Daganzo '94, Bickel '01,
    Stoev '09 
  • Disease propagation, blog cascade, SIS Lescovec
    '07 
  • Stochastic actor-based models Snijders '09

03/22/11
3
Our focus
  • Dynamic edge attributes
  • Simplest case
  • edge is 1 or -1
  • 1 means flow of interest
  • congestion, flow above historical threshold
  • real values are a general case and can also be
    considered
  • Query find highest scoring substructures in
    graph over time 
  • combines graph structure and time

03/22/11
4
Motivation traffic congestion
03/22/11
5
Re-tweet rate of music in Twitter
03/22/11
6
  • Outline
  • Motivation
  • Problem definition
  • Solving for a fixed time interval
  • Heuristic for multiple time intervals
  • Path Forward

03/22/11
7
  • Problem definition
  • A time evolving graph
  • G (V, E, Ft(e))
  • V set of nodes
  • E set of edges
  • Ft(e) mapping of edges to -1,1
  • Score of an edge e in interval t1,t2 ? Ft(e)
  • Score of a subgraph in interval ? score(e), for
    all e in the subgraph

1
-1
1
-1
1
-1
-1
-1
1
-1
-1
1
-1
-1
1
-1
1
1
-1
-1
t1
t2
t3
t4
1
-1
1
-1
1
-1
-1
-1
-1
-1
-1
-1
1
1
1
-1
1
-1
1
-1
03/22/11
8
Prize-collecting Steiner Tree (PCST)
  • Given a graph G(V, E) with positive node weights
    p(v) and negative edge weights c(e), find a
    subtree T (V,E) such that
  • Goemans-Williamson Minimization (GW-PCST)
  • Net Worth Maximization (NW-PCST)
  • Both are NP-hard (equivalent objective functions)
    Johnson00
  • GW-PCST has an approximation factor 2-1/(n-1). 
  • The rooted version of NW-PCST is NP-hard to
    approximate within any constant factor Feig 01

GW(T) ? c(e) ? p(v)
e in E
v not in V
NW(T) ? p(v) - ? c(e)
e in E
v in V
03/22/11
9
Why the same guarantee doesnt hold for NW?
APX
  • In this specific example
  • GW-PCST
  • APX 3(k-1)
  • OPT 2k
  • ratio 2/3
  • NW-PCST
  • OPT k
  • APX 3
  • ratio k/3

OPT
3
2
3
2
0
2
k
3
2
3
Optimal solution the whole graph
03/22/11
10
Merge-and-refine approximation
  • Merge nodes into clusters in a bottom-up fashion
  • shortest-path metric graph using edge costs 
  • Merge triangle and star structures considering
    both node values and interconnect cost
  • Multiple refinement iterations
  • Approximation quality
  • OPT lt APX cN(OPT), where N is the cost of
    interconnection
  • Good approximation for instances in which there
    are cheaply connected clusters of high-prize
    nodes
  • Challenges
  • Relatively high computational cost due to all
    pairs shortest path computation

03/22/11
11
An example
  • Aggregate edge values within the interval
  • Transform the edge-weighted graph into NW-PCST
  • Apply the Merge-and-refine approximation 

03/22/11
12
Running time of merge-and-refine 
  • APSP comprises 90 of the approximation running
    time
  • Takes more than a second for N360 for one
    interval

03/22/11
13
Baseline solution across time
  • Find the best subgraph in time by exhaustive
    enumeration
  • Consider all O(t2) intervals
  • Apply the solution for a fixed interval in each
  • Take the best obtained subgraph in all intervals
  • Polynomial cost, but impractical for real-world
    problems
  • The highway system of Southern California has
    4k edges with live-traffic measurements
  • The Autonomous Systems (AS)-level Internet
    backbone has hundreds of thousand of links 
  • The baseline solution would not be practical for
    networks of this scale
  • Need for scalable solutions of acceptable quality

03/22/11
14
Best-first approach using bounds
  • Idea reduce the number of calls to
    Merge-and-refine
  • Estimate solutions for different intervals
  • Evaluate the most "promising" intervals first
  • Prune intervals that do not contain the best
    solution
  • Bound the solution in an interval
  • Computationally simple to compute
  • Effective in terms of pruning power
  • Best first procedure
  • Order intervals by their upper bound
  • Prune infeasible intervals using lower bound

03/22/11
15
Upper bound (UB)
  • Offline 
  • Consider a hyper-graph in which original edges
    become nodes and original nodes become
    hyper-edges 
  • Split the original edges into k partitions via
    hyper-graph partitioning
  • Maintain edges at partition "boundaries
  • Online UB estimation for a fixed interval
  • UB of a partition is the aggregate of its
    positive edges
  • Edges between partitions
  • 0 cost if there is at least one positive boundary
    edge
  • cheapest boundary edge otherwise
  • Solve the NW-PCST on the obtained coarse-level
    graph

03/22/11
16
Upper bound example
03/22/11
17
Upper bound effectiveness
  • The upper bound is more effective if
  • Partitions are well connected (small diameter)
  • Edges within partitions are correlated
  • Boundary edges are minimal and have expected
    value closer to -1 than within-partition edges
  • The upper bound is a coarse aggregation of the
    original graph
  • Coarseness is controlled by partitions
  • Trade-off between efficiency and effectiveness

03/22/11
18
Upper bound quality
  • Random Markovian graph (N150,M180,T300).
  • Number of partitions 2-64. 
  • Random 64 is a random partitioning of edges into
    64.

03/22/11
19
Lower bound
  • Local iterative search in the solution space
    within an interval
  • Simulated Annealing (SA) procedure that
    grows/shrinks a subgraph within an interval
  • Possible moves add/remove an edge from an
    existing solution
  • Allow sub-optimal moves according to an annealing
    schedule
  • Better quality than simple greedy algorithm
  • Due to sub-optimal moves, high-score clusters can
    be joined even if there are more than 2-hops away
  • Better running time than Merge-and-refine
  • No computation of all pairs shortest paths

03/22/11
20
Summary
  • Dynamic graphs with changing edge attributes
  • Simplest query find the highest scoring
    substructure
  • Heuristics under development
  • Approximation guarantee
  • Empirical validation on
  • traffic network
  • twitter messages

03/22/11
21
Path forward
  • Maximal scoring subgraph is a building block for
    richer queries and analyses
  • What is the structure of a congestion? Global
    (short and large), longitudinal (prolonged and
    localized) or a combination of both?
  • What characterizes the evolution of a network?
  • How do different network regions compare? 
  • Is evolution similar across networks of different
    genres?  
  • Index structures
  • Use statistical models for indexing real-world
    networks
  • Exploit locality within the network and locality
    in time
  • Represent the network at different level of
    coarseness
  • Queries constrained by
  • Time
  • Neighborhood
  • Similarity queries

03/22/11
22
Connections
  • Queries/analysis of information flow (E 2.1)
  • Queries on mobile networks (E 2.2, E2.3)
  • Formal modeling of time (E1.1)
  • Dynamic network models (E2.1)

03/22/11
23
Army relevance
  • Query/analysis of mobility networks
  • Cyber-physical scenario
  • Query/analysis of evolving networks
  • Patterns of behavior in composite networks
  • Find terrorist groups using temporal interactions

03/22/11
24
Publications
  • P. Bogdanov, B. Baumer, P. Basu, A. Singh, and
    A. Bar-Noy, Discovering Influential Groups of
    Agents Using Composite Network Analysis,
    submitted to NetSci 2011.
  • P. Bogdanov, Nicholas D. Larusso and Ambuj K.
    Singh, Towards Community Discovery in Signed
    Collaborative Interaction Networks, published in
    SIASP at 2010 IEEE International Conference on
    Data Mining, 2010.
  • K. Macropol and A. Singh, Content-based Modeling
    and Prediction of Information Dissemination,
    submitted to ASONAM 2011.
  • M. Mongiovi, A. Singh, X. Yan, B. Zong, K.
    Psounis, An Indexing System for Mobility-aware
    Information Management, submitted to VLDB.
  • Ziyu Guan, Jian Wu, Zheng Yun, Ambuj K. Singh and
    Xifeng Yan, Assessing and Ranking Structural
    Correlations in Graphs, to appear at SIGMOD 2011.
  • Nicholas D Larusso and Ambuj K. Singh, Synopses
    for Probabilistic Data over Large Domains, in
    EDBT 2011.

03/22/11
25
THANK YOU!
03/22/11
26
  • Markovian dynamic models
  • Markovian - the graph state is a Markov Chain
  • Fixed set of nodes
  • Edges at time t depend on edges at time t-1
  • Cover Time of Dynamic Graphs Avin et Al. '08
  • Introduction of Markovian Dynamic Graphs
  • Exponential cover time
  • Lazy random walks 
  • Information spread in Markovian graphs Clementi
    '09
  • Edge-Markovian
  • Geometric Markovian - node mobility
  • Evolving range-dependent graphs Grindrod '09
  • Edge dynamics as a birth/death process

03/22/11
27
Dynamic models of traffic
  • The cell transmission model (CTM) Daganzo '94
  • Dynamic model of highway traffic
  • Inspired by hydrodynamic theory 
  • Traffic Flow on a Freeway Network Bickel '01
  • Time and context Markovian model of the traffic
    flow
  • The state of a segment at time t depends on the
    state of its neighbors and and itself at time t-1
  • Model of a single highway. How about junctions? 
  • Computer Network Traffic Stoev '09
  • Statistical model of traffic flow across all
    links
  • Applied to traffic prediction 

03/22/11
28
  • Background literature

?Avin '08 Chen Avin and Zvi Lotker. "How to
Explore a Fast-Changing World." 2008 Bickel
'01  Peter Bickel, Chao Chen, Jaimyoung Kwon,
and John Rice. "Traffic Flow on a Freeway
Network" Electrical Engineering, 2001. Clementi
'09 Andrea Clementi, Angelo Monti, Francesco
Pasquale, and Riccardo Silvestri. "Information
Spreading in Stationary Markovian Evolving
Graphs". Informatica, 2009   Feig01 J.
Feigenbaum, C. Padimitriou, and S. Shenker,
Sharing the Cost of Multicast Transmissions,
JCSS, 63, 21-41, 2001. Grinford '09 Peter
Grindrod and Desmond J. Higham. "Evolving Graphs
Dynamical Models, Inverse Problems and
Propagation." 2009 Johnson00 D. Johnson, M.
Minkoff, S. Phillips, The Prize Collecting
Steiner Tree Problem Theory and Practice, ACM
SODA, 2000. Lescovec '07 Jure Leskovec, Mary
McGlohon, Christos Faloutsos, Natalie Glance,
Matthew Hurst "Cascading behavior in large blog
graphs Patterns and a Model", SDM, 2007 
03/22/11
29
Background literature
  • Ribeiro '11 B. Ribeiro, D. Figueiredo, E. de
    Souza e Silva, and D. Towsley, "Characterizing
    Dynamic Graphs with Continuous-time Random
    Walks" SIGMETRICS 2011.
  • Snijders '09 Tom A.B. Snijders, Gerhard G. van
    de Bunt, Christian E.G. Steglich, "Introduction
    to Stochastic Actor-Based Models for Network
    Dynamics", Social Networks, 2009
  • Stoev '09 Stilian A. Stoev, George Michailidis,
    and Joel Vaughan. "Global Modeling and Prediction
    of Computer Network", Arxiv 2009
  • Zheng '05 A. X. Zheng and A. Goldenberg "A
     Generative Model for Dynamic Contextual
    Friendship Networks", Learning, 2005

03/22/11
Write a Comment
User Comments (0)
About PowerShow.com