Measuring and Extracting Proximity in Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Measuring and Extracting Proximity in Networks

Description:

Collection of edges or links. Collection of nodes ... walk starting at v1 for path P= v1-v2-...-vr, will follow this path is given by: ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 40
Provided by: stude76
Learn more at: https://www.cs.kent.edu
Category:

less

Transcript and Presenter's Notes

Title: Measuring and Extracting Proximity in Networks


1
Measuring and Extracting Proximity in Networks
  • By - Yehuda Koren, Stephen C.North and Chris
    Volinsky


  • - Rahul Sehgal

2
Introduction
  • Network
  • Information hidden in a network
  • What is Proximity?
  • Why do we need proximity in a network?
  • Methods for measuring and extracting proximities
    in a network.
  • CFEC ( Cycle free effective conductance) and how
    to compute cycle-free escape probability
  • Extracting proximity through proximity graphs
    obtained by CFEC
  • Working with large networks
  • Experiments
  • Conclusion
  • Questions

3
Network
  • Collection of edges or links
  • Collection of nodes
  • These edges and links help in deciding the
    proximity in a given network.

4
Hidden information in a network
C
D
E
B
A
F
5
Hidden information in a network
C
  • If two people speak on the phone to many common
    friends, the probability is high that they will
    talk to each other in the future, or perhaps that
    they already communicate through some other
    medium as email.

D
E
B
A
F
6
Hidden information in a network
C
  • If two nodes are connected to a common node then
    it might be possible that they have a strong
    relationship or they dont have any relationship
    at all.
  • for example in a telephone network where
    many people call to service center but it might
    be possible that two people who are calling they
    dont know each other at all.

D
E
B
A
F
7
What is proximity?
  • Proximity is a method of measuring distance or
    closeness between different objects.
  • It is a method of finding hidden relationship
    between the objects.
  • It is a method of finding similarities and helps
    in clustering objects or nodes.

8
Need of proximity
  • It measures potential information exchange
    between two non-linked objects through
    intermediaries
  • It can measure the extent to which two nodes
    belong to each other.
  • It helps in knowing the likelihood that a link
    will exist in future.
  • In social network settings proximity helps to
    predict or track the propagation of an idea,
    product or disease .
  • It helps in discovering unexpected communities in
    a network.

9
Various Methods
  • Graph-theoretic distance
  • It is length of shortest path connecting two
    nodes measured either by number of hops or sum of
    weight of edges
  • Limitations
  • Proximity decays as nodes become farther apart.
  • Information may be lost due to friction or noise
    at a particular node.
  • This method doesnt assume that proximity can
    exist via multiple paths.

10
Various Methods contd
  • Network Flow (maximal network flow)
  • Limited capacity is assigned to each edge,
    depending upon the weight of that edge and then
    compute the maximal number of units that can be
    simultaneously delivered from node s to node t.
  • It prefers high weight edges and captures the
    premise that an increasing number of alternative
    paths increases the proximity.
  • For example we consider the adjacent figure.

a1
B
A
b1
A
B
a1
11
Various Methods contd
  • Network Flow (maximal network flow)
  • Limitations
  • It disregards the length of the path.
  • It also follow that the maximal s-t flow (that is
    graph flow from node s to node t) in a graph
    equals the minimal s-t cut that is the minimal
    edge capacity we need to remove to disconnect s
    from t, therefore we cannot implement it in a
    robust system.

12
Various Methods contd
  • Effective Conductance (EC)
  • Modeling of networks as an electric circuit by
    treating the edges as resistors whose conductance
    is the given edge weight.
  • In this method we keep the starting node (s) as 1
    and the end node (t) as 0. and then we solve the
    linear equations for getting voltage and current
    on each edge.
  • It accounts for both path length and number of
    alternative paths .
  • It avoids dependence on single shortest path and
    bottlenecks
  • It has a monotonicity property which states that
    in an electrical resistor network, increasing the
    conductance of any resistor or increasing the
    number of resistors increase conductance between
    any two nodes.

13
Various Methods contd
  • Limitations of Effective Conductance-
  • Monotonicity has its limitations consider
    following example

s1
a1
t1
same EC
a1
t
s
14
Various Methods contd
  • Sink augmented effective conductance
  • Each node in a network is connected to a
    universal sink which is at voltage zero.
  • This universal sink competes with the node t.
  • Universal sink tax each node that absorbs a
    portion of out going current. Consequently it
    forces all the node to have degree greater than
    1. So, our restriction for degree one node
    doesnt exist any more.

15
Various Methods contd
  • Limitations of sink augmented effective
    conductance
  • Its required to know how much current will flow
    through each node . Understanding how such
    parameter will influence the proximity is a
    difficult task.
  • It destroys the concept of monotonicity . It
    means whenever a new node is added to the network
    it has a direct link to the sink but not to t. It
    strengthens sink and it compete with node t and
    thus the proximity between s and t is lost.
  • So, we look for a solution that can overcome the
    limitations of above methods..

16
Random Walk
  • Definitions
  • Random Walk is a transition from one state to
    another without depending on the previous state.
    The transition could be to the same state also.

t
a3
a4
s
a2
a1
17
Random Walk
  • Random walk in network proximity is the infinite
    number of attempts that is made to reach from
    starting node s to end node t. It might be
    possible that when we traverse this path we might
    return back to the starting node s.

s
t
18
Random Walk
  • Explanation
  • In network it might be possible that during
    random walk we might go back to s without going
    to t.
  • As we can see in the diagram it might be possible
    that we return back to s via path sa1-a2-a3-s
    without going to t.
  • We are having a cyclic path which is leading us
    back to starting point s.
  • continued gt

19
Random Walk
  • Diagram

a3
a2
s
a1
a5
t
a4
20
What is our goal?
  • We have to improve the effective conductance
    measure and avoid any cyclic path..

21
CFEC( Cycle Free Effective Conductance)
  • It considers random walk interpretation
  • DEFINITION The cycle-free escape probability
    from s to t is the probability that a random walk
    originating at s will reach t without visiting
    any node more than once.
  • In random walk, a probability of transition from
    node i to node j is
  • probability of transition
    from node i to node j
  • weight of edge from i
    to j
  • degree of node i.

22
CFEC( Cycle Free Effective Conductance)
  • The probability that a random walk starting at v1
    for path P v1-v2--vr, will follow this path is
    given by

23
Features of CFEC
  • We have following equalities
  • Multiplying by the degree
  • R set of simple paths from s to t, simple path
    are those that never visit the same node twice
  • CFEC discourages long paths as the probability of
    following the path decays exponentially with its
    length. It is given by
  • It supports proximity measure for multiple paths.
  • Degree-1 nodes dilute the significance of path
    from s to t. So we cut the main graph into
    sub-graph. And we preserve the original degree of
    the nodes.

24
Explanation of CFEC features
  • CFEC discourages long paths

ck
s
t
c1
c2
Proximity decreases
25
Computing Cycle Free Escape Probability
  • Restrict the sum in following equation to K most
    probable simple path.
  • Edge weights are transformed into edge lengths,
    establishing 1-1 correspondence between path
    probability and path length.
  • Path probability is given as

26
Extracting Proximity Through Proximity Graphs
  • Extract a subgraph that maximizes the
    ratio
  • subgraph
  • constant,
  • Find subset of R, which required for above
    problem.
  • Set of simple paths R is sorted in ascending
    order of weights of paths.

27
Extracting Proximity Through Proximity Graphs
  • Optimizing the given formula using branch and
    bound path merging algorithm

28
Branch and Bound path merging algorithm
29
Working with large networks
  • Growing candidate graphs via s, t neighborhoods
  • Producing the candidate graph is to find a sub
    graph containing the highest weight paths
    originating at either s or t.
  • Now, the problem becomes , find a sub graph
    containing shortest paths originating at either s
    or t. Our objective is to expand the
    neighborhoods of s and t. This is done by
    Dijkstras algorithm for computing shortest path
    on graphs with non-negative edge lengths.

30
Working with large networks contd.
  • Determining neighborhood size
  • we determine paths which are not longer than L
    log(). Path lengths greater than is are not
    useful. Here L is the length of shortest s-t
    path.
  • In practice, (L log())/2 neighborhoods might
    be too large.

31
Working with large networks contd.
  • Pruning the neighborhood
  • We prune the neighbor in such a way that
    dist(s,i)dist(t,i) gt ß , where s is our starting
    node and t is terminating node.
  • Then, we exclude from the neighborhood any i for
    which dist(s,i)dist(i,t) gt L log().

32
Experiments
  • Online movie database IMDB.
  • Co-authorship graph.
  • Telecommunication graph.

33
Experiments
34
Telecommunication Graph
  • Extract the candidate graph by growing
    neighborhoods from the two nodes of interest.
  • 2000 random pairs of telephone numbers to
    calculate the CFEC value.
  • For 1808(90) of these pairs paths between them
    were found. Others do not have any known path
    between them, this could be due to, these number
    were not in frequent use.
  • We have values for as 1,5,10 and 50.

35
Telecommunication Graph (contd)
36
Conclusion
  • CFEC proximity allows us to readily compute
    proximity graphs, which are small portions of the
    network that are aimed at capturing a related
    proximity value.
  • An analyst studying proximity in a graph has to
    focus on the most relevant part of the graph.
  • It is extension of connection graph which is
    capable of presenting compact relationship
    between objects of a network.
  • We can deduce relationship between more than two
    endpoints, the flexibility to handle edge
    direction, and the fact that they are obtained by
    solving an intuitively tunable optimization
    problem.

37
  • QUESTIONS??????

38
References
39
References
Write a Comment
User Comments (0)
About PowerShow.com