Title: Microscopic Evolution of Social Network
1- Microscopic Evolution of Social Network
Zheng Jiangchuan
2Outline
Motivation
1
Introduction
2
Method Overview
3
4
Evaluation
3Motivation
- Conventional social network study
- Primarily focus on the static structure of social
network - Reveal statistical network properties observed in
real-world data power-law degree distribution,
small world property, community.
4Motivation
- What is missing or rarely studied
- How does the social network we have observed come
about - What force drives the social network to exhibit
the noted static macroscopic structural
properties? - How would the social network evolve in the
future?
5Introduction
- Basic Intuition
- The answer lies in the laws governing the
temporal evolution of social network - Since the social network is a self-organized
network, such laws are primarily hidden in the
temporal behaviors of individual nodes
6Introduction
- Temporal Behaviors of Individual Node
At what rate does a new node arrive?
Node Arrival Process
How long will a new node stay active during its
life time?
When a node creates a new edge, which target will
this node most likely connect to?
Edge Initiation and Selection Process
How long will a node sleep before creating a new
edge?
7Introduction
- Individual node behavior overview
- Node arrives at some rate
- The newly arrived node decides its active
lifetime - The node initiates its first edge to a specific
target node - The node goes to sleep for some time
- The node wakes up, if its life time has not
expired, then it selects a target node to connect
to. - Each node carries out the above process
simultaneously, collectively leading to the
macroscopic evolution of social network
8Introduction
- Basic Task
- Develop a generative model that is capable of
describing the evolution of a social network - This model is specified by the process in the
previous slide at an intuitive level - Need to quantify every step in this process
mathematically from empirical observations
9Introduction
- What can this generative model be used for
- Provide mathematical insight into the question of
how the social network with the static properties
we have observed come about - Predict the future evolution of the current
social network - Explain the temporal behaviors of humans in
social domain
10Method Overview
- How to quantify each step in the generative
process mathematically? - Figure out the mathematical expression for each
step by estimating from empirical temporal social
network data based on Maximum Likelihood
Estimation principle. - For each step, select a model with certain
parameters that maximize the likelihood of the
data we have observed
11Data Set
- Four data sets with temporal information
- Flickr (03/2003-09/2005)
- Delicious (05/2006-02/2007)
- Answers (03/2007-06/2007)
- LinkedIn (05/2003-10/2006)
12Edge Attachment Process
- Basic method
- Evolve the network edge by edge, and for every
edge arriving into the network, measure the
likelihood that the particular edge endpoints
would be chosen under some given model - Pick the model and associated parameters that
maximize the sample likelihood
13Edge Attachment Process
- Candidate models
- Before proceeding to the MLE experiments, need to
propose some candidate models - Edge attachment by degree
- Edge attachment by age of the node
- Carry some simple experiments to justify the
effectiveness of the proposed models
qualitatively
14Edge Attachment Process
- Edge attachment by degree
- The probability that a new edge connects to a
specific node is proportional to the degree of
that node at the moment - This is intuitively consistent with common sense
as people are more likely to know those
influential individuals
15Edge Attachment Process
- Edge attachment by degree
- Simple experiments for justification, not MLE.
- Plots the probability that a new edge connects to
a node with a certain degree
The experiments match well with the degree
preferential attachment model
16Edge Attachment Process
- Edge attachment by age of the node
- The probability that a new edge connects to a
specific node is proportional to age of that node
at the moment - The intuition is older, more experienced users
of a social network are also more engaged and
thus absorb more edges
17Edge Attachment Process
- Edge attachment by age of the node
- Depicts how many number of edges are absorbed by
nodes of specific age normalized by the number of
nodes that have achieved that age
The experiments do not match well with the
proposed model, but anyway it is a possible choice
18Edge Attachment Process
- Maximum likelihood estimation
- Four models
- D degree preferential attachment
- DR combination of degree preferential attachment
and uniformly at random attachment - A age preferential attachment
- DA combination of degree preferential attachment
and age preferential attachment - For each model and for each data set, plot the
sample likelihood w.r.t model parameters
19Edge Attachment Process
- Maximum likelihood estimation
Conclude that model D performs reasonably well
compared to more sophisticated variants based on
degree and age
20Locality of edge attachment
- Basic Intuition
- While the degree preferential attachment model
appears to be a reasonable model, it fails to
take into account the locality of edge attachment - Intuitively, people are more likely to connect to
people with common friends, that is, a new edge
tends to span a small number of hops
21Locality of edge attachment
- Experiments that empirically justify this
intuition - Plots the probability that a newly created edge
spans a certain number of hops
The experiments on real data do not match well
with PA model in terms of decreasing rate
22Locality of edge attachment
- Insight from the experiments
- The double exponential decrease of
suggest that newly created edges are very likely
to span only a small number of hops, forming
triangles - So the degree preferential attachment model
should be replaced by triangle-closing models,
i.e., each new edge connects to a node two hops
away
23Locality of edge attachment
- Mathematical model of triangle-closing
- The edge creating process can be decomposed into
two steps select the neighbor by some random
rule, then select the neighbors neighbor by
possibly another rule - There are many possible triangle-closing models,
depending on how to select neighbor at each step
24Locality of edge attachment
- Select best triangle-closing model using MLE
On average, the random-random triangle-closing
model performs relatively well, and will be used
to describe the evolution.
25Node life time
- Selected Model
- By performing similar maximum likelihood
estimation experiments, found that node lifetimes
are best modeled by an exponential distribution
26Time gap between edges
- Selected Model
- Intuitively, individuals with more friends are
likely to make new friends in a shorter time,
meaning the gap distribution for nodes with
different degrees should be different. - More precisely, the gap distribution should be
conditional on the degree of the node
27Time gap between edges
- Selected Model
- For a specific data set, by estimating and
for each using maximum likelihood estimation,
we are able to find the possible function of
and with , respectively.
While a is a constant, independent of d, k is a
linear function of d, although the linear
coefficient b varies among different data sets
28Complete Model
29Evaluation
- Very novel and rigorous evaluation method
- Basic ideas For a specific data set, if the
evolution model is correct, then the static
properties of the final network that are computed
mathematically by such an evolution model should
be close to what we have observed in the data set
30Evaluation
- Very novel and rigorous evaluation method
- 1. Based on the proposed evolution model,
analytically derive a mathematical expression for
the degree distribution of the final network as a
function of the parameters in the evolution model - 2. For a specific temporal social network data
set, estimate the parameters needed by the
evolution model. - 3.Substitute the estimated parameters into the
mathematical expression for the degree
distribution of the final network - 4. Compare the result with the true degree
distribution observed in the final snapshot of
this social network data set.
31Evaluation
Estimate and from the temporal social
network dataset, respectively
Compare this with the parameter of the degree
distribution directly estimated from the real
data set
32Evaluation
Surprisingly Similar!
33Thank you!
Q A