A Random-Surfer Web-Graph Model - PowerPoint PPT Presentation

About This Presentation
Title:

A Random-Surfer Web-Graph Model

Description:

resume.html. index.html. http://cnn.com. Studying the Web ... A Power-law degree distribution has been observed in a wide variety of graphs ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 52
Provided by: csC76
Learn more at: http://www.cs.cmu.edu
Category:
Tags: graph | model | random | surfer | web

less

Transcript and Presenter's Notes

Title: A Random-Surfer Web-Graph Model


1
A Random-Surfer Web-Graph Model
Mugizi Rwebangira
  • (Joint work with Avrim Blum Hubert Chan)

2
The Web as a Graph
Consider the World Wide Web as a graph, with web
pages as nodes and hyperlinks between pages as
edges.
3
Studying the Web
  • Since the Web emerged there has been a lot of
    interest in
  • Empirically studying properties of the Web Graph.
  • Modeling the Web Graph mathematically.
  • Benefits of Generative Models
  • Simulation When real data is scarce
  • Extrapolation How will the graph change?
  • Understanding Inspire further research on real
    data

4
Power Law
f(x) g(x) if Limx?8 f(x)/g(x) 1
e.g (x1) (x2)
The distribution of a random variable X follows a
power law if Prob Xk Ck-a
Example Prob Xk k-2
5
Power Law Prob Xk k-2
6
Power Law
Prob Xk Ck-a
log Prob Xk log C a log k
Prob Xk k-2
log Prob Xk -2 log k
7
Power Law Log-Log plot
8
Power Law contd.
More general definition
Prob Xk Ck-a
Particularly useful if X takes on real values.
Sometimes referred to as heavy tailed or
scale free.
9
Power Laws in Degree distribution
Let G be a graph.
Let Xk be the proportion of nodes with degree k
in G.
Then if Xk Ck-a we say that G has power law
degree distribution.
10
Properties of the Web Graph
A Power-law degree distribution has been observed
in a wide variety of graphs including citation
networks, social networks, protein-protein
interaction networks and so on.
It has also been observed in the Web Graph.
Barabási Albert
11
Outline
  • Background/Previous Work
  • Motivation
  • Models
  • Theoretical results
  • Experimental results
  • Conclusions

12
Classic Random Graph Models
  • In the G(n,p) random graph model
  • There are n nodes.
  • There is an edge between any two nodes with
    probability p.
  • Was proposed by Erdös and Renyi in 1960s.

13
Online G(n,p)
  • In this model each new node makes k connections
    to existing nodes uniformly at random.

For this talk we will focus on k 1, hence the
graph will be a tree.
14
Online G(n,p)
15
Properties of Online G(n,p)
  • Edegree of first node 1 1/2 1/31/4 1/n
    ?(log n)
  • Emax degree ?(log n)
  • Xk Proportion of nodes with degree k
  • EXk ?(½k)

NOT POWER LAWED!!
16
Online G(n,p) (n100,000, average of 100 runs)
17
Preferential Attachment
In the Preferential Attachment model, each
new node connects to the existing nodes with a
probability proportional to their degree.
Barabási Albert
18
Preferential Attachment
19
Preferential Attachment
Edegree of 1st node vn
Preferential Attachment gives a power-law degree
distribution. Mitzenmacher, Cooper Frieze 03,
KRRSTU00
20
Preferential Attachment
21
Other Models
Kumar et. al. proposed the copying model.
KRRSTU00 Leskovec et. al. propose a forest
fire model which has some similarites to this
work. LKF05
22
Outline
  • Background/Previous Work
  • Motivation
  • Models
  • Theoretical results
  • Experimental results
  • Conclusions

23
Motivating Questions
  • Why would a new node connect to nodes of high
    degree?
  • Are high degree nodes more attractive?
  • Or are there other explanations?

How does a new node find out what the high degree
nodes are?
24
Motivating Questions
Motivating Observation
  • Suppose each page has a small probability p of
    being interesting.
  • Suppose a user does a (undirected) random walk
    until they
  • find an interesting page.
  • If p is small then this is the same as
    preferential attachment.
  • What about other processes and directed graphs?

25
Outline
  • Background/Previous Work
  • Motivation
  • Models
  • Theoretical results
  • Experimental results
  • Conclusions

26
Directed 1-step Random Surfer, p.5
27
Directed 1-step Random Surfer
It turns out this model is a mixture of
connecting to nodes uniformly at random and
preferential attachment.
Has a power-law degree distribution.
But taking one step is not very natural.
What about doing a real random walk?
28
Directed Coin Flipping model
  1. Pick a node uniformly at random.

2. Flip a coin of bias p
If HEADS connect to current node, else walk to
neighbor
D
C
NEW NODE
B
A
RANDOM STARTING NODE
1. COIN TOSS TAIL (at node A)
2. COIN TOSS TAIL (at node B)
3. COIN TOSS HEAD (at node C)
29
Directed Coin Flipping model
  1. At time 1, we start with a single node with a
    self-loop.
  2. At time t, we choose a node u uniformly at
    random.
  3. We then flip a coin of bias p.
  4. If the coin comes up heads, we connect to the
    current node.
  5. Else we walk to a random neighbor and go to step
    3.

each page has equal probability p of being
interesting to us
30
Outline
  • Background/Previous Work
  • Motivation
  • Models
  • Theoretical results
  • Experimental results
  • Conclusions

31
Is Directed Coin-Flipping Power-lawed?
We dont know but we do have some partial
results ...
32
Virtual Degree
Definitions
Let li(u) be the number of level i descendents of
node u. l1(u) of children l2(u) of
grandchildren, e.t.c.
Let ? (ß1, ß2,..) be a sequence of real numbers
with ?11.
Then v?(u) 1 ß1 l1(u) ß2 l2(u) ß3
l3(u) Well call v?(u) the Virtual degree
of u with respect to ?.
33
Virtual Degree
34
Virtual Degree
Easy observation If we set ßi (1-p)i then the
expected increase in deg(u) is proportional to
v(u).
Expected increase in deg(u) p/t (1-p)pl1(u)/t
(1-p)2pl2(u)/t (p/t)v(u)
35
Virtual Degree
  • Theorem There always exist ßi such that
  • For i 1, ßi 1.
  • As i ? 8, ßi ?0 exponentially.
  • The expected increase in v(u) is proportional to
    v(u).

Recurrence ?11, ?2p, ?i1?i
(1-p)?i-1
E.g., for p¾, ?i 1, 3/4, 1/2, 5/16, 3/16,
7/64,...
for p½, ?i 1, 1/2, 0, -1/4, -1/4,
-1/8, 0, 1/16,
36
Virtual Degree, continued
Let vt(u) be the virtual degree of node u at time
t and tu be the time when node u first appears.
Theorem For any node u and time t tu,
Evt(u) T((t/tu)p)
So, the expected virtual degrees follow a power
law.
37
Actual Degree
We can also obtain lower bounds on the expected
values of the actual degrees
Theorem For any node u and time t tu,
Edegree(u) O((t/tu)p(1-p))
38
Outline
  • Background/Previous Work
  • Motivation
  • Models
  • Theoretical results
  • Experimental results
  • Conclusions

39
Experiments
  • Random graphs of n100,000 nodes
  • Compute statistics averaged over 100 runs.
  • K1 (Every node has out-degree 1)

40
Online Erdös-Renyi
41
Directed 1-Step Random Surfer, p3/4
42
Directed 1-Step Random Surfer, p1/2
43
Directed 1-Step Random Surfer, p1/4
44
Directed Coin Flipping, p1/2
45
Directed Coin Flipping, p1/4
46
Undirected coin flipping, p1/2
47
Undirected Coin Flipping p0.05
48
Outline
  • Background/Previous Work
  • Motivation
  • Models
  • Theoretical results
  • Experimental results
  • Conclusions

49
Conclusions
  • Directed random walk models appear to generate
    power-laws (and partial theoretical results).

Power laws can naturally emerge, even if all
nodes have the same intrinsic attractiveness.
50
Open questions
  • Can we prove that the degrees in the directed
    coin-flipping model do indeed follow a power law?
  • Analyze degree distribution for the undirected
    coin-flipping
  • model with p1/2?
  • Suppose page i has interestingness pi. Can we
    analyze
  • the degree as a function of t, i and pi?

51
Questions?
Write a Comment
User Comments (0)
About PowerShow.com