Extrapolation Methods for Accelerating PageRank Computations - PowerPoint PPT Presentation

1 / 45

About This Presentation

Title:

Extrapolation Methods for Accelerating PageRank Computations

Description:

u2. u3. u4. u5. Outline. Definition of PageRank. Computation of PageRank ... u2. a2. u3. a3 ... u2. u3. u4. u5. Estimate components of current iterate ... – PowerPoint PPT presentation

Number of Views:137

Avg rating:3.0/5.0

Slides: 46

Provided by: taherhha

Category:

more less

Transcript and Presenter's Notes

Title: Extrapolation Methods for Accelerating PageRank Computations

1
Extrapolation Methods for Accelerating PageRank
Computations

Sepandar D. Kamvar
Taher H. Haveliwala
Christopher D. Manning
Gene H. Golub
Stanford University

2
Motivation

Problem
Speed up PageRank
Motivation
Personalization
Freshness

Note PageRank Computations dont get faster as
computers do.
3
Outline

Definition of PageRank
Computation of PageRank
Convergence Properties
Outline of Our Approach
Empirical Results

4
Link Counts
Seps Home Page
Tahers Home Page
CS361
CNN
DB Pub Server
Yahoo!
Linked by 2 Important Pages
Linked by 2 Unimportant pages
5
Definition of PageRank

The importance of a page is given by the
importance of the pages that link to it.

6
Definition of PageRank
Taher
Sep
Yahoo!
CNN
DB Pub Server
7
PageRank Diagram
0.333
0.333
0.333
Initialize all nodes to rank
8
PageRank Diagram
0.167
0.333
0.333
0.167
Propagate ranks across links (multiplying by link
weights)
9
PageRank Diagram
0.5
0.333
0.167
10
PageRank Diagram
0.167
0.5
0.167
0.167
11
PageRank Diagram
0.333
0.5
0.167
12
PageRank Diagram
0.4
0.4
0.2
After a while
13
Computing PageRank

Initialize
Repeat until convergence

14
Matrix Notation
15
Matrix Notation
Find x that satisfies
16
Power Method

Initialize
Repeat until convergence

17
A side note

PageRank doesnt actually use PT. Instead, it
uses AcPT (1-c)ET.
So the PageRank problem is really
not

18
Power Method

And the algorithm is really . . .
Initialize
Repeat until convergence

19
Outline

Definition of PageRank
Computation of PageRank
Convergence Properties
Outline of Our Approach
Empirical Results

20
Power Method
Express x(0) in terms of eigenvectors of A
u1 1
u2 a2
u3 a3
u4 a4
u5 a5
21
Power Method
u1 1
u2 a2?2
u3 a3?3
u4 a4?4
u5 a5?5
22
Power Method
u1 1
u2 a2?22
u3 a3?32
u4 a4?42
u5 a5?52
23
Power Method
u1 1
u2 a2?2k
u3 a3?3k
u4 a4?4k
u5 a5?5k
24
Power Method
u1 1
u2 0
u3 0
u4 0
u5 0
25
Why does it work?

Imagine our n x n matrix A has n distinct
eigenvectors ui.

26
Why does it work?

From the last slide
To get the first iterate, multiply x(0) by A.
First eigenvalue is 1.
Therefore

27
Power Method
u1 1
u2 a2
u3 a3
u4 a4
u5 a5
28
Convergence

The smaller l2, the faster the convergence of the
Power Method.

u1 1
u2 a2?2k
u3 a3?3k
u4 a4?4k
u5 a5?5k
29
Our Approach
Estimate components of current iterate in the
directions of second two eigenvectors, and
eliminate them.
u1
u2
u3
u4
u5
30
Why this approach?

For traditional problems
A is smaller, often dense.
l2 often close to l1, making the power method
slow.
In our problem,
A is huge and sparse
More importantly, l2 is small1.
Therefore, Power method is actually much faster
than other methods.

1(The Second Eigenvalue of the Google Matrix
dbpubs.stanford.edu/pub/2003-20.)

31
Using Successive Iterates
32
Using Successive Iterates
x(0)
x(1)
u1
33
Using Successive Iterates
x(0)
x(1)
x(2)
u1
34
Using Successive Iterates
x(0)
x(1)
x(2)
u1
35
Using Successive Iterates
x(0)
x(1)
x u1
36
How do we do this?

Assume x(k) can be written as a linear
combination of the first three eigenvectors (u1,
u2, u3) of A.
Compute approximation to u2,u3, and subtract it
from x(k) to get x(k)

37
Assume

Assume the x(k) can be represented by first 3
eigenvectors of A

38
Linear Combination

Lets take some linear combination of these 3
iterates.

39
Rearranging Terms

We can rearrange the terms to get

Goal Find b1,b2,b3 so that coefficients of u2
and u3 are 0, and coefficient of u1 is 1.
40
Summary

We make an assumption about the current iterate.
Solve for dominant eigenvector as a linear
combination of the next three iterates.
We use a few iterations of the Power Method to
clean it up.

41
Outline

Definition of PageRank
Computation of PageRank
Convergence Properties
Outline of Our Approach
Empirical Results

42
Results
Quadratic Extrapolation speeds up convergence.
Extrapolation was only used 5 times!
43
Results
Extrapolation dramatically speeds up convergence,
for high values of c (c.99)
44
Take-home message

Speeds up PageRank by a fair amount, but not by
enough for true Personalized PageRank.
Ideas are useful for further speedup algorithms.
Quadratic Extrapolation can be used for a whole
class of problems.

45
The End