Extrapolation Methods for Accelerating PageRank Computations - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Extrapolation Methods for Accelerating PageRank Computations

Description:

u2. u3. u4. u5. Outline. Definition of PageRank. Computation of PageRank ... u2. a2. u3. a3 ... u2. u3. u4. u5. Estimate components of current iterate ... – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 46
Provided by: taherhha
Category:

less

Transcript and Presenter's Notes

Title: Extrapolation Methods for Accelerating PageRank Computations


1
Extrapolation Methods for Accelerating PageRank
Computations
  • Sepandar D. Kamvar
  • Taher H. Haveliwala
  • Christopher D. Manning
  • Gene H. Golub
  • Stanford University

2
Motivation
  • Problem
  • Speed up PageRank
  • Motivation
  • Personalization
  • Freshness

Note PageRank Computations dont get faster as
computers do.
3
Outline
  • Definition of PageRank
  • Computation of PageRank
  • Convergence Properties
  • Outline of Our Approach
  • Empirical Results

4
Link Counts
Seps Home Page
Tahers Home Page
CS361
CNN
DB Pub Server
Yahoo!
Linked by 2 Important Pages
Linked by 2 Unimportant pages
5
Definition of PageRank
  • The importance of a page is given by the
    importance of the pages that link to it.

6
Definition of PageRank
Taher
Sep
Yahoo!
CNN
DB Pub Server
7
PageRank Diagram
0.333
0.333
0.333
Initialize all nodes to rank
8
PageRank Diagram
0.167
0.333
0.333
0.167
Propagate ranks across links (multiplying by link
weights)
9
PageRank Diagram
0.5
0.333
0.167
10
PageRank Diagram
0.167
0.5
0.167
0.167
11
PageRank Diagram
0.333
0.5
0.167
12
PageRank Diagram
0.4
0.4
0.2
After a while
13
Computing PageRank
  • Initialize
  • Repeat until convergence

14
Matrix Notation
15
Matrix Notation
Find x that satisfies
16
Power Method
  • Initialize
  • Repeat until convergence

17
A side note
  • PageRank doesnt actually use PT. Instead, it
    uses AcPT (1-c)ET.
  • So the PageRank problem is really
  • not

18
Power Method
  • And the algorithm is really . . .
  • Initialize
  • Repeat until convergence

19
Outline
  • Definition of PageRank
  • Computation of PageRank
  • Convergence Properties
  • Outline of Our Approach
  • Empirical Results

20
Power Method
Express x(0) in terms of eigenvectors of A
u1 1
u2 a2
u3 a3
u4 a4
u5 a5
21
Power Method
u1 1
u2 a2?2
u3 a3?3
u4 a4?4
u5 a5?5
22
Power Method
u1 1
u2 a2?22
u3 a3?32
u4 a4?42
u5 a5?52
23
Power Method
u1 1
u2 a2?2k
u3 a3?3k
u4 a4?4k
u5 a5?5k
24
Power Method
u1 1
u2 0
u3 0
u4 0
u5 0
25
Why does it work?
  • Imagine our n x n matrix A has n distinct
    eigenvectors ui.

26
Why does it work?
  • From the last slide
  • To get the first iterate, multiply x(0) by A.
  • First eigenvalue is 1.
  • Therefore

27
Power Method
u1 1
u2 a2
u3 a3
u4 a4
u5 a5
28
Convergence
  • The smaller l2, the faster the convergence of the
    Power Method.

u1 1
u2 a2?2k
u3 a3?3k
u4 a4?4k
u5 a5?5k
29
Our Approach
Estimate components of current iterate in the
directions of second two eigenvectors, and
eliminate them.
u1
u2
u3
u4
u5
30
Why this approach?
  • For traditional problems
  • A is smaller, often dense.
  • l2 often close to l1, making the power method
    slow.
  • In our problem,
  • A is huge and sparse
  • More importantly, l2 is small1.
  • Therefore, Power method is actually much faster
    than other methods.
  • 1(The Second Eigenvalue of the Google Matrix
    dbpubs.stanford.edu/pub/2003-20.)

31
Using Successive Iterates
32
Using Successive Iterates
x(0)
x(1)
u1
33
Using Successive Iterates
x(0)
x(1)
x(2)
u1
34
Using Successive Iterates
x(0)
x(1)
x(2)
u1
35
Using Successive Iterates
x(0)
x(1)
x u1
36
How do we do this?
  • Assume x(k) can be written as a linear
    combination of the first three eigenvectors (u1,
    u2, u3) of A.
  • Compute approximation to u2,u3, and subtract it
    from x(k) to get x(k)

37
Assume
  • Assume the x(k) can be represented by first 3
    eigenvectors of A

38
Linear Combination
  • Lets take some linear combination of these 3
    iterates.

39
Rearranging Terms
  • We can rearrange the terms to get

Goal Find b1,b2,b3 so that coefficients of u2
and u3 are 0, and coefficient of u1 is 1.
40
Summary
  • We make an assumption about the current iterate.
  • Solve for dominant eigenvector as a linear
    combination of the next three iterates.
  • We use a few iterations of the Power Method to
    clean it up.

41
Outline
  • Definition of PageRank
  • Computation of PageRank
  • Convergence Properties
  • Outline of Our Approach
  • Empirical Results

42
Results
Quadratic Extrapolation speeds up convergence.
Extrapolation was only used 5 times!
43
Results
Extrapolation dramatically speeds up convergence,
for high values of c (c.99)
44
Take-home message
  • Speeds up PageRank by a fair amount, but not by
    enough for true Personalized PageRank.
  • Ideas are useful for further speedup algorithms.
  • Quadratic Extrapolation can be used for a whole
    class of problems.

45
The End
  • Paper available at http//dbpubs.stanford.edu/pub/
    2003-16
Write a Comment
User Comments (0)
About PowerShow.com