Online Topological Ordering - PowerPoint PPT Presentation

About This Presentation
Title:

Online Topological Ordering

Description:

Given a DAG G and a valid topological order, if u v and u v, then all subsequent ... For graph with no edges, any ordering is a topological ordering ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 53
Provided by: sid82
Category:

less

Transcript and Presenter's Notes

Title: Online Topological Ordering


1
Online Topological Ordering
  • Siddhartha Sen, COS 518
  • 11/20/2007

2
Outline
  • Problem statement and motivation
  • Prior work (summary)
  • Result by Ajwani et al.
  • Algorithm
  • Correctness
  • Running time
  • Implementation
  • Comparison to prior work
  • Incremental complexity analysis
  • Practical implications
  • Open problems
  • Breaking news

3
Problem statement
  • Offline or static version (STO)
  • Given a DAG G (V,E) (with n ?V? and m ?E?),
    find a linear ordering T of its nodes such that
    for all directed paths from x ? V to y ? V (x ?
    y), T(x) lt T(y), where TV ? 1..n is a
    bijective mapping
  • Online version (DTO)
  • Edges of G are not known before hand, but are
    revealed one by one
  • Each time an edge is added to the graph, T must
    be updated

4
Problem statement
u ? v invalidates topological order
a
b
c
d
u
v
affected region
5
Motivation
  • Traditional applications
  • Online cycle detection in pointer analysis
  • Incremental evaluation of computational circuits
  • Semantic checking by structure-based editors
  • Maintaining dependences between modules during
    compilation
  • Other applications
  • Scheduling jobs in grid computing systems, where
    dependences arise between the subtasks of a job

6
Prior work (summary)
  • Offline problem per edge
  • for m edges
  • Alpern et al. (AHRSZ, 90)
    per edge
  • Marchetti-Spaccamela et al. (MNR, 96)
    per edge (amortized)
  • for m edges
  • Pearce and Kelly (PK, 04)
    per edge
  • Katriel and Bodlaender (KB, 05)
    .
    per edge (amortized)

  • for m edges

incremental complexity analysis
7
Ajwani et al. (AFM)
  • Contributions
  • Solves DTO in O(n2.75) time, regardless of the
    number of edges m inserted
  • Uses generic bucket data structure with efficient
    support for insert, delete, collect-all
  • Analysis based on tunable parameter t max
    number of nodes in each bucket
  • Contributions
  • Poor discussion of motivating applications
  • No insight into how algorithm works or achieves
    running time
  • No intuitive comparison with prior algorithms
    (AHRSZ, MNR, etc.)

8
Notation
  • d(u,v) denotes ?T(u) T(v)?
  • u lt v is shorthand for T(u) lt T(v)
  • u ? v denotes an edge from u to v
  • u ? v means v is reachable from u

9
Algorithm AFM
10
Algorithm AFM
u ? v invalidates topological order
a
b
c
d
u
v
Call Set A Set B Recursion depth
Reorder(u,v) v , a c , u
11
Algorithm AFM
a
b
c
d
u
v
Call Set A Set B Recursion depth
Reorder(c,a) Ø Ø
12
Algorithm AFM
c
b
a
d
u
v
Call Set A Set B Recursion depth
Reorder(c,a) Ø Ø
Swap!
13
Algorithm AFM
c
b
a
d
u
v
Call Set A Set B Recursion depth
Reorder(u,v) v , a c , u
14
Algorithm AFM
c
b
a
d
u
v
Call Set A Set B Recursion depth
Reorder(u,a) a , b u
15
Algorithm AFM
c
b
a
d
u
v
Call Set A Set B Recursion depth
Reorder(u,b) Ø Ø
16
Algorithm AFM
c
u
a
d
b
v
Call Set A Set B Recursion depth
Reorder(u,b) Ø Ø
Swap!
17
Algorithm AFM
c
u
a
d
b
v
Call Set A Set B Recursion depth
Reorder(u,a) a , b u
18
Algorithm AFM
c
u
a
d
b
v
Call Set A Set B Recursion depth
Reorder(u,a) Ø Ø
19
Algorithm AFM
c
a
u
d
b
v
Call Set A Set B Recursion depth
Reorder(u,a) Ø Ø
Swap!
20
Algorithm AFM
c
a
u
d
b
v
Call Set A Set B Recursion depth
Reorder(u,v) v , a c , u
21
Algorithm AFM
c
a
u
d
b
v
Call Set A Set B Recursion depth
Reorder(c,v) Ø Ø
22
Algorithm AFM
v
a
u
d
b
c
Call Set A Set B Recursion depth
Reorder(c,v) Ø Ø
Swap!
23
Algorithm AFM
v
a
u
d
b
c
Call Set A Set B Recursion depth
Reorder(u,v) v , a c , u
24
Algorithm AFM
v
a
u
d
b
c
Call Set A Set B Recursion depth
Reorder(u,v) Ø Ø
25
Algorithm AFM
u
a
v
d
b
c
Call Set A Set B Recursion depth
Reorder(u,v) Ø Ø
Swap!
26
Algorithm AFM
u
a
v
d
b
c
Call Set A Set B Recursion depth
Reorder(u,v) Ø Ø
Done!
27
Data structures
  • Store T and T-1 as arrays
  • O(1) lookup for topological order and inverse
  • Graph stored as array of vertices, where each
    vertex has two adjacency lists (for
    incoming/outgoing edges)
  • Each adjacency list stored as array of buckets
  • Each bucket contains at most t nodes for a fixed
    t
  • i-th bucket of node u contains all adjacent nodes
    v with i ? t ? d(u,v) ? (i 1) ? t

28
Data structures
  • A bucket is any data structure with efficient
    support for the following operations
  • Insert insert an element into a given bucket
  • Delete given an element and a bucket, delete the
    element from the bucket (if found otherwise,
    return 0)
  • Collect-all copy all elements from a given
    bucket to some vector
  • Analysis assumes a generic bucket data structure
    and counts the number of bucket operations
  • Later, we will consider different implementations
    of the data structure and corresponding running
    times/space usage

29
Correctness
  • Theorem 1. Algorithm AFM returns a valid
    topological order after each edge insertion.
  • Lemma 1. Given a DAG G and a valid topological
    order, if u ? v and u ? v, then all subsequent
    calls to REORDER will maintain u ? v.
  • Lemma 2. Given a DAG G with v ? y and x ? u, a
    call of REORDER(u,v) will ensure that x lt y.
  • Theorem 2. The algorithm detects a cycle iff
    there is a cycle in the given edge sequence.

30
Correctness
  • Theorem 1. Algorithm AFM returns a valid
    topological order after each edge insertion.
  • Proof use Lemmas 1 and 2.
  • For graph with no edges, any ordering is a
    topological ordering
  • Need to show that Insert(u,v) maintains correct
    topological order of G G ? (u,v)
  • If u ? v, this is trivial otherwise,
  • Show that x ? y for all nodes x,y of G with x ?
    y. If there was a path x ? y in G, Lemma 1 gives
    x ? y. Otherwise, x ? y was introduced to G by
    (u,v), and Lemma 2 gives x ? y in G since there
    is x ? u ? v ? y in G.

31
Correctness
  • Lemma 1. Given a DAG G and a valid topological
    order, if u ? v and u ? v, then all subsequent
    calls to Reorder will maintain u ? v.
  • Proof by contradiction
  • Consider the first call of Reorder that leads to
    u ? v. Either this led to swapping u and w with w
    ? v or swapping w and v with w ? u. In the first
    case
  • Call was Reorder(w,u) and A Ø
  • However,?? x ? A for which u ? x ? v (since v is
    between u and w), leading to a contradiction

32
Correctness
  • Lemma 2. Given a DAG G with v ? y and x ? u, a
    call of Reorder(u,v) will ensure that x lt y.
  • Proof by induction on recursion depth of
    Reorder(u,v)
  • For leaf nodes, A B Ø. If x ? y before, Lemma
    1 ensures x ? y will continue otherwise, x u
    and y v and swapping gives x ? y.
  • Assume lemma is true up to a certain tree level
    (show this implies higher levels). If A ? Ø,
    there is a v such that v ? v ? y, otherwise v
    v y. If B ? Ø, there is a u such that x ? u
    ? u, otherwise u u x. Hence v ? y ? x ? u.
  • For loops will call Reorder(u,v), which ensures
    x ? y by inductive hypothesis
  • Lemma 1 ensures further calls to Reorder maintain
    x ? y

33
Correctness
  • Theorem 2. The algorithm detects a cycle iff
    there is a cycle in the given edge sequence.
  • Proof ?
  • Within a call to Insert(u,v), there are paths v ?
    v and u ? u for each recursive call to
    Reorder(u,v)
  • Trivial for first call and follows by definition
    of A and B for subsequent calls
  • If algorithm detects a cycle in line 1, then we
    have v ? v u ? u and adding u ? v completes
    the cycle

34
Correctness
  • Theorem 2. The algorithm detects a cycle iff
    there is a cycle in the given edge sequence.
  • Proof ?, by induction on number of nodes in path
    v ? u
  • Consider edge (u,v) of the cycle v ? u ? v
    inserted last. Since v ? u before inserting this
    edge, Theorem 1 states that v ? u, so Reorder
    (u,v) will be called.
  • Call of Reorder (u,v) with u v or v ? u
    clearly reports a cycle
  • Consider path v ? x ? y ? u of length k ? 2 and
    call to Reorder(u,v). Since v ? x ? y ? u before
    the call, x ? A and y ? B, so Reorder(y,x) will
    be called. y ? x has k 2 nodes in the path, so
    call to Reorder will detect the cycle (by the
    inductive hypothesis).

35
Algorithm AFM
36
Running time
  • Theorem 3. Online topological ordering can be
    computed using O(n3.5/t) bucket inserts and
    deletes, O(n3/t) bucket collect-all operations
    collecting O(n2t) elements, and O(n2.5 n2t)
    operations for sorting.
  • Lemma 4. Reorder is called O(n2) times.
  • Lemma 5. The summation of ?A? ?B? over all
    calls of Reorder is O(n2).
  • Lemma 6. Calculating the sorted sets A and B over
    all calls of Reorder can be done by O(n3/t)
    bucket collect-all operations touching a total of
    O(n2t) elements and O(n2.5 n2t) operations for
    sorting these elements.
  • Lemma 9. Updating the data structure over all
    calls of Reorder requires O(n3.5/t) bucket
    inserts and deletes.

37
Running time
  • Theorem 3. Online topological ordering can be
    computed using O(n3.5/t) bucket inserts and
    deletes, O(n3/t) bucket collect-all operations
    collecting O(n2t) elements, and O(n2.5 n2t)
    operations for sorting.
  • Proof
  • Use lemmas 4, 6, and 9. Additionally, show that
    merging sets A and B (lines 6-7 in the algorithm)
    takes O(n2) time
  • Merging takes O(?A? ?B?), which is O(n2) over
    all calls to Reorder by Lemma 5 finding vertices
    in B that exceed the chosen v takes O(the number
    of those vertices), which is also the number of
    recursive calls to Reorder made. Lemma 4 says the
    latter value is O(n2).

38
Running time
  • Lemma 4. Reorder is called O(n2) times.
  • Proof
  • Consider the first time Reorder(u,v) is called.
    If A B Ø, then u and v are swapped.
    Otherwise, Reorder(u,v) is called recursrivelly
    for all v ? v ? A and u ? B ? v with u ?
    v. The order in which recursive calls are made
    and the fact that Reorder is local (only touches
    the affected region) ensures that Reorder(u,v) is
    not called except as the last recursive call. In
    this second call to Reorder(u,v), A B Ø
  • Consider all v ? A and v ? B from the first
    call of Reorder(u,v). Reorder(u,v) and
    Reorder(u,v) must have been called by the for
    loops before the second call to Reorder(u,v).
    Therefore, u ? v and u ? v for all v ? A and
    v ? B, so u and v are swapped during the second
    call.
  • Reorder(u,v) will not be called again because u ?
    v.

39
Running time
  • Lemma 9. Updating the data structure over all
    calls of REORDER requires O(n3.5/t) bucket
    inserts and deletes.
  • Proof use LP
  • Data structure requires O(d(u,v)n/t) bucket
    inserts and deletes to swap two nodes u and v.
  • Need to update adjacency lists of u and v and all
    w adjacent to u and/or v. If d(u,v) ? t, build
    from scratch in O(n). Otherwise, can show that at
    most d(u,v) nodes need to transfer between any
    pair of consecutive buckets. This yields a bound
    of O(d(u,v)n/t).
  • Each node pair is swapped at most once (Lemma 7),
    so summing up over all calls of REORDER(u,v)
    where u and v are swapped, we need O(? d(u,v)n/t)
    bucket inserts and deletes. ?d(u,v) O(n2.5) by
    Lemma 8, so the result follows.

40
Running time
  • How to prove ? d(u,v) O(n2.5)?
  • Use an LP
  • Let T denote the final topological ordering and
  • Model some linear constraints on X(i,j)
  • 0 ? X(i,j) ? n for all i,j ?1..n
  • X(i,j) 0 for all j ? i
  • ?j?i X(i,j) ?jlti X(j,i) ? n for all 1 ? i ? n
  • Over insertion of all edges, a nodes net
    movement right and left in the topological
    ordering must be less than n

if and when Reorder(u,v) leads to a
swapping otherwise
41
Running time
  • Yields the following LP
  • And its dual

42
Running time
  • Which yields the following feasible solution
  • This solution has a value of

43
Implementation of data structure
  • Balanced binary tree gives O(1 log?) time
    insert and delete and O(1 ?) collect-all
  • Total time is O(n2t n3.5 log n/t) by Theorem 3.
    Setting t n0.75 (log n)1/2, we get a total time
    of O(n2.75 (log n)1/2) and O(n2) space
  • n-bit array gives O(1) insert and delete and
    O(total output size total of deletes)
    collect-all operation
  • Total time is O(n2t n3.5/t). Setting t n0.75
    gives O(n2.75) time and O(n2.25) space for
    O(n2/t) buckets
  • Uniform hashing is similar to n-bit array
  • O(n2.75) expected time and O(n2) space

44
Empirical comparison
  • Compared against PK, MNR, and AHRSZ for the
    following hard-case graph

45
Empirical comparison
46
Comparison to prior work
  • No insight provided by Ajwani et al.
  • Pearce and Kelly compare PK, AHRSZ, and MNR using
    incremental complexity analysis
  • In dynamic problems, typically no fixed input
    captures the minimal amount of work to be
    performed
  • Use complexity analysis based on input size
    measure work in terms of a paramter ?
    representing the (minimal) change in input and
    output required
  • For DTO problem, input is current DAG and
    topological order, output after an edge insertion
    is updated DAG and (any) valid ordering
  • Algorithm is bounded if time complexity can be
    expressed only in terms of ??? otherwise, it is
    unbounded

47
Comparison to prior work
  • Runtime comparisons
  • AHRSZ is bounded by ?Kmin?, the minimal cover of
    vertices that are incorrectly ordered after an
    edge insertion, plus adjacent edges
  • PK is bounded by ??uv?, the set of vertices in
    the affected region which reach u or are
    reachable from v, plus adjacent edges PK is
    worst-case optimal wrt number of vertices
    reordered
  • MNR takes ?(???uvF?? ARuv) in the incremental
    complexity model, where ARuv is the set of
    vertices in the affected region
  • ?Kmin? ? ??uv? ? ?ARuv?, so AHRSZ is strictly
    better than PK, but PK and MNR are more difficult
    to compare (former expected to outperform the
    latter on sparse graphs)
  • KB analyzes a variant of AHRSZ
  • AFM appears to improve the bound on the time to
    insert m edges for AHRSZ

48
Comparison to prior work
  • Intuitive comparison
  • AHRSZ performs simultaneous forward and backward
    searches from u and v until the two frontiers
    meet nodes with incorrect priorities are placed
    in a set and corrected using DFSs in this set
  • MNR does a similar DFS to discover incorrect
    priorities, but visits all nodes in the affected
    region during reassignment
  • PK is similar to MNR but reassigns priorities
    using only positions previously held by members
    of ?uv
  • KB and AFM appear to be improvements in the
    runtime analysis of variants of AHRSZ

49
Comparison to prior work
  • Practical implications
  • PK and MNR use simpler data structures (arrays)
    than AHRSZ (priority queues and Diez and Sleator
    ordered list structure)
  • PK and MNR use simpler traversal algorithms than
    AHRSZ
  • PK visits fewer nodes during reassignments
  • Experiments run by Pearce and Kelly
  • MNR performs poorly on sparse graphs, but is the
    most efficient on dense graphs
  • PK performs well on very sparse/dense graphs, but
    not so well in between
  • AHRSZ is relatively poor on sparse graphs, but
    has constant performance otherwise (competitive
    with the others)

50
Open problems
  • Only lower bound in the problem is ?(n log n) for
    inserting n 1 edges, by Ramalingam and Reps
    better lower bounds?
  • Reduce the (wide) gap between best known lower
    and upper bounds
  • Answer does the definition of ? for DTO need to
    include adjacent edges?
  • Does the bounded complexity model capture the
    power of amortization?
  • Include edge deletions in the analysis of AFM or
    any of the other algorithms
  • Perform a theoretical and empirical analysis of a
    parallel version of AFM or any of the other
    algorithms

51
Breaking news
  • Kavitha and Mathew improve the upper bound to
    O(min?n2.5, (m n log n)m0.5?)
  • Doesnt appear to be anything wildly unique about
    their algorithm
  • Do a better job of keeping the sizes of sets ?uvF
    and ?uvB close to each other

52
Thank you
Write a Comment
User Comments (0)
About PowerShow.com