Recursive Graph Deduction and Reachability Queries - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Recursive Graph Deduction and Reachability Queries

Description:

Recursive Graph Deduction and Reachability Queries Yangjun Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg, Manitoba, Canada R3B 2E9 – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 34
Provided by: ionUwinn6
Category:

less

Transcript and Presenter's Notes

Title: Recursive Graph Deduction and Reachability Queries


1
Recursive Graph Deduction and Reachability
Queries
  • Yangjun Chen
  • Dept. Applied Computer Science,
  • University of Winnipeg
  • 515 Portage Ave.
  • Winnipeg, Manitoba, Canada R3B 2E9

2
Outline
  • Motivation
  • Graph deduction
  • - Basic definitions
  • - Critical nodes and critical subgraphs
  • - Evaluation of reachability queries
  • Recursive graph deduction (RGD)
  • - Recursive deduction
  • - Evaluation of reachability queries
  • based on RGD
  • Conclusion

3
Motivation
  • Efficient method to evaluate graph reachability
    queries
  • Given a directed acyclic graph (DAG) G, check
    whether a node v is reachable from another node u
    through a path in G.
  • Application
  • XML data processing, gene-regulatory networks or
    metabolic networks. It is well known that XML
    documents are often represented by tree
    structure. However, an XML document may contain
    IDREF/ID references that turn itself into a
    directed, but sparse graph a tree structure plus
    a few reference links. For a metabolic network,
    the graph reachability models a relationship
    whether two genes interact with each other or
    whether two proteins participate in a common
    pathway. Many such graphs are sparse.

4
Motivation
  • A simple method
  • - store a transitive closure as a matrix

G
O(n2) space query time O(1)
M
5
Motivation
Question Is it possible to reduce the size of
M, but still have a constant query time?
6
Graph deduction
  • Basic definitions

Let G be a sparse graph. we will first find a
spanning tree T of G.
a
h
b
r
j
c
i
d
e
k
f
g
The spanning tree of G is represented by the
solid arrows, which covers all nodes of G.
7
Graph deduction
Edge classification
  • tree edges (Etree) edges appearing in T.
  • cross edges (Ecross) any edge (u, v) such that u
    and v
  • are not on the same path in T.
  • forward edges (Eforward) any edge (u, v) not
    appearing
  • in T, but there exists a path from u to v in T.
  • back edges (Eback) any edge (u, v) not appearing
    in T,
  • but there exists a path from v to u in T.

a
h
b
r
c
j
i
e
d
k
g
f
8
Graph deduction
  • Tree encoding
  • Let G be a DAG. we will first find a spanning
    tree T of G.
  • Each node v in T will be assigned an interval
    start, end),
  • where start is vs preorder number and end - 1
    is the largest
  • preorder number among all the nodes in Tv. So
    another
  • node u labeled start, end) is a descendant of
    v
  • (with respect to T) iff start ? start, end).

9
Graph deduction
  • Tree encoding
  • Let v and u be two nodes in T, labeled a, b) and
    a, b), respectively.
  • If a ? a, b), v is a descendant of u. In this
    case, we say, a, b) is subsumed
  • by a, b).
  • Also, we must have b ? b. Therefore, if v and u
    are not on the same path in T,
  • we have either a ? b or a ? b.
  • In the former case, we say, a, b) is smaller
    than a, b), denoted
  • a, b) ? a, b). In the latter case, a, b)
    is smaller than a, b).

10
Graph deduction
  • Critical nodes and critical subgraph
  • We denote by E the set of all cross edges.
    Denote
  • by V the set of all the end points of the cross
  • edges. That is, V Vstart ? Vend, where Vstart
  • contains all the start nodes while Vend all the
    end
  • nodes of the cross edges.

a
Vstart d, f, g, h Vend c, k, e, d, g
r
e
b
h
d
f
g
c
i
j
k
11
Graph deduction
  • Critical nodes and critical subgraph

Definition 1 (anti-subsuming subset) A subset S
? Vstart is called an anti-subsuming subset iff
S gt 1 and no two nodes in S are related by
ancestor-descendant relationship with respect to
T. ?
anti-subsumming subsets
d, f d, g d, h f, g f, h g, h
d, f, g d, f, h d, g, h f, g, h d, f, g,
h
a
r
e
b
h
d
f
g
c
i
j
k
12
Graph deduction
  • Critical nodes and critical subgraph

Definition 2 (critical node) A node v in a
spanning tree T of G is critical if v ? Vstart or
there exists an anti-subsuming subset S v1,
v2, ..., vk for k ? 2 such that v is the lowest
common ancestor of v1, v2, ..., vk. We denote Vc
the set of all critical nodes. ? In the graph,
node e is the lowest common ancestor of f, g,
and node a is the lowest common ancestor of d,
f, g, h. So e and a are critical nodes. In
addition, each v ? Vstart is a critical node. So
all the critical nodes of G with respect to T are
d, f, g, h, e, a.
a
r
Vc d, f, g, h, e, a.
h
e
b
d
f
g
c
i
j
k
13
Graph deduction
  • Critical node recognition
  • Algorithm critical-node-recognition(T)
  • Mark any node in T, which belongs to Vstart.
  • Let v be the first marked node encountered during
    the bottom-up searching of T. Create the first
    node for v in Gc.
  • Let u be the currently encountered node in T. Let
    u be a node in T, for which a node in Gc is
    created just before u is met. Do (4) or (5),
    depending on whether u is a marked node or not.
  • If u is a marked node, then do the following.
  • (a) If u is not a child (descendant) of u,
    create a link from u to u, called a
    left-sibling link and denoted as
    left-sibling(u) u.

14
Graph deduction
Critical node recognition Algorithm
critical-node-recognition(T) (continued) (b) If
u is a child (descendant) of u, we will first
create a link from u to u, called a parent
link and denoted as parent(u) u. Then, we will
go along a left-sibling chain starting from u
until we meet a node u which is not a child
(descendant) of u. For each encountered node w
except u, set parent(w) ? u. Set
left- sibling(u) ? u. Remove left-sibling(w)
for each child w of u. 5. If u is a non-marked
node, then do the following. (c) If u is not a
child (descendant) of u, no node will be
created. (d) If u is a child (descendant) of u,
we will go along a left-sibling chain starting
from u until we meet a node u which is not
a child (descendant) of u. If the number of
the nodes encountered during the chain navigation
(not including u) is more than 1, we will
create new node in Gc and do the same operation
as (4.b). Otherwise, no node is created.
15
Graph deduction
Sample trace
u is not a child of u.
u
u
u
u
u
u


link to the left sibling
d
d
f
d
f
(c)
(b)
(a)
a
h
r
(e)
(d)
d
f
g
d
f
g
h
e
b
a
d
f
g
c
e
i
(f)
j
f
h
g
d
k
16
Graph deduction
  • Tree deduction
  • Let T be a spanning tree of G. Denote by Tr a
    reduction of T obtained by removing all those
    nodes v ? Vc ? Vend. Deleting a node v entails
    connecting vs parent to each of vs children.
    So, removing a node in this way corresponds to
    the elimination of a tree edge.
  • Example Tr obtained by removing the nodes b, r,
    i, and j one by one. (Note that none of them
    belongs to Vc ? Vend. Vc a, d, e, f, g, h and
    Vend c, d, e, g, k.)

Tr
a
e
c
d
h
g
f
k
17
Graph deduction
Critical subgraph Definition 4 (critical
subgraph) Let G(V, E) be a DAG. Let T be a
spanning tree of G. The critical subgraph Gc of G
with respect to T is graph with node set V(Tr)
and edge set E(Tr) ? Ecross.
The reachability of any two nodes can be checked
by using T or Gc.
18
Graph deduction
19
Graph deduction
  • Evaluation of reachablity queries
  • Definition 5 (anchor nodes) Let G be a DAG and T
    a spanning
  • tree of G. Let v be a node in T. Denote by Cv all
    the critical
  • nodes in Tv. We associate two anchor nodes with
    v as
  • below.
  • A node u ? Cv is called an anchor node (of the
    first kind) of
  • v if u is closest to v. u is denoted v.
  • A node w is called an anchor node (of the second
    kind) of v
  • if it is the lowest ancestor of v (in T), which
    has a cross
  • incoming edge. w is denoted v.

Example. r e. It is because node e is critical
and closest to node r in Tr. But r does not
exist since it does not have an ancestor which
has a cross incoming edge. e e e. That
is, both the first and second kinds of
anchor nodes of e are e itself.
20
Graph deduction
  • Evaluation of reachablity queries

Example. r e. It is because node e is critical
and closest to node r in Tr. But r does not
exist since it does not have an ancestor which
has a cross incoming edge. e e e. That
is, both the first and second kinds of
anchor nodes of e are e itself.
f e
21
Graph deduction
  • Evaluation of reachablity queries

Definition 6 (non-tree labels) Let v be a node in
G. The non-tree label of v is a pair ltx, ygt, where
  • x v if v exists. If v does not exists, let x
    be the special
  • symbol -.
  • - y v if v exists. If v does not exist,
    let y be -.

22
Graph deduction
  • Example

r
d ?
?
a
lta, -gt
h
b
r e
lth, -gt
r
5, 9)
lte, -gt
ltd, -gt
d
j
c
e
d d
i
lt-, cgt
lte, egt
ltd, dgt
4, 5)
lt-, -gt
lt-, -gt
k
lt-, kgt
ltf, egt
ltg, ggt
f
d is reachable from e through a path in Gc. So d
is reachable from r.
g
23
Graph deduction
  • Evaluation of reachablity queries

Reachability checking over Gc
Index(v)
(1, 1) (2, 3) (1, 4) (1, 2) (1, 3) (2, 2) (2,
1) (1, 5)
a c d e f g h k
Decompose Gc into chains
24
Graph deduction
  • Evaluation of reachablity queries

Reachability checking over G
Index(v)
1st chain
2nd chain
(1, 1) (2, 4)(3, -)(4, -)(5, -)
b
r
e
c
(1, 2) (2, -)(3, -)(4, -)(5, -)
f
(1, 3) (2, -)(3, -)(4, -)(5, -)
k
d
3rd chain
4th chain
5th chain
(5, 1) (1, -)(2, -)(3, -)(4, -)
h
a
j
i
g
25
Recursive graph decomposition
  • Recursive deduction

From the above discussion, we can see that Gc is
much smaller than G. However, it can be observed
that Gc itself can be further reduced, leading
to a further reduction of space requirement.
Using the above method, we can find a series of
graph reductions G0 G, G1, ..., Gk, (k ?
1) where Gi is a critical subgraph of Gi-1 (i
1, ..., k). In order to construct such critical
subgraphs, a series of spanning trees have to be
established T0, T1, ..., Tk-1, where each Ti is
a spanning tree of Gi (i 0, ..., k - 1), used
to construct Gi1.
26
Recursive graph decomposition
  • Recursive deduction

To check reachability efficiently, each node v in
G will be asssociated with two sequences an
interval sequence and an anchor node
sequence 1) ?0(v), ?0(v)), ..., ?j(v), ?j(v))
(j ? k - 1) where each ?i(v), ?i(v)) is an
interval generated by labeling Ti 2) (x0(v),
y0(v)), ..., (xj(v), yj(v)), where each is a
pointer to an anchor node of the first kind (a
node appearing in Gi1) while each a pointer to
an anchor node of the second kind (also, a node
in Gi1).
27
Recursive graph decomposition
  • Recursive deduction

G0 U ?0(u), ?0(u)) v ?0(v),
?0(v)) w ?0(w), ?0(w)) z ?0(z), ?0(z))
G1 U ?1(u), ?1(u)) v ?1(v),
?1(v)) w ?1(w), ?1(w)) z ?1(z), ?1(z))
Gj U ?j(u), ?j(u)) v ?j(v),
?j(v)) w ?j(w), ?j(w)) z ?j(z), ?j(z))






28
Recursive graph decomposition
  • Recursive deduction

Example
G0
G1
G2
a
ltc, -gt
a
r
ltc, -gt
h
h
e
b
ltc, -gt
e
d
f
g
c
i
lt-, -gt
j
ltc, -gt
g
f
k
Index(v)
c
d
lt-, -gt
1 1 2
ltc, -gt
k
c k
lt-, kgt
(1, 1) (1, 2)
29
Recursive graph decomposition
  • Recursive deduction

Example
a b c d e f g h i j k r
0, 12)0, 8) lta, -gtltc, -gt 1, 5) ltd, -gt 2,
4)7, 8) lt-, cgtltc, -gt 4, 5)4, 6) ltd, dgtlt-,
-gt 6, 9)2, 8 lte, egtltc, -gt 7, 8)3, 6) ltf,
egtlt-, -gt 8, 9)6, 8) ltg, ggtltc, -gt 9, 12)1,
8) lth, -gtltc, -gt 10, 11) lt-, -gt 11,
12) lt-, -gt 3, 4)5, 6) lt-, kgtlt-, kgt 5,
9) lte, -gt
30
Recursive graph decomposition
  • Evaluation of reachability queries

Anchor node sequence
a b c d e f g h i j k r
lta, -gtltc, -gt ltd, -gt lt-, cgtltc, -gt ltd, dgtlt-, -gt lte,
egtltc, -gt ltf, egtlt-, -gt ltg, ggtltc, -gt lth, -gtltc,
-gt lt-, -gt lt-, -gt lt-, kgtlt-, kgt lte, -gt
2,
1, 2,
1,
a
c
b
2,
2,
1, 1,
e
1, 1,
g
2,
1,
1,
r
1, 2,
h
k
1,
1,
f
g
k ?
?
?0(g), ?0(g)) 8, 9) ?0(k), ?0(k)) 3,
4) ?1(g), ?1(g)) 6, 8) ?1(k), ?1(k)) 5,
6). In G2, k is reachable from c, which shows
that k is reachable from g.
31
Summary
  • Transitive closure compression based on graph
    deduction
  • - DAG decomposition a spanning and a subgraph
  • - Reachability checking tree labels and
    reachability of anchor nodes in the subgraph
  • Transitive closure compression based on recursive
    graph deduction
  • - DAG decomposition a series of spanning trees
    and a subgraph
  • - Reachability checking interval sequences and
    anchor node sequences

32
Summary
  • Computational complexities
  • - labeling time O(ke bk1.5nk)
  • - space overhead O(kn bknk)
  • - query time O(k)
  • where n number of the nodes of G,
  • e - number of the nodes of G,
  • nk number of the nodes of Gk, and
  • bk width of Gk.

33
Thank you.
Write a Comment
User Comments (0)
About PowerShow.com