Title: Katia Abbaci
1A Similarity Skyline Approach for Handling
GraphQueries - A Preliminary Report
- Katia Abbaci Allel Hadjali Ludovic
Liétard Daniel Rocacher -
- IRISA/ENSSAT, University of Rennes1
- Katia.Abbaci, Allel.Hadjali,
Daniel.Rocacher_at_enssat.fr - IRISA/IUT, University of Rennes1
- Ludovic.Lietard_at_univ-rennes1.fr
2Outline
- Introduction
- Background
- Skyline Query
- Graph Query
- Graph Similarity Measures
- Graph Similarity Skyline
- Refinement Graph Similarity Skyline
- Summary and Outlook
3Introduction (1/3)
- Context
- Graphs Modeling of structured and complex data
- Application Domains
- Medicine, Web, Chemistry, Imaging, XML documents,
Bioinformatic,...
GDM 2011
4Introduction (2/3)
- Main
- Search Problem of similar graphs to graph query
- Existing approaches a single similarity measure
- Several methods for measuring the similarity betwe
en two graphs - Method limited to an application class
- No method fits all
5Introduction (3/3)
- Motivations
- Model for different classes of applications
- Model incorporating multiple features
- Contributions
- Graph Similarity Skyline in order to answer a
graph query optimality in the sense of Pareto - A Refinement Method of Skyline based on diversity
criterion among graphs
6Skyline Query
- Identification of interesting objects from
multi-dimensional dataset - p (p1, , pm), q (q1, , qm)
multidimensional objects - p Pareto dominates q, denoted p q, iff
- on each dimension, 1 i m, pi qi
- on at least one dimension, pj lt qj
7Sample Skyline Query
- Find a cheap hotel and as close as possible
to the downtown
H2
H2
H6
H6
Skyline H2, H4, H6
8Graph Query
- Two categories of graph queries
- Graph containment search
- q a query, D g1, , gn a GDB
- Subgraph containment search
- ? Retrieve all graphs gi of D such that q ? gi
- Supergraph containment search
- ? Retrieve all graphs gi of D such that q ? gi
- Graph similarity search
- Retrieve structurally similar graphs to the query
graph
9Graph Similarity Measures
- Several processing methods of graph similarity
- Edit Distance (DistEd)
- Maximum common subgraph based distance (DistMcs)
- Graph union based distance (DistGu)
10Graph Similarity Measures
Distance between g and g Similarity between g and g
Edit Distance
Mcs-based Distance
Gu-based Distance
Tab. 2 Similarity Measures
11Edit Distance example
- Transformation of g into g
- deletion of the adge (d, e),
- re-labeling the adge (a, d) from 1 to 4,
- re-labeling the node d with e,
- insertion of the adge (a, f) with the label 1.
- Use of the uniform distance
12Distances based on Mcs and Gu example
- Identification of the size of
- Computation of Mcs-based distance
- Computation of Gu-based distance
13Graph Similarity Skyline (1/2)
- Graph compound similarity between two graphs a
vector of local distance measures
14Graph Similarity Skyline (2/2)
- q a query, D g1, , gn a GDB
- For i 1 to n, do
- Compare
- Extract the Graph Similarity Skyline (GSS)
- Similarity-Dominance Relation
- ? i ? 1, ..., d, Disti(g, q) Disti(g, q),
- ? k ? 1, ..., d, Distk(g, q) lt Distk(g, q).
-
15Illustrative Example (1/2)
Mcs(gi, q)
(g1, q) 4
(g2, q) 4
(g3, q) 4
(g4, q) 3
(g5, q) 5
(g6, q) 5
(g7, q) 6
Tab. 3 Information about Mcs(gi, q)
16Illustrative Example (2/2)
- Computation of GCS(gi,q), for i 1 to 7, do
DistEd(gi,q) DistMcs(gi,q) DistGu(gi,q)
(g1, q) 4 0.33 0.50
(g2, q) 4 0.43 0.56
(g3, q) 3 0.43 0.56
(g4, q) 2 0.50 0.67
(g5, q) 3 0.38 0.44
(g6, q) 4 0.44 0.50
(g7, q) 4 0.40 0.40
g1
g5
g1
Tab. 4 Distance Measures
GSS(D, q) g1, g4, g5, g7
17Refinement of Graph Similarity Skyline (1/3)
- Large Skyline
- Need k dissimilar answers
- Solution diversity criterion
- Extract a subset (S) of size k with a maximal
diversity - Provide the user with a global picture of
the whole set GSS
18Refinement of Graph Similarity Skyline (2/3)
- Diversity of a subset S of size k is
-
- diversity in the ith dimension of the subset
S - s. t.
19Refinement of Graph Similarity Skyline (3/3)
- Refinement Algorithm
- For j 1 to , enumerate
, with - For i 1 to d, rank-order all Sj in decreasing
way according to their diversity - Let be the rank of Sj w. r. t.
the ith dimension - the best diversity value
- the worst diversity value
- Evaluate Sj by
- Extract
20Illustrative Exa mple
v3
0.80
0.60
0.67
0.73
0.77
0.61
S1g1,g4
S2g1,g5
S3g1,g7
S4g4,g5
S5g4,g7
S6g5,g7
v1
0.86
0.83
0.87
0.80
0.83
0.75
r3
1
6
4
3
2
5
S1g1,g4
S2g1,g5
S3g1,g7
S4g4,g5
S5g4,g7
S6g5,g7
r1
2
3
1
4
3
5
v3
0.80
0.60
0.67
0.73
0.77
0.61
Val(Si)
5
14
9
10
6
15
v1
0.86
0.83
0.87
0.80
0.83
0.75
v2
0.67
0.50
0.60
0.62
0.70
0.50
v2
0.67
0.50
0.60
0.62
0.70
0.50
r2
2
5
4
3
1
5
21Summary and Outlook
- Skyline approach for searching graphs
by similarity - Extraction of all DB graphs non-dominated by
any other graph - Preserving information about the
similarity on different features - Selection of the subset of graphs with maximal
diversity from the skyline - Implementation step to demonstrate the
effectiveness of the approach on a real database - Investigation of other similarity measures
22Thank you
Questions ?