PIXML: Probabilistic Semistructured Databases - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

PIXML: Probabilistic Semistructured Databases

Description:

Semistructured Data Model. Instance S=(V, lch, t, val) ... PIXML Data Model. Probabilistic instance I = (V, lch, t, val, card, ipf) ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 68
Provided by: ehu6
Category:

less

Transcript and Presenter's Notes

Title: PIXML: Probabilistic Semistructured Databases


1
PIXML Probabilistic Semistructured Databases
  • Edward Hung, Lise Getoor, V.S. Subrahmanian
  • University of Maryland, College Park

2
Outline
  • Motivating example
  • Semistructured data model
  • PIXML data model
  • Semantics
  • Interpretation
  • Satisfaction
  • Other work done
  • Related work
  • Future work

3
Motivating Example
  • Surveillance applications monitoring a region of
    battlefield
  • Image processing system identifies vehicles in
    convoys appearing in the region in different time
  • Convoys
  • Timestamp
  • tanks, trucks, etc
  • Uncertainty
  • number of vehicles
  • Category and identity of a vehicle, e.g., a tank?
    T-72?

4
Motivating Example
  • Doppler speed system detects the speed and
    velocity of convoys and infers their possible
    destinations
  • Convoys
  • Timestamp
  • Possible destinations
  • Uncertainty
  • Number of places the convoy will go
  • The name of the places

5
Motivating Example
  • Semistructured data model
  • General hierarchical structure is known.
  • The schema is not fixed
  • Number of vehicles
  • Properties of vehicles
  • Our work store uncertain information in
    probabilistic environments.

6
Semistructured Data Model
  • Instance S(V, lch, t, val)
  • lch(o, l) the set of children of o with label l
  • G (V, lch) is a rooted, directed, edge-labeled
    graph

7
Semistructured Data Model
Time 10
8
Semistructured Data Model
Time 15
9
Semistructured Data Model
  • Example

10
PIXML Data Model
  • Uncertainty
  • Existence of sub-objects
  • Number of sub-objects
  • Identity of the sub-objects

11
PIXML Data Model
  • Weak instance W (V, lch, t, val, card)
  • Cardinality constraint (card(o, l)) gives the
    bounds of the number of sub-objects with edge
    label l connected to the same parents o.

12
PIXML Data Model
  • Example
  • Convoy 2 surely has a timestamp
  • card(convoy2, ts) 1, 1
  • Convoy 2 may have one to two trucks
  • card(convoy2, truck) 1, 2

13
PIXML Data Model (Cardinality)
  • Example of cardinality

Weak Instance W Semistructured Instance card
14
PIXML Data Model
  • Compatible Instances
  • A semistructured instance S (VS, lchS, tS,
    valS) is compatible with a weak instance W (VW,
    lchW, tW, valW) if
  • (VS, lchS) is a rooted connected graph.
  • If o is a leaf in S, then
  • If o is also a leaf in W, tS(o)tW(o) and
    valS(o)valW(o), otherwise, the type and value is
    defined as unknown.
  • Otherwise, card(o,l).min lt k lt card(o,l).max
    where k is the number of l-labeled children of o,
    i.e. lchS(o, l)

15
PIXML Data Model
  • Example

16
PIXML Data Model
  • Example
  • There are surely 2 convoys.
  • card(S, convoy) 2, 2
  • Convoy 1 surely has a timestamp, a truck and a
    tank.
  • card(convoy1, ts) 1, 1
  • card(convoy1, truck) 1, 1
  • card(convoy1, tank) 1, 1
  • Convoy 2 surely has a timestamp
  • card(convoy2, ts) 1, 1
  • Convoy 2 may have one to two trucks
  • card(convoy2, truck) 1, 2

17
PIXML Data Model
  • D(W) the set of all semistructured instances
    compatible with a weak instance W

18
(No Transcript)
19
PIXML Data Model (Weak Instance)
  • Example of a weak instance W

card(S1,convoy)2,2
card(convoy1,ts)1,1
card(convoy1,truck)1,1
card(convoy1,tank)1,1
card(convoy2,ts)1,1
card(convoy2,truck1,2
20
PIXML Data Model
  • Example of an instance compatible with W

card(convoy1,ts)1,1
card(convoy1,truck)1,1
card(S1,convoy)2,2
card(convoy1,tank)1,1
card(convoy2,ts)1,1
card(convoy2,truck)1,2
21
  • D(W) the set of all semistructured instances
    compatible with the weak instance W

22
PIXML Data Model
  • Potential child set
  • PC(o), the potential child set of a non-leaf
    object o in a weak instance W is
  • the set of all possible sets of children of o
    satisfying the constraint of cardinality

23
PIXML Data Model
  • Example
  • Convoy 2s surely has one time stamp which is
    surely 15. Convoy 2 may have a truck of type mac
    and/or a truck of type rover
  • card(convoy2, truck) 1, 2
  • card(convoy2, ts) 1, 1
  • PC(convoy2) ts2, truck3, ts2, truck4,
    ts2, truck3, truck4

24
Potential child set of convoy2, PC(convoy2)
ts2, truck3, truck4,
ts2, truck3,
ts2, truck4
25
PIXML Data Model
  • Probabilistic instance I (V, lch, t, val, card,
    ipf)
  • Interval probability function (ipf(o, c)) w.r.t.
    the set PC(o) associates, with each c in PC(o), a
    closed subinterval lb(c), ub(c) 0, 1

26
PIXML Data Model
  • Example
  • PC(convoy2) ts2, truck3, ts2, truck4,
    ts2, truck3, truck4
  • ipf(convoy2, ts2, truck3)0.2, 0.3
  • ipf(convoy2, ts2, truck4)0.3, 0.5
  • ipf(convoy2, ts2, truck3, truck4)0.2, 0.4

27
Probabilistic Instance I Weak Instance W ipf
ipf(convoy2, ts2, truck3 , truck4)0.2, 0.3
ipf(convoy2, ts2, truck3)0.3, 0.5
ipf(convoy2, ts2, truck4)0.2, 0.4
28
PIXML Data Model
  • Here the ipf assigns the probability interval to
    each possible set of children.
  • More independence assumptions are possible to
    make the representation more compact
  • e.g. independence between trucks and tanks.
  • e.g. all trucks are all indistinguishable.

29
Semantics (Global Interpretation)
  • Interpretation
  • Global interpretation, P
  • a mapping from D(W) (the set of semistructured
    instances compatible with W) to 0,1 s.t.

30
S1a
S1b
S1c
P(S1a) 0.12
P(S1b) 0.08
P(S1c) 0.2
S1d
S1e
S1f
P(S1d) 0.18
P(S1e) 0.12
P(S1f) 0.3
31
Semantics (Local Interpretation)
  • An object probability function (OPF)for an object
    o w.r.t. a weak instance W is a mapping w PC(o)
    ? 0, 1 s.t.

32
Semantics
  • Example
  • ipf(convoy2, ts2, truck3)0.2, 0.3
  • ipf(convoy2, ts2, truck4)0.3, 0.5
  • ipf(convoy2, ts2, truck3, truck4)0.2, 0.4
  • wconvoy2(ts2, truck3) 0.2
  • wconvoy2(ts2, truck4) 0.5
  • wconvoy2(ts2, truck3, truck4) 0.3

33
Semantics (Local Interpretation)
  • Previously, probabilities are assigned to each
    compatible instance globally.
  • Now we are going to assign probabilities of the
    actual children of each non-leaf object in a
    local manner.

34
Object probability function (OPF) for convoy2
w.r.t. W is a mapping w PC(convoy2) ? 0,1 s.t.
wconvoy2(ts2, truck3 , truck4) 0.2
wconvoy2(ts2, truck3) 0.5
wconvoy2(ts2, truck4) 0.3
35
Semantics (Local Interpretation)
  • Interpretation
  • Local interpretation, p
  • a mapping from the set of non-leaf objects to
    OPFs
  • Example
  • p(convoy2) wconvoy2

36
Semantics (Local ? Global)
  • Assume that the probability of any potential
    child of an object o is independent of
    non-descendants of o.
  • W operator
  • W operator returns the probabilities assigned to
    every semistructured instance compatible with a
    given weak instance, which is consistent with a
    given local interpretation.
  • Given a semistructured instance S compatible with
    a weak instance W and a local interpretation p
    for W
  • W(p)(S)Õo S p(o)(CS(o))
  • Theorem
  • W(p) is a global interpretation for W

37
Semantics
  • Example
  • ipf(S1, convoy1, convoy2)1, 1
  • wS1(ts1, truck1, tank1) 1
  • ipf(convoy1, ts1, truck1, tank1)0.2, 0.6
  • ipf(convoy1, ts1, truck1, tank2)0.4, 0.8
  • wconvoy1(ts1, truck1, tank1) 0.4
  • wconvoy1(ts1, truck1, tank2) 0.6
  • ipf(convoy2, ts2, truck3)0.2, 0.3
  • ipf(convoy2, ts2, truck4)0.3, 0.5
  • ipf(convoy2, ts2, truck3, truck4)0.2, 0.4
  • wconvoy2(ts2, truck3) 0.2
  • wconvoy2(ts2, truck4) 0.5
  • wconvoy2(ts2, truck3, truck4) 0.3

38
Semantics
  • Example
  • W(S1a)
  • p(S1)(convoy1, convoy2) x p(convoy1)(ts1,
    truck1, tank1) x p(convoy2)(ts2, truck3,
    truck4)
  • wS1(ts1, convoy1, convoy2) x wconvoy1(ts1,
    truck1, tank1) x wconvoy2(ts2, truck3, truck4)
  • 1 x 0.4 x 0.3
  • 0.12

39
Semantics
wS1(convoy1, convoy2)1
wconvoy1(ts1, truck1, tank1) 0.4
wconvoy2(ts2, truck3, truck4)0.3
p(S1)(convoy1, convoy2) x p(convoy1)(ts1,
truck1, tank1) x p(convoy2)(ts2, truck3,
truck4)
  • W(S1a)

wS1(ts1, convoy1, convoy2) x wconvoy1(ts1,
truck1, tank1) x wconvoy2(ts2, truck3, truck4)

1 x 0.4 x 0.3 0.12
40
Semantics
  • Example
  • Similarly, we can get
  • W(S1a) 0.12
  • W(S1b) 0.08
  • W(S1c) 0.2
  • W(S1d) 0.18
  • W(S1e) 0.12
  • W(S1f) 0.3

41
Semantics (Global ? Local)
  • (Same assumption) The probability of any
    potential child of an object o is independent of
    non-descendants of o.
  • Given a global interpretation P for a weak
    instance W
  • P satisfies W iff P(co, ndes(o)) P(co)
  • ndes(o) is the set of non-descendants of o.

42
Semantics (Global ? Local)
  • D operator
  • D operator returns the probabilities assigned to
    each possible set of children of every non-leaf
    object, which is consistent with a given global
    interpretation.
  • Given a global interpretation P that satisfies a
    weak instance W, for any non-leaf object o, any c
    in PC(o)
  • D(P) returns a function defined as follows for
    any non-leaf object o, D(P)(o)wP,o

43
Semantics (Global ? Local)
  • Theorem
  • D(P) is a local interpretation for W
  • Example
  • Derive D(P)(convoy2)

44
S1a
S1b
S1c
P(S1a) 0.12
P(S1b) 0.08
P(S1c) 0.2
S1d
S1e
S1f
P(S1d) 0.18
P(S1e) 0.12
P(S1f) 0.3
D(P)(convoy2) wP, convoy2
  • wP, convoy2(ts2, truck3, truck4)
    (0.120.18)/10.3

45
D(P)(convoy2) wP, convoy2
  • wP, convoy2(ts2, truck3, truck4)
    (0.120.18)/10.3
  • wP, convoy2(ts2, truck3) (0.080.12)/1 0.2
  • wP, convoy2(ts2, truck4) (0.20.3)/1 0.5

46
Semantics
  • Example
  • Derive D(P)(convoy2) wP, convoy2
  • wP, convoy2(ts2, truck3, truck4)
    (0.120.18)/10.3
  • wP, convoy2(ts2, truck3) (0.080.12)/1 0.2
  • wP, convoy2(ts2, truck4) (0.20.3)/1 0.5

47
Semantics (Local ?? Global)
  • Theorems
  • Suppose p is a local interpretation for a weak
    instance W, then D(W(p))p.
  • Suppose P is a global interpretation that
    satisfies a weak instance W, then W(D(P))P.

48
Semantics (Satisfaction)
  • Given a probabilistic instance I, a non-leaf
    object o,
  • OC(o), the object constraints are
  • p(c) is a real-valued variable denoting the
    probability that c is the actual set of children
    of o.

49
Semantics (Satisfaction)
  • Example
  • ipf(convoy2, ts2, truck3)0.2, 0.3
  • ipf(convoy2, ts2, truck4)0.3, 0.5
  • ipf(convoy2, ts2, truck3, truck4)0.2, 0.4
  • OC(convoy2)

50
Semantics (Local Satisfaction)
  • An OPF w satisfies a non-leaf object o iff w is a
    probability distribution w.r.t. PC(o) over ipf.
  • A local interpretation p satisfies a non-leaf
    object o iff p(o) satisfies o.
  • A local interpretation p satisfies a
    probabilistic instance I iff p satisfies Is
    every non-leaf object.

51
Semantics (Global Satisfaction)
  • A global interpretation P satisfies a
    probabilistic instance I iff D(P) satisfies I.
  • Corollary
  • A local interpretation p satisfies a
    probabilistic instance I iff W(p) satisfies I.

52
Semantics (Consistency)
  • A probabilistic instance is locally consistent
    iff there is a local interpretation that
    satisfies it.
  • A probabilistic instance is globally consistent
    iff there is a global interpretation that
    satisfies it.
  • Theorem
  • Every probabilistic instance is locally and
    globally consistent.

53
Other Work Done
  • Algebra
  • Projection, selection, Cartesian product
  • Probabilistic point query
  • returns the probability that a given object
    satisfies a given path expression
  • R-answer to a query
  • returns the set of objects that satisfy a query
    with probability r or more
  • Implementation of a prototype

54
Other Work Done
  • Experiment
  • Execution time is linear to the total number of
    ipf entries, i.e., the instance size
  • Papers submitted to ICDE and ICDT

55
Related Work
  • Semistructured Probabilistic Objects (SPOs)
    (Dekhtyar, Goldsmith, Hawkes, in SSDBM, 2001)
  • SPO express contexts (not random variables) in a
    semistructured manner
  • PIXML data model stores XML data AND
    probabilistic information.

56
Related Work
  • ProTDB (Nierman, Jagadish, in VLDB, 2002)
  • Point probabilities VS interval probabilities
  • Independent probabilities assigned to each child
    VS arbitrary distributions over sets of children
  • Tree-structured VS arbitrary acyclic
  • Our model theory provides two formal semantics
  • Differences in their queries and our algebra and
    query.

57
Related Work
  • Algebras TAX, SAL
  • TAX (Jagadish, Lakshmanan, Srivastava, 2001)
  • use pattern tree to extract subsets of nodes, one
    for each embedding of pattern tree.
  • fixed number of children
  • SAL (Beeri, Tzaban, 1999)
  • bind objects to variables
  • original structure is totally lost

58
Future Work
  • System implementation
  • Query optimization

59
Summary
  • PIXML data model
  • Semistructured instance
  • Weak instance (add cardinality)
  • Probabilistic instance (add ipf)
  • Semantics
  • Local and Global
  • Interpretation
  • Satisfaction

60
Algebra
  • Operators
  • Projection
  • Selection
  • Cross-product
  • Path expression
  • o.l1.l2ln

S1.convoy.truck
61
Algebra (Projection)
  • Ancestor projection
  • Descendant projection
  • Single projection

62
Algebra (Projection)
Semistructured Instance
  • Ancestor projection ( )

63
Weak Instance
  • Ancestor projection ( )

64
Probabilistic Instance
  • Ancestor projection ( )

card(convoy1,ts)1,1
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
card(convoy1,tank)1,1
ipf(convoy1, ts1,truck1,tank1)0,0.3 ipf(convo
y1, ts1,truck1,tank2)0.1,0.4 ipf(convoy1,
ts1,truck2,tank1)0.3,0.5 ipf(convoy1,
ts1,truck2,tank2)0.3,0.6
PC(convoy1)
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
Children of convoy1 before CI2(convoy1)ts1,
truck1, truck2, tank1, tank2
Children of convoy1 after CI2(convoy1)truck1,
truck2
Let Cd CI2(convoy1) CI2(convoy1)ts1,
tank1, tank2
PC(convoy1)truck1,truck2
65
Probabilistic Instance
  • Ancestor projection ( )

card(convoy1,ts)1,1
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
card(convoy1,tank)1,1
ipf(convoy1, ts1,truck1,tank1)0,0.3 ipf(convo
y1, ts1,truck1,tank2)0.1,0.4 ipf(convoy1,
ts1,truck2,tank1)0.3,0.5 ipf(convoy1,
ts1,truck2,tank2)0.3,0.6
PC(convoy1)
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
For each c in PC(convoy1),
ipf(convoy1, c)a, min(1,b)
ipf(convoy1) ? tight(ipf(convoy1))
Dekhtyar, Goldsmith (2002)
66
Probabilistic Instance
  • Ancestor projection ( )

card(convoy1,ts)1,1
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
card(convoy1,tank)1,1
ipf(convoy1, ts1,truck1,tank1)0,
0.3 ipf(convoy1, ts1,truck1,tank2)0.1,0.4 ip
f(convoy1, ts1,truck2,tank1)0.3,0.5 ipf(convo
y1, ts1,truck2,tank2)0.3,0.6
PC(convoy1)
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
For truck1,
a 0.00.1 0.1
b 0.30.4 0.7
ipf(convoy1, truck1) 0.1, min(1, 0.7)
0.1, 0.7
67
Probabilistic Instance
  • Ancestor projection ( )

card(convoy1,ts)1,1
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
card(convoy1,tank)1,1
ipf(convoy1, ts1,truck1,tank1)0,
0.3 ipf(convoy1, ts1,truck1,tank2)0.1,0.4 ip
f(convoy1, ts1,truck2,tank1)0.3,0.5 ipf(convo
y1, ts1,truck2,tank2)0.3,0.6
PC(convoy1)
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
For truck2,
a 0.30.3 0.6
b 0.50.6 1.1
ipf(convoy1, truck2) 0.6, min(1, 1.1)
0.6, 1
68
Probabilistic Instance
  • Ancestor projection ( )

card(convoy1,ts)1,1
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
card(convoy1,tank)1,1
ipf(convoy1, ts1,truck1,tank1)0,
0.3 ipf(convoy1, ts1,truck1,tank2)0.1,0.4 ip
f(convoy1, ts1,truck2,tank1)0.3,0.5 ipf(convo
y1, ts1,truck2,tank2)0.3,0.6
PC(convoy1)
card(I2,convoy)1,1
card(convoy1,truck)1,1
ipf(I2, convoy1)1
ipf(convoy1) ? tight(ipf(convoy1))
tight
ipf(convoy1, truck1)0.1, 0.7 ipf(convoy1,
truck2)0.6, 1
ipf(convoy1, truck1)0.1, 0.4 ipf(convoy1,
truck2)0.6, 0.9
69
  • Ancestor projection ( )

HIDE IT
card(convoy1,ts)1,1
card(convoy1,truck)1,1
card(convoy1,tank)1,1
wconvoy1(ts1,truck1,tank1)0.4 wconvoy1(ts1,tru
ck1,tank2)0.6
card(S1,convoy)2,2
wS1(convoy1,convoy2)1
card(convoy2,ts)1,1
card(convoy2,truck1,2
wconvoy2(ts2,truck3)0.2 wconvoy2(ts2,truck4)
0.5 wconvoy2(ts2,truck3,truck4)0.3
70
Algebra (Projection)
  • Descendant projection ( )

card(I3, truck)0,3 ipf(I3,c)0,1
One naive strategy
Our better strategy similar to the one in cross
product
71
Algebra (Projection)
  • Single projection ( )

(null)
card(I3, truck)0,3 ipf(I3,c)0,1
72
Algebra (Projection)
  • Equivalence

Equivalent
73
Algebra (Projection)
  • Equivalence

Equivalent
74
Algebra (Projection)
  • Equivalence

Equivalent
e1 and e2 are a sequence of zero or more
edges. Thus, I.e1.lm can include I.lm, I.l1.lm,
I.l2.l3.lm, etc.
75
In general non-equivalent
76
Algebra (Selection) ( )
  • Similar to ancestor projection
  • Path expression specifies leaf objects with a
    specified value.

77
Algebra (Selection)
Semistructured Instance
I1
78
Algebra (Selection) ( )
card(I7, convoy)1,2, wI7(convoy1)0.2,
wI7(convoy2)0.5, wI7(convoy1,convoy2)0.3
card(convoy1, tank)1,1 wconvoy1(tank1)0.3,
wconvoy1(tank2)0.7
card(convoy2, tank)1,1 wconvoy2(tank2)0.4,
wconvoy2(tank3)0.6
0.14 0.3 0.054 0.036 0.084 0.614
D(I7) ?
0.054
/ 0.614
0.06
0.126
0.14
/ 0.614
0.036
0.3
/ 0.614
/ 0.614
0.2
0.084
/ 0.614
79
Algebra (Selection) ( )
card(I7, convoy)1,2, ipf(I7,convoy1)0.1,0.
3, ipf(I7,convoy2)0.4,0.6,
ipf(I7,convoy1,convoy2)0.2,0.4
card(convoy1, tank)1,1 ipf(convoy1,tank1)0.
2,0.4, ipf(convoy1,tank2)0.6,0.8
card(convoy2, tank)1,1 ipf(convoy2,tank2)0.
3,0.5, ipf(convoy2,tank3)0.5,0.7
D(I7) ?
0.012,0.08
Conditionalization of interval probabilities
0.02,0.12
0.02,0.112
0.06,0.24
0.036,0.16
Dekhtyar, Goldsmith (2002)
0.08,0.24
0.24,0.48
0.06,0.224
80
Algebra (Cross product (x))
  • Probabilistic conjunction strategies
  • Example
  • Ignorance
  • Positive correlation
  • Negative Correlation
  • Independence

81
Algebra (Cross product (x))
card(I4, truck)1,1 ipf(I4, truck1)0.2,0.7
ipf(I4, truck2)0.3,0.8
card(I5, tank)1,1 ipf(I5, tank1)0.1,0.6 ip
f(I5, tank2)0.4,0.9
card(I6, truck)1,1 card(I6, tank)1,1
I4 x I5
82
Algebra (Cross product (x))
card(I4, truck)1,1 ipf(I4, truck1)0.2,0.7
ipf(I4, truck2)0.3,0.8
card(I5, tank)1,1 ipf(I5, tank1)0.1,0.6 ip
f(I5, tank2)0.4,0.9
card(I6, truck)1,1 card(I6, tank)1,1
I4 x I5
83
Algebra (Cross product (x))
card(I4, truck)1,1 ipf(I4, truck1)0.2,0.7
ipf(I4, truck2)0.3,0.8
card(I5, tank)1,1 ipf(I5, tank1)0.1,0.6 ip
f(I5, tank2)0.4,0.9
card(I6, truck)1,1 card(I6, tank)1,1
I4 x I5
84
Algebra (Cross product (x))
card(I4, truck)1,1 ipf(I4, truck1)0.2,0.7
ipf(I4, truck2)0.3,0.8
card(I5, tank)1,1 ipf(I5, tank1)0.1,0.6 ip
f(I5, tank2)0.4,0.9
card(I6, truck)1,1 card(I6, tank)1,1
I4 x I5
85
Algebra (Cross product)
  • Equivalence
  • (I1 x I2) x I3
  • I1 x (I2 x I3)
  • (I1 x I3) x I2

Equivalent
86
Related Work
  • Semistructured Probabilistic Objects (SPOs)
    (Dekhtyar, Goldsmith, Hawkes, in SSDBM, 2001)
  • SPO express contexts (not random variables) in a
    semistructured manner
  • PIXML data model stores XML data AND
    probabilistic information.

87
Related Work
  • ProTDB (Nierman, Jagadish, in VLDB, 2002)
  • Point probabilities VS interval probabilities
  • Independent probabilities assigned to each child
    VS arbitrary distributions over sets of children
  • Tree-structured VS arbitrary acyclic
  • Our model theory provides two formal semantics
  • Differences in their queries and our algebra and
    query.

88
Related Work
  • Algebras TAX, SAL
  • TAX (Jagadish, Lakshmanan, Srivastava, 2001)
  • use pattern tree to extract subsets of nodes, one
    for each embedding of pattern tree.
  • fixed number of children
  • SAL (Beeri, Tzaban, 1999)
  • bind objects to variables
  • original structure is totally lost

89
Future Work
  • System implementation
  • Query optimization

90
Summary
  • PXML data model
  • Semistructured instance
  • Weak instance (add cardinality)
  • Probabilistic instance (add ipf)
  • Semantics
  • Local and Global
  • Interpretation
  • Satisfaction
  • Algebra
  • Projections, selection, cross product

91
Algebra (Projection)
  • Equivalence

Equivalent
92
Algebra (Projection)
  • Equivalence

Equivalent
e1 and e2 are a sequence of zero or more
edges. Thus, I.e1.lm can include I.lm, I.l1.lm,
I.l2.l3.lm, etc.
93
In general non-equivalent
94
Algebra (Cross product)
  • Equivalence
  • (I1 x I2) x I3
  • I1 x (I2 x I3)
  • (I1 x I3) x I2

Equivalent
95
Related Work
  • Bayesian net (Pearl, 1988)
  • random variables (probability of events)
  • ours existence of children requires existence of
    parents
Write a Comment
User Comments (0)
About PowerShow.com