Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database presentation

About This Presentation

Transcript and Presenter's Notes

Title: Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database

1
Storing and Maintaining Semistructured Data
Efficiently in an Object-Relational Database

Mo Yuanying and Ling Tok Wang

2
Contests

1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion

3
Main Accomplishment

This study provides an efficient and consistent
storage for semistructured data by developing
algorithms that map the XML document to logical
ORA-SS model and then to an object-relational
data store.

4
Contests

1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion

5
(1) the file system
Related Works

store each XML document as a separate operating
system file and use a DOM or SAX parser whenever
the document is accessed by a query
Disadvantage
XML files in ASCII format need to be parsed every
time when they are accessed for either browsing
or querying.
the entire parsed file must be memory-resident
during query processing in DOM.
it is hard to build and maintain indices on
documents stored this way.
update operations are difficult to implement.

6
(2)Using a relational DBMS
Related Works

XML data is stored in relations and the XML query
language (for example, XQuery) is translated to
SQL and executed by the underlying relational
database system

Disadvantages
A great deal of redundancy
Difficult to do search or update
Handling multi-valued attribute is expensive

-- The Edge Approach -- The Attribute Approach --
Universal Table -- Normalized Universal
Approach -- STORED
7
(3)Using a storage manager
Related Works

the XML query is parsed, translated to a suitable
operator tree representation, optimized, and then
executed by an XML Query Engine
-- Shore
-- B-tree

Disadvantage
Inconvenient when doing the search or update

8
(4)Our approach --Store ORA-SS in nested
relations
Related Works

Problems in existing storage approaches
Stored in flat files it is long and difficult
to query or update
Relational DBMS these approaches cannot get the
semantic information
ORA-SS reflects the nested structure of
semi-structured data, distinguishes between
object classes, relationship types and
attributes. It is possible to specify the degree
of n-ary relationship types and indicate if an
attribute is an attribute of a relationship type
or an attribute of an object class. Such
information is essential for designing an
efficient and non-redundant storage organization
for semi-structured data
Handling multi-valued attribute better in nested
relations

9
Contests

1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion

10
ORA-SS

A semantically richer data model for
semi-structured data
3 main concepts
Object class
Relationship type
Attribute

11
Example
ORA-SS

Binary relationship type

12
Example (Cont)
ORA-SS

Ternary relationship type

13
Example (Cont)
ORA-SS

The distinction between binary and ternary
relationship types cannot be made in other
semi-structured data models.

14
ORA-SS

ORA-SS can specify the degree of n-ary
relationship types
ORA-SS can indicate if an attribute is an
attribute of a relationship type or an attribute
of an object class
Existing semi-structured data models cannot
specify such information while it is essential
and important for storage

15
Contests

1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion

16
ORA-SS to OR database
Storing Algorithm

Object-Relational database can handle
multi-valued attributes efficiently.
Multi-valued attributes are treated as repeating
groups in nested relations.

17
ORA-SS to OR database
Storing Algorithm

Main rules
Each object class together with its attributes
forms a nested relation while multi-valued
attributes as repeating groups of this relation
(Object relation).
Each relationship type(object classes involved in
this relationship type) together with its
attributes forms a nested relation while
multi-valued attributes as repeating groups of
this relation (Relationship relation).

18
(1)Object class translation algorithm
Storing Algorithm

O1 The identifier and candidate key of this
object class is the primary key and candidate key
of the generated relation.
O2 Each single-valued attribute of this object
class is a single-valued attribute of the
generated relation.
O3 Composite attributes of object class are
represented directly. They are replaced by their
components in the generated relation.

19
Object class translation algorithm (cont)
Storing Algorithm

O4 Each multi-valued attribute of this object
class forms a repeating group in this relation.
O5 Each reference is a foreign key in this
relation.
O6 Each disjunctive attribute is treated as two
attributes.
O7 For the ID dependency relationship type, the
rule for the ID dependent object class is the
same as the rule for the regular object class.
The ID dependent object class together with its
attributes forms a nested relation within its
parent object class.

20
Translation Example1
Storing Algorithm
21
(2)Relationship type translation algorithm
Storing Algorithm

R1 All the identifiers of the object classes
participating in this relationship type form the
single-valued attributes of the nested relation.
The key of the relationship type can be
determined by the participation constraint of the
relationship type.
R2 Each single-valued attribute of this
relationship type is a single-valued attribute of
the generated relation.

22
Relationship type translation algorithm (cont)
Storing Algorithm

R3 Composite attributes of relationship type are
represented directly. They are replaced by their
components in the generated relation
R4 Each multi-valued attributes of this
relationship type forms a repeating group in this
relation.
R5 A disjunctive relationship type is treated as
two relationship types.
R6 There is no need to translate ID dependency
relationship type.

23
Translation Example1
Storing Algorithm
24
Translation for Ordering and ANY
Storing Algorithm

(3)Translation for Ordering
we define another attribute named ordinal within
the ordered object class (ie, the ordered
attribute).
(4)Translation for ANY
the unknown structured attribute or an attribute
may have a different structure for different
instances, which is denoted as ANY
we define a separate table as (Identifier, ANY,
ANY-value).
Identifier is the identifier of the object class
or the relationship type which this ANY belongs
to.
ANY is the different structure name (the TAG) for
the different instances.
ANY-value is its value.

25
Translation Results
Storing Algorithm

Followed these algorithms, the Normal Form ORA-SS
schema will result in the normal form nested
relations.
the undesirable update anomalies in
semi-structured databases are removed and any
redundancy due to many-to-many relationships and
n-ary relationships are controlled

26
Contests

1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion

27
Comparison

Other models
Supply(J, S, P, price, Qty)

28
Conclusion

Our approach is to use ORA-SS as our data model
and use object-relational database as the
database management system.
We can store and access the semi-structured data
correctly, more efficient and without avoidable
redundancy.
There is no node ID needed in our approach.

29
Conclusion (cont)

Our approach can capture the semantic information
which is essential and important for storage.
Our approach can represent the degree of n-ary
relationship types.
Our approach can represent the attribute as
attribute of object class or attribute of
relationship type.

Write a Comment

User Comments (0)

About PowerShow.com

Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database PowerPoint PPT Presentation