Title: Translating Relational Schemas to XML Schemas
1Translating Relational Schemas to XML Schemas
- Dongwon Lee, Murali Mani,
- Frank Chiu, Wesley. W. Chu
- UCLA
2Our goals.
- Map from relational to XML model automatically.
- Maintain semantic constraints during the mapping.
- Existing work such as XML Extender (from IBM),
db2XML, XML-DBMS, SilkRoute, EXPERANTO rely on
user-specified mapping, and do not maintain
semantic constraints.
3NeT Nesting-based Translation
- Nesting on a single attribute, X ?C, where C is
column set of a table t groups the tuples of t
with the same value for (C X). - NeT is especially useful to
- Remove redundancies due to MVDs.
- Group on attributes to get a more intuitive
schema. - Semantic constraints are maintained.
4nestC(t)
t
nestA(nestC(t))
5NeT optimization
- Perform minimum number of nesting.
- Properties used for optimization
- Idempotency of nesting we need not perform
nesting more than once on any single attribute. - Nesting need not be done on an attribute that
does not participate in any one candidate key. - Experimental results NeT almost always decreases
the data size tremendously for one particular
data set, size of the nested table is 6 of the
size of the original table.
6CoT Constraint-based Translation Step 1
- Consider IND s? ? t?, where ? ? X, ? ? Y, ?
is primary key, and ? is non-nullable. - If ? is unique, M(t) (Y, s?), else M(t) (Y,
s) - M(s) (X - ?)
- Key for s is (Ks - ?)
M(professor) (Pname, Age, student) M(student)
(Sname, Course)
ltprofessorgt ltPnamegtMuntzlt/Pnamegt
ltAgegt60lt/Agegt ltstudentgt ltSnamegtJohnlt/Snamegt
ltCoursegtDBlt/Coursegt lt/studentgt ltstudentgt
ltSnamegtJohnlt/Snamegt ltCoursegtN/Wlt/Coursegt
lt/studentgt lt/professorgt
student
ltprofessorgt ltPnamegtChult/Pnamegt
ltAgegt55lt/Agegt lt/professorgt
professor
7CoT Step 2
- Consider tables, s, t1, t2 with column set X, Y1,
Y2, and INDs s? ? t1?1, and s? ? t2 ?2,
where ?1, ?2 are primary keys and ? , ? are
non-nullable - Translate one IND as in Step1, and translate the
other to IDREF as - M(t1) (Y1, s), M(t2) (Y2), M(s) (X -
? - ?), A(t2) ID_t2ID, A(s)
Ref_t2IDREF
ltprofessorgt ltPnamegtMuntzlt/Pnamegt
ltAgegt60lt/Agegt ltstudent Ref_courseDBgt
ltSnamegtJohnlt/Snamegt lt/studentgt ltstudent
Ref_courseN/Wgt ltSnamegtJohnlt/Snamegt
lt/studentgt lt/professorgt
ltcourse ID_courseDB/gt ltcourse
ID_courseN/W/gt ltprofessorgt
ltPnamegtChult/Pnamegt ltAgegt55lt/Agegt lt/professorgt
course
8CoT Step 3
- Consider a relational schema with tables t1,
t2,,tn and INDs ti?i ? tj?j - Construct an IND-graph
- Identify top-nodes as
- Nodes that do not have any IND are top-nodes
- In a strongly connected component formed from
table-set, S, if there is no IND from a node in S
to a node outside S, then one of the nodes in S
must be a top-node. - Perform BFS and translate the IND as in Step 1 or
Step 2.
9M (course) (Cid, Title, Room, course) M(prof)
(student) M(student) (Sid, Name) M(emp)
(Eid, Name, dept, proj) M(dept) (Dno) M(proj)
(Pname) A(emp) ID_empID,
Ref_projIDREF A(prof) Ref_empIDREF A(proj
) ID_projID
IND-Graph
10Conclusions
- Automatically map from relational to a good XML
model. - Maintain semantic constraints.
- Remove some of the redundancies that could have
been present in the relational model.