Title: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation
1Rainbow Bridging XML and Relational Databases
Design, Implementation, and Evaluation
- MQP Advisor
- Prof. Elke A. Rundensteiner
- Sponsor
- Verizon Laboratories Incorporated
MQP Project Members Tien Vu, Mirek Cymer, John
Lee
2HTML vs. XML
- Microsoft, IBM, Informix, Oracle, Sun, ...
3XML Data Management by RDBMS
- Advantages
- Efficient query and analysis tools.
- Matured database tools available.
- Easy integration with existing business
databases. - Issues
- Map between XML and Relational Model.
- Update Propagation.
- Query Translation and Optimization.
4Motivation for Mapping
car
ltEMPTYgt
Alternate Mapping
car
make
make
model
year
Ford
model
Ford
Mustang
2001
Mustang
year
2001
- Query Performance vary with respect to how data
is mapped. - Flexible mapping fixed translation and
restructure
5Rainbow Architecture
Legend
XMLQuery
XML
User
RDBMS
XML Query Engine
XML Data
Subsystem
Restructuring Subsystem
DTDM Manager
XML Manager
DTD
XML
6Goals of our MPQ
- What
- Implement and evaluate restructuring subsystems
within the large-scale Rainbow system. - How
- Learn about the database technologies and web
tools. - Translate research ideas to software system
design. - Practice software engineering techniques
- UML, engineer and reuse code.
- Design an experimental test plan and test bed.
- Conduct performance study and analysis.
7Restructuring Subsystem
XMLQuery
XML
User
Legend
XML Query Engine
XML Model
Mapping
Query Storage
Subsystem
Restructuring
Restructure Operator Library
Restructurer
Relational Model
DTDM Manager
XML Manager
Internal Process
DTD
XML
8Restructuring Operators
- 11 Restructuring Operators
- Rename Item/Attribute
- Switch Nesting
- Pushup/Pushdown Attribute
- Pushup/Pushdown Nesting
- Split/Merge Nesting
- Reference/Dereference
9Mapping Sequence of Restructure Operators
invoice
summary
ltemptygt
account_num
bill_period
account_num
bill_period
value
value
- Mapping is modeled as a sequence of reversable
restructuring operators, Operator Name
Parameters. - For Example
pushUpAttribute(account_number, value,
invoice, account_number) pushUpAttribute(bil
l_period, value, invoice, bill_peroid) ren
ameItem(invoice, summary)
10SQLs for Push-Up Attributes
A
A
Push-up
B
a
B
b
- CREATE VIEW new.A (ltall-columnsgt, a) AS
- SELECT A.ltall_columnsgt, B.b
- FROM old.A, old.B
- WHERE B.pid A.iid
- CREATE VIEW new.B (ltall-columns-but-bgt) AS
- SELECT B.ltall-columns-but-bgt
- FROM old.B
11Example SQLs
- Inline make.value into car as Attribute make.
- Mapping
- pushUpAttribute(account_number, value,
invoice, account_number) - SQL statements
- CREATE VIEW new.invoice (iid, pid,
account_number) AS - SELECT invoice.iid, invoice.pid,
account_number.value - FROM old.invoice, old.account_number
- WHERE account_number.pid invoice.iid
- CREATE VIEW new.account_number (iid, pid) AS
- SELECT account_number.iid, account_number.pid
- FROM old.account_number
12Rainbow Implementation
- Development Tools
- Java Visual Café2, Javadocs, JAVA2
- Oracle 8i, XML 4J, JDBC1.2, SQL Queries
- Code Facts
- 44 total system classes
- 17 classes of Rainbow
- 27 classes reused
- ? lines of system code
- ? lines of Rainbow code
- ? lines of code reused
13Screen Shot
14Screen Shot
15Rainbow Test Experimental Evaluation
- Experimental Setup
- Oracle 8i
- Windows NT
- Data
- Created a DTD
- Randomly generated XML
- Hand translated queries
- Factors
- Type of query
- Number of operations
16Query Performance Evaluation
17Rainbow Conclusions
- Technical accomplishments
- Functional prototype system
- Feasibility of Rainbow concepts
- Automated test bed designed
- Performance evaluations show that
- (Ideal) Moving up data on the embedded-relational-
level yields better query performance for Join
queries. - Knowledge gained
- OO, Java, JDBC, SQL, RDBMS, XML, DTD
- Teamwork S/W Engineering Software Reuse
- Logistics of setting up an experiment
- Future work
- Experiment test plans and test beds to realize
the full potential of the restructuring component.
18Rainbow XML and Relational Database Design,
Implementation, and Evaluation
- Project Members
- Tien Vu, Mirek Cymer, John Lee
- Advisor
- Elke A. Rundensteiner
- Ph. D Student
- Xin Zhang
- Sponsor By
- Verizon Laboratories Incorporated
- Visit Rainbow at http//davis.wpi.edu/dsrg/TJM/
19Recycled!!!
20XML The Future of the Web
- Benefits
- Efficient query and analysis tools.
- Matured Data Warehousing support.
- Easy Integration with existing business database.
- Applications
- E-commerce
- Web-based industries
- ltinvoicegt
- ltaccount_numbergt555 777-3158 573 234
lt/account_numbergt - ltbill_periodgtJun 9 - Jul 8, 2000lt/bill_periodgt
- ltcarriergtSprintlt/carriergt
- ltitemized_call no1 dateJUN 10
number_called973 555-8888 time1017pm
rateNIGHT min1 amount0.05 /gt - ltitemized_call no2 dateJUN 13
number_called973 650-2222 time1019pm
rateNIGHT min1 amount0.05 /gt - ltitemized_call no3 dateJUN 15
number_called206 365-9999 time1025pm
rateNIGHT min3 amount0.15 /gt - lttotalgt0.25lt/totalgt
- lt/invoicegt
21XML and Relational Database
- Problem
- Many Application usually change its data very
frequently. - e.g., flight reservation, online billing,
inventory. - Current Solution
- Reloading the complete XML document when changed
which is very expensive. - Rainbow Solution
- Incrementally propagate XML Document Updates to
Stored XML Data. - Goal XML Repository Implemented using RDBMS
- Approach Flexible Mapping
- Features
- DTD Metadata Management in RDB
- Automatic Schema Creation
- Incremental Update Propagation
- XML Query Optimization
22Rainbow Analysis
23Rainbow Analysis Cont..
24HTML vs. XML
- HTML
- lth1gtCarlt/h1gt
- lth2gtMakelt/h2gt
- ltpgtFord Mustang
- lth2gtSeatslt/h2gt
- ltpgt5
- lth2gtTop Speedlt/h2gt
- ltpgt70 m.p.h
- XML
- lth1gtCarlt/h1gt
- ltmakegtFord Mustanglt/makegt
- ltseatsgt5ltseatsgt
- ltspeed unitsmphgt70lt/speedgt