Title: 146 pages
1Thesis Presentation CERIA Laboratory
Design and Implementation of a Scalable
Distributed Database System SD-SQL Server
Soror SAHRI Soror.Sahri_at_dauphine.fr
http//ceria.dauphine.fr/soror/soror.html
21. Introduction
2. State of The Art
P L A N
3. SD-SQL Server Architecture
4. SD-SQL Server Application Interface
5. Implementation of SD-SQL Server
6. Performance Measurements
7. Conclusion Future Work
3Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Facts Objectve
Facts
- Most of DBSs have distributed/parallel versions
- SQL Server, Oracle, DB2
- DBSs do not provide dynamically scalable tables
- All require manual repartitioning when tables
scale-up
4Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Facts Objective
Objective
- Scalable Distributed Partitioning of Relational
Tables
Scalable Distributed Database System
SD-SQL Server
51. Introduction
2. State of the Art
P L A N
3. SD-SQL Server Architecture
4. SD-SQL Server Application Interface
5. Implementation of SD-SQL Server
6. Performance Measurements
7. Conclusion Future Work
6Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Parallel DBMSs SDDSs
Parallel DBMSs
LB05 K, Loney B, Bryla. Oracle Database
10g, DBA Handbook
7Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Parallel DBMS SDDSs
SDDSs
- An SDDS is a new class of data structures
- Specific for multicomputers, P2P, Grids
- SDDSs provide many scalable distributed
partitioning schemes - LH, RP, k-RP, LHRS
- These schemes can serve as the basis for an
SD-DBS architecture
81. Introduction
2. State of the Art
P L A N
3. SD-SQL Server Architecture
4. SD-SQL Server Application Interface
5. Implementation of SD-SQL Server
6. Performance Measurements
7. Conclusion Future Work
9Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
SD-SQL Server?
- SD-SQL Server is a Scalable Distributed Database
System (SD-DBS) - SD-SQL Server uses the reference architecture
- Proposed by Pr. Litwin, Pr. Schwartz Pr. Risch
- 2nd Intl. Workshop on Cooperative Internet
Computing, 2002 - SD-SQL Server is based on the RP SDDS principles
- SD-SQL Server runs on Microsoft SQL Server 2000
10Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Gross Architecture
Litwin Sahri. WDAS 2004
- The SD-SQL Server originality ?
- The automatic extension of the scalable tables
- of their NDBs
- of their SD-SQL Server nodes
11Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
The Nodes, NDBs SDBs
12Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Scalable Tables
- A scalable (distributed) table is a collection of
segments - Segments are SQL tables
- A scalable table has, initially, only one primary
segment - At some server or peer NDB
- All the segments of a scalable table have the
same scheme -
13Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Scalable Tables Meta-data
- Each scalable table has meta-data
- The segment capacity
- The actual partitioning of the scalable table
- The check constraint of each segment
A check constraint defines the Min and Max for
each segment
- These meta-data are stored in the meta-tables
- excluding the check constraints
14Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Scalable Tables Meta-data
DB1 SDB
.
T Scalable Table
15Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Scalable Tables Splitting
- The number of segments in a scalable table is
variable - A segment that overflows splits
- A split occurs when an insert overflows the
segment capacity - Every split produces one or more new segments for
a scalable table
16Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Scalable Tables Splitting
Check Constraint?
b1
b
SELECT TOP P INTO Ni.Si FROM S ORDER BY C ASC
SELECT TOP P WITH TIES INTO Ni.S1 FROM S ORDER
BY C ASC
17Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Scalable Tables Splitting
sd_create_node
sd_create_node_database
.
18Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Images
- An Image hides the scalable table partitioning
- An image is an SQL Server distributed updateable
partitioned view of the table - An SQL Server Union-all view with check
constraints - An image resides on client or peer NDBs
19Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Image Types
- Primary image
- Resides at the creation node
- Has the name of the scalable table
- Secondary images
- Reside at other client or peer NDBs of the SDB
- Have a specific name, other than that of the
table - To avoid name conflict
20Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Image Adjustment
- An image presents the actual partitioning of its
scalable table - Defines the partitioning as known to the client
- It do not address any new segments resulted from
a split - Are dynamically adjustable by the client
- When a query to the image comes in
- Image checking
- Image adjustment if necessary
21Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Image Adjustment
- Get the number of segments presented in the
image, N1 - Get the number of segments of the scalable table,
N2 - Compare N1 and N2
- If N1ltN2 then Image Adjustment
- Alter the partitioned view definition
22Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Overview Nodes, NDBs SDBs Scalable Tables Images
Images Example
DB1 SDB
T Image
T Scalable Table
CREATE VIEW T AS SELECT FROM N2.DB1.SD._N1_T
CREATE VIEW T AS SELECT FROM N2.DB1.SD._N1_T
UNION ALL SELECT FROM N3.DB1.SD._N1_T
UNION ALL SELECT FROM N4.DB1.SD._N1_T
231. Introduction
2. State of the Art
P L A N
3. SD-SQL Server Architecture
4. SD-SQL Server Application Interface
5. Implementation of SD-SQL Server
6. Performance Measurements
7. Conclusion Future Work
24Principles Nodes Management SDBs NDBs
Management Scalable Tables Images
Management Scalable Queries Management
Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Principles
Litwin, Schwartz Sahri. IASTED-DBA 2006
- The application interface manipulates scalable
tables through SD-SQL Server commands - The SD-SQL Server commands start with sd_ to
distinguish from SQL Server commands for static
tables -
25Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Principles Nodes Management SDBs NDBs
Management Scalable Tables Images
Management Scalable Queries Management
Nodes Management
- Node Creation
- sd_create_node Dell1 / Server by default /
- sd_create_node Ceria, client
- Node Alteration
- sd_alter_node Ceria, ADD server / Becomes
peer/ - Node Removal
- sd_drop_node Ceria
26Principles Nodes Management SDBs NDBs
Management Scalable Tables Images
Management Scalable Queries Management
Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
SDB NDB Management
- SDB Creation
- sd_create_scalable_database
- SkyServer, Dell1, Server, 2
- / Creates the primary SkyServer NDB as well at
Dell1/ - SDB Alteration
- sd_create_node_database SkyServer, Ceria,
Client - SDB Removal
- sd_drop_scalable_database SkyServer
27Principles Nodes Management SDBs NDBs
Management Scalable Tables Images
Management Scalable Queries Management
Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Scalable Tables Management
- Scalable Table Creation
- sd_create_table PhotoObj (Objid BIGINT PRIMARY
KEY..), 10000 - Scalable Table Alteration
- sd_alter_table PhotoObj ADD t INT, 1000
- sd_create_index run_index ON Photoobj (run)
- sd_drop_index PhotoObj.run_index
- Scalable Table Removal
- sd_drop_table PhotoObj
28Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Principles Nodes Management SDBs NDBs
Management Scalable Tables Images
Management Scalable Queries Management
Images Management
- Secondary Image Creation
- sd_create_image Ceria, PhotoObj
- Secondary Image Removal
- sd_drop_image 'PhotoObj
29Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Principles Nodes Management SDBs NDBs
Management Scalable Tables Images
Management Scalable Queries Management
Scalable Queries Management
- USE SkyServer / SQL Server command /
- Scalable Update Queries
- sd_insert INTO PhotoObj SELECT FROM
Ceria.Skyserver-S.dbo.PhotoObj - Scalable Search Queries
- sd_select FROM PhotoObj
- sd_select TOP 5000 INTO PhotoObj1 FROM
PhotoObj, 500
301. Introduction
2. State of the Art
P L A N
3. SD-SQL Server Architecture
4. SD-SQL Server Application Interface
5. Implementation of SD-SQL Server
6. Performance Measurements
7. Conclusion Future Work
31Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Internal Processing Concurrency Experiments
Command Processing
Litwin, Schwartz Sahri. WDAS 2006
- Let Q a scalable query using the PhotoObj image
- sd_select COUNT () FROM PhotoObj
Find Images in Q
PhotoObj Image Adjustment
Execution of Q
32Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Internal Processing Concurrency Experiments
Concurrency
- SD-SQL Server processes every command as SQL
distributed transaction at Repeatable Read
isolation level - Much less blocking than at Serializable Level
- SD-SQL Server performs the split asynchronously
with the insert that triggered it - It launches the actual splitting as an
asynchronous job called splitter
33Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Internal Processing Concurrency Experiments
Concurrency
- Splits use exclusive locks on segments and on
tuples in RP meta-table. - Shared locks on other meta-tables Primary, NDB
meta-tables - Scalable queries use basically shared locks on
meta-tables and any other table involved
34Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Internal Processing Concurrency Experiments
Concurrency example
X
Exclusive Lock
Waiting
Shared Lock
X
Exclusive Lock
Exclusive Lock
351. Introduction
2. State of the Art
P L A N
3. SD-SQL Server Architecture
4. SD-SQL Server Application Interface
5. Implementation of SD-SQL Server
6. Performance Measurements
7. Conclusion Future Work
36Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Internal Processing Concurrency Experiments
Experimental Environment
- 6 Machines Pentium IV 1.7 GHz
- RAM 780 Mb 1 Gb
- Operating System Windows 2K Server
- Ethernet Network max bandwidth of 1 Gb/s
- Use of SQL Analyzer for editing queries
- Use of SQL Profiler to take measurements
37Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Principles Nodes Management SDBs NDBs
Management Scalable Tables Images
Management Scalable Queries Management
The SkyServer Benchmark
- We use SkyServer database as benchmark
- Provided and installed at Ceria by Dr. Gray
- SkyServer brings the entire database of the Sloan
Digital Sky Survey, SDSS - We use of the PhotoObj table as an example
scalable table - In our experiments, PhotoObj has 158,426 tuples
(about 260 MB) - Originally, it has 14 M tuples
38Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Internal Processing Concurrency Experiments
Splitting Measurements
Litwin, Sahri Schwartz. WDAS 2004
Splitting of PhotoObj scalable table into 2, 3, 4
and 5 segments according to different capacities
39Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Internal Processing Concurrency Experiments
Image Adjustment
(Q1) sd_select TOP 10 objid FROM PhotoObj
WHERE objid not in
(SELECT objid FROM PhotoObj WHERE objid lt
_at_objidMax
Query (Q1) execution time
40Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Internal Processing Concurrency Experiments
Comparison between SD-SQL Server and SQL Server
Litwin, Sahri Schwartz. BNCOD 2006
- (Q2) sd_select COUNT () FROM PhotoObj
-
Execution time of (Q2) on SQL Server and SD-SQL
Server
411. Introduction
2. State of the Art
P L A N
3. SD-SQL Server Architecture
4. SD-SQL Server Application Interface
5. Implementation of SD-SQL Server
6. Performance Measurements
7. Conclusion Future Work
42Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Conclusion Future Work
Conclusion
- Scalable distributed databases with scalable
tables are now a reality with SD-SQL Server - No more manual repartitioning
- Unlike in any other DBS we know about
- The performance analysis proves
- Efficiency of our design
- Immediate utility of SD-SQL Server
43Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Conclusion Future Work
Future Work
- More performance measurements
- With the SDSS queries
- With the SkyServer benchmark of 80 Gb size
- Error processing
- Management of fault tolerance
- Use of the high availability methods
44Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Conclusion Future Work
Future Work
- Application on other DBMSs
- Oracle, DB2, etc.
- Use of the SD-SQL Server principles on P2P
systems or Grid Computing - Use of SD-SQL Server as core component of a
virtual repository of eGov documents
45Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Conclusion Future Work
Thanking
- Work partly supported by
- CEE Project eGov
- MS Research
- CEE Project ICONS
46Introduction State of the Art SD-SQL Server
Architecture SD-SQL Server Application
Interface Implementation of SD-SQL
Server Conclusion Future Work
Conclusion Future Work
Thank you for your Attention
Soror SAHRI Soror.Sahri_at_dauphine.fr
http//ceria.dauphine.fr/soror/soror.html