Title: The Future of Distributed Databases
1The Future of Distributed Databases
2The Future of Distributed Databases
- Self-Organizing Distributed DBMSs
- Multimedia Databases
- Object-Oriented DBMSs
- Course Summary
3Problems Facing the DBA Practical Advice
- A set of requests is an approximation of the
total load - Request frequencies are approximations
- Statistical summaries of data may not reflect
future changes - Model use in calculations may not be completely
accurate - There may be no fast algorithm to find the best
solutions - So
- Find a reasonable design, not an optimal design
4Need for Tuning
- Characteristics of data changes
- Size of fragments
- Location of fragments
- Replication of fragments
- New fragments are added
- Existing fragments are split and/or combined
- Applications change
- Old applications evolve
- New applications are implemented
- Users acquire more sophistication
- So
Tuning is a Continual Process
5Where to Get Information?
- Problems facing the DBA
- Obtain accurate usage pattern
- Develop model of operation
- Rate alternative designs
- Obtain meaningful predictions
- So
- Look to DBMS for assistance
- DBMS is the best source of information
6Design Evaluation Module Overview
DBA
UsagePattern
DesignEvaluationModule
InformationGatheringModule
DBInternalCharacteristics
7Design Evaluation Module Detail
TransactionCost Estimator
Design CostEstimator
EstimatedDesign Cost
Also used to evaluate cost of user queries
8Automated Design Strategy 1
- Iterate design evaluation over all possible
designs - Disadvantage
- Astronomical number of possible designs
9Automated Design Strategy 2
- Derive analytic cost equations
- Apply mathematical optimization techniques
- Disadvantages
- Difficult to derive cost equations
- Entails oversimplification
- Expensive optimization techniques
10Automated Design Strategy 3
- Use heuristic search techniques to select
near-optimal design
11Summary
- Engineering discipline is replacing black art.
- DBA concentrates on design requirements.
- Automated aids
12The Future of Distributed Databases
- Self-Organizing Distributed DBMSs
- Multimedia Databases
- Object-Oriented DBMSs
- Course Summary
13Multimedia Databases
- Use distributed DBMSs to store information about
text, graphics, voice and video
14Thesaurus
What terms are more specific than pine? select
SpecificTerm from Thesaurus where
GeneralTerm"pine" What term is more general
than pine? select GeneralTerm from
Thesaurus where SpecificTerm"pine"
15Term-Document Index
What are the numbers of documents that contain
the words "white pine"? select
DocumentNumber from Index where Term white
pine What are the numbers of documents that
contain the words "white pine" and "blister
rust"? select DocumentNumber from Index where
Term "white pine" INTERSECT select
DocumentNumber from Index where Term "blister
rust"
1 2
16Location
select text (location) from LocationTable where
DocumentNumber 1
17Location Example
Display text with the term "white pine" select
text (Location) from LocationTable, Index where
LocationTable.DocumentNumberIndex.DocumentNumber
and term "white pine"
18Typical Indexing Strategy
- Remove punctuation
- Remove special characters
- Remove common words (stop words)
- Replace words by their stems
- Remove words in text that occur less than n times
- Place words into INDEX
Many white pines have blister rust .
3
3
4
1
19Pictures Location Table
LocationTable Illustration Number Location
1 4460 2 5460
select illustration (location) from
LocationTable where IllustrationNumber 1
20Pictures Example
Illustration 1
Index Illustration Number Term 1 white pine
2 super-car
Illustration 2
Display illustrations showing "white
pines"? select illustration (location) from
LocationTable, Index where LocationTable.Illustra
tionNumber Index.IllustrationNumber and Term
"white pine"
21Compound Document
- Compound document X consists of all the data
units (perhaps of different media) in the leaves
plus the structure of the subtree with root X
Monthly Report
LayoutComposer
Section 1
Monthlyreport
Section 2
Blah, Blah
Blah, Blah
22Compound Document Contents
- Text
- Graphics
- Images
- Voice
- Spreadsheets and charts
- Annotations
23Mixed Media Impact on Distributed Databases
- Extend fragmentation and allocation procedures
- Extend request optimization algorithms
- Extend or replace user interface
- Optimize distributed DBMS to minimize handling
transfer of nontabular data
24Disadvantages of Using RDBMSs for Multimedia
- Forced to fit nontabular data into tables
- Additional processing is needed to reconstruct
nontabular data structures - RDMBSs are good for tabular data, but may not
handle complex structures well.
25The Future of Distributed Databases
- Self-Organizing Distributed DBMSs
- Multimedia Databases
- Object-Oriented DBMSs
- Course Summary
26Classes and Instances
Instances
Class
PERSON
PERSON
PERSON
IsA
IsA
IsA
EMPLOYEE
EMPLOYEE
EMPLOYEE
- Classes represent sets of homogeneous real-world
objects. - Class instance object represents a single
real-world object.
27inheritance
PERSON
PERSON
IsA
IsA
Name Jones Birthdate Dec. 11,
1956 Department Toy Works on Rambo
Doll Dependents Sally, Sam
EMPLOYEE
EMPLOYEE
- An object instance inherits all of the attributes
of its parent.
28Methods and Encapsulation
GiveRaise (100)
- Method is like a subroutine inside of a class.
- Users dont know how the subroutine is
represented. - Users only know the interface to subroutine.
encapsulation
29Complex Objects
Object Instances
Class
PartOf
SECTION
SECTION
SECTION
- Object instances may be arbitrarily complex.
30Object-Oriented DBMS
- Object-oriented DBMSs are like traditional DBMSs
- Persistence
- Sharing
- Consistency
- Resilience
- Associative retrieval
- Object-oriented DBMSs also support operations
specific to objects. - In a sense, business logic is performed by
object-oriented DBMSs when the user or the
application invokes methods.
31Distributed Object-Oriented DBMS
- Objects can be distributed throughout a network
of database servers. - Current research problems
- Object allocation is difficult because
- Access frequencies are not usually available.
- Complex objects may need to be partitioned.
- Query optimization is difficult because
- Optimizer may not know about special methods.
- Optimizer may not know how methods are
implemented.
32Future of Distributed DBMSs Summary
- DBMSs will be extended to support multimedia and
hypertext. - Design tools will be used to determine placement
of fragments in distributed DBMSs. - Some distributed DBMSs will automatically migrate
fragments from site to site.
33The Future of Distributed Databases
- Self-Organizing Distributed DBMSs
- Multimedia Databases
- Object-Oriented DBMSs
- Course Summary
34Course Summary
- Choose the features you need in a distributed
DBMS and then choose a distributed DBMS
architecture. - Choose data fragments and distribute/replicate
them as necessary - Use a high-level language for formulating
requests let optimizers determine the
distributed execution plan. - Planning and cooperation are necessary for a
successful distributed DBMS.
More
35Course Summary, cont.
- Object-oriented, mixed-media databases are coming
- Lots of challenges ahead
- Things are moving fast
- Its an exciting time to be working with
distributed DBMSs