PicoDBMS: Scaling down Database Techniques for the Smartcard - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

PicoDBMS: Scaling down Database Techniques for the Smartcard

Description:

Domain Storage Illustration. 1. 2. 3. 4. Relation R. Domain value ... Operators are executed in a pipelined fashion. All left-operands must be stored in memory ... – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 47
Provided by: course2
Category:

less

Transcript and Presenter's Notes

Title: PicoDBMS: Scaling down Database Techniques for the Smartcard


1
PicoDBMS Scaling down DatabaseTechniques for
the Smartcard
  • Authors Christophe Bobineau,
  • Luc Bouganim,
  • Philippe Pucheral,
  • Patrick Valduriez
  • Presented by ON Saitung

2
Outline
  • Smartcard Overview
  • Motivation of DBMS on Smartcard
  • Challenges of Scaling down DBMS for Smartcard
  • PicoDBMS an elegant solution
  • - Storage Model
  • - Query Processing Model
  • Conclusion

3
Smartcard Products
4
Smartcard Architecture
5
Smartcard Overview
  • Advantages
  • - Portability (small size, light weight)
  • - Security
  • Physical view tamper proof
  • Data view strong encrypted
  • - Availability (anytime, anywhere, on any
    terminal)
  • - Widely used
  • banking, healthcare, insurance, etc.
  • - Low cost (to be sold in large volumes)

6
Smartcard Overview
  • Disadvantages
  • - Limited storage capacity (up to 128Kb EEPROM)
  • - Very slow write speed in EEPROM (10ms/word)
  • - Extremely small size of the RAM (i.e., 4KB)
  • - Lack of autonomy
  • i.e., no independent power supply, thus no
  • asynchronous and disconnected processing

7
Motivation of DBMS on Smartcard
  • The Trend towards multi-application, versatility

8
Motivation of DBMS on Smartcard
  • Smartcard will become more powerful
  • Moores law for processor and memory capacities
  • To handle complex queries on a large volume of
    data
  • To Keep data consistency
  • i.e. to meet the transaction ACID properties
  • The nature merit of DBMS
  • Security concern
  • unsuitable to implement DBMS outside smartcard

9
Why should the traditional DBMS be scaled down?
  • Recall the limitations on Smart Card
  • - tiny RAM, little stable storage
  • - costly write and lack of autonomy
  • - high demand on security issues
  • Traditional DBMS
  • - Consume significant amounts of RAM
  • - Use caching and asynchronous I/Os to reduce
  • disk accesses
  • - Complex algorithms

Conflict
Therefore, how to scale down database techniques
becomes the major problem!
10
The existing solutions to scale down Database
techniques
Used on PDA, have a small footprint but still
use much RAM and stable memory
  • - Sybase Adaptive Server Anywhere
  • - Oracle 8i Lite
  • - DB2 Everywhere
  • - ISOLs SQLJava Machine DBMS
  • - SCQL - standard for smart card database
    language

Used on smart card, their optimization is limited
on simple select query
Therefore, further scaling down database
techniques is required!
11
Database Management Requirements
Database management requirements of various
classes of smartcard applications
12
Detailed study on Health Card Application
  • The representative of personal folder
    applications
  • The amount of data is significant (duplicates
    exist)
  • Queries can be rather complex
  • Mature access rights management are required
  • Atomicity must be preserved.
  • Durability is a must.

13
Challenges of Scaling down DBMS for Smartcard
  • How to minimize the data structures to fit on the
    smart card?
  • How to minimize the RAM usage so that no overflow
    happened?
  • How to minimize write operations to enhance
    performance?
  • How to take advantage of the fast read operations
    of EEPROM?
  • How to enforce ACID for transactions?
  • How to minimize the algorithms complexity to
    keep it secure?

14
PicoDBMS Architecture
  • Storage manager
  • manages the storage of data indices
  • Query manager
  • processes query plans composed of select,
    project, join and aggregates.
  • Transaction manager
  • enforces the ACID properties.
  • Access right manager
  • provides access rights on base data and on
    user-defined views.

15
PicoDBMS Design Rules
  • Compactness rule
  • Minimize size of data indices
  • RAM rule
  • Minimize the RAM usage while executing
    queries
  • Write rule
  • Minimize writes in stable storage
  • Read rule
  • Take advantage of fast read of EEPROM
  • Access rule
  • Take advantage of direct access capability
    of EEPROM
  • Security rule
  • Never externalize protected data Simplify
    algorithms

These rules are addressing the aforementioned
challenges!
16
PicoDBMS Storage Model
  • Existing storage models
  • Storage cost evaluation
  • - A combination of the existing storage models
  • - Chooses the best storage model for each
    attribute

17
Flat Storage model
  • Tuples are stored sequentially with attribute
  • values embedded.
  • Two drawbacks
  • 1. Space consuming (cannot avoid value
    duplicates)
  • 2. Inefficient (have to compute sequentially)
  • Indexed FS
  • solve the second drawback, but makes the first
  • drawback worse.

Relation R
18
Domain Storage model
  • Factoring out values into a domain table for
  • data compactness.
  • Three advantages
  • 1. Possible space saving
  • 2. Amount of Writes can be possibly reduced
  • 3. All tuples of all relations become
    fixed-size
  • One disadvantage
  • Inefficient (have to compute sequentially)
  • Only use when
  • 1. data size gt pointer size
  • 2. duplicates exist

19
Domain Storage Illustration
Relation S
Relation R
Domain value
Relation S
Relation R
20
Join on Domain Storage
Relation R
Relation S
R.a
S.b
It naturally forms an unidirectional join
index from S.b to R.a
m
n
S.b foreign key referencing R.a
R.a primary key
e.g. select from S, R where S.b R.a
Cost O(m) v.s. O(mn)
21
Ring Storage Model
  • Address both index compactness and data
    compactness.
  • Replacing Tuple to Value pointers with Value to
    Tuple pointers
  • One pointer per domain value whatever the
    cardinality is.

Relation S
attr
Relation S
Index on S.attr
attr
Domain
22
Select on Ring Storage
Relation S
Relation S
attr
Index on S.attr
attr
Domain
e.g. Select from S where S.attr 1
23
Join on Ring Storage
R.a
Relation R
Here, it naturally forms a bi-directional join
index between S.b and R.a
Relation S
S.b
m
n
S.b foreign key referencing R.a
R.a primary key
e.g. select from S, R where S.b R.a
Cost O(m2/(2n)) v.s. O(mn)
24
Storage cost evaluation
  • - The proposed storage model is a combination
    of FS,DS,RS and Indexed FS.
  • - Cost evaluation helps to decide which model
    to be adopted for storing each attributes.
  • - For attributes need indexing, the choice is
    between RS and
  • Indexed FS
  • - For attributes do not need indexing, the
    choice is between FS and DS

25
Parameters of Cost Model
  • CardRel cardinality of the relation holding the
    attribute
  • CardDom cardinality of the attribute domain
    table
  • a Average length of attribute (bytes)
  • p Pointer size (3 bytes..)
  • S Selectivity factor of the attribute lt1
    (redundancy of the attribute) SCardDom/CardRel

26
Cost Formula
  • Cost (FS) CardRela
  • Cost (DS) CardRelpSCardRela
  • Cost (Indexed FS) Cost (FS)SCardRelaCardRel
    p
  • Cost (RS) Cost (DS)SCardRelp

- Cost (FS) Cost (DS) when S(a-p)/a - Cost
(Indexed FS) Cost (RS) when Sa/p
27
FS vs. DS
  • Cost (FS) Cost (DS) when S (a - p) /a

S
P 3
FS better
DS better
a
28
Indexed FS vs. RS
  • - Cost (Indexed FS) Cost (RS) when S a/p

P 3
S
FS better
RS better
a
29
Traditional Query Processing
  • Traditional Query Processing consumes large main
    memory for storing
  • - Intermediate Results
  • - Temporary data structures (e.g. hash
    functions)
  • When no space to hold them, state-of-the-art
    algorithms store them on disk to avoid memory
    overflow

30
Why not use these traditional algorithms?
  • For a longer lifetime of stable memory, writes on
    stable memory are forbidden.
  • Unable to estimate the required RAM size
  • Those state-of-the-art algorithms are complex,
    thus may bring security problems for smartcard.

A query processing technique which do not use any
RAM nor incur any writes in the stable memory is
proposed in PicoDBMS!
31
Query Execution
  • Query processing is done in two steps
  • - Query execution plan (QEP) generation
  • - Query engine executes this QEP
  • Different shapes of QEP
  • - Left deep tree
  • - Right deep tree
  • - Bushy tree
  • - Extreme right deep tree

32
Query Example (From Healthcare Database)
  • Doctor (DocId,name,specialty,...)
  • Prescription (VisId,DrugId,qty,...)
  • Visit (VisId,DocId,date,diagnostics,...)
  • Drug (DrugId,name,type)
  • Query Q1 Who prescribes antibiotics in 1999

33
Left deep tree
- Operators are executed sequentially - Each
intermediate result is stored in stable memory.
Query Q1 Who prescribes antibiotics in 1999
34
Right deep tree
  • Operators are executed in a pipelined fashion
  • All left-operands must be stored in memory

Query Q1 Who prescribes antibiotics in 1999
35
Bushy tree
  • Both the intermediate results and left operands
    need to be materialized.

Query Q1 Who prescribes antibiotics in 1999
36
Extreme right tree
- Uses pipelining on every operator (including
select) - As left operands are always base
relations, they are already materialized in
stable memory. - Thus, no ram consumption and
no writes incur
Query Q1 Who prescribes antibiotics in 1999
37
Iterator Model
  • Pipeline execution can be easily achieved using
    Iterator Model.
  • Each operator is an iterator that supports 3
    procedure calls
  • 1. Open - prepare an operator for producing an
    item
  • 2. next - to produce an item
  • 3. Close - perform final clean up
  • Starts from the root.

38
Complex Query Execution
  • Complex query operators

- Aggregate (e.g., count, min, max) - Sort -
Duplicate removal
  • Classical solution

- Rely on materializing intermediate results -
Thus, not suitable for smartcard
39
Complex Query Execution without RAM
  • PicoDBMS solution

- If the incoming tuples are already grouped by
distinct values, aggregate and duplicate removal
can be done in pipeline. - Pipeline operators
are in order because tuples are consumed in their
arrival order.
Therefore, maintaining a right order at the
leaf of the execution tree allows pipelined
aggregation and duplicate removal!
40
Example of Complex Query Execution
Query Q2 Number of antibiotics prescribed per
doctor in 1999
41
Another Example
The output data should be grouped by type of
drugs rather than Drug.id, therefore, an
additional join is required between Drug and
Drug.type.
Query Q3 Number of prescription per type of
drug
42
Performance Evaluation
  • Evaluate whether PicoDBMS performance matches the
    smartcard applications requirements
  • Evaluate which techniques are really necessary
  • - ring indices
  • - query optimization

43
Performance Simple Query
For simple query,
Ring indices are necessary in medium and
large database!
Query optimization is useful for large
database!
Query Q1 Who prescribes antibiotics in 1999
44
Performance Complex Query
For complex query,
Ring indices are necessary in medium and
large database!
Query optimization is useful for large
database!
Query Q4 Number of prescriptions per doctor
and type of drugs
45
Conclusion
  • Overall, a great contribution to industry
  • - Ranked the best paper of VLDB 2000
  • - Elegant ring-based storage model
  • - Powerful query processing engine
  • - Meets the requirements of smartcard
    applications
  • Large potential for future deployment

46
  • End
Write a Comment
User Comments (0)
About PowerShow.com