PicoDBMS: Scaling down Database Techniques for the Smartcard - PowerPoint PPT Presentation

1 / 46

About This Presentation

Title:

PicoDBMS: Scaling down Database Techniques for the Smartcard

Description:

Domain Storage Illustration. 1. 2. 3. 4. Relation R. Domain value ... Operators are executed in a pipelined fashion. All left-operands must be stored in memory ... – PowerPoint PPT presentation

Number of Views:123

Avg rating:3.0/5.0

Slides: 47

Provided by: course2

Category:

more less

Transcript and Presenter's Notes

Title: PicoDBMS: Scaling down Database Techniques for the Smartcard

1
PicoDBMS Scaling down DatabaseTechniques for
the Smartcard

Authors Christophe Bobineau,
Luc Bouganim,
Philippe Pucheral,
Patrick Valduriez
Presented by ON Saitung

2
Outline

Smartcard Overview
Motivation of DBMS on Smartcard
Challenges of Scaling down DBMS for Smartcard
PicoDBMS an elegant solution
- Storage Model
- Query Processing Model
Conclusion

3
Smartcard Products
4
Smartcard Architecture
5
Smartcard Overview

Advantages
- Portability (small size, light weight)
- Security
Physical view tamper proof
Data view strong encrypted
- Availability (anytime, anywhere, on any
terminal)
- Widely used
banking, healthcare, insurance, etc.
- Low cost (to be sold in large volumes)

6
Smartcard Overview

Disadvantages
- Limited storage capacity (up to 128Kb EEPROM)
- Very slow write speed in EEPROM (10ms/word)
- Extremely small size of the RAM (i.e., 4KB)
- Lack of autonomy
i.e., no independent power supply, thus no
asynchronous and disconnected processing

7
Motivation of DBMS on Smartcard

The Trend towards multi-application, versatility

8
Motivation of DBMS on Smartcard

Smartcard will become more powerful
Moores law for processor and memory capacities
To handle complex queries on a large volume of
data
To Keep data consistency
i.e. to meet the transaction ACID properties
The nature merit of DBMS
Security concern
unsuitable to implement DBMS outside smartcard

9
Why should the traditional DBMS be scaled down?

Recall the limitations on Smart Card
- tiny RAM, little stable storage
- costly write and lack of autonomy
- high demand on security issues
Traditional DBMS
- Consume significant amounts of RAM
- Use caching and asynchronous I/Os to reduce
disk accesses
- Complex algorithms

Conflict
Therefore, how to scale down database techniques
becomes the major problem!
10
The existing solutions to scale down Database
techniques
Used on PDA, have a small footprint but still
use much RAM and stable memory

- Sybase Adaptive Server Anywhere
- Oracle 8i Lite
- DB2 Everywhere
- ISOLs SQLJava Machine DBMS
- SCQL - standard for smart card database
language

Used on smart card, their optimization is limited
on simple select query
Therefore, further scaling down database
techniques is required!
11
Database Management Requirements
Database management requirements of various
classes of smartcard applications
12
Detailed study on Health Card Application

The representative of personal folder
applications
The amount of data is significant (duplicates
exist)
Queries can be rather complex
Mature access rights management are required
Atomicity must be preserved.
Durability is a must.

13
Challenges of Scaling down DBMS for Smartcard

How to minimize the data structures to fit on the
smart card?
How to minimize the RAM usage so that no overflow
happened?
How to minimize write operations to enhance
performance?
How to take advantage of the fast read operations
of EEPROM?
How to enforce ACID for transactions?
How to minimize the algorithms complexity to
keep it secure?

14
PicoDBMS Architecture

Storage manager
manages the storage of data indices
Query manager
processes query plans composed of select,
project, join and aggregates.
Transaction manager
enforces the ACID properties.
Access right manager
provides access rights on base data and on
user-defined views.

15
PicoDBMS Design Rules

Compactness rule
Minimize size of data indices
RAM rule
Minimize the RAM usage while executing
queries
Write rule
Minimize writes in stable storage
Read rule
Take advantage of fast read of EEPROM
Access rule
Take advantage of direct access capability
of EEPROM
Security rule
Never externalize protected data Simplify
algorithms

These rules are addressing the aforementioned
challenges!
16
PicoDBMS Storage Model

Existing storage models
Storage cost evaluation

- A combination of the existing storage models
- Chooses the best storage model for each
attribute

17
Flat Storage model

Tuples are stored sequentially with attribute
values embedded.
Two drawbacks
1. Space consuming (cannot avoid value
duplicates)
2. Inefficient (have to compute sequentially)
Indexed FS
solve the second drawback, but makes the first
drawback worse.

Relation R
18
Domain Storage model

Factoring out values into a domain table for
data compactness.
Three advantages
1. Possible space saving
2. Amount of Writes can be possibly reduced
3. All tuples of all relations become
fixed-size
One disadvantage
Inefficient (have to compute sequentially)
Only use when
1. data size gt pointer size
2. duplicates exist

19
Domain Storage Illustration
Relation S
Relation R
Domain value
Relation S
Relation R
20
Join on Domain Storage
Relation R
Relation S
R.a
S.b
It naturally forms an unidirectional join
index from S.b to R.a
m
n
S.b foreign key referencing R.a
R.a primary key
e.g. select from S, R where S.b R.a
Cost O(m) v.s. O(mn)
21
Ring Storage Model

Address both index compactness and data
compactness.
Replacing Tuple to Value pointers with Value to
Tuple pointers
One pointer per domain value whatever the
cardinality is.

Relation S
attr
Relation S
Index on S.attr
attr
Domain
22
Select on Ring Storage
Relation S
Relation S
attr
Index on S.attr
attr
Domain
e.g. Select from S where S.attr 1
23
Join on Ring Storage
R.a
Relation R
Here, it naturally forms a bi-directional join
index between S.b and R.a
Relation S
S.b
m
n
S.b foreign key referencing R.a
R.a primary key
e.g. select from S, R where S.b R.a
Cost O(m2/(2n)) v.s. O(mn)
24
Storage cost evaluation

- The proposed storage model is a combination
of FS,DS,RS and Indexed FS.
- Cost evaluation helps to decide which model
to be adopted for storing each attributes.
- For attributes need indexing, the choice is
between RS and
Indexed FS
- For attributes do not need indexing, the
choice is between FS and DS

25
Parameters of Cost Model

CardRel cardinality of the relation holding the
attribute
CardDom cardinality of the attribute domain
table
a Average length of attribute (bytes)
p Pointer size (3 bytes..)
S Selectivity factor of the attribute lt1
(redundancy of the attribute) SCardDom/CardRel

26
Cost Formula

Cost (FS) CardRela
Cost (DS) CardRelpSCardRela
Cost (Indexed FS) Cost (FS)SCardRelaCardRel
p
Cost (RS) Cost (DS)SCardRelp

- Cost (FS) Cost (DS) when S(a-p)/a - Cost
(Indexed FS) Cost (RS) when Sa/p
27
FS vs. DS

Cost (FS) Cost (DS) when S (a - p) /a

S
P 3
FS better
DS better
a
28
Indexed FS vs. RS

- Cost (Indexed FS) Cost (RS) when S a/p

P 3
S
FS better
RS better
a
29
Traditional Query Processing

Traditional Query Processing consumes large main
memory for storing
- Intermediate Results
- Temporary data structures (e.g. hash
functions)
When no space to hold them, state-of-the-art
algorithms store them on disk to avoid memory
overflow

30
Why not use these traditional algorithms?

For a longer lifetime of stable memory, writes on
stable memory are forbidden.
Unable to estimate the required RAM size
Those state-of-the-art algorithms are complex,
thus may bring security problems for smartcard.

A query processing technique which do not use any
RAM nor incur any writes in the stable memory is
proposed in PicoDBMS!
31
Query Execution

Query processing is done in two steps
- Query execution plan (QEP) generation
- Query engine executes this QEP
Different shapes of QEP
- Left deep tree
- Right deep tree
- Bushy tree
- Extreme right deep tree

32
Query Example (From Healthcare Database)

Doctor (DocId,name,specialty,...)
Prescription (VisId,DrugId,qty,...)
Visit (VisId,DocId,date,diagnostics,...)
Drug (DrugId,name,type)
Query Q1 Who prescribes antibiotics in 1999

33
Left deep tree
- Operators are executed sequentially - Each
intermediate result is stored in stable memory.
Query Q1 Who prescribes antibiotics in 1999
34
Right deep tree

Operators are executed in a pipelined fashion
All left-operands must be stored in memory

Query Q1 Who prescribes antibiotics in 1999
35
Bushy tree

Both the intermediate results and left operands
need to be materialized.

Query Q1 Who prescribes antibiotics in 1999
36
Extreme right tree
- Uses pipelining on every operator (including
select) - As left operands are always base
relations, they are already materialized in
stable memory. - Thus, no ram consumption and
no writes incur
Query Q1 Who prescribes antibiotics in 1999
37
Iterator Model

Pipeline execution can be easily achieved using
Iterator Model.
Each operator is an iterator that supports 3
procedure calls
1. Open - prepare an operator for producing an
item
2. next - to produce an item
3. Close - perform final clean up
Starts from the root.

38
Complex Query Execution

Complex query operators

- Aggregate (e.g., count, min, max) - Sort -
Duplicate removal

Classical solution

- Rely on materializing intermediate results -
Thus, not suitable for smartcard
39
Complex Query Execution without RAM

PicoDBMS solution

- If the incoming tuples are already grouped by
distinct values, aggregate and duplicate removal
can be done in pipeline. - Pipeline operators
are in order because tuples are consumed in their
arrival order.
Therefore, maintaining a right order at the
leaf of the execution tree allows pipelined
aggregation and duplicate removal!
40
Example of Complex Query Execution
Query Q2 Number of antibiotics prescribed per
doctor in 1999
41
Another Example
The output data should be grouped by type of
drugs rather than Drug.id, therefore, an
additional join is required between Drug and
Drug.type.
Query Q3 Number of prescription per type of
drug
42
Performance Evaluation

Evaluate whether PicoDBMS performance matches the
smartcard applications requirements
Evaluate which techniques are really necessary
- ring indices
- query optimization

43
Performance Simple Query
For simple query,
Ring indices are necessary in medium and
large database!
Query optimization is useful for large
database!
Query Q1 Who prescribes antibiotics in 1999
44
Performance Complex Query
For complex query,
Ring indices are necessary in medium and
large database!
Query optimization is useful for large
database!
Query Q4 Number of prescriptions per doctor
and type of drugs
45
Conclusion