Title: On Using Secure Hardware in Outsourced Databases
1On Using Secure Hardware inOutsourced Databases
- Einar Mykletun Gene Tsudik
- Computer Science Department
- School of Information Computer Science
- University of California, Irvine
- mykletun AT ics.uci.edu gts AT ics.uci.edu
- http//sconce.ics.uci.edu
Supported in part by NSF ITR grant Security
Privacy in Database-as-a-Service
2Birds eye view
DB
Server Site
- Outsourced Database (ODB) Model
- DB management outsourced to service provider for
- backup, administration, restoration, space
management, upgrades etc. - Clients use the database as-a-service
- take advantage of servers SW, HW, human
resources, instead of ones own
3Challenges
- Economic/business model?
- How to charge for service, what kind of service
guarantees can be offered, costing of
guarantees, liability of service provider. - Interfaces to support complete application
development environments - User Interface for SQL, support for embedded SQL
programming, support for user defined interfaces,
etc. - Scalability in the web environment
- Overhead costs due to network latency (data
proxies?)
4Rough Idea
- Client encrypts its data and stores it at server
- (Could, of course, ask the server to return
entire db for each query) - Client
- submits queries to server who runs them over
encrypted remote data - and
- returns results
- Most of the work is done by the server
5Bucketization/Partitioning Main Idea
- Basic operations dont need to be fully
implemented over encrypted data - To test (e.g., AGE gt 40), it suffices for the
test to succeed in most cases, e.g., some false
positives are okay (but not false negatives!) - If test does not result in clear positive or
negative -- over encrypted representation --
resolve later at client-side, after decryption.
6Outsourced Databases (ODB)
- Query execution over encrypted data
- Encrypted database is augmented with additional
information (bucket ids) - Allows some query processing at server
- Meta-data describing bucketization kept at client
side - Split queries into server- and client-side
queries - Server side queries (QS) executed on encrypted
data - Client side queries (QC) executed on results
returned by server - Server returns superset of records to client
- Client performs post-processing (QC)
- Filtering
- Specific query execution
7Bucketization / Partitioning
- Partition domain through bucketization
- Assign bucket ids to each encrypted attribute
- Example
- Attribute Age
- Domain 0-100
Owner/Client converts attributes using meta-data
8Query Execution Example
Meta-data (client)
Untrusted Server
- 1) Query Select where Age lt 52
2) Client creates server side query
QS Select where Age_ID B,G,Q
3) Client sends Qs to server
4) Server computes Qs, returns superset
5) Client post-processes (filters), maps bucket
ids to original values
9Motivation for hardware-assist
- Bucketization/Partitioning is somewhat effective,
but - Supports mainly range (select-type) queries
-
- Other query types hard to accommodate, e.g.,
aggregation, nested queries -
- Returns too much information
- Consumes lots of bw
- Requires post-processing by client
- A secure co-processor can help
10 Example (IBM 4758)
- No public/official info on performance of IBM
4758 - 4758 was not designed for the multi-round
external-to-internal-to-external processing - 600-800 KB/sec throughput (3DES engine measured
at 18MB/sec) - Main bottleneck DMA between encryption engine
and SCs internal RAM - Bringing in data via DES, 3DES or no DES yields
roughly same performance - PCI BUS 130MB/sec (allows for multiple SCs)
- Latency R/T time for an 8 byte echo packet 5.5
Msec
11Secure Co-Processor (SC)
a general-purpose computer, possibly with
cryptographic support, secure against all
foreseeable physical and logical attacks
- Limitations (IBM 4758)
- CPU Power 486 processor, 99Mhz
- RAM
- 4MB Flash, slow write, limited number of writes
- 32 KB Battery Backed, read/write comparable with
DRAM, key storage, zeroizable - Notes
- Equipped with public/private key
- Located at host (via PCI bus)
- IBM 4758 ? leading SC on the market
- FIPS rated
- Properties
- Has CPU, RAM, I/O (bus)
- Tamper resistant device
- Enclosed in casing, dipped in epoxy
- Zero-izes data upon tamper detection
- Detectors penetration, temperature change,
radiation, etc. - Will allow attacker to see insides, but not data
- Secure storage
- Cryptographic keys, certificates, keeps
persistent state - Programmable
- Allows user to write applications for execution
inside SC - Crypto support
- HW encryption, Math Accelerator (PKC)
12Secure co-processor general usage model
- Acts as an island-of-trust in an untrusted
environment
Pre-established secret key(s)
SC
application
Untrusted Server
13SC in ODB
- Off-load computation from client to SC
- Split queries into server side and SC queries
- Server side queries (QS) executed on encrypted
data - SC queries (QSC) correspond to post-processing of
results returned by server - Meta-data shared between client and SC
(bucketization) - Session key KC-SC shared between client and SC
- Secret communication channel ? can even be done
via SSL/TLS - Benefits
- Reduce communication overhead between server and
client - Not returning supersets of records matching query
predicates - Full range of queries without client involvement
- Aggregation, nesting, etc.
- Client only decrypts results no other computation
14SC in ODB Model
- 1) Client query
- Select where Salary lt 20K
Meta-Data
SC
2) Client splits query (based on meta data) -
Server Query (QS) Select where Salary ID 2 or
3 - SC Query (QSC) Select where Salary ID lt 20K
3) Client sends queries to server
DB
Server
4) Server executes QS, sends superset and QSC to
SC
4) SC executes QSC, sends encrypted results to
server
5) Server sends encrypted results to client
No communication overhead (server to client) No
post-processing of query results at client
15Issues
- I/O co-relation
- How to hide records selected by SC
- Is server still needed in query processing?
- Remove bucketization? After all, its inherently
insecure - Other
- Parallelism offered by multiple SC devices
- Client participation in query processing
- Query Optimization multi-round query processing
- Key Management
- Implementation and experiments
16Access Privacy
- If we had a Super SC (unlimited RAM / CPU)
- Store database and process queries inside SC
- No information is ever leaked (except size of
result) - Reality very limited on-board RAM
- Cannot store intermediate results (e.g., only 4
MB storage) - Crypto Paging
- Page swapping
- SCs virtual memory consists of servers
memory/disk - Data in virtual memory encrypted under KSC
- SCs crypto paging strategy should minimize
information adversary can learn - Frequency, location, context, contents of page
swaps can be observed - Problem
- Server sends pages containing supersets of
records to SC - SC outputs encrypted records (to be returned to
client) - Server learns relationship between bucket ids
and returned results - For every input to the SC, the server can observe
the output
17Possible Solutions to Access Privacy
- Similar to Private Information Retrieval (PIR)
- How to query a server and hide results
- PIR solutions are often too theoretical, i.e.,
very inefficient - Secure computation
- Multi-round protocols
- Probabilistic approach
- Remove 1-to-1 correspondence (input and output)
- SC randomly chooses when to output
- SC required to store multiple pages
- Take advantage of multiple SCs
- Work in parallel can handle supersets better
- Introduces new difficulties (e.g., need to
coordinate)
18Is server needed for query processing?
- Servers participation is needed for
bucketization - cant send entire database to client for each
query - However, with SC
- server can send entire database to SC
- no bucketization scheme
- no added computation for client
- But
- Lots of communication/computation overhead at SC
QSC
SC
Server
19Summary
- Secure co-processors can help in the ODB model
- Current state-of-the-art SC-s are too limited
- Bucketization still necessary to reduce amt of
data flowing thru SC - Crypto paging not straight-forward ? must hide
I/O co-relation (open issue) - Need to build and, experiment with, this model