Efficient Secure Query Evaluation over Encrypted XML Databases - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Efficient Secure Query Evaluation over Encrypted XML Databases

Description:

... Wang. Laks V.S. Lakshmanan. University of British Columbia, Canada. Wang, ... Wang, Lakshmanan. Efficient Secure Query Evluation over Encrypted XML Databases. 3 ... – PowerPoint PPT presentation

Number of Views:145
Avg rating:3.0/5.0
Slides: 53
Provided by: aitrcK
Category:

less

Transcript and Presenter's Notes

Title: Efficient Secure Query Evaluation over Encrypted XML Databases


1
Efficient Secure Query Evaluation over Encrypted
XML Databases
  • Wendy Hui Wang
  • Laks V.S. Lakshmanan
  • University of British Columbia, Canada

2
Outline
  • Introduction
  • Design of metadata
  • Secure and efficient query processing

3
Database-as-Service (DAS) Model
  • Data owner
  • Small business with limited budget (e.g., an
    online art gallery owner)
  • Owns an XML database of large size (e.g., a
    database contains the information of paintings
    customers)
  • Cannot afford a suitable database server
  • More cost effective hosts the database on a
    third-party remote server
  • E.g., Caspio web database service provider

4
Security Concerns in DAS Model
  • Data owner
  • Does NOT trust the server
  • Protects the sensitive information in the
    database
  • Individual XML element with its content
    (structure of the subelements, data values,
    etc..)
  • E.g., the customers financial information
  • Association between data values
  • E.g., the customers name and the paintings
    he/she purchased

FinancialAccount
visa
mastercard
visa



5
Database-as-Service Model (Cont.)
  • Data Owner
  • Stores the encrypted database on the server
  • Keeps decryption keys to himself
  • Server
  • Provides data storage query engine as services
  • Doesnt have decryption keys

6
The Queries by Data Owner
  • Remotely sent by data owners handheld devices
  • The answers are a very small portion of the
    database
  • E.g., The name of paintings that Andy bid for
  • The answers are post-processed on the handheld
    devices
  • The devices are installed with decryptor and
    query engine
  • Limited bandwidth
  • Limited memory and processing power

7
Naïve Method of Query Processing
  • Returns the whole encrypted database back to the
    client
  • Disadvantages
  • Expensive cost of data transportation, decryption
    and query post-processing
  • May exceed the computational capabilities of
    handheld devices

Encrypted XML Database
Untrusted Server
XML Decryptor
Client
Query Executor
Answer of Query
8
Another Option for Query Processing
  • Encrypts tags data values in the database
    individually
  • E.g.,
  • Tags values in the query are encrypted as the
    same as in database
  • E.g., purchasecnameAndy/pname
  • Query processing is more efficient than naïve
    method
  • But there exists security breach!
  • E.g., the attacker knows Andy is the biggest
    customer of the art gallery. Then the encrypted
    value on customer that is of the largest of
    occurrences must correspond to Andy.

purchase
purchase
cname
cname
pname
pname
Andy
Lily
A
Lily
purchasecnameA/pname
9
Our Goals
  • Security
  • Guarantee no leakage of sensitive information to
    the untrusted server/disk
  • Efficient query evaluation
  • The server returns ONLY the portions of database
    that is relevant to the data owners query

10
Our Approach
Query Executor
Encryption blocks relevant to Q

purchases
Qs
Metadata

purchase
purchase
cname
pname
cname
pname
Andy
Lily
Betty
Reflection
purchase
Untrusted Server
Encrypted XML Database
cname
pname
Lily
Andy
Client
XML Decryptor
Query Translator
Query Q
Query Executor
//purchase//cnameAndy/pname
Answer of Query Q
Lily
11
Our Contributions
  • Security constraints (see paper)
  • Formal definition of attack model and security
  • Construction of the secure encryption scheme (see
    paper)
  • Finding an optimal secure encryption scheme is
    NP-hard
  • Design of the metadata on the server
  • Efficient and secure query processing

12
Outline
  • Introduction
  • Design of metadata
  • Structural index
  • Value index
  • Secure and efficient query processing

13
Structural Index
  • Purpose for efficient processing of tags and
    XPath predicates (/, //,, sibings, etc..) in
    the query
  • The interval index of the element
  • Each element is assigned an interval (start end)
  • For parent u and child v, ustart lt vstart lt
    vend lt uend
  • The intervals of adjacent nodes dont overlap
  • The structural index
  • Index table entry lt(encrypted) tag, the interval
    indexgt
  • Encryption block table entry ltencryption block
    ID, the interval indexgt

14
Attacks on Structural Index
  • By accessing structural index T and encrypted
    element E, the attacker constructs the candidates
    of the original element that
  • have the same structural index T
  • the size of the encrypted candidate is the same
    as that of E

a
0, 1
?
ß
d
d
d
0.85, 0.9
0.55, 0.75
0.83, 0.84
0.8, 0.82
0.2, 0.25
The of such candidates is 1, i.e., the attacker
can reconstruct the structure of the original
element!
Index table
15
More Secure Structural Index
  • Grouping on the intervals in the index table
  • The intervals of the adjacent nodes with the same
    tag and encrypted in the same block are grouped
    together

Index table after grouping
Index table before grouping

16
Security Example of Structural Index
A 0, 1
3 intervals on 5 leaf nodes
C
D
D
D
B
0.2,0.25
0.55, 0.75
0.8, 0.9
Original element
A 0, 1
A 0, 1
B
C
C
D
D
C
C
D
B
B
0.2,0.25
0.55, 0.75
0.8, 0.9
0.2,0.25
0.55, 0.75
0.8, 0.9
Candidate 2
Candidate 1
of Candidates
17
Technical Result of Security of Structural
Metadata
  • We prove there exists a large number of candidate
    databases (including the true hosted database)
    such that
  • By applying any query that is captured by any
    security constraint, only the true database
    returns the non-empty answer
  • By looking at the structural index, the
    candidates are pairwise indistinguishable

18
Related Work of Structural Index
  • Efficient Tree Search in Encrypted XML Database
    Brinkman et al. 2004
  • stores a relational table containing structural
    information of the database on server
  • compromises security of structural information
  • XML interval index schemes Al-Khalifa et
    al.2002, Chien et.al, 2002, etc..
  • Only focus on efficiency. Dont consider security

19
Outline
  • Design of metadata
  • Structural index
  • Value index
  • Secure and efficient query processing

20
Value Index
  • Purpose for efficient processing of value-based
    constraints in the queries
  • Every encrypted data value in the database is
    indexed in format lt(Encrypted) value, block IDsgt
  • By accessing value index, the attacker counts the
    of occurrences of encrypted values
  • Enc(50)
  • Enc(90)
  • Enc(100)
  • Enc(30)
  • Enc(20)
  • Enc(70)
  • Encrypted value
  • 1, 2, 5, 6
  • 3, 6
  • 3, 4
  • 1, 2
  • Block IDs

21
Attacks on Value Index
  • Attackers aim infer mapping between plaintext
    values and corresponding index, consequently
    crack the associations between data values
  • E.g., he wants to find out what are the paintings
    Andy has bought. The names of paintings are not
    encrypted. But the names of customers are.
  • His prior knowledge of occurrences of some
    data values in the original database
  • E.g., from the newspaper, he knows Andy has
    bought 10 paintings from the art gallery for
    charity purpose.

22
Attacks on Value Index (Cont.)
  • His approach map the encrypted values with
    plaintext based on their of occurrences
  • E.g., A is the only value in index whose
    occurrences 10. Then A must map to Andy.
    Consequently the attacker finds out which
    paintings that Andy has purchased

23
Our Solution
  • Order preserving encryption with splitting and
    scaling (OPESS)
  • Order preserving efficient query processing
  • Splitting and scaling
  • Purpose change frequency distribution of
    encrypted data values in value index to be
    different from that of the frequencies of
    original values

24
Splitting
  • Every plaintext value p is encrypted into
    multiple distinct ciphertext values v1, v2..vn
    by using distinct keys. ?vip.
  • ? encrypted value vi, vi ? m-1, m, m1
  • Orders preserved. Encrypted values corresponding
    to different plaintext values never straddle each
    other
  • Mapping between encrypted values and plaintext
    values is unique, i.e., splitting alone is not
    secure!
  • E.g., for data values on attribute CustomerName

Plaintext value of CustomerName
of occurrence
Encrypted value of CustomerName
of occurrence
KA
3
345 12
4
KH
12
Andy
5
KT
5
Betty
SF
55
5

WA
45 9
9
4
Carl
WE
5
25
Scaling
  • Every encrypted value replicated multiple times
    so their occurrences will be scaled up.
  • By scaling, the mapping between encrypted values
    and plaintext values is not unique!
  • E.g., for data values on attribute CustomerName

of occurrence
Encrypted value of CustomerName
Plaintext value of CustomerName
of occurrence
3
6
KA
4
6
KH
Andy
Scale to
12
5
KT
6
5
Betty
SF
5
6
4
WA
6
Carl
9
5
WE
6
To map 6 distinct ciphertext values to 3 distinct
plaintext values, of mappings
26
Technical Results of Security of Value Index
  • We prove there exists a large number of candidate
    databases (including the true hosted database)
    such that
  • By applying any query that is captured by any
    security constraint, only the true database
    returns the non-empty answer
  • By looking at the value index, the candidates are
    pairwise indistinguishable

27
Related Work of Value Index
  • Efficient processing of queries on encrypted
    relational database Hacigumus et al. 2002
  • Index on the bucket ID, which represents the
    partition to which the unencrypted value belongs
  • DO NOT consider occurrence-based distribution
    model
  • Order-preserving encryption for numeric data
    Agrawal et al. 2004
  • Consider a DIFFERENT histogram-based distribution
    model
  • Balancing security and efficiency in untrusted
    relational DBMSs Damiani et.al 2003
  • Propose indexing scheme by direct encryption and
    hashing, and measure the information exposure
  • For the same occurrence-based distribution model
    as ours, their probability of information
    exposure can be HIGH
  • The encryption is NOT order-preserving

28
Outline
  • Introduction
  • Design of metadata
  • Secure and efficient query processing

29
Example of Query Processing

purchases
Encrypted XML Database
purchase
purchase
cname
pname
cname
mname
purchase
Reflection
Andy
Lily
Betty

cname
mname
Block 1
Block 2
Lily
Andy
Block 1
Block 1
Block 1, 2
Block 1
Join
Structural index
Value index
ßKA AND ß?KT
//a ß/?
Translated Query Qs
Untrusted Server
ß
// a
/?
/?
ßKA AND ß?KT
Query Translator
XML Decryptor
Client
Query Executor
Query Q
//purchasecnameAndy/pname
Lily
30
Technical Results of Security of Query Answering
  • Let A be any query that is captured by the
    security constraints, and Bel(B(A)) be the
    attackers belief probability of whether the
    hosted database satisfies A
  • We prove that by answering queries, Bel(B(A))
    does not increase

31
Experiments
  • Impact of Optimization

Compared with naïve method, our approach gets gt
80 of savings!
32
Conclusion
  • We consider the problem of efficient and secure
    evaluation of XPath queries on encrypted XML
    database
  • We formally define the attack model and security
    (see paper)
  • We propose
  • The security constraints (see paper)
  • The secure encryption scheme (see paper)
  • The design of secure structural and value index
  • The secure and efficient query evaluation

33
Future Work
  • More prior knowledge
  • Tag distribution
  • Query workload distribution
  • Correlations between data values
  • Updates on database
  • Definition of security
  • Secure encryption scheme
  • How to design metadata

34
  • Thank you !
  • Questions?

35
Extra Slides
  • Extra slides

36
Similar Application Scenario Untrusted Disk
  • The attacker may install the Trojan virus on the
    disk where the databases are stored (maybe
    locally), and spy the operations on the databases
  • The disk is not trusted anymore, which is similar
    to the untrusted server

37
More Discussion on Security of Structural Index
  • Attacker still can infer the structural relations
    (e.g., parent/child, siblings, etc..) between the
    nodes in the encrypted elements
  • However, he cannot reconstruct the exact content
    of original element

38
Other Contributions Security Constraints
  • Node type constraint
  • For sensitive XML element with its content
  • E.g., //customer//prescription
  • Association type constraint
  • For sensitive associations between data values
  • E.g., //customer (/name, //purchase//mname)

39
Security Definitions
Query Executor
A set of encryption blocks
Encrypted XML Database
Qs
Metadata
Untrusted Server
XML Decryptor
Client
Query Translator
Query Executor
Query Q
Answer of Query Q
40
XML Encryption
  • W3C standard
  • Different encryption granularity

Info

purchases

purchase
purchase
cname
pname
cname
pname
Andy
Lily
Betty
Last supper
41
XML Encryption (Cont.)
  • Tradeoff exists between encryption granularity
    and efficient query processing
  • Next question is
  • Whats the optimal encryption scheme s.t.
  • (1) it is secure, and
  • (2) it facilitate the query processing?

42
Secure Encryption Scheme
  • Encryption Scheme S
  • Every security constraint is enforced
  • ?node type constraint c, ?node that c binds to is
    encrypted
  • E.g., for the security constraint
    //customer//prescription, encrypt every
    prescription element
  • ?association type constraint p(q1, q2), nodes
    that binds to either p/q1 or p/q2 are encrypted
  • E.g., for the security constraint //customer
    (/name, //purchase//mname), either
    //customer/name or //customer//purchase//mname is
    encrypted

43
Secure Encryption Scheme (Cont.)
  • More protection
  • Every leaf element containing data values is
    encrypted with encryption decoy
  • Effect every encrypted value is of unique number
    of occurrence
  • E.g., original values (AIDS, AIDS, cold)
    are encrypted to be (CCED, PACS, DAEE)
  • Goal defense of frequency-based attack

44
Secure Encryption Scheme (Cont.)
  • Theorem the encryption scheme is a secure
    encryption scheme
  • Theorem Finding an optimal secure encryption
    scheme is NP-hard in size of security constraints

45
Unsafety of Continuous Interval Index
A 1, 10
B 2,5
B 6, 9
C 3, 4
D 7, 8
Original database
A 1, 10
The original structure is revealed by the gap!
B 2,9
B 2,5
B 6, 9
C 3, 4
D 7, 8
46
Safety of Discontinuous Interval Index
A 0, 1
B 0.1,0.4
B 0.5,0.9
C 0.2,0.25
D0.55, 0.75
Original database
A 0, 1
B 0.1,0.9
C 0.2,0.25
D0.65, 0.75
D0.55, 0.6
A fake candidate
47
Splitting
  • E.g., for data values on attribute Age

of occurrence
Ciphertext value of Age
of occurrence ki
Plaintext value of Age
18
10
20
5
30
27
Ki ? m-1, m, m1, m6
48
Scaling
of occurrence
Ciphertext value of price
of occurrence
Plaintext value of price
7
101
6
18
10
124
5
189
5
210
20
5
312
7
7
367
30
27
7
371
6
389
To map 8 distinct ciphertext values to 3 distinct
plaintext values, of mappings
49
Value Metadata (Cont.)
50
Query Processing at Client
  • The tags and values are encrypted
  • E.g., original query //customer//zipcode12500
    //name
  • customer a, zipcode ß, name
    ?

4000
4000 ? ß and ß ? 7000

Zipcode 12500
12500
7000
Translated query //a 4000 ? ß and ß ?
7000 // ?
51
Query Processing at Server
structural index
  • Query tags
  • //a ß // ?

A set of encryption block IDs Bs
value index
(2) Value-based Constraints 4000 ? ß and
ß ? 7000
A set of encryption block IDs Bv
(3) The blocks corresponding to Bs ? Bv are
returned to the client. Each returned block
contains the answers of the original query
52
Experiments (Cont.)
  • Effects of Various Secure Encryption Schemes
  • Optimal encryption scheme always has the best
    performance of query evaluation
  • The performance of approximate scheme is around
    1.1-1.3 times of that by optimal encryption
    scheme
Write a Comment
User Comments (0)
About PowerShow.com