Poorvi Vora - PowerPoint PPT Presentation

About This Presentation

Title:

Poorvi Vora

Description:

Dept. of Computer Science. George Washington University ... x = f(S) (including adaptive, related queries) queries are channel codes ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 24

Provided by: poo69

Learn more at: https://www2.seas.gwu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Poorvi Vora

1
Information Theory and the Security of Binary
Data Perturbation

Poorvi Vora
Dept. of Computer Science
George Washington University

2
Statistical Database

Database A
Q q1 ,q2 ,...qi ,... (queryable bits) and
S s1, s2,...si ,... (sensitive bits).
Data collector B can ask for
fi(q1, q2, q3, )qj ?Q Xi

3
The statistical database security problem

Can query multiple
fi(q1, q2, q3, )qj?Q Xi
And simultaneously solve
(perfect zk protocols do not leak additional
information about xi, but Ai are revealed thus
not a traditional cryptographic problem)

4
Random Data Perturbation (RDP)

Used in public health community for twenty odd
years, can be used together with cryptographic
techniques
If xi perturbed each time, the simultaneous
equations are inconsistent
fi(q1?1i, q2 ?2i, q3 ?3i, ) Xi ?i
Security and attack characterization open problem
for 20 years though many attempts (Denning,
Adams, Duncan, Landers).

5
RDP
Salary 25,000
Salary 40,000
-25,000
25,000
q
0
0
p 1-q
F(x)
G(x)
Yes
HIV?
p 1-q
q
1
1
stats. over many are accurate
6
Known Security Property of RDP

m repeated queries
?m probability of error
?m ? 0 ? m ? ?
Chernoff Bound
m ln(2/?) /0.38 ?2 ? ?m lt ?
Probability of lie 0.5 ? ?

7
A simple inference attack

Query 1 Female?
Query 2 Over 40?
Query 3 Losing Calcium?
Really asking about age and gender
How does one characterize all such attacks?
What can one say about security wrt such attacks?

8
Our definitions

Definition
An inference attack is a set of queries x not
independent of the set of sensitive bits S, i.e.
I (S x) ? 0
Definition
A small error inference attack is one in which
lim n?? ?m 0 .
Definition
The query complexity per bit, of query sequence x
of length m, as a means of distinguishing among M
possible values of x is
?m m/log2M .

9
Recall attack example

Query 1 Female?
Query 2 Over 40?
Query 3 Losing Calcium?
Query 3 checks answers to Query 1 and 2
Is a parity-check bit of sorts, but not quite
If 1 and 2 independent, ? 3/2
?m ? 0 ? ?m ? ? ?

10
Our analogy (ISIT 03)

All attacks are communication over channel
When attacks are codes x f(S)
What B queries is a codeword bit
What B receives is the transmitted codeword that
he decodes

11
Shannons theorems apply when x f(S) and ?
constant (ISIT 03)

Assuming
x f(S) (including adaptive, related queries)
queries are channel codes
constant reliable transmission
Result
?m ? 0 ? ? ? 1/C
Above this bound, ?m ? 0 exponentially,
Below it, it ?m increases exponentially

12
What about the general zero-error inference
attack?

All inference attacks are not codes, i.e. x ?
f(S).
? is not necessarily kept constant as m ??, i.e.
transmission is not necessarily reliable.

13
Thm. 1

lim m ?? ?m 0
?? ? mm1 ? s.t. ?i ? ? m ? i?m lim m ?? ?m
1/C
Proof modifies the converse of Shannons proof of
the channel coding theorem

14
The Proof

log2M H(sm) H(smym) I(smym)
1 Emlog2M I(smym)
1 Emlog2M mC
?m m/log2M ? (1-Em)/(1/mC) ?m
Lim m?? ?m 1/C

15
Thm. 2

Small error attacks with constant ? ? 1/C exist.
Proof Follows from channel coding theorem

16
Thm. 3

For data of entropy H, stationary record
sequence, Nr records, and ?m the number of
queries per record,
lim m ?? ?m 0
?? ?mm1 ? s.t. ?i ? ? m ? i?m lim m ?? ?m
H/C
Proof Modification of source-channel coding
theorem

17
Proof

Given Theorem 1, smaller lengths can be shown to
violate Shannons source coding theorem when the
data is stationary.

18
Corollary

?m ? ln2/2?2
When p 0.5??
For any probability of error
Different from Chernoff bound, does not increase
with a smaller probability of error
This is the improvement bought over the
repetition code

19
Where to?

Block Ciphers as channels for properties of the
key (Filiol, ePrint 2003)
Attacks on Stream Ciphers as codes over key bits
(Johansson et al, Golic et al, Filiol et al)
It appears there is a framework (Vora, working
documents)
all statistical attacks as channel communication
efficient attacks as codes
related-input (key, message) attacks as
concatenated codes1
Wagners Cryptanalytic Model (FSE 03) to
determine inner codes
Do related-key attacks provide an improvement in
efficiency over repeated key attacks?
1Filiol shows the repeated key attack on block
ciphers as a concatenated code with the outer
code as the repetition code

20
Also traffic analysis, e.g.Crowds Reiter and
Rubin/Lucent and ATT
N nodes C colluding pf probability of forwarding
At node i1 Probability that node i originated
the message (probability of truth) 1 pf
(N-C-1)/N Probability of any other
non-collaborating node originating message
pf/N Observable information changes the pdf on
the data of interest the originator of the
message
Crowds
21
The Crowds protocol as a simplex channel
X
Y
F X set of originator nodes 0, ..N-3 ? Y
set of predecessor nodes 0, ..N-3 F(X)
Y Assumption all senders equally likely P(Y j
X i) pij pf/N i ?j 1 pf(N-2)/N ij
22
The Crowds protocol
X
Y
C 1 (N-2)pf/N log 1- (N-2)pf/N pf/N log
pf/N 2log2/N if pf1 ? 2log2/N (N-1)?2 if
pf 1 - ? Average path length (1 - ?)/? O(1/
?)
23
The replay attack on Crowds