Poorvi Vora - PowerPoint PPT Presentation

About This Presentation
Title:

Poorvi Vora

Description:

Dept. of Computer Science. George Washington University ... x = f(S) (including adaptive, related queries) queries are channel codes ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 24
Provided by: poo69
Category:
Tags: poorvi | vora

less

Transcript and Presenter's Notes

Title: Poorvi Vora


1
Information Theory and the Security of Binary
Data Perturbation
  • Poorvi Vora
  • Dept. of Computer Science
  • George Washington University

2
Statistical Database
  • Database A
  • Q q1 ,q2 ,...qi ,... (queryable bits) and
  • S s1, s2,...si ,... (sensitive bits).
  • Data collector B can ask for
  • fi(q1, q2, q3, )qj ?Q Xi

3
The statistical database security problem
  • Can query multiple
  • fi(q1, q2, q3, )qj?Q Xi
  • And simultaneously solve
  • (perfect zk protocols do not leak additional
    information about xi, but Ai are revealed thus
    not a traditional cryptographic problem)

4
Random Data Perturbation (RDP)
  • Used in public health community for twenty odd
    years, can be used together with cryptographic
    techniques
  • If xi perturbed each time, the simultaneous
    equations are inconsistent
  • fi(q1?1i, q2 ?2i, q3 ?3i, ) Xi ?i
  • Security and attack characterization open problem
    for 20 years though many attempts (Denning,
    Adams, Duncan, Landers).

5
RDP
Salary 25,000
Salary 40,000
-25,000
25,000
q
0
0
p 1-q
F(x)
G(x)
Yes
HIV?
p 1-q
q
1
1
stats. over many are accurate
6
Known Security Property of RDP
  • m repeated queries
  • ?m probability of error
  • ?m ? 0 ? m ? ?
  • Chernoff Bound
  • m ln(2/?) /0.38 ?2 ? ?m lt ?
  • Probability of lie 0.5 ? ?

7
A simple inference attack
  • Query 1 Female?
  • Query 2 Over 40?
  • Query 3 Losing Calcium?
  • Really asking about age and gender
  • How does one characterize all such attacks?
  • What can one say about security wrt such attacks?

8
Our definitions
  • Definition
  • An inference attack is a set of queries x not
    independent of the set of sensitive bits S, i.e.
  • I (S x) ? 0
  • Definition
  • A small error inference attack is one in which
  • lim n?? ?m 0 .
  • Definition
  • The query complexity per bit, of query sequence x
    of length m, as a means of distinguishing among M
    possible values of x is
  • ?m m/log2M .

9
Recall attack example
  • Query 1 Female?
  • Query 2 Over 40?
  • Query 3 Losing Calcium?
  • Query 3 checks answers to Query 1 and 2
  • Is a parity-check bit of sorts, but not quite
  • If 1 and 2 independent, ? 3/2
  • ?m ? 0 ? ?m ? ? ?

10
Our analogy (ISIT 03)
  • All attacks are communication over channel
  • When attacks are codes x f(S)
  • What B queries is a codeword bit
  • What B receives is the transmitted codeword that
    he decodes

11
Shannons theorems apply when x f(S) and ?
constant (ISIT 03)
  • Assuming
  • x f(S) (including adaptive, related queries)
    queries are channel codes
  • constant reliable transmission
  • Result
  • ?m ? 0 ? ? ? 1/C
  • Above this bound, ?m ? 0 exponentially,
  • Below it, it ?m increases exponentially

12
What about the general zero-error inference
attack?
  • All inference attacks are not codes, i.e. x ?
    f(S).
  • ? is not necessarily kept constant as m ??, i.e.
    transmission is not necessarily reliable.

13
Thm. 1
  • lim m ?? ?m 0
  • ?? ? mm1 ? s.t. ?i ? ? m ? i?m lim m ?? ?m
    1/C
  • Proof modifies the converse of Shannons proof of
    the channel coding theorem

14
The Proof
  • log2M H(sm) H(smym) I(smym)
  • 1 Emlog2M I(smym)
  • 1 Emlog2M mC
  • ?m m/log2M ? (1-Em)/(1/mC) ?m
  • Lim m?? ?m 1/C

15
Thm. 2
  • Small error attacks with constant ? ? 1/C exist.
  • Proof Follows from channel coding theorem

16
Thm. 3
  • For data of entropy H, stationary record
    sequence, Nr records, and ?m the number of
    queries per record,
  • lim m ?? ?m 0
  • ?? ?mm1 ? s.t. ?i ? ? m ? i?m lim m ?? ?m
    H/C
  • Proof Modification of source-channel coding
    theorem

17
Proof
  • Given Theorem 1, smaller lengths can be shown to
    violate Shannons source coding theorem when the
    data is stationary.

18
Corollary
  • ?m ? ln2/2?2
  • When p 0.5??
  • For any probability of error
  • Different from Chernoff bound, does not increase
    with a smaller probability of error
  • This is the improvement bought over the
    repetition code

19
Where to?
  • Block Ciphers as channels for properties of the
    key (Filiol, ePrint 2003)
  • Attacks on Stream Ciphers as codes over key bits
    (Johansson et al, Golic et al, Filiol et al)
  • It appears there is a framework (Vora, working
    documents)
  • all statistical attacks as channel communication
  • efficient attacks as codes
  • related-input (key, message) attacks as
    concatenated codes1
  • Wagners Cryptanalytic Model (FSE 03) to
    determine inner codes
  • Do related-key attacks provide an improvement in
    efficiency over repeated key attacks?
  • 1Filiol shows the repeated key attack on block
    ciphers as a concatenated code with the outer
    code as the repetition code

20
Also traffic analysis, e.g.Crowds Reiter and
Rubin/Lucent and ATT
N nodes C colluding pf probability of forwarding
At node i1 Probability that node i originated
the message (probability of truth) 1 pf
(N-C-1)/N Probability of any other
non-collaborating node originating message
pf/N Observable information changes the pdf on
the data of interest the originator of the
message
Crowds
21
The Crowds protocol as a simplex channel
X
Y
F X set of originator nodes 0, ..N-3 ? Y
set of predecessor nodes 0, ..N-3 F(X)
Y Assumption all senders equally likely P(Y j
X i) pij pf/N i ?j 1 pf(N-2)/N ij
22
The Crowds protocol
X
Y
C 1 (N-2)pf/N log 1- (N-2)pf/N pf/N log
pf/N 2log2/N if pf1 ? 2log2/N (N-1)?2 if
pf 1 - ? Average path length (1 - ?)/? O(1/
?)
23
The replay attack on Crowds
  • Repetition code ? resending message, along
    different (randomly chosen) route
  • How about attacks corresponding to other codes?
Write a Comment
User Comments (0)
About PowerShow.com