Sergey Yekhanin - PowerPoint PPT Presentation

About This Presentation
Title:

Sergey Yekhanin

Description:

( Linear programming / Polynomial interpolation) [DMT,DY] ... Polynomial interpolation attack. Assume c is prime. ... Polynomial interpolation attack. Privacy ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 15
Provided by: sayo4
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Sergey Yekhanin


1
Lower Bounds on Noise
  • Sergey Yekhanin
  • Institute for Advanced Study

2
Setting
  • Database of information about individuals
  • E.g. Medical history, Census data, Customer info.
  • Need to guarantee confidentiality of individual
    entries
  • Want to make deductions about the database learn
    large scale trends.
  • E.g. Learn that a drug V increases likelihood of
    heart disease
  • Do not leak info about individual patients

3
Message
  • Two approaches to database privacy
  • Interactive Analyst asks questions curator
    returns approximate answers

Analyst
Curator
4
Message
  • Two approaches to database privacy
  • Interactive Analyst asks questions curator
    returns approximate answers
  • Non-interactive Publish a summary of the
    database analyst can use summary to get answers

Analyst
Summary
Curator
5
Message
  • Two approaches to database privacy
  • Interactive Analyst asks questions curator
    returns approximate answers
  • Non-interactive Publish a summary of the
    database analyst can use summary to get answers
  • Thesis The interactive approach is the right way
    to give good accuracy for a given level of
    privacy
  • Any non-interactive solution permitting too
    accurate answers to too many questions leaks
    private information.

6
Plan
  • Mathematical model of database and queries
  • Attacks
  • Somewhat accurate answers to all queries lead to
    privacy leakage. (Fourier analysis) Y (extends
    DiNi).
  • Somewhat accurate answers to a fraction of
    queries lead to privacy leakage. (Linear
    programming / Polynomial interpolation) DMT,DY
  • Study of privacy leads to a variety of
    mathematical challenges!

7
Model
  • Dinur-Nissim Simple Model (easily justifiable)
  • Database n-bit binary vector x
  • Query vector a
  • True answer Dot product ax
  • Response is ax e True Answer Noise
  • Privacy Leakage Attacker learns a certain bit
    of x.
  • Blatant Non-Privacy Attacker learns n-o(n) bits
    of x.

8
Fourier attack
  • Theorem If a curator adds o(vn) noise to every
    response then an attacker can ask n questions,
    perform O(n log n) computation and recover n-o(n)
    bits of the database.
  • Put database records in one-to-one correspondence
    with elements of a group .
  • Think of a database as a function D from to
    0,1.
  • Choose queries to ask for Fourier coefficients of
    D.
  • Noisy Fourier coefficients approximately
    determine the Boolean function D! (Parseval
    identity).

9
Linear programming attack
  • Theorem If a curator adds o(vn) noise to 0.773
    fraction of responses then an attacker can ask
    O(n) questions, perform O(n3) computation and
    recover n-o(n) bits of the database.
  • Arbitrarily large error on arbitrary and unknown
    0.239 fraction on answers.

10
Linear programming attack
  • Ask O(n) random 1/-1 questions
  • Obtain yAxe, where e is the error vector
  • A natural approach to recover x from y
  • Solve min e'0 such that yAx'e, x' in
    Rn (hard!)
  • Solve a linear program D, CT, MT
  • min e'1
  • such that
  • yAx'e'
  • x' in Rn

Ax'
y
11
Polynomial interpolation attack
  • Model Questions have O(c) large coefficients
  • Theorem If a curator adds o(c) noise to 0.501
    fraction of responses then an attacker can ask c
    questions, perform O(c4) computation and reliably
    recover any particular bit of the database.
  • Arbitrarily large error on arbitrary and unknown
    0.499 fraction on answers.

12
Polynomial interpolation attack
  • Assume c is prime.
  • Think of the space of queries as a linear space
    .
  • To obtain a reliable answer to query x (1,0,
    , 0) ,
  • draw a degree two curve through x.
  • Ask all queries that correspond to points on the
    curve.
  • Use polynomial interpolation to carefully combine
    the answers.

q1
q5
x
q6
q4
q2
q3
13
Implications
  • Privacy has a Price
  • There is no safe way to avoid increasing the
    noise as the number of queries increases
  • Applies to Non-Interactive Setting
  • Any non-interactive solution permitting answers
    that are too accurate to too many questions
    is vulnerable to attack.
  • Cannot just output a noisy table.

14
Helps to know what you need
  • Non-interactive approach has inherent limitations
  • Interactive approach works
  • Can also publish a summary, as long as its clear
    which stats are accurate, and which ones are not.
  • Future directions
  • Fewer queries
  • Understand what can and what cannot be done
    privately
Write a Comment
User Comments (0)
About PowerShow.com