Poorvi Vora - PowerPoint PPT Presentation

About This Presentation
Title:

Poorvi Vora

Description:

Safeway card. Monthly charge to be kept of phone books. Information for community statistics: ... Another application: anonymous delivery. Crowds: Reiter and ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 24
Provided by: poo6
Category:
Tags: poorvi | vora

less

Transcript and Presenter's Notes

Title: Poorvi Vora


1
A model for data revelation
  • Poorvi Vora
  • Dept. of Computer Science
  • George Washington University

2
Security frameworks
  • Binary
  • Divide the world into trusted and untrusted
    parties
  • Provides complete revelation of information or
    complete protection
  • E.g. multiparty computation, encrypted data

3
Even a statistic or aggregate reveals private
information
  • Secure multiparty computation reveals
  • f(x1, x2, .. xn)
  • And nothing more.
  • Yet, this reveals information about all xi
  • Thus, typical security assurances not enough

4
What is privacy
  • Control over information
  • Extent of information revelation
  • Tensions between
  • Access to aggregate information for community
  • Vs.
  • Individual control
  • reputation vs. predjudice

5
Individual control requires more than binary
security of personal information
  • Information is often given up for something in
    return
  • Safeway card
  • Monthly charge to be kept of phone books
  • Information for community statistics
  • Health statistics
  • Collaborative filtering/personalization in
    virtual communities

6
A model introduce uncertaintymaximum
uncertainty (i.e. secrecy) corresponds to crypto
protocols
  • Alice and Bob determine
  • a binary data point from Alices personal
    information, x
  • a probability of truth, p
  • a return, y
  • Alice reveals a variable z x with probability p
  • Bob provides, in return, y
  • z exists in the ether as Alices value x with
    probability p
  • This is not mutually exclusive with cryptographic
    protection (p0.5 is cryptographic)
  • Used in public health community for twenty odd
    years

7
Outcome
  • Protocol is a mathematical game between Alice and
    Bob
  • Optimal situation not when no information is
    revealed, but when Alice gets maximum benefit for
    her information
  • Think about this should women in Africa test for
    HIV when they will certainly not obtain any
    treatment for it?

8
An analogy
  • The protocol is a communication channel
  • The sender is Alice, the receiver (malicious?)
    Bob
  • The probability of error is the probability of a
    lie

9
Security properties of randomization
  • Repeated queries
  • Error ? 0 as n ? ?
  • And n ? ? as Error ? 0
  • Cost to attacker increases without bound if error
    not bounded above zero
  • This is a repetition code over channel

10
Other attacks
  • Query 1 Graying?
  • Query 2 Balding?
  • Query 3 Weight?
  • Query 4 Sports?
  • Really asking about age and gender
  • How does one characterize all such attacks?
  • What can one say about security wrt such attacks?

11
An analogy
  • The protocol is a communication channel
  • The sender is Alice, the receiver (malicious?)
    Bob
  • The probability of error is the probability of a
    lie
  • The attributes that Bob wants to determine form
    the message

12
A simple attack
  • Query 1 Female?
  • Query 2 Over 40?
  • Query 3 Losing Calcium?
  • Query 3 checks answers to Query 1 and 2
  • Is a parity-check it

13
An analogy
  • All attacks are communication over channel
  • Good attacks are codes
  • What Bob queries is a codeword bit
  • What he receives is the transmitted codeword that
    he decodes

14
Shannons theorems apply
  • In fact, assuming
  • any functions of Alices data points as queries
    (adaptive, related queries)
  • and error probability ? 0 as n ??
  • The number of queries required per bit of entropy
  • is asymptotically tightly bound below by the
    inverse of the channel capacity
  • Above this bound, error tends exponentially to 0
  • Below it, it increases exponentially with n

15
Questions
  • How does one determine the entropy of a
    particular data set, or a general data set?
  • What kinds of attacks are computationally
    feasible?
  • This was a very powerful attacker. What are
    reasonable limits on the attackers abilities?
  • Result in itself, independent of model.
  • Partly published at Int. Symp. Info. Theory, 2003
  • Journal paper in review, at website

16
Value-free model
  • Human rights aspects covered through crypto
    protocols
  • Necessary health information and community
    information can be gathered
  • Consumer behaviour treated through this game
  • Criticism very adversarial model

17
Another application anonymous deliveryCrowds
Reiter and Rubin/Lucent and ATT
At node i1 node i more likely than any other
Receiver Node i1 Message sending
node Received symbol Node i Channel
characteristic Probability that true sender is
Node i, Probability that other nodes are
senders Traffic analysis/data mining
correlations among senders (communication across
channel, less efficient than some
error-correcting code)
B
A
E
C
D
N nodes pf probability of forwarding
18
An example of model use to measure the value of
informationwith Yu-An Sun and Sumit Joshi
  • Auction bids reveal much about an individuals
    profile
  • Consider the Vickrey sealed second highest bid
    auction
  • Optimal strategy to bid ones valuation
  • Bids (and hence valuations) can be protected with
    secure multiparty computation
  • But, bids allow determination of market demand
    (efficient markets)
  • Need for an aggregate value, not well-defined at
    the moment of the auction

19
Variably Private Vickrey Bidding
RoundIntroduce uncertainty
  • The seller announces a minimum sale price and a
    maximum randomization setting.
  • Each bidder submits a sealed interval containing
    her bid. The size of the interval is her choice.
  • In the running with high end, committed to low

20
Variably Private Vickrey Revealing Round
  • Bidders not in the running will reveal no more
    information on their valuations.
  • Largest of the others will reveal which half of
    their interval contains valuation

21
Sale Price

Seller gets
Buyer pays

Divided among all bidders proportional to the
interval width
22
Properties?
  • Provides various demand statistics
  • In general, accuracy of future bid estimation
    lower for more uncertainty
  • Allows for bidder to vary uncertainty, and pay
    for it
  • Allows seller to obtain more than regular
    Vickrey, depending on how much information is
    valued
  • Bidder with highest valuation still wins auction
    as long as she can tolerate revealing her
    valuation to the extent required.

23
Summary
  • A model that we hope will
  • Provide choices not currently typically available
    to users
  • Extend the security framework to include problems
    like those in statistical databases
  • Provide a means of measuring uncertainty in
    situations where there is some not none or
    complete
  • Include other leakage from security-related
    protocols such as anonymous delivery and ciphers
  • Be useful for measuring the economic value of
    information
Write a Comment
User Comments (0)
About PowerShow.com