Title: Research Overview: Security and Incentives in Emerging Applications
1Research Overview Security and Incentives in
Emerging Applications
2Summary of Previous, Ongoing and Future Research
- Privacy and Anonymity in Computer Networks
GZ02, ZLLY04a - Incentives in Ad Hoc Networks ZYC03, ZLLY04b
- Privacy-Preserving Data Mining Z03, YZW04
- Other Interesting Topics ZY03, AFYZ04
3Part 1 Privacy and Anonymity in Computer Networks
- Finished Optimistic Mix for Exit Polls GZ02
- Under Review Privacy-Preserving Location-based
Services ZLLY04a
4Optimistic Mix GZ02
- A mix network (consisting of a group of mix
servers) is a construction for anonymizing
communications. - Security requirements
- Privacy Infeasible to associate any input with
the corresponding output. - Verifiability Functionality of the mix can be
verified.
5Global Picture of a Mix
- Operations of each server
- Rerandomization generate a set of random
ciphertexts corresponding to the same set of
cleartexts - Reshuffling change the order of ciphertexts
randomly
Ciphertexts
Mix Server
Mix Server
Mix Server
Private Algorithm for Joint Decryption
6Designing a Mix
- Key Technical Question How to ensure each server
follow the protocol? - Zero-knowledge proofs, but proofs in previous
solutions are expensive. - We proposed Proof of Product with Checksum, which
is much more efficient. - However, Proof of Product with Checksum also
introduces a type of attack. We use a novel
technique double encryption to defeat this
attack.
7Our Result and Open Question
- We design a mix that is more efficient than any
previous work with provable security. - In normal cases (no cheating), our mix only needs
a few modular exponentiations. Thus it is more
efficient than previous work. - Our mix net achieves privacy similar to standard
ElGamal-based mix nets its operations are
publicly verifiable. - Open Question Can we further improve the
efficiency? For example, can we avoid using
modular exponentiations?
8Progress of Talk
- Privacy and Anonymity in Computer Networks
- Finished Optimistic Mix for Exit Polls GZ02
- ? Under Review Privacy-Preserving Location-based
Services ZLLY04a - Incentives in Ad Hoc Networks ZYC03, ZLLY04b
- Privacy-Preserving Data Mining Z03, YZW04
- Other Interesting Topics ZY03, AFYZ04
9Privacy-Preserving Location-based Services
- Problem 1 A user authorizes a set of entities to
retrieve her location information. This set will
be dynamically changing. - At any moment, all entities in the designated set
are able to retrieve her location, while any
other entity learns no information about her
location.
10Design Techniques
- Each user stores his location information on a
location server, encrypted using a key
specifically chosen for the set of entities that
are authorized to retrieve the location
information. - Only the entities in the given set can derive the
key to decrypt the location information. - Infeasible for any other entity to derive the
key. - To achieve the above goal, we use a cryptographic
technique motivated by Akl and Taylors work on
hierarchical access control AT83 and Fiat and
Naors work on broadcast encryption FN93.
11Solution to Problem 1 Illustrated
User 1
Entity a
Location encrypted using K1
Entity b
Authorized by User 1
User 2
CANNOT derive K1
Entity c
CAN derive K1
Location Server
User 3
Entity d
Entity e
Entity f
12Privacy-Preserving Location-based Services
Problem 2
- Dating service Does anybody around me satisfy
these requirements This can be easily
implemented - Users register with a server and provide
profiles. - Server keeps track of all users locations.
- When a user sends a request, the server searches
for another user with the same locations whose
profile matches the requirements. - However, in the above solution, users are tracked
either by the service provider or by any other
users. Can this service be implemented without
revealing any users current location?
13Analysis of Problem 2
- When there is an incoming request for match, it
is very easy for the service provider to find the
set of registered users that meet the requestors
requirements. - ? The technical problem is how to decide whether
any of the matched users locations is the same
as the requestors without revealing either the
requestors or the matched users locations.
14Solution to Problem 2 and Open Question
- We give a two-round protocol for problem 2.
- Privacy is guaranteed under the Decisional
Diffie-Hellman assumption. - Computational overhead A user needs to do 3
modular exponentiations The server needs to do
k1 modular exponentiations (for k matched
users). - Open Question Can we design an (efficient)
one-round protocol for this problem?
15Progress of Talk
- Privacy and Anonymity in Computer Networks
GZ02, ZLLY04a - ? Incentives in Ad Hoc Networks
- Finished Sprite ZYC03
- Under Review Corsac ZLLY04b
- Privacy-Preserving Data Mining Z03, YZW04
- Other Interesting Topics ZY03, AFYZ04
16Sprite ZCY03
- Wireless multi-hop networks are formed by mobile
nodes, with no pre-existing infrastructure. - Nodes depend on other nodes to relay packets.
- A node may have no incentive to forward others
packets.
packet
17Sprite System Architecture
Credit-Clearance System
Internet
Wide-area wireless network
18Big Picture Saving Receipts
Credit-Clearance System
Internet
Wide-area wireless network
A
packet
D
C
B
receipt
receipt
(protected by digital signature)
19Big Picture Getting Payment
Credit-Clearance System
Internet
receipt
C
A
D
B
20Our Results on Mobile Ad Hoc Networks
- We designed a simple scheme to stimulate
cooperation. - Cheating cannot increase a players expected
welfare. - In case of collusion, cheating cannot increase
the sum of colluding players expected welfares. - Evaluations have shown that the system has good
performance.
21Corsac ZLLY04b
- Further game-theoretic study of ad hoc networks.
- Consider routing and packet forwarding as a
complete game.
22Packet Forwarding Ideal Solution
- A standard solution concept in game theory ?
dominant action. - An action is dominant if it brings the maximum
utility to each player regardless of other
players actions. - A protocol is Forwarding-Dominant if forwarding
traffic is a dominant action of each node. - Note that, in Sprite, forwarding packets only
maximizes the expected welfare.
23Difficulties in Incentive-Compatible Design
- Although AE03 claims that they have found a
dominant-action solution, we show that
forwarding-dominant protocols do NOT really
exist! - There are also other difficulties in
incentive-compatible design of ad hoc networks,
e.g., determining link costs requires help of
neighbors.
24New Solution Concept Cooperation-Optimal
Protocols
- We propose a new solution concept for the entire
game of routing and forwarding. - A protocol is cooperation-optimal if
- its routing protocol is a dominant-action
solution to the routing subgame - for ANY routing decision generated by actions
designated in routing protocol, following its
forwarding protocol is optimal for each node.
25Corsac A Cooperation-Optimal Protocol
- Corsac a Cooperation-optimal routing-and-forward
ing protocol in wireless ad-hoc networks using
cryptographic techniques - Several cryptographic techniques are used in
designing Corsac. - We are able to show Corsac is cooperation-optimal.
26Open Question for Incentive-Compatible Ad Hoc
Networks
- Are there any alternative solution concepts
(which are both feasible and useful)? - Can we significantly reduce the communication
overhead? (All existing solutions/pseudo-solutions
need too much communication.)
27Progress of Talk
- Privacy and Anonymity in Computer Networks
GZ02, ZLLY04a - Incentives in Ad Hoc Networks ZYC03, ZLLY04b
- ? Privacy-Preserving Data Mining
- Tech Report Privacy-Preserving Mining of
Frequent-itemset Z03 - Under Review Privacy-Preserving Classification
without Loss of Accuracy YZW04 - Other Interesting Topics ZY03, AFYZ04
28Privacy-Preserving Mining of Frequent Itemsets
Z04
- Association Rule Milk ? Cereal.
- Milk, Cereal is frequent (i.e., Milk, Cereal
is large). - Milk, Cereal/Milk is close to 1.
- The key technical problem in association-rule
mining is to find frequent itemsets.
29Privacy in Distributed Mining
- Distributed Mining
- Two (or more) miners.
- Each miner holds a portion of a database.
- Goal Jointly mine the entire database.
- Privacy Each miner learns nothing about others
data, except the output.
30Vertical Partition Weakly Privacy-Preserving
Algorithm
- Vertical Partition ? Each miner holds a subset of
the columns. - Algorithm provides weak privacy ? only support
count ( of appearances of candidate itemset) is
revealed. - Computational Overhead Linear in of
transactions. - Previous solution has a quadratic overhead.
31Vertical Partition Strongly Privacy-Preserving
Algorithm
- Algorithm provides strong privacy ? no
information (except the output) is revealed. - Computational Overhead Also linear in of
transactions. - Slightly more expensive than weakly
privacy-preserving algorithm.
32Horizontal Partition
- Horizontal Partition ? Each miner holds a subset
of rows. - Computational Overhead Still linear in of
transactions. - Works for two or more parties.
- Previous solution only works for three or more
parties.
33Progress of Talk
- Privacy and Anonymity in Computer Networks
GZ02, ZLLY04a - Incentives in Ad Hoc Networks ZYC03, ZLLY04b
- Privacy-Preserving Data Mining
- Tech Report Privacy-Preserving Mining of
Frequent-itemset Z03 - ? Under Review Privacy-Preserving Classification
without Loss of Accuracy YZW04 - Other Interesting Topics ZY03, AFYZ04
34Privacy-Preserving Classification without Loss of
Accuracy YZW04
- Scenario A large number of customers, each
holding a piece of data. - Problem Learn classification rules on their
data. - Naïve solution Survey customers to collect data,
and then learn rules from the collected data. - Why doesnt it work? Because we have a privacy
requirement - The sensitive attributes of these customers need
to be protected.
35Previous Solutions
- Previous solutions use randomization techniques
in the survey. - Basic Idea Change the survey question with a
(small) probability. - Example To survey for customers marital status
(which is sensitive), ask Are you married with
probability 95, and Are you single with
probability 5. - Cannot decide whether a customer is married from
his answer.
36Tradeoff
- The essence of the previous solution is to trade
accuracy for privacy - The more each customer's private information is
protected, the less accurate result the miner
obtains - Conversely, the more accurate the result is, the
less privacy the customers have. - Question Is it possible to protect customer
privacy without loss of accuracy?
37Our Solution
- We propose a privacy-preserving method to compute
joint frequencies - ? of values (or tuples of values) in the
customers' data conditioned on the class value. - No info about the sensitive data is revealed.
- Compared with general-purpose cryptographic
protocols, this method - requires no interaction between customers
- each customer only needs to send a single flow of
communication to the data miner - but we still have a cryptographically strong
guarantee of privacy .
38Application of Our Solution
- Using the above proposed method, we can construct
- a privacy-preserving algorithm for naïve Bayes
learning - a privacy-preserving algorithm for ID3 tree
learning - a privacy-preserving algorithm for association
rule learning. - The privacy guarantee and efficiency vary for
different applications.
39Next Step in Privacy-Preserving Data Mining
- Is there an efficient protocol for privately
computing the scalar product of integer vectors? - A central question in many PPDM problems.
- Can we add economic considerations to
privacy-preserving data mining?
40Progress of Talk
- Privacy and Anonymity in Computer Networks
GZ02, ZLLY04a - Incentives in Ad Hoc Networks ZYC03, ZLLY04b
- Privacy-Preserving Data Mining Z03, YZW04
- ? Other Interesting Topics
- Finished Verifiable Distributed Oblivious
Transfer and Mobile-Agent Security ZY03 ?
Covered in last talk skipped. - Finished Data Entanglement AFYZ04
41Data Entanglement AFYZ04
- Question Suppose you store your data on a remote
server. How do you ensure that it is not
corrupted by the server? - Answer Have your data entangled with some VIPs
such that - corruption of your data ? corruption of theirs.
- Ideally, we should have All-or-nothing Integrity
(AONI) with high probability, - one user cannot recover data ? no user can
recover data
42Basic Framework of Models
43Models Recovery Algorithms and Adversaries
- What recovery algorithms do users have?
- All users use a standard-recovery algorithm
provided by the system designer. - All users use a public-recovery algorithm
provided by the adversary. - Each individual uses a private-recovery algorithm
provided by the adversary. - What kind of adversary?
- Destructive adversary that reduces the entropy of
the data store - Arbitrary adversary
44Our Results
- AONI is possible in standard recovery model
- AONI is impossible in public/private recovery
model with general adversary - AONI is possible in private recovery model with
entropy-reducing adversary.
45Open Question in Data Entanglement
- Is there a model ?
- in which AONI is possible, but
- has weaker assumptions than the standard recovery
model? - Can we extend the model to consider multiple
rounds of operations?
46THANK YOU