Privacy-preserving Anonymization of Set Value Data - PowerPoint PPT Presentation

About This Presentation
Title:

Privacy-preserving Anonymization of Set Value Data

Description:

Privacy-preserving Anonymization of Set Value Data Manolis Terrovitis, Nikos Mamoulis University of Hong Kong Panos Kalnis National University of Singapore – PowerPoint PPT presentation

Number of Views:194
Avg rating:3.0/5.0
Slides: 18
Provided by: webKaust
Category:

less

Transcript and Presenter's Notes

Title: Privacy-preserving Anonymization of Set Value Data


1
Privacy-preserving Anonymization of Set Value
Data
Manolis Terrovitis, Nikos Mamoulis University of
Hong Kong Panos Kalnis National University of
Singapore www.comp.nus.edu.sg/kalnis
2
Motivation
Helen
0 Milk
Beer
Pregnancy test
  • Attacker can see up to m items
  • Any m items
  • No distinction between sensitive and
    non-sensitive items

3
Motivation (cont.)
Attacker Find all transactions that contain Beer
0 Milk
Published
t1 Beer, Milk, Pregnancy test t2 Cola,
Cheese t3 Milk, Coffee . tn Wine, Beer, Milk
t1 Beer, 0Milk, Pregnancy test t2 Cola,
Cheese t3 2 Milk, Coffee . tn Wine, Beer,
Full-fat Milk
4
km-anonymity
5
Related Work K-Anonymity Swe02
NOT suitable for high-dimensionality
Quasi-identifier
Age ZipCode Disease
42 25000 Flu
46 35000 AIDS
50 20000 Cancer
54 40000 Gastritis
48 50000 Dyspepsia
56 55000 Bronchitis
Age ZipCode Disease
42-46 25000-35000 Flu
42-46 25000-35000 AIDS
50-54 20000-40000 Cancer
50-54 20000-40000 Gastritis
48-56 50000-55000 Dyspepsia
48-56 50000-55000 Bronchitis
(a) Microdata
  1. 2-anonymous microdata

Swe02 L. Sweeney. k-Anonymity A Model for
Protecting Privacy. Int. J. of Uncertainty,
Fuzziness and Knowledge-Based Systems,
10(5)557-570, 2002.
6
Related Work L-diversity in Transactions
Requires knowledge of (non)-sensitive attributes
GTK08 G. Ghinita, Y. Tao, P. Kalnis, On the
Anonymization of Sparse High-Dimensional Data,
ICDE, 2008
7
Our Approach Employs Generalization
Generalization Hierarchy
k2 m2
8
Lattice of Generalizations
9
Count Tree
2
3
2
2
10
Optimal Algorithm
?
Q ? ? ?
Q ?
Q ? ? ?
?
?
?
?
11
Direct Anonymization
?
  • Solves each problem independently

?
?
?
?
COUNT(a1,a2)1
12
Apriori-based Anonymization
  • Construct the count-tree incrementally
  • Prune unnecessary branches

13
Small Datasets (2-15K, BMS-WebView2)
  • I40..60, k100, m3

14
Small Datasets (BMS-WebView2)
  • D10K, k100, m1..4

15
Apriori Anonymization for Large Datasets
D I
515K 1657
59K 497
77K 3340
  • k5
  • m3

16
Points to Remember
  • Anonymization of Transactional Data
  • Attacker knows m items
  • Any m items can be the quasi-identifier
  • Global recoding method
  • Optimal solution too slow
  • Apriori Anonymization fast and low information
    loss
  • On-going work
  • Local recoding (sort by Gray order and partition)
  • Transactional data in streaming environments

17
Bibliography on LBS Privacy
  • http//anonym.comp.nus.edu.sg
  • ?
Write a Comment
User Comments (0)
About PowerShow.com