Privacy-Preserving%20Databases%20and%20Data%20Mining - PowerPoint PPT Presentation

About This Presentation
Title:

Privacy-Preserving%20Databases%20and%20Data%20Mining

Description:

Some of the largest airline companies in US, including American, United and ... US, which aims to build a centralized database that will store the credit card ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 19
Provided by: peopleSab
Category:

less

Transcript and Presenter's Notes

Title: Privacy-Preserving%20Databases%20and%20Data%20Mining


1
Privacy-Preserving Databases and Data Mining
  • Yücel SAYGIN
  • ysaygin_at_sabanciuniv.edu
  • http//people.sabanciuniv.edu/ysaygin/

2
Outline
  • Privacy an informal discussion
  • Overview of data mining
  • Overview of privacy preserving databases and data
    mining
  • Privacy preserving data mining
  • Privacy protection against data mining
  • Privacy preserving databases
  • Future research directions

3
Privacy What, Why, and How
  • Privacy Giving the people the right to be left
    alone
  • It is one of the fundamental rights of people in
    western civilizations
  • Privacy of data Giving the data owners the right
    to say what can be done with their data

4
Is data privacy something new?
  • Privacy has been one of the fundamental rights of
    people
  • Maybe termed differently but it has been studied
    in the past
  • Statistical databases, statistical disclosure
    control
  • The inference problem

5
Why privacy is a really big issue these days?
  • Technology is really integrated with our personal
    life
  • With new technology Networking, WEB
  • New devices Mobile Phones, RFID tags, Computers,
    digital cameras
  • Which means that data about us, and about what we
    are doing can be collected easily and at a
    fraction of the cost 10 years ago.
  • Navigation patterns in WEB
  • Location information (wireless phones, RFID tags)
  • Transactions (e-commerce, POS)
  • Your emails (now scanned by gmail to display
    ads) (was a big discussion in the CFP conference
    at Berkeley this year )

6
Why privacy is a really big issue these days?
CAPPS II (Computer Assisted Passenger
Prescreening System) collects flight reservation
information as well as commercial information
about passengers. This data, in turn, can be
utilized by government security agencies.
Although CAPPS represents US national data
collection efforts, it also has an effect on
other countries.
7
Why privacy is a really big issue these days?
The following sign at the KLM ticket desk in
Amsterdam International Airport demonstrates the
point Please note that KLM Royal Dutch
Airlines and other airlines are required by new
security laws in the US and several other
countries to give security customs and
immigration authorities access to passenger data.
Accordingly any information we hold about you and
your travel arrangements may be disclosed to the
concerning authorities of these countries in your
itinerary.
8
Why privacy is a really big issue these days?
Some of the largest airline companies in US,
including American, United and Northwest, turned
over millions of passenger records to the FBI
SSchwartz J. Micheline M. (2004). Airlines
Gave F.B.I. Millions of Records on Travelers
After 9/11 NY Times, May 1.
9
Why privacy is a really big issue these days?
Total Information Awareness (TIA) project in US,
which aims to build a centralized database that
will store the credit card transactions, emails,
web site visits, flight details of Americans was
not funded by the Congress due to privacy
concerns.
10
Why is privacy a really big issue these days?
  • Data about us is being collected and stored
    somewhere
  • We need to have the right to control
  • what data is collected about us,
  • how long it should be stored,
  • who is going to see it
  • and how it is going to be used

11
But we have all this security research going on
for decades!
  • Security (Database, Network etc) is necessary but
    not sufficient to ensure full privacy.
  • Once someone has access to the data what can be
    done with it (e.g. giving your email to a third
    party, giving away your profile, shopping
    behavior etc.) needs to be regulated.

12
Some of the past research in the context of
security is useful for data privacy
  • Disclosure Control in statistical databases
  • The inference problem and proposed solutions
  • Encryption techniques
  • Secure multi party computation

13
So what have Data Mining and Databases to do with
Privacy?
  • They deal with data mostly about people.
    Therefore we need to integrate privacy into
    database systems and data mining tools.
  • Data mining is seen as a magic tool that can find
    secret information in piles of data, therefore
    there is some hesitation in public about data
    mining
  • This is partially true
  • But they are just tools designed by human beings,
    that need some good training data, and experts to
    interpret the results.

14
Data mining and Privacy Issues Gained Momentum in
US
  • Pentagon has released a study that recommends
    the government to pursue specific technologies as
    potential safeguards against the misuse of
    data-mining systems similar to those now being
    considered by the government to track civilian
    activities electronically in the United States
    and abroad.
  • "Perhaps the strongest protection against
    abuse of information systems is Strong Audit
    mechanisms we need to watch the watchers"
  • Markoff J. (2002). Study Seeks
    Technology Safeguards for Privacy. NY Times, 19
    December.
  • This shows us that even the most aggressive data
    collectors in the US are aware of the fact that
    the data mining tools could be misused and we
    need a mechanism to protect the confidentiality
    and privacy of people.

15
Privacy Issues Gained Momentum among researchers.
  • More research funding
  • More projects
  • More sessions on privacy in database and data
    mining conferences
  • Search google privacy data mining , you will
    have pages of results. It was not like that 3-4
    years ago.
  • Centers for privacy research (IBM Almaden,
    Stanford, Purdue Univ. )

16
Overview of Data Mining
  • Data mining is a combination of statistics
  • Data mining models
  • Patterns (associations, sequences,)
  • Clusters
  • Classification

17
Privacy preserving data mining
  • Privacy preserving classification model
    construction
  • Privacy preserving data clustering
  • Privacy preserving association rule mining

18
Privacy preserving classification
  • Reference Rakesh Agrawal and Ramakrishnan
    Srikant. Privacy-Preserving Data Mining.
    SIGMOD, 2000, Dallas, TX.
Write a Comment
User Comments (0)
About PowerShow.com