0wning Y0ur Inb0x: Attacks on Spam Filtering - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

0wning Y0ur Inb0x: Attacks on Spam Filtering

Description:

Marco Barreno, Jack Chi, Anthony Joseph, Blaine Nelson, Benjamin Rubinstein, ... angelo asphyxiate brad abase decompression codebreak. ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 35
Provided by: eecsd9
Category:

less

Transcript and Presenter's Notes

Title: 0wning Y0ur Inb0x: Attacks on Spam Filtering


1
0wning Y0ur Inb0x Attacks on Spam Filtering
  • Marco Barreno, Jack Chi, Anthony Joseph, Blaine
    Nelson, Benjamin Rubinstein, Udam Saini, Charles
    Sutton, Doug Tygar, Kai Xia

2
Motivation for SecML
  • Many security-sensitive applications use adaptive
    learning techniques
  • Using learning techniques in these systems
    introduces new security vulnerabilities
  • Learning techniques can be misled by malicious
    data
  • How much of a threat is this new adversary?
  • How hard is an attack for the adversary?
  • Are there defenses against these threats?

3
Big Picture
Low level spec
Com- piler
High level spec
Instrumentation Backplane
New apps, equipment, global policies (eg SLA)
Offered load, resource utilization, etc.
Director
Policy-awareswitching
Training data
performance power consumption models
Logmining
4
Scope of this Talk
  • Goals
  • Develop attacks against spam filters
  • Suggest need for better filters
  • Our Previous Work
  • Taxonomy of attack strategies
  • Analysis of simple learners
  • Concrete attacks on real systems

5
Taxonomy of Attacks
  • Criteria to qualify attacks design new systems
  • Type of Security Violation
  • Integrity or Availability
  • Attackers Influence
  • Causative or Exploratory
  • Specificity of the Attack
  • Ranges from Targeted to Indiscriminate

6
A Different Type of Attack
  • Causative Availability attacks

Adversary
Attack Corpus
Contamination
Email Distribution
Training Corpus
SpamBayes Filter
Training
New Ham Email
INBOX
Spam Folder
7
Our Target SpamBayes
8
SpamBayes Overview
  • SpamBayes is a statistical spam filter
  • Unigram word frequency model
  • Naïve Bayes assumption of independence
  • Uses a chi-square test to score email
  • Thresholds spam score to label messages as ham,
    unsure, or spam

9
SpamBayes Token Probabilities
  • SpamBayes estimates the probability of spam given
    each token w
  • Uses Bayes Rule to compute
  • Each estimate is smoothed
  • Uses Bayesian smoothing to computing a score

10
SpamBayes Combining Probabilities
  • SpamBayes treats each f(Spamw) as an independent
    test for spam
  • Tests are combined with Fishers test
  • H and S combine into a spam score

11
Our Email Datasets
  • Ham email from the Enron dataset
  • Email made public in trial
  • Total of 92,159 messages (52,790 spam 39,399
    ham)

12
Our Attacks against SpamBayes
13
Targeted Attack
  • Goal cause a particular target email to be
    misclassified as spam
  • Example

Dear Sir, I wanted to make you aware that
your son was absent from school
today. Sincerely, S. Skinner
Dear Sir, Sincerely school was I son S. you
aware that make your to Skinner wanted absent
from today.
Dear Sir, I wanted to make you aware that
your son was absent from school
today. Sincerely, S. Skinner
14
Targeted Attack Results
15
Targeted Attack Results
16
Dictionary Attack
  • Goal make spam filter unusable by causing it to
    classify ham as spam
  • Example

17
Dictionary Attack Results
18
Distribution-based Attacks
  • Problems
  • Targeted attacks require specific information
    about the desired target
  • Dictionary attacks require huge emails (100,000
    words per message)
  • Using partial information (distribution of words
    used) we can reduce the size of messages

19
The Specificity Spectrum
  • Ranges from targeted to indiscriminate
  • Represents the amount knowledge the attacker has
    about the victims email
  • Targeted, Distribution, Dictionary Attacks

20
Distribution-based Attacks
21
Real World Attacks
  • Attacks or Failed Spam?

Sent 11/27/06
Sent 7/16/07
Sent 7/22/07
brocade crown bethought chimney. angelo
asphyxiate brad abase decompression codebreak.
crankcase big conjuncture chit contention acorn
cpa bladderwort chick. cinematic agleam
chemisorb brothel choir conformance airfield.
calvert dawson blockage card. coercion
choreograph asparagine bonnet contrast bloop.
coextensive bodybuild bastion chalkboard
denominate clare churchgo compote
act. childhood ardent brethren commercial
complain concerto depressor.
"what, is he coming home, and without poor
lydia?" she cried. "sure he will not leave london
Sent 11/28/06
"i am quite sorry, lizzy, that you should be
forced to have that disagreeable man all to
yourself.
an amino acid
a fruit dish
a freshwater plant
to chemically bind to
22
Conclusions Future Work
23
Future Work
  • Further Analyses
  • More realistic variants of our attacks
  • Better understanding of attacks
  • Quantifying the work function of the attacker
  • Defenses against attacks
  • Designing filters with an attackers in mind
  • Design strategies for general purpose ML

24
Data Collaboration
  • Expanding the scope of our work
  • Other spam filters
  • Other learning domains
  • Other datasets
  • We are interested in industry collaboration!

25
Conclusion
  • We Explored Availability Attacks
  • Successfully targeted a specific ham message
  • Successfully caused general DOS attack
  • On going research into distribution-based attacks
  • Ongoing research on defenses

26
Questions?
27
(No Transcript)
28
Extra Slides
29
Outline
  • Motivation
  • Our Target SpamBayes spam filter
  • Our Attacks against SpamBayes
  • Conclusions Future Work

30
Taxonomy of Attacks
  • Criteria to qualify attacks design new systems
  • Type of Security Violation
  • Integrity or Availability
  • Attackers Influence
  • Causative or Exploratory
  • Specificity of the Attack
  • Ranges from Targeted to Indiscriminate

31
Previous Work
  • Other Spam Attacks
  • Kryptonite Word Attack
  • Good Word Attacks
  • ACRE Learning
  • Game Theoretic Approach
  • All are approaches for creating better spam -
    integrity attacks

32
SpamBayes Token Probabilities
  • SpamBayes estimates the probability of spam given
    each token w
  • Each estimate is smoothed

33
Dictionary Attack Results
34
Other Attacks
  • Pseudo-Spam Attack
  • Send ham bodies with spam headers making header
    tokens less informative
  • Dilution Attack
  • Insert tokens from a target spam into ham emails
    to decrease the tokens spam score
Write a Comment
User Comments (0)
About PowerShow.com