0wning Y0ur Inb0x: Attacks on Spam Filtering - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

0wning Y0ur Inb0x: Attacks on Spam Filtering

Description:

Marco Barreno, Jack Chi, Anthony Joseph, Blaine Nelson, Benjamin Rubinstein, ... angelo asphyxiate brad abase decompression codebreak. ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 35

Provided by: eecsd9

Category:

more less

Transcript and Presenter's Notes

Title: 0wning Y0ur Inb0x: Attacks on Spam Filtering

1
0wning Y0ur Inb0x Attacks on Spam Filtering

Marco Barreno, Jack Chi, Anthony Joseph, Blaine
Nelson, Benjamin Rubinstein, Udam Saini, Charles
Sutton, Doug Tygar, Kai Xia

2
Motivation for SecML

Many security-sensitive applications use adaptive
learning techniques
Using learning techniques in these systems
introduces new security vulnerabilities
Learning techniques can be misled by malicious
data
How much of a threat is this new adversary?
How hard is an attack for the adversary?
Are there defenses against these threats?

3
Big Picture
Low level spec
Com- piler
High level spec
Instrumentation Backplane
New apps, equipment, global policies (eg SLA)
Offered load, resource utilization, etc.
Director
Policy-awareswitching
Training data
performance power consumption models
Logmining
4
Scope of this Talk

Goals
Develop attacks against spam filters
Suggest need for better filters
Our Previous Work
Taxonomy of attack strategies
Analysis of simple learners
Concrete attacks on real systems

5
Taxonomy of Attacks

Criteria to qualify attacks design new systems
Type of Security Violation
Integrity or Availability
Attackers Influence
Causative or Exploratory
Specificity of the Attack
Ranges from Targeted to Indiscriminate

6
A Different Type of Attack

Causative Availability attacks

Adversary
Attack Corpus
Contamination
Email Distribution
Training Corpus
SpamBayes Filter
Training
New Ham Email
INBOX
Spam Folder
7
Our Target SpamBayes
8
SpamBayes Overview

SpamBayes is a statistical spam filter
Unigram word frequency model
Naïve Bayes assumption of independence
Uses a chi-square test to score email
Thresholds spam score to label messages as ham,
unsure, or spam

9
SpamBayes Token Probabilities

SpamBayes estimates the probability of spam given
each token w
Uses Bayes Rule to compute
Each estimate is smoothed
Uses Bayesian smoothing to computing a score

10
SpamBayes Combining Probabilities

SpamBayes treats each f(Spamw) as an independent
test for spam
Tests are combined with Fishers test
H and S combine into a spam score

11
Our Email Datasets

Ham email from the Enron dataset
Email made public in trial
Total of 92,159 messages (52,790 spam 39,399
ham)

12
Our Attacks against SpamBayes
13
Targeted Attack

Goal cause a particular target email to be
misclassified as spam
Example

Dear Sir, I wanted to make you aware that
your son was absent from school
today. Sincerely, S. Skinner
Dear Sir, Sincerely school was I son S. you
aware that make your to Skinner wanted absent
from today.
Dear Sir, I wanted to make you aware that
your son was absent from school
today. Sincerely, S. Skinner
14
Targeted Attack Results
15
Targeted Attack Results
16
Dictionary Attack

Goal make spam filter unusable by causing it to
classify ham as spam
Example

17
Dictionary Attack Results
18
Distribution-based Attacks

Problems
Targeted attacks require specific information
about the desired target
Dictionary attacks require huge emails (100,000
words per message)
Using partial information (distribution of words
used) we can reduce the size of messages

19
The Specificity Spectrum

Ranges from targeted to indiscriminate
Represents the amount knowledge the attacker has
about the victims email
Targeted, Distribution, Dictionary Attacks

20
Distribution-based Attacks
21
Real World Attacks

Attacks or Failed Spam?

Sent 11/27/06
Sent 7/16/07
Sent 7/22/07
brocade crown bethought chimney. angelo
asphyxiate brad abase decompression codebreak.
crankcase big conjuncture chit contention acorn
cpa bladderwort chick. cinematic agleam
chemisorb brothel choir conformance airfield.
calvert dawson blockage card. coercion
choreograph asparagine bonnet contrast bloop.
coextensive bodybuild bastion chalkboard
denominate clare churchgo compote
act. childhood ardent brethren commercial
complain concerto depressor.
"what, is he coming home, and without poor
lydia?" she cried. "sure he will not leave london
Sent 11/28/06
"i am quite sorry, lizzy, that you should be
forced to have that disagreeable man all to
yourself.
an amino acid
a fruit dish
a freshwater plant
to chemically bind to
22
Conclusions Future Work
23
Future Work

Further Analyses
More realistic variants of our attacks
Better understanding of attacks
Quantifying the work function of the attacker
Defenses against attacks
Designing filters with an attackers in mind
Design strategies for general purpose ML

24
Data Collaboration

Expanding the scope of our work
Other spam filters
Other learning domains
Other datasets
We are interested in industry collaboration!

25
Conclusion

We Explored Availability Attacks
Successfully targeted a specific ham message
Successfully caused general DOS attack
On going research into distribution-based attacks
Ongoing research on defenses

26
Questions?
27
(No Transcript)
28
Extra Slides
29
Outline

Motivation
Our Target SpamBayes spam filter
Our Attacks against SpamBayes
Conclusions Future Work

30
Taxonomy of Attacks

Criteria to qualify attacks design new systems
Type of Security Violation
Integrity or Availability
Attackers Influence
Causative or Exploratory
Specificity of the Attack
Ranges from Targeted to Indiscriminate

31
Previous Work

Other Spam Attacks
Kryptonite Word Attack
Good Word Attacks
ACRE Learning
Game Theoretic Approach
All are approaches for creating better spam -
integrity attacks

32
SpamBayes Token Probabilities

SpamBayes estimates the probability of spam given
each token w
Each estimate is smoothed

33
Dictionary Attack Results
34
Other Attacks

Pseudo-Spam Attack
Send ham bodies with spam headers making header
tokens less informative
Dilution Attack
Insert tokens from a target spam into ham emails
to decrease the tokens spam score

Write a Comment

User Comments (0)