Title: Authors: Anirudh Ramachandran, Nick Feamster, and Santosh Vempala
1Filtering Spam with Behavioral Blacklisting
- Authors Anirudh Ramachandran, Nick Feamster,
and Santosh Vempala - Publication ACM Conference on Computer and
Communications Security 2007 - Presenter Melvin Rodriguez for CAP 6133,
Spring08
2Filtering Spam with Behavioral Blacklisting
- What is Spam / Spamming
- Indiscriminately send of unsolicited bulk
messages - - Different types of Spam
- Why it is used?
- Reach of potential customers / market
- No or little operating cost for senders
E-mail Spam - unsolicited bulk email
Source Wikipedia -http//en.wikipedia.org/wiki/Sp
am_28Monty_Python29
3Filtering Spam with Behavioral Blacklisting
- The problem with Spam
- Users waste time and resources
- Users received multiple unwanted emails
- Promoting events
- Advertising new products sales
- Promoting services
- Spammers use more sophisticated techniques
- More resources are needed
- Increase capacity in servers storage and
bandwidth - Increase time to manage items
Spam uses needed resources and increase costs
Source Wurd -http//www.wurd.com/cl_email_faq_spa
m.php
4Filtering Spam with Behavioral Blacklisting
- A 2007 study by Osterman Research Inc.
- - A growing proportion of spam is generated by
zombies - that are part of enormous botnets of infected
computers. - - Symantec reported in March 2007 that it had
discovered - more than six million zombies worldwide.
- - More than 80 of spam is today generated by
zombies - - Spam campaigns are constantly changing
strategies
Spam very hard to control
Source Osterman Research Inc -
http//www.ostermanresearch.com/whitepapers/or_sym
0607.pdf
5Filtering Spam with Behavioral Blacklisting
- How to Solve the problem
- Spam Filters
- Blacklisting
- Publicizing known IP addresses that send spam
- Issues
- Need to know what to block / filter
- Need to constant update
- Need to adapt to spam campaign changes
- Behavior Blacklisting Spam Tracker
- Classifies senders based on their sending
behavior rather than IP identity - Similar patterns of spammers sending behavior
fingerprint
Behavior Spam Filter based on sending behavior
6Filtering Spam with Behavioral Blacklisting
- Behavior Blacklisting
- Spam Tracker
- Cluster emails based on targeted domains
- Builds blacklist clusters based on known
spammers - Tracks sending patterns from other senders
- Uses fast spectral clustering algorithms
- Two phases
- Clustering (spectral) - similar behavior in their
target domain - - gather initial data and create clusters
- Classification assign a value and compare
- - obtain sending patterns from servers
- - compares algorithm value to known pattern
7Filtering Spam with Behavioral Blacklisting
Spam Tracker High Level Design
8Filtering Spam with Behavioral Blacklisting
- Conclusion
- New spam detecting technique using behavioral
blacklisting - Classifies email based on senders sending patters
- Creation of email blacklist clusters
9Filtering Spam with Behavioral Blacklisting
- Contributions
- Improvements on detecting email spam
- Using new algorithms to detect and classify email
spam - Capable of detecting new email spammers senders
earlier than existing processes
10Filtering Spam with Behavioral Blacklisting
- Weaknesses
- Dependent on the number of data sources
- Limited number of data collection points
- Limited testing pool of domains
- Process sequence is not clearly depicted
- Lack of integration with existing spam systems
11Filtering Spam with Behavioral Blacklisting
- How to Improve
- More testing needed for analysis of false
positives - Increase the number of data collection points
- Add additional features to algorithms
- Add integration capabilities with existing email
spam services - Add missing diagrams discussed in paper
- Present process sequence in more detail
12Filtering Spam with Behavioral Blacklisting
13Filtering Spam with Behavioral Blacklisting
- Origins of the use of the word SPAM
- "Spam" is a popular Monty Python sketch, first
televised in 1970. In the sketch, two customers
are trying to order a breakfast from a menu that
includes the processed meat product in almost
every dish. The term spam (in electronic
communication, and as of 2007, general slang) is
derived from this sketch.
Source Wikipedia -http//en.wikipedia.org/wiki/Sp
am_28Monty_Python29
14Filtering Spam with Behavioral Blacklisting
- The list of 2007 top 12 countries that spread
spam around the globe - USA - 28.4
- South Korea - 5.2
- China (including Hong Kong) - 4.9
- Russia - 4.4
- Brazil - 3.7
- France - 3.6
- Germany - 3.4
- Turkey - 3.
- Poland - 2.7
- Great Britain - 2.4
- Romania - 2.3
- Mexico - 1.9
- Other countries - 33.9 8
Source Wikipedia -http//en.wikipedia.org/wiki/Sp
am_28Monty_Python29
15Filtering Spam with Behavioral Blacklisting
- Trace Date Range Fields
- Organization Mar. 1 31, 2007 Received time,
remote IP, targeted domain, whether rejected - Blacklist Apr. 1 30, 2007 IP address (or
range), time of listing - Data sets used in evaluation.
- Our primary data is a set of email logs from a
provider (Organization) that hosts and manages
mail servers for over 115 domains. The trace also
contains an indication of whether it rejected the
SMTP connection or not. We also use the full
database of Spamhaus 37 for one month,
including all additions that happened within the
month (Blacklist), to help us evaluate the
performance of SpamTracker relative to existing
blacklists. We choose the Blacklist traces for
the time period immediately after the email
traces end so that we can discover the first time
an IP address, unlisted at the time email from it
observed in the Organization trace was added to
Blacklist trace.