Title: Extracting the Ham from Spam
1Extracting the Hamfrom Spam
2Introduction
- History
- Spam
- Terminology
- ASSP
- Benchmarks
- Demo
- Questions
3History
- Where did the term spam come from?
4SPiced hAM
5SPAM sketch
http//www.youtube.com/results?search_queryspamm
ontypython
http//video.google.com/videosearch?qspammontyp
ython
- Scene A cafe. One table is occupied by a group
of Vikings wearing horned helmets. Whenever the
word "spam" is repeated, they begin singing
and/or chanting. A man and his wife enter. The
man is played by Eric Idle, the wife is played by
Graham Chapman (in drag), and the waitress is
played by Terry Jones, also in drag. - ManYou sit here, dear.WifeAll
right.ManMorning!WaitressMorning!ManWell,
what've you got?WaitressWell, there's egg and
bacon egg sausage and bacon egg and spam egg
bacon and spam egg bacon sausage and spam spam
bacon sausage and spam spam egg spam spam bacon
and spam spam sausage spam spam bacon spam
tomato and spamVikingsSpam spam spam
spam...Waitress...spam spam spam egg and spam
spam spam spam spam spam spam baked beans spam
spam spam...VikingsSpam! Lovely spam! Lovely
spam!Waitress...or Lobster Thermidor a Crevette
with a mornay sauce served in a Provencale manner
with shallots and aubergines garnished with
truffle pate, brandy and with a fried egg on top
and spam.WifeHave you got anything without
spam?WaitressWell, there's spam egg sausage and
spam, that's not got much spam in it.WifeI don't
want ANY spam!ManWhy can't she have egg bacon
spam and sausage?WifeTHAT'S got spam in
it!ManHasn't got as much spam in it as spam egg
sausage and spam, has it?VikingsSpam spam spam
spam... (Crescendo through next few
lines...)WifeCould you do the egg bacon spam and
sausage without the spam then?WaitressUrgghh!Wife
What do you mean 'Urgghh'? I don't like
spam!VikingsLovely spam! Wonderful
spam!WaitressShut up!VikingsLovely spam!
Wonderful spam!WaitressShut up! (Vikings stop)
Bloody Vikings! You can't have egg bacon spam and
sausage without the spam.WifeI don't like
spam!ManSshh, dear, don't cause a fuss. I'll
have your spam. I love it. I'm having spam spam
spam spam spam spam spam beaked beans spam spam
spam and spam!VikingsSpam spam spam spam. Lovely
spam! Wonderful spam!WaitressShut up!! Baked
beans are off.ManWell could I have her spam
instead of the baked beans then?WaitressYou mean
spam spam spam spam spam spam... (but it is too
late and the Vikings drown her words)VikingsSpam
spam spam spam. Lovely spam! Wonderful spam! Spam
spa-a-a-a-a-am spam spa-a-a-a-a-am spam. Lovely
spam! Lovely spam! Lovely spam! Lovely spam!
Lovely spam! Spam spam spam spam!
6Spam Spam Spam lyrics
- Lovely spam, wonderful spa-a-m,Lovely spam,
wonderful S Spam,Spa-a-a-a-a-a-a-am,Spa-a-a-a-a-
a-a-am,SPA-A-A-A-A-A-A-AM,SPA-A-A-A-A-A-A-AM,LO
VELY SPAM, LOVELY SPAM,LOVELY SPAM, LOVELY
SPAM,LOVELY SPA-A-A-A-AM...SPA-AM, SPA-AM,
SPA-AM, SPA-A-A-AM!
7What is spam?
- Unsolicited Bulk e-mail (UBE)
- Unsolicited Commerical Email (UCE)
- The abuse of electronic messaging systems to
send unsolicited, undesired bulk messages
8The cost of spam
- Productivity It is estimated that 80-85 of all
email is spam - Payload may contain malware (virus, worm, trojan,
etc.) - Internet bandwidth
9How do spammers gete-mail addresses?
- Replying to a spam e-mail
- Auto-responders (vacation)
- Viewing HTML spam (web beacons)
- Clicking on URLs to websites listed in spam
- Chain e-mail (MUA virus)
- Mining
- Usenet postings/message boards/chat rooms
- Usenet article message-IDs
- Company or personal websites
- DNS SOA records
- whois database
- Opt-out websites
- E-mail worms harvesting address books
- Shady businesses selling addresses to spammers
- Dictionary attacks
- Zombies
10Anti-spam best practices
- Turn off email preview
- Use throw away email addresses
- Do not use an auto responder
- Do not read spam
- Do not click on URLs in spam
- Give your e-mail address only to closely trusted
acquaintances - Use images or other obfuscation techniques
- Googling for your email address
- Use a good spam filter
11Terminology
Not Identified as SPAM Identified as SPAM
Not SPAM (Negative) True Negative False Negative (SPAM)
SPAM (Positive) False Positive True Positive (SPAM)
12xxxxx Listing
- Whitelisting
- A list of email addresses which would generally
never send you spam - Blacklisting
- A list of email addresses or domains you do not
wish to receive any email from - Greylisting
- Temporarily reject an unknown email by imposing a
fixed delay before accepting email (ASSP calls
this Delaying due to a name conflict) - Redlisting
- Keeps an address off the whitelist
13More ASSP terms
- Spam Lover
- Spam Bucket
- Honeypot
- Postmaster
- Bayesian
- MTA
- MUA
- SMTP
14Processing matrix
Filtered Mail Unfiltered Mail
Contributes to whitelist Normal ASSP operation Spam Lover
Doesnt contribute to whitelist Redlist (but does contribute to spam/nospam collections) No processing (also doesnt contribute to spam/nospam collections)
15What is ASSP?
- Anti-Spam SMTP Proxy
- An Open Source platform-independent transparent
SMTP proxy server that leverages numerous
methodologies and technologies to both rigidly
and adaptively identify spam. - -- wikipedia.org
16Theory of Operation
- When you install ASSP a colony of
super-intelligent thermophilus bacteria takes up
residence on your CPU and begin reading all your
email. They communicate using radio waves
directly with the CPU and interface with the ASSP
software choosing between spam and nonspam mail. - If you choose to read further this myth will be
sadly dispelled, and I take no responsibility for
the consequences. - However, you can always refer your users to this
slide to prove to them that their email is
actually being filtered by super-intelligent
bacteria.
17True Theory of Operation
- ASSP uses three complementary strategies to allow
good email and to block unsolicited email - Whitelisting
- Spambuckets
- Bayesian filtering
- Local mail domain users are not whitelisted
18ASSP Implementation
- Version 1.2.5
- It is a single Perl script
- 360 KB
- 10,000 lines
- Built in web server
- Built in Pseudo-SMTP server
19ASSP Target User Base
- ASSPs primary target audience is mail
administrators or system administrators at
smallish institutions. If you operate an ISP or a
mailhost with a heterogeneous user base, you may
not have a good enough consensus about what is
considered spam or is not. It should work well
with between 1 and 300 client addresses and a
mail volume of up to around 100,000 messages per
day. Testing has not been done to verify these
ranges - ASSP is not for the following
- Individual clients -- ASSP must be installed
together with a SMTP server - Domains which receive mail indirectly, for
example if you use fetchmail
20ASSP Philosophy
- Reject SPAM before the SMTP server
- Work with any SMTP MTA
- Adapt quickly as spammers change attack
strategies - Require low maintenance after initial setup
21Main ASSP capabilities
- Automatic Whitelisting
- Spam Traps
- Bayesian filtering
- Greylist
- Whitelist RE Matching
- Email interface
- Mail Analyzer
- Automatic Statistics
- SPF (Sender Policy Framework)
- DNSBL (DNS Black Lists)
- ClamAV virus scanner
- Mail host Headers
22ASSP Features
- Uses existing MTA and MUAs
- Runs on Linux, Unix, Windows, OS X, and more
- Automatic whitelist no-one you email will ever
be blocked - Redlist keeps an address off the whitelist
- Uses honeypot type spambucket addresses to
automatically recognize spam and update your spam
database - Bayesian filter intelligently classifies email
into spam and non-spam - Supports site-defined regular expressions to
identify spam or non-spam email - Accepts whitelist submissions and spam error
reports by authorized email - Browser based setup
- Keeps spam statistics for your site
- Recognizes Mime encoded and other camouflaged
spam - Can listen on more than one smtp port
- Basic anti-virus filtering using the ClamAV virus
databases - Optionally blocks no mail but adds an email
header and/or updates the message subject
(SPAM) - Can block spam-bombs (when spammers forge your
domain in the from field) - More
23ASSP Flexibility
- Whitelist-only mode
- Dont filter, just tag subject line
- Let specific addresses receive SPAM
- Use a mail list behind ASSP
- Use ASSP with redundant MX domains
- Web based configuration
24ASSP Mail Processing
- What order does ASSP process mail to check if it
is spam? -
- Local or whitelisted?
- Blacklisted Domain?
- Spam Helo?
- Addressed to spam-bucket?
- Mail bomb?
- Blocked attachment?
- Matches expression to identify non-spam?
- Matches expression to identify spam?
- Bayesian evaluation
- If the message is identified as spam at any step
along the way it goes to the spam directory. If
the message is local or whitelisted it goes to
the notspam directory.
25Installation Overview
- Install ASSP and dependencies
- Configure ASSP
- Put ASSP in test mode
- Modify mail flow of test user(s)
- Test that it is working
- Prime the system
- Create the Bayesian database
- Automate daily Bayesian database updates
- Monitor spam filtering
- Correct false negatives and false positives
- Take ASSP out of test mode
- Train user community
- Modify mail flow of trained users
26ASSP Installation
- Install Perl
- Install Perl modules from CPAN
- CompressZlib NEEDED - Standard Perl
installation - DigestMD5 NEEDED - Standard Perl installation
- TimeHiRes NEEDED - Standard Perl installation
- NetDNS NEEDED TO RUN RBL, SPF and 1.2.X
- EmailValid OPTIONAL, BUT ADVISED
- FileReadBackwards OPTIONAL, BUT ADVISED
- MailSPFQuery OPTIONAL
- MailSRS OPTIONAL
- SysSyslog OPTIONAL
- NetLDAP OPTIONAL NEEDED IF YOU RUN LDAP
- Win32Daemon NEEDED to run as a service on
Windows - No installation script
- GUNZIP assp.tar.gz to /usr/local/assp
- In /usr/local create the following directories
- assp/spam
- assp/notspam
- assp/errors
27Configure ASSP
- Start ASSP
- perl assp.pl
- Configure ASSP
- http//127.0.0.155555
- Login ltemptygt
- Password nospam4me (default)
- Beware of the Show Advanced Configuration Option
28ASSP Configuration
29Initial Configuration
- Change values for
- Web Admin Password
- Accept All Mail
- Local Domains
- Spam Error
- Spam Addresses
- Addresses of recipients at your site that only
receive spam (website spam-bait, ex-employees)
30Mail Flow
Internet
Mail Svr
Clients
Inbound
Outbound
Internet
Mail Svr
Clients
with ASSP
Internet
ASSP
Mail Svr
Clients
Inbound
Outbound
Internet
ASSP
Mail Svr
Clients
Internet
Mail Svr
ASSP
Clients
Invalid
31Email Flow
Internet
ASSP
GroupWise/ Exchange
Clients
MTA
Inbound
Outbound
Internet
GroupWise/ Exchange
Clients
MTA
ASSP
smtp0
white
red
black
grey
25
125
ASSP
MTA
in
out
Bayesian DB
Errors
spam
Not spam
321999
This is an email that is being sent to the
Internet. Th This is an email that is
332003
342004
GroupWise
Internet MTA
GWIA
MTA
sendmail
Virtuser table
aliases
POA
352006
GroupWise
Internet MTA
GWIA
MTA
sendmail
Virtuser table
aliases
POA
36Phase In
GroupWise
Internet MTA
GWIA
MTA
sendmail
Virtuser table
aliases
POA
37Flow with Anti-Virus
Internet
ASSP
Mail Svr
Clients
Antivirus
Inbound
Internet
ASSP
Clients
Outbound
Mail Svr
Antivirus
38Flow with Groupware
Internet
ASSP
Groupware
Clients
MTA
Inbound
Outbound
Internet
Groupware
Clients
MTA
ASSP
- To use ASSP with Exchange, Lotus Notes or
GroupWise, youll also need to implement a
smarthost relay like sendmail, qmail, postfix,
exim or one in a number of others
39DNSBL vs Greylist
- The ASSP Greylist supercedes DNSBL
- ASSP Greylist is not to be confused with
Greylisting - Use of DNSBL is discouraged (If a DNSBL lookup
blocks, ASSP will block due to its multiplex
design)
40Penalty Box
- This will blacklist an SMTP server for about 72
hours or so from sending to your server if they
violate basic SMTP connection conventions over a
certain threshold.
41SMTP Ports
- For example, internet mail needs to connect to
ASSP on port 25 (ASSP's listen port), and ASSP
can proxy to your mail server on port 125 (or any
port you choose) -- ASSP's SMTP Destination. You
need to change your mail server to match.
42Sender Notification
- With most client-based filters (POPFile,
SpamBayes, SpamAssassin) senders receive NO
NOTIFICATION if their mail isn't delivered. With
most of these solutions, the user bears full
responsibility to VERIFY that no good mail is
blocked. -
- ASSPs solution to this is that when spam is
blocked the SENDER RECEIVES NOTIFICATION, and it
does this without generating non-delivery reports
that bounce and bounce again because spammers
forge their from address.
43Catch-22
- Issue Lets say a client receives a
non-delivery report, how can he (not in
whitelist) send a message to the organization if
he is still not in whitelist? I mean, if the
recipient or assp admin does not receive the
notification, they will not know that there is a
false positive and will not add the unknown
client to whitelist... - Solution Set up an email address and put it in
the Spam-Lover Address configuration option.
Then modify the spam error message to direct
people to "500 Mail appears to be unsolicited
(spam) -- please forward this email to
not-spam_at_mydomain.com if you feel this is in
error."Any false positives that bounce back to
clients will hopefully be reported to the Mail
Admin via the spam lover address (they just
forward it), assuming they read the rejected
email.
44Email Interface
- Any user can help to improve ASSPs spam
filtering accuracy. Users can use it to add
addresses to the whitelist, report spam, or
false-positives. To use it, you must have it
enabeled in the configuration, and have names set
for the addresses. The interface only accepts
mail addressed to addresses at any of your
localdomains, and only from "Accept All Mail"
hosts, or authenticated SMTP connections. - assp-white -- for whitelist additions
- assp-spam -- to report spam that got through
- assp-notspam -- to report mis-categorized spam
- Whitelisting Assuming that your local-domain is
yourdomain.com, to add addresses to the
whitelist, youd create a message to
assp-white_at_yourdomain.com. You can either put the
addresses in the body of the message, or as
recipients of the message. For example, if you
wanted to add all the addresses in your address
book to the whitelist, create a message to
assp-white_at_yourdomain.com and then add your
entire address book to the BCC part of the
message and click send. Note that no mail will be
delivered to any address except
assp-white_at_yourdomain.com (and that won't
actually be passed to your mail transport).
Within a short time you'll receive a response
from ASSP showing the results of your mail. - False Negatives To report a spam that got
through, simply forward the mail to
assp-spam_at_yourdomain.com. It's best to forward it
as an attachment, but you can just forward it
normally if you must. In a short time you will
receive a confirmation. - False Positives The process is the same to
report a miscategorized spam, but send it to
assp-notspam_at_yourdomain.com.
45Spam Report
46Benchmarks
- Spam Bucket
- Ex-employee that left the company 5 years ago
- Receives 50-80 spam mails per day
47Filter effectiveness
- SpamAssassin 60-65 effective in 2004
- Deteriorated to 11 by 2006
- (267 of 2238 True Positives)
- ASSP in first 3 weeks of operation 99.7
- (1336 of 1340 True Positives)
48ASSP vs SpamAssassin
- SpamAssassin
- is difficult to install
- great investment in hand-made regular expressions
and header analysis to identify spam - Hand-crafted expressions are brittle as spammers
adjust their strategies - Requires frequent updates to accurately identify
spam - ASSP
- is low maintenance
- is easy to install
- is a complete spam blocking solution, not just a
filter that must be integrated into your MTA - works with nearly every MTA on any OS
- Poorly documented
49Before ASSP
50Turning ASSP on
51With ASSP
52stat.pl Statistics
root_at_smtp perl stat.pl /tmp/m.log As of Mon
Jan 22 214846 2007 the mail logfile shows 0
proxy / smtp connections 253 were dropped for
attempted relays (0.0 of total). 31523
messages, 16758 were spam (53.2) in 65 days for
485.0 messages per day or 257.8 spams per
day 1518 additions to / verifications of the
whitelist (23.4 per day) 14643 were judged spam
by the bayesian filter (87.4 of spam) 2115 were
to spam addresses (12.6 of spam) 0 were rejected
for executable attachments (0 of spam) 10121
were sent from local clients (68.5 of
nonspam) 842 were from whitelisted addresses
(5.7 of nonspam) 0 messages were passed to
SPAMLOVERs 3802 were ok after a bayesian check
(25.8 of nonspam) 1498 addresses are on the
whitelist 0 hits on the blacklist 0 resulted in
spam (0.0 of Bayesian spam, 0.0 of blacklist
hits) 0 resulted in non-spam (0.000 of blacklist
hits)
53ASSP Statistics
54Issues
- Vacation
- Auto Replies
- TLS and secure SMTP
- ASSP is site based, not per-user
55Lessons Learned
- Whitelist spambucket Bayesian is a great spam
filtering strategy - The default is SPF failures will filter even if
whitelisted - Be very careful what you put in the relay hosts
list - ASSP is not multi-process or multi-threaded
56Utilities
- rebuildspamdb.pl
- repair.pl
- move2num.pl
- stat.pl
57Demo
- Web configuration
- Mail analyzer
58Resources on the Internet
- http//www.spamland.com
- http//antispam.yahoo.com
- http//www.openspf.org
59Questions