Title: DenySpammers: Spam Blocking with a Dynamically Updated Firewall Ruleset
1Deny-SpammersSpam Blocking with a
DynamicallyUpdated Firewall Ruleset
- 11.06.02
- chris tracy
- deeann m.m. mikula
2Motivation for Paper
- Deeann presented a Spam BOF at LISA 2001
- generated a lot of excitement
- idea to write the paper came from this
- seemed like a novel approach to fighting spam
- share our good idea
- get feedback to improve the idea
3Introduction
- We will
- detail our methods for controlling spam at a
small ISP - discuss initial unsuccessful tactics
- discuss the resulting development of our unique
spam blocking system
4Introduction
- We will show how our spam blocking system
- classifies hosts as probable spammers
- dynamically manages a firewall ruleset
- conserves system resources
- effectively blocks spam
5Theft of Service
- Lost bandwidth
- CPU cycles
- Disk space
- Lost time
- end-users and administrators
- Obviously this is lost money
- Apparent escalation in recent years
6Spam Horror Stories
- Hotmail states that 80 of its almost 2 billion
processed email messages are spam. - - Lee Gomes, The Wall Street Journal
- Brightmail...now records 140,000 spam attacks a
day, each potentially involving thousands of
messages, if not millions. - - Jennifer Lee, The New York Times
7Tools for Coping with Spam
- Simple Mail Filters
- MailAudit, procmail...
- patches for various MTAs
- qmail-uce, rblsmtpd...(lots of these)
- versatile spam filters
- spamassassin
- sieve
- bayespam
8Tools for Coping with Spam
- Databases
- Relay Blackhole Lists (RBL)
- lists the IPs of known spammers, open relays,
dialup/DSL address pools - Razor
- Pyzor
- DCC - Distributed Checksum Clearinghouse
- http//www.rhyolite.com/anti-spam
9Trouble with Spam Filtering
- Requires
- more CPU power
- network bandwidth
- if accessing networked databases
- RBL, DCC, Razor, Pyzor...
- a more complicated mail system
- administrative overhead
- updates, configuration...
10Trouble with Spam Filtering
- False positives (Type I error)
- legitimate messages that are marked as spam
- spam filters are getting better and better,
but... - users are more likely to ignore everything that
their spam filter catches
11Our Problem
- Spammers were effectively DOSing our mail server
- Wanted a way to be able to selectively deny hosts
- take away the ability to connect to our mail
server if we detect them as a spammer - RFC 706 (next slide)
12RFC 706 - On the Junk Mail Problem
- In a nutshell
- No mechanism for a mail host to selectively
refuse messages - Lots of unwanted messages by a misbehaving host
would constitute a DoS attack - Both local users and network communication could
suffer
13Hardware/Software Platform
- Software
- FreeBSD 2.2.8
- server is just a little behind... )
- qmail-1.03
- patched with qmail-uce checklocal patch
- Hardware
- 1GHz Athlon processor
- 640MB of RAM
14What We Tried First
- qmail-uce checklocal patch
- denys mail for non-existent mailboxes
- by default, qmail accepts mail for these users
- this is actually an anti-spam mechanism to
prevent spammers from getting valid addresses - qmail-smtpd returns a 550 when attempting to send
15What We Tried First
- qmail-uce checklocal patch
- example
- RCPT TO abcdefghijklmnop_at_telerama.com
- 550 Sorry, no mailbox here by that name. (5.1.1)
16What We Tried First
- qmail-uce checklocal patch
- example of logging output
- Oct 10 130924 mail smtpd 1034269764.717203
7678 DENYMAIL RCPT_TO_Filter.NoUser_ relay
unknown 205.201.1.215 FROM lttest_at_telerama.comgt
ADDR ltabcdefghijklmnop_at_telerama.comgt
17What We Tried First
- qmail-uce checklocal patch
- limitations
- kept the queue size down, but didnt prevent
spammers from making 50-100 parallel SMTP
connections - sluggish performance when there were many
parallel SMTP connections to the server - as a result, load average still gtgt 1
18What We Tried First
- rblsmtpd (part of ucspi-tcp)
- a.k.a. tcpserver
- http//cr.yp.to/ucspi-tcp.html
- queries any number of RBL sources (and anti-RBL
sources) to catch spammers - open relays, dialup/DSL pools, known spammers
- temporary (451) or permanent (553)
19What We Tried First
- rblsmtpd (part of ucspi-tcp)
- toggled on when heavily spammed
- off if queue size lt 2000
- on if queue size gt 2000
- this was done to limit complaints
- on/off method made 451 errors effectively useless
- limitations
- too many false positives
- many complaints from customers about mail delays
(451) or bounces (553)
20A Decision to Start Coding
- Should we buy more hardware?
- more expensive
- Or write software to manage a firewall based on
the checklocal logging output? - less expensive
- Obviously, this is what we opted to do
21Design Goals
- What has not worked for us in the past?
- Do we have enough resources to allow client-side
filtering options? - Do we have the time and expertise to create our
own spam blocking solutions? - Would it be more effective to purchase faster and
better hardware than to script a custom solution? - How transparent does the spam blocking need to be
to the user base? - Are we concerned with bandwidth consumed by spam
attacks?
22Requirements
- Method must conserve system resources.
- Method must reduce the amount of bandwidth
consumed by spam attacks. - Method must not add much additional overhead to
mail processing. - Method must prevent spamming sites from getting
mail into the mail queue. - The system must be manageable in a way that
allows us to exempt certain hosts or networks. - Keep our customers happy by minimizing the number
of false positives. - The process must be as transparent as possible to
end users.
23Data Flow Diagram
24Data Flow Diagram
- Oct 10 130924 mail smtpd 1034269764.717203
7678 DENYMAIL RCPT_TO_Filter.NoUser_ relay
unknown 205.201.1.215 FROM lttest_at_telerama.comgt
ADDR ltabcdefghijklmnop_at_telerama.comgt
25Data Flow Diagram
- Add rule
- ipfw add 1000 unreach filter-prohib tcp from
205.201.1.215 to any 25 - Delete rule
- ipfw delete 1000
26Data Structures
- 3 hash structures
- Host Tracking spammer (hash of lists)
- keys of hash - host IP address
- values of hash - lists of timestamps
- timestamps - represent times that a host sent a
mail to a nonexistent address - Banned Hosts banned (1-level hash)
- keys of hash - host IP address
- values of hash - timestamp for when a host was
banned - Exception List noban_list (4-level hash)
- keys represent octets
- first level - first set of octets, etc...
- hash structure chosen for performance
27Other Configuration Variables
- MAX_SPAMMER_ENTRIES (default 50)
- number of timestamp entries to keep for each
spammer - SPAM_TIMESPAN (default 3600 seconds)
- 5-minute sampling interval
- timespan to check for spam attempts
- SPAM_TRIGGER (default 10)
- number of nonexistent mailbox delivery attempts
required to trigger block - BAN_TIME (default 3 days)
- how long a host should stay banned for
28Implementation / Pseudo-code
- While (true)
- match maillog lines against a regexp for
undeliverable messages - to non-existent addresses and parse timestamp
and IP address - skip line if host is in the exception list
-
- trim the timestamp list for this host to
MAX_SPAMMER_ENTRIES -
- add the timestamp to the host's list contained
in the spammer hash - check how many delivery attempts to non-existent
address this host has made in the sampling
interval SPAM_TIMESPAN - if (nondeliverable messages count gt
SPAM_TRIGGER) add_firewall_rule() -
- if (time() gt next_refresh)
- next_refresh time() REFRESH_INTERVAL
- reload the exception list into noban_list
hash - prune banned hash (un-ban hosts who
have been banned for BAN_TIME) -
29In Production
Firewall Reset
30Quick Historical Statistics
- 25,284 Dec 2001
- 46,338 Jan 2002
- 35,849 Feb 2002
- 44,652 Mar 2002
- 25,175 Apr 2002
- 26,808 May 2002
- 33,298 Jun 2002
- 18,787 Jul 2002
- 24,781 Aug 2002
- 28,883 Sep 2002
- 16,935 Oct 2002
Number of hosts banned by month
31Limitations
- CIDR notation not supported in exception list
- only compatible with FreeBSD checklocal patched
qmail - limited scalability
- checklocal exploitable by spammers to find valid
addresses - easy to work around this
32Future Plans
- Address scalability issues
- add ability to use a separate firewall
- Integration with a 3rd-party app
- SpamAssassin, Anomy Sanitizer...
- use results from app to ban hosts
- Improve statistics generation
- facilitate research
- look for interesting patterns
33Future Plans
- Develop a better interface...
- for unbanning hosts
- managing the exception list
- Interoperability with other operating systems and
MTAs - Develop more spam signatures...?
- of concurrent SMTP connections
- of recipients in RCPT TO list
34Availability
- Deny-Spammers is freely available
- source code and documentation
- http//deny-spammers.telerama.com
- written in Perl 5
- only works with FreeBSD checklocal patched qmail
35The End