Spam

From Antiflux Wiki

(Difference between revisions)
Jump to: navigation, search
m
Line 6: Line 6:
=== More aggressive spam analysis with user-configurable filtering ===
=== More aggressive spam analysis with user-configurable filtering ===
-
Okcomputer runs incoming mail through [http://www.spamassassin.org/ Spamassassin] before delivering it. If Spamassassin determines that the email is spam, it tags it with a special header but it does not reject it. Users can configure their email programs to automatically delete the tagged email or sort it into a special folder. This gives users control over how they want to filter their email, by choosing thresholds and particular tests.
+
Okcomputer runs incoming mail through [http://www.spamassassin.org/ Spamassassin] and [http://www.nuclearelephant.com/projects/dspam/ DSPAM] before delivering it. If either program determines that the email is spam, it tags the message with special headers like the ones below but it does not reject it. Users can configure their email programs to automatically delete the tagged email or sort it into a special folder. This gives users control over how they want to filter their email, by choosing thresholds and particular tests.
-
=== Virus scanning ===
+
<pre><nowiki>
-
On top of analyzing messages for spam characteristics, okcomputer also uses [http://www.amavis.org/ Amavis] to scan for viruses. Like Spamassassin, Amavis inserts special headers into the messages to let the end user decide what to do with incoming viruses.
+
-
 
+
-
=== Statistics ===
+
-
We also keep some [http://antiflux.org/mrtg/spam.html rough spam statistics].
+
-
[[Image:http://antiflux.org/mrtg/spam-day.png]]
+
-
 
+
-
== Filtering tagged email ==
+
-
For specific directions on configuring your email program to filter mail based on header information, we suggest reading the [https://web.interchange.ubc.ca/index.cfm?p=main/support/email/setup/spam.inc UBC spam filtering page]. Basically, Spamassassin inserts two special headers like the following.
+
-
 
+
-
<pre>
+
-
<nowiki>
+
X-Spam-Status: Yes, hits=10.7 required=5.0 tests=FROM_STARTS_WITH_NUMS,
X-Spam-Status: Yes, hits=10.7 required=5.0 tests=FROM_STARTS_WITH_NUMS,
FROM_ENDS_IN_NUMS,NO_REAL_NAME,CLICK_BELOW,WEB_BUGS,BIG_FONT,
FROM_ENDS_IN_NUMS,NO_REAL_NAME,CLICK_BELOW,WEB_BUGS,BIG_FONT,
CLICK_HERE_LINK,CTYPE_JUST_HTML version=2.20
CLICK_HERE_LINK,CTYPE_JUST_HTML version=2.20
X-Spam-Level: **********
X-Spam-Level: **********
-
</nowiki>
+
</nowiki></pre>
-
</pre>
+
 
 +
<pre><nowiki>
 +
X-DSPAM-Result: Spam
 +
X-DSPAM-Confidence: 0.5910
 +
X-DSPAM-Probability: 1.0000
 +
X-DSPAM-Signature: 422a4197102002045975100
 +
X-DSPAM-User: tim
 +
</nowiki></pre>
 +
 
 +
For specific directions on configuring your email program to filter mail based on header information, we suggest reading the [https://web.interchange.ubc.ca/index.cfm?p=main/support/email/setup/spam.inc UBC spam filtering page]. The examples below show the headers we insert to let your email program identify spam and viruses.
You can configure your email application to check the email header (not the email body!) for either "X-Spam-Status: Yes" if you trust the system default threshold or "X-Spam-Level: ****" (adjust the number of * characters) if you want to pick your own threshold.
You can configure your email application to check the email header (not the email body!) for either "X-Spam-Status: Yes" if you trust the system default threshold or "X-Spam-Level: ****" (adjust the number of * characters) if you want to pick your own threshold.
Line 31: Line 29:
You can also filter based on the "tests=" section. Spamassassin performs a [http://spamassassin.rediris.es/tests.html long list of tests] on each message and tags the message with the names of the tests that indicate the message's possible spaminess. For example, if you want to filter out email from servers listed in the [http://www.ordb.org/about/ relays.ordb.org] database, set your email program to check the X-Spam-Status header for the string " RCVD_IN_RELAYS_ORDB_ORG".
You can also filter based on the "tests=" section. Spamassassin performs a [http://spamassassin.rediris.es/tests.html long list of tests] on each message and tags the message with the names of the tests that indicate the message's possible spaminess. For example, if you want to filter out email from servers listed in the [http://www.ordb.org/about/ relays.ordb.org] database, set your email program to check the X-Spam-Status header for the string " RCVD_IN_RELAYS_ORDB_ORG".
-
=== Example 1: filtering spam using procmail ===
+
=== Virus scanning ===
-
[http://www.procmail.org/ Procmail] is a very flexible mail processor with many uses, including sorting incoming mail into different folders. To have procmail dump email marked as spam into a special folder called "spamassassin", create (or edit) a file called .procmailrc in your home directory with the following text.
+
On top of analyzing messages for spam characteristics, okcomputer also uses [http://www.amavis.org/ Amavis] to scan for viruses. Like Spamassassin and DSPAM, Amavis inserts special headers like the ones below into the messages to let the end user decide what to do with incoming viruses.
-
<pre>
+
<pre><nowiki>
 +
X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at antiflux.org
 +
X-Amavis-Alert: INFECTED, message contains virus: Worm.Bagle.Gen-zippwd,
 +
Worm.Bagle.Gen-zippwd
 +
</nowiki></pre>
 +
 
 +
=== Statistics ===
 +
We also keep some [http://antiflux.org/mrtg/spam.html statistics]
 +
http://antiflux.org/mrtg/spam-day.png
 +
 
 +
== Filtering spam using procmail ==
 +
[http://www.procmail.org/ Procmail] is a very flexible mail processor with many uses, including sorting incoming mail into different folders. One big advantage of procmail is that it processes your mail when it arrives instead of when you check it. This can save you the time and frustration associated with downloading many spam messages.
 +
 
 +
=== Step 1: You don't need to configure your account to use procmail for mail delivery ===
 +
This is not really a step, just a reference for people reading other procmail tutorials. Most places will tell you to create a .forward file with something like "| /usr/bin/procmail" in it. Don't do that for your Antiflux account. All mail already goes through procmail.
 +
 
 +
=== Step 2: Create files and directories ===
 +
We suggest that you keep a procmail log so that you can keep track of what it's doing. This is helpful in case anything goes wrong. Create a .procmail directory in your home directory.
 +
 
 +
Create a .procmailrc file in your home directory and add the following lines to it. This assumes that your mail is stored in a directory called "mail" in your home directory. If you use Pine, this is the default.
 +
 
 +
<pre><nowiki>
MAILDIR=$HOME/mail
MAILDIR=$HOME/mail
 +
PMDIR=$HOME/.procmail
 +
LOGFILE=$PMDIR/log
 +
</nowiki></pre>
 +
 +
=== Step 3: filter spam detected by Spamassasin ===
 +
Add the following to your .procmailrc file to filter anything Spamassasin classifies as spam into a mailbox called "spamassasin".
 +
 +
<pre><nowiki>
:0:
:0:
* ^X-Spam-Status: Yes
* ^X-Spam-Status: Yes
spamassassin
spamassassin
-
</pre>
+
</nowiki></pre>
-
(Optional) Create a directory called .procmail in your home directory (run "mkdir ~/.procmail" without the quotes) and add the following line to the top of your .procmailrc file to enable logging. This is useful for troubleshooting problems with procmail.
+
This uses Spamassasin's default threshold set by the Antiflux Management. We tend to be a little conservative, so we may set the threshold a bit high to prevent legitimate email being tagged as spam even if it means a few spams slip by. If you would like to set your own threshold, you can filter based on the X-Spam-Level header like this.
-
<pre>
+
<pre><nowiki>
-
LOGFILE=$HOME/.procmail/log
+
:0:
-
</pre>
+
* ^X-Spam-Level: \*\*\*\*
 +
spamassassin.level4
 +
</nowiki></pre>
-
=== Example 2: filtering viruses using procmail ===
+
=== Step 4: filter spam detected by DSPAM ===
 +
This step may be more work than it's worth for many users, so feel free to skip it. Unlike Spamassasin which works well "out of the box," DSPAM requires training. To get good results, you'll need a mailbox containing at least a thousand spams (and no legitimate messages) and another mailbox with at least a thousand legitimate emails (and no spams). You might want to run with Spamassin for a while, manually removing spams that manage to get through to your inbox.
-
Create a file called .procmailrc in your home directory, if you haven't already. Add the following to the end of your .procmailrc file to have procmail dump mail containing identified viruses to a folder called "virus".
+
<pre><nowiki>
 +
nice dspam_corpus <user> <mailbox>
 +
nice dspam_corpus --addspam <user> <spambox>
 +
</nowiki></pre>
-
<pre>
+
Once you have trained DSPAM, add the following to your .procmailrc file.
 +
 
 +
<pre><nowiki>
 +
:0:
 +
* ^X-DSPAM-Result: Spam
 +
dspam
 +
</nowiki></pre>
 +
 
 +
=== Step 5: filter viruses ===
 +
Add the following to the end of your .procmailrc file to have procmail dump mail containing identified viruses to a folder called "virus".
 +
 
 +
<pre><nowiki>
:0:
:0:
* ^X-Amavis-Alert: INFECTED
* ^X-Amavis-Alert: INFECTED
virus
virus
-
</pre>
+
</nowiki></pre>
-
=== A note about "This email scanned by [...]" messages ===
+
== A note about "This email scanned by [...]" messages ==
Some systems, typically run by corporate IT departments with something to prove, like to advertise that they scan outgoing email for viruses and spam. You'll often see something like this.
Some systems, typically run by corporate IT departments with something to prove, like to advertise that they scan outgoing email for viruses and spam. You'll often see something like this.
Line 83: Line 127:
</pre>
</pre>
-
That text at the end is worthless from a security point of view and is considered spam (i.e. advertising the scanner software/service) by some. Since it's only plain text, it would be trivial for a virus to add it to every message it sends out. Indeed, some viruses are [http://www.theregister.co.uk/content/56/36526.html starting to do just that]. There might be some value in cryptographically signing the message so that people can verify it using a public key, but that's beyond the abilities of casual email users.
+
That text at the end is worthless from a security point of view and is really just advertising for the scanner software. Since it's only plain text, it would be trivial for a virus to add it to every message it sends out. Indeed, some viruses are [http://www.theregister.co.uk/content/56/36526.html starting to do just that]. There might be some value in cryptographically signing the message so that people can verify it using a public key, but that's beyond the abilities of casual email users.
-
Scanning outgoing mail is essentially worthless because there's no (easy, secure) way for the recipient to trust the sender's scanner. It's up to the recipient's mail client (or mail server) to scan incoming mail. It's also nicer to add headers to the email rather than adding text to the message body since the scan message is metadata, not actual content.
+
Scanning outgoing mail is essentially worthless because there's no easy, secure way for the recipient to trust the sender's scanner. It's up to the recipient's mail client (or mail server) to scan incoming mail. It's also nicer to add headers to the email rather than adding text to the message body so that it's less distracting to the user.

Revision as of 00:40, 6 March 2005

Contents

Overview

We hate spam and viruses. Our goal is to block 100% of them without blocking any legitimate email. We also realize that our goal is nearly impossible to reach in the real world. As a compromise, we attack the problem from three sides.

Conservative policy to bounce messages based on sending server address

The server uses the Spamhaus SBL and XBL to reject mail from known spammers. Messages are bounced back to the sender with a message explaining why the message was rejected. End users do not need to configure anything.

More aggressive spam analysis with user-configurable filtering

Okcomputer runs incoming mail through Spamassassin and DSPAM before delivering it. If either program determines that the email is spam, it tags the message with special headers like the ones below but it does not reject it. Users can configure their email programs to automatically delete the tagged email or sort it into a special folder. This gives users control over how they want to filter their email, by choosing thresholds and particular tests.

X-Spam-Status: Yes, hits=10.7 required=5.0 tests=FROM_STARTS_WITH_NUMS,
        FROM_ENDS_IN_NUMS,NO_REAL_NAME,CLICK_BELOW,WEB_BUGS,BIG_FONT,
        CLICK_HERE_LINK,CTYPE_JUST_HTML version=2.20
X-Spam-Level: **********
X-DSPAM-Result: Spam
X-DSPAM-Confidence: 0.5910
X-DSPAM-Probability: 1.0000
X-DSPAM-Signature: 422a4197102002045975100
X-DSPAM-User: tim

For specific directions on configuring your email program to filter mail based on header information, we suggest reading the UBC spam filtering page. The examples below show the headers we insert to let your email program identify spam and viruses.

You can configure your email application to check the email header (not the email body!) for either "X-Spam-Status: Yes" if you trust the system default threshold or "X-Spam-Level: ****" (adjust the number of * characters) if you want to pick your own threshold.

You can also filter based on the "tests=" section. Spamassassin performs a long list of tests on each message and tags the message with the names of the tests that indicate the message's possible spaminess. For example, if you want to filter out email from servers listed in the relays.ordb.org database, set your email program to check the X-Spam-Status header for the string " RCVD_IN_RELAYS_ORDB_ORG".

Virus scanning

On top of analyzing messages for spam characteristics, okcomputer also uses Amavis to scan for viruses. Like Spamassassin and DSPAM, Amavis inserts special headers like the ones below into the messages to let the end user decide what to do with incoming viruses.

X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at antiflux.org
X-Amavis-Alert: INFECTED, message contains virus: Worm.Bagle.Gen-zippwd,
    Worm.Bagle.Gen-zippwd

Statistics

We also keep some statistics http://antiflux.org/mrtg/spam-day.png

Filtering spam using procmail

Procmail is a very flexible mail processor with many uses, including sorting incoming mail into different folders. One big advantage of procmail is that it processes your mail when it arrives instead of when you check it. This can save you the time and frustration associated with downloading many spam messages.

Step 1: You don't need to configure your account to use procmail for mail delivery

This is not really a step, just a reference for people reading other procmail tutorials. Most places will tell you to create a .forward file with something like "| /usr/bin/procmail" in it. Don't do that for your Antiflux account. All mail already goes through procmail.

Step 2: Create files and directories

We suggest that you keep a procmail log so that you can keep track of what it's doing. This is helpful in case anything goes wrong. Create a .procmail directory in your home directory.

Create a .procmailrc file in your home directory and add the following lines to it. This assumes that your mail is stored in a directory called "mail" in your home directory. If you use Pine, this is the default.

MAILDIR=$HOME/mail
PMDIR=$HOME/.procmail
LOGFILE=$PMDIR/log

Step 3: filter spam detected by Spamassasin

Add the following to your .procmailrc file to filter anything Spamassasin classifies as spam into a mailbox called "spamassasin".

:0:
* ^X-Spam-Status: Yes
spamassassin

This uses Spamassasin's default threshold set by the Antiflux Management. We tend to be a little conservative, so we may set the threshold a bit high to prevent legitimate email being tagged as spam even if it means a few spams slip by. If you would like to set your own threshold, you can filter based on the X-Spam-Level header like this.

:0:
* ^X-Spam-Level: \*\*\*\*
spamassassin.level4

Step 4: filter spam detected by DSPAM

This step may be more work than it's worth for many users, so feel free to skip it. Unlike Spamassasin which works well "out of the box," DSPAM requires training. To get good results, you'll need a mailbox containing at least a thousand spams (and no legitimate messages) and another mailbox with at least a thousand legitimate emails (and no spams). You might want to run with Spamassin for a while, manually removing spams that manage to get through to your inbox.

nice dspam_corpus <user> <mailbox>
nice dspam_corpus --addspam <user> <spambox>

Once you have trained DSPAM, add the following to your .procmailrc file.

:0:
* ^X-DSPAM-Result: Spam
dspam

Step 5: filter viruses

Add the following to the end of your .procmailrc file to have procmail dump mail containing identified viruses to a folder called "virus".

:0:
* ^X-Amavis-Alert: INFECTED
virus

A note about "This email scanned by [...]" messages

Some systems, typically run by corporate IT departments with something to prove, like to advertise that they scan outgoing email for viruses and spam. You'll often see something like this.


Date: Fri, 5 Mar 2004 13:40:22 -0700 (MST)
From: William 'Bill' Lumbergh
To: Peter Gibbons
Subject: new cover sheets for TPS reports

Hey Peter, what's happening? Just wanted to let you know that we're putting
those new cover sheets on all TPS reports before sending them out now, so
if you can remember to do that from now on, that would be great.

Bill Lumbergh
"My other car is also a Porsche"

==================================================================
This message certified virus-free by CompuGlobal HyperScanner 2000
Enterprise Edition.
http://www.compuglobalhypermeganet.com/
==================================================================

That text at the end is worthless from a security point of view and is really just advertising for the scanner software. Since it's only plain text, it would be trivial for a virus to add it to every message it sends out. Indeed, some viruses are starting to do just that. There might be some value in cryptographically signing the message so that people can verify it using a public key, but that's beyond the abilities of casual email users.

Scanning outgoing mail is essentially worthless because there's no easy, secure way for the recipient to trust the sender's scanner. It's up to the recipient's mail client (or mail server) to scan incoming mail. It's also nicer to add headers to the email rather than adding text to the message body so that it's less distracting to the user.

Personal tools