SpamAssassin is one of the most-used spam filtering systems in use today. Unfortunately, because there are so many different ways SpamAssassin can be used, SpamAssassin remains subject to many performance problems. Fortunately, there are several speed-ups and optimizations that can be applied to most SpamAssassin installations to speed up its rule processing, especially on Debian and Ubuntu GNU/Linux-based systems. These instructions can be adopted to other operating systems as well.
This article does not discuss configuring your mail filtering system (i.e. procmail, maildrop). This depends completely on your setup, and more than likely there are plenty of other articles that describe the best way to setup what you want.
The spamc/spamd client/server combination
A normal SpamAssassin setup invokes the
spamassassin command for each incoming e-mail that needs to be scanned. This process is expensive when you consider what this does: for each incoming e-mail, a perl interpreter starts and loads all of SpamAssassin’s perl code, including its thousands of rules. Starting SpamAssassin like this, with a decent amount of incoming e-mail, will increase I/O load and slows things down, fast.
SpamAssassin 2.x introduced the spamc/spamd combination to alleviate this issue.
spamd is a version of SpamAssassin intended to be run as long-running process that accepts connections from
spamc is a lightweight C program made to replace the
spamassassin command, taking a message and passing it to
spamd for processing. This alleviates all of the I/O caused by loading perl and all of SpamAssassin’s rules repeatedly. Instead, both perl and SpamAssassin’s rule sets are only loaded once.
Setting this up and relatively quick and easy. Install:
sudo aptitude install spamc
Then, edit /etc/default/spamassassin to enable startup of the spamd process. In this file, change:
Start the spamd process:
sudo /etc/init.d/spamassassin start
Next, in your mail filtering scripts wherever you invoke the
spamassassin command, invoke the
spamc command instead. After verifying spam filtering still works after all these changes, you are done.
One important thing to note: the spamd/spamc combination sets up another service running on your system, possibly opening up a security hole waiting to be exploited. Make sure to follow due process (securing user accounts, setting up firewall, etc) before enabling new network services.
Erich Schubert went into SpamAssassin 3.2’s ability to precompile rules a few months ago. While Perl’s regular expressions are fast, C regular expressions can be faster (and consume less memory). SpamAssassin can compile its rules into a C shared library that is used instead of interpreted perl code. Eric mentions a negative to precompilation: it requires your mail server to have a C compiler.
If you’re comfortable with that, setting up your system for precompiling SpamAssassin’s rules is easy. Install the required depencencies:
sudo aptitude install re2c build-essential
and run the sa-compile command:
You then need to configure SpamAssassin to use the precompiled rules. This is done by editing /etc/spamassassin/v320.pre, and uncommenting out the line: