Speeding up SpamAssassin rule processing on Debian and Ubuntu

SpamAssassin is one of the most-used spam filtering systems in use today. Unfortunately, because there are so many different ways SpamAssassin can be used, SpamAssassin remains subject to many performance problems. Fortunately, there are several speed-ups and optimizations that can be applied to most SpamAssassin installations to speed up its rule processing, especially on Debian and Ubuntu GNU/Linux-based systems. These instructions can be adopted to other operating systems as well.

This article does not discuss configuring your mail filtering system (i.e. procmail, maildrop). This depends completely on your setup, and more than likely there are plenty of other articles that describe the best way to setup what you want.

The spamc/spamd client/server combination

A normal SpamAssassin setup invokes the spamassassin command for each incoming e-mail that needs to be scanned. This process is expensive when you consider what this does: for each incoming e-mail, a perl interpreter starts and loads all of SpamAssassin’s perl code, including its thousands of rules. Starting SpamAssassin like this, with a decent amount of incoming e-mail, will increase I/O load and slows things down, fast.

SpamAssassin 2.x introduced the spamc/spamd combination to alleviate this issue. spamd is a version of SpamAssassin intended to be run as long-running process that accepts connections from spamc. spamc is a lightweight C program made to replace the spamassassin command, taking a message and passing it to spamd for processing. This alleviates all of the I/O caused by loading perl and all of SpamAssassin’s rules repeatedly. Instead, both perl and SpamAssassin’s rule sets are only loaded once.

Setting this up and relatively quick and easy. Install:

sudo aptitude install spamc

Then, edit /etc/default/spamassassin to enable startup of the spamd process. In this file, change:

ENABLED=1

Start the spamd process:

sudo /etc/init.d/spamassassin start

Next, in your mail filtering scripts wherever you invoke the spamassassin command, invoke the spamc command instead. After verifying spam filtering still works after all these changes, you are done.

One important thing to note: the spamd/spamc combination sets up another service running on your system, possibly opening up a security hole waiting to be exploited. Make sure to follow due process (securing user accounts, setting up firewall, etc) before enabling new network services.

Precompiled Rules

Erich Schubert went into SpamAssassin 3.2’s ability to precompile rules a few months ago. While Perl’s regular expressions are fast, C regular expressions can be faster (and consume less memory). SpamAssassin can compile its rules into a C shared library that is used instead of interpreted perl code. Eric mentions a negative to precompilation: it requires your mail server to have a C compiler.

If you’re comfortable with that, setting up your system for precompiling SpamAssassin’s rules is easy. Install the required depencencies:

sudo aptitude install re2c build-essential

and run the sa-compile command:

sudo sa-compile

You then need to configure SpamAssassin to use the precompiled rules. This is done by editing /etc/spamassassin/v320.pre, and uncommenting out the line:

loadplugin Mail::SpamAssassin::Plugin::Rule2XSBody

Comments

Comments powered by Disqus