Re: [gutvol-d] Spam on PG lists?

24 Mar 2005

...
I don't want this to turn this mailing list into a dspam vs Spam 
Assassin war, but I think your information about SA is out of date.
You're right, my information is a bit out of date, dspam is 
quite a bit ahead of SA now, further than I originally surmised (see 
further down). 

	But I agree, let's not turn this into a religious war.
...
SA v3 supports multi-tiered (e.g., global, domain, user) 
configurations, and has bayesian filtering as one of several rules 
for determining spam.
Does SA support allowing the user to configure their own mail 
preferences via a simple web interface? Does it support adding and 
revoking tokens by simply sending the false-positives back through 
email, without involving a mail administrator? Sure, those things can 
be written, but do they come as part of the core package? Does that 
capability exist in the base engine?

	Incidentally, dspam supports the following, out of the box: 

	- Bayesian filtering
		- Graham Bayes
		- Burton Bayes
		- Noise Reduction
	- Robinson Geometric Mean calculation
	- Fisher-Robinson Inverse Chi-Square calculation
	- Robinson Combined P-Values
	- Chained Tokens
	- Neural Networking
	- Message Innoculation

	..and quite a bit more for filtering mail.

	Does SpamAssassin v3?

	I'm glad that SA is now beginning to incorporate some of these 
things now, and they've got a good base project to learn from. I've 
been very disappointed with SA, and dspam has already trounced it in 
our case, so we have no need to de-evolve to something that doesn't 
suit our needs. 

	Less than 10 spam messages total in any user's mailbox in over 
a year now (that we've been told about), and only a small handful of 
innocent messages were caught as spam, but were really ham. 

	With the web interface, the user just sends them on to their 
normal account, and dspam scores them lower, so future versions aren't 
caught. Works great, and I don't have to be involved in the mail 
management process _at all_ anymore.
...
I'd also like to point out that being written in Perl does not imply 
that something is always much slower than C, especially when large 
amounts of regular expression pattern matching is involved.
True, poorly-written C can definately be worse than Perl, but 
well-written C is ALWAYS going to be faster than equivalently written 
Perl. I don't think I've ever seen SA process 100 messages/sec., but 
dspam has no problem doing the same thing, every day.
...
Perl developers have spent a lot of time optimizing its pattern 
matching. The SA Wiki suggests that if you find that SA is slow, you 
should examine the rule set you're using, and disable inappropriate 
rules (for example, ones requiring DNS lookups).
You're preaching to the choir here, I'm a very heavy user and 
supporter of Perl, and I use it for 99% of my tasks... but there are 
some cases where an interpreted language just can't compete with a 
natively-compiled object code.

	Anyway, good discussions all around. Use whatever tool fits 
your needs. In my case (heavy mail use from very disparate sources), 
dspam easily beat what SA could do, hands-down in terms of quality and 
speed and flexibility. The added benefit is that now I don't have to 
micro-manage mail, whitelists, or rulesets anymore.

David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com