UPDATED: 1/8/2011
This article describes free usage limits of network test providers used by Spamassassin, along with recommendations if they are worthwhile to pay for service for sites large enough where a data feed is necessary. Recommendations are based upon statistical data in Spamassassin’s weekly masscheck as collected at RuleQA.
It is important for Spamassassin sysadmins to know the limits and usage restrictions of the various network test providers. If those providers deem that you are abusing their service they might choose to silently block your IP address. This can cause significant problems like mail delivery slowdown as Spamassassin waits until DNS timeout during each mail scan, along with test failure which can cripple your spam filter.
Subscribe to announce-only newsletters targeted at Spamassassin Sysadmins.
COMING SOON: The next article will describe troubleshooting and testing procedures to determine if you have been blocked by a network test provider. If you have been blocked, you probably want to disable that rule in order to avoid timeout delays.
NOTE: AT&T Business DNS note. AHBL not accessible from China.
Spamhaus.org: RCVD_IN_PBL, RCVD_IN_SBL, RCVD_IN_XBL
Spamhaus is the undisputed best provider of DNSBL service. It is generally safe to outright block MTA connections from hosts listed in zen.spamhaus.org. Their free limits of 100k SMTP/day or 300,000 DNS/day are very generous, but if your usage exceeds the limits this is perhaps the most worthwhile provider to pay for a data feed.
PSBL.org: RCVD_IN_PSBL
No Limits, but the rsync data feed is free, so you might as well run your own rbldnsd mirror if you have more than 100k mail per day.
Spamcop.net: RCVD_IN_BL_SPAMCOP_NET
No limits posted, but their data feed costs $1,000/yr. If you are cut-off from DNS, I’m unsure if buying their data feed is worthwhile as their safety rating has consistently been mediocre.
Barracuda Reputation Block List: RCVD_IN_BRBL_LASTEXT
No apparent limits are posted, but they request that you register your IP addresses to enable DNSBL access. I don’t know if they enforce that registration requirement, as spamassassin uses this BL by default and most sysadmins certainly do not know. In any case it seems they have no data feed available.
AHBL: DNS_FROM_AHBL_RHSBL
They ask that you contact them if you use more than 100k queries per day. But apparently this RHSBL is detecting so few spam (0.02% in recent counts) that you are probably better off disabling it to save yourself a network query during every scan.
SORBS: RCVD_IN_SORBS_*
No published limits, although the bottom of this page does suggest you ask for data feed access.
NJABL.org: RCVD_IN_NJABL_*
RFC-Ignorant.org: DNS_FROM_RFC_*
Their sites do not seem to indicate any usage limits, but with poor 1% spam detection rates and high overlap, you may want to consider disabling if you want to squeeze a tiny efficiency improvement.
DNSWL.org: RCVD_IN_DNSWL_*
Their limit is 100k DNS queries/day. In my opinion DNSWL is the most useful of the whitelist providers. However it is important to know in general whitelist providers are not necessary to spamassassin’s safe operation, but it does help a slight bit.
ReturnPath: RCVD_IN_RP_*
Their SAFE and CERTIFIED lists are described here where it says the limits are 100k/day.
SuretyMail: RCVD_IN_IADB_*
No limits and free data feed.
SURBL: (Most of the URIBL_* rules.)
250k limit per day. If you are a large site, it is well worth the price for a data feed. URIBL’s are powerful, effective and safe.
Vipul’s Razor: RAZOR2_CF*
No mention of limits, but this page mentions Cloudmark for large deployments.
Pyzor: PYZOR_CHECK
No mention of limits, but if you are a large site you should probably look into running your own pyzor server.
DCC: DCC_*
100k limit per day, otherwise you should run your own DCC server and join the global network.