A quite common type of spam message is now a very short message of the form:
- WordsJoinedUpLikeThis
- https://somedomain.com
This sort of message is very difficult to detect using content analysis techniques.
For the message to achieve its goal, the URL in the message needs to point somewhere reasonably stable, so that the intended targets of the message can follow it (although they do not typically stay around for a long time).
This gives a means to detect the spam, based on where a URL in the message is pointing. SURBL--Spam URI Realtime Blocklists--provide a means to do this. This uses a similar technique to the well-known RBL (Realtime Blocklists), which hold the location of spam sources in the domain name system. RBLs are now largely ineffective, as spammers rapidly move location using botnets.
SURBL refers to both a technique and a specific deployed SURBL, run by surbl.org. The data that drives all of this comes from SpamCop's "Spamvertised Web Sites." Operational experience suggests this is currently very effective at removing spam that is not easily susceptible to other techniques.
... Steve Kille
One Comment
Another interesting technique for spammers to fool content analysis engines, and even humans is by using legitimate links from search engine results or using high reputation sites, such as flickr for spam images.
Amir,
THis is right. Fortunately, it is straightforward in most cases to distinguish this sort of URL from human generated URLs. Google seems a popular site for spammers to use with this technique
Steve
Hi. I’m Rob McEwen, one of the SURBL adminstrators.
Good article.
I’d add that using SURBL in a generic sense is an interesting situation, and one that was probably intended by the creators of surbl. (Just like coca-cola loves it when people generically refer to a soda as a “coke”.)
Likewise, uribl.com, another uri blacklist, is enjoying the fact that many generically refer to such lists as a “uribl”– uribl being short for “URI blacklist”.
Personally, I think the simplest and most direct label for such blacklists is simply “URI Blacklist”.
Since these do hit on the IPs and domains contained within links in the messages, “URI” is the appropriate term. The term “domain blacklist” is insufficient since that would exclude the IP addresses which are contained within the links. Also, the term “URL blacklist” is inappropriate because only the actual IP or domain name is listed… not the fully url.
Additionally, some confused URI blacklists with RHSBL “right-hand-side” blacklists. They are different. RHSBLs block based on the domain used in the “from” address. In contrast, uri blacklist (such as surbl, uribl, and ivmuri) all block based on the domains and IPs extracted from clickable links found in the body of a message… ignoring all items in the header and smtp envelope of a message. Sadly, many misuse uri blacklists and treat them like RHSBLs.
Anyways–good post. I hope this info added to your page.