Most of you are fighting a lot of Spam in your e-mail inboxes. Some of you have filters that help with sorting out the spam. Some of you even have ISPs that are trying to deal with the problem. Watching the spam though, is like seeing a battle unfold inside your inbox between the spammers and the filterers.
One of the obvious filtering strategies is to look for key words: Viagra, mortgage, etc. That's why the spammers have took to coding the words with such disguises as V1agra and m0rtgage. Such tactics serve two purposes: they evade simple filters, and they can still be read by the recipient. The only thing a filter sofware company can do can do is hire someone to add all these variants to its hunt-and-kill list. Or get its users to contribute to these lists (a la Cloudmark). Obviously this is an area where the human brain does a lot better at "error correction" than the automated filters, and the user participation is very effective. As many have pointed out, having all this coded product or service naming does make the advertised products look a lot less professional, and so there is probably a loss of efficiency for the advertiser there - but at fractions of a cent per e-mail, and with millions of targets, these are obviously still useful methods.
Other filter strategies include categorizing senders as spammers, and blocking all mail from particular addresses. But, as a lot of the latest viruses illustrate, sending e-mail with a fake origin address is all too easy. People send e-mail that looks like it's coming from email@example.com, firstname.lastname@example.org, and even email@example.com all the time (although I'd be reeeally cautious about that last one).
One particular tactic I have seen lately is to add words from dictionaries to the spam. Often they are hidden simply by using a white typeface so that they don't show against a white background (a tactic that originated in websites trying to increase their ranking in search engines). Lately though, I have noted that many of these actually seem to be using some kind of program to generate random sentences, rather than simple lists of rather odd words. I would assume that this is to try and emulate a real e-mail message with other content, since filtering is now looking at syntax as a detection tool.
These results can often be quite amusing. That's where I got "Sandwich dilettantes" from. It has a certain ring to it, don't you think? There has to be a rock band out there that got its inspiration for its name from this kind of thing. "Any pit viper can make love" is another.
Can you imagine a future where spam is given credit for driving intelligent machine recognition of language? I can. It's happening right now. People are paying a lot of money to keep their in-boxes tidy.