Schedule a FREE Internet Marketing Audit

Microsoft Cans Search Engine Spam

March 21, 2007

Search engine spam. It’s a problem, not only for the search engines, but for the frustrated users that are trying to find what they’re looking for. In the past, shady search engine optimization firms have used techniques like doorway pages and cloaking to get pages ranked in the SERPs. However, according to the New York Times Technology article, “Researchers Track Down a Plague of Fake Web Pages,” Microsoft is now saying that they’ve found a solution to the search engine spam, and methods to detecting the companies behind it.

Spam Double-Funnel: Connecting Web Spammers with Advertisers outlines their five-layer double-funnel model for analyzing redirection spam, in which ads from merchant advertisers are funneled through a number of syndicators, aggregators and redirection domains to get displayed on spam doorway pages, whereas click-through traffic from these spam ads is funneled, in the reverse direction, through aggregators and syndicators to reach the advertisers. The domains in the middle layers provide a critical infrastructure for converting the spam traffic to money.

“A small number of rogue actors who know what they are doing can create an enormous amount of disruption,” said David L. Sifry, chief executive of Technorati, a blog-indexing company that works to keep junk pages of this sort out of its indexes. Until now, these shady operators have been hiding behind scenes, successfully mainly because the search engine spiders typically don’t crawl javascripts that these spammers are hiding behind.

Using ‘search monkey’ programs, Microsoft claims that it came emulate a real person’s click-through activity and track the offending spam back to its operators. Researchers noted that the vast bulk of the junk listings was created from just two Web hosting companies and that as many as 68 percent of the advertisements sampled were placed by just three advertising syndicators.

It should be interesting to see what the search engines do with the information.