BULLAMANKA-PINHEADS Archives

The listserv where the buildings do the talking

BULLAMANKA-PINHEADS@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
deb bledsoe <[log in to unmask]>
Reply To:
A man of honor pays his debts with his own money. --DeGaulle
Date:
Wed, 9 Jun 2004 07:03:12 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (192 lines)
 From Salon, June 7, 2004

www.salon.com   - some content not available to non-subscribers....


[In Tech & Business]

Invasion of the spambots

By Sam Williams

For Lawrence Kestenbaum, the realization that a new species of
intelligent agent -- or "bot" -- was prowling the Internet first dawned
about two years ago.

It was about that time, Kestenbaum says, that a series of "fluke"
addresses started popping up in the HTTP referrer log of his personal
Web site, the historical cemetery database Political Graveyard.

"If you're at all concerned with how your Web site is being received,
you're almost compulsively checking the logs to see who's coming in and
from where," says Kestenbaum, laying the scene. "You get to know what
sites are linking to you. Anything new gets your attention."

Even more attention-grabbing, Kestenbaum adds, was the fact that the
fluke referrals came in bunches. Curious, Kestenbaum pasted in the URL
and went to look. His disappointment was immediate. Expecting something
interesting, he instead found a page filled with nothing but banner and
pop up ads.

For a moment, Kestenbaum says, he suspected a glitch. How else could one
explain a dozen or so Internet browsers flipping directly from a site
boasting zero unpaid content to one documenting historical graveyards?
It didn't make sense.

"That's when I had this 'Aha' moment," says Kestenbaum. "I'd visited the
site because of the very technique they'd used to advertise it. Somebody
had taken the trouble to write a program that would plant strange links
in referrer logs knowing that the people curious enough to check those
logs would also be curious enough to follow the link.

Scary as it may seem, spam is evolving. The automated, Web-spidering
technology that delivers bulk c1alis and vi@gra ads to your daily e-mail
in box has mutated into a dozen variants, targeting everything from
cellphones to blogs to instant messenger accounts. Feeding off the two
divergent trends in online publishing -- increased specialization of
content and increased generalization in the use of basic software tools
such as Google, AIM and Movable Type -- many of these mutations no
longer even demand your attention. In some cases, a place to hide in a
chat room or forum is the only thing they need."There are tons of ways
to monetize any type of traffic you can get," notes Aaron Wall, author
of "The SEO Book," a newly published treatise on the art of
"search-engine optimization" and other traffic-boosting techniques. "The
indirect technique isn't as noticed yet, because so many people are
still fighting off the direct stuff," Wall says.

So-called indirect techniques vary. Aside from referrer-log spam -- the
general term for what happened to Kestenbaum's site in 2002 -- there's
"blog spam" (using bots to post unsolicited HTTP links in the "comment"
sections of blog listings), and chat-room spam. Recently, marketers have
even resorted to targeting wiki sites such as Wikipedia, taking
advantage of their anyone-can-edit policies.

"We've only been noticing it for six months," says Tim Starling, an
Australian Wikipedia contributor who has taken a leadership role in the
site's attempts to ward off the bot menace. "The bots will go through a
site and spam every page. They'll start with the smaller [non-English]
language versions, which aren't watched as closely. So it takes longer
to pick them up."

In each case, the goal isn't so much to solicit a purchase or confirm
receipt -- the tactic of most e-mail spam capaigns -- as to boost
visibility. With more than a third of all Internet search queries now
running through Google, site marketers have crafted their automated
campaigns with an eye to Google's PageRank algorithm, which factors the
total number of incoming links to a site as a sign of relevance.

Although Google publishes clearly stated policies forbidding the use of
"link farms," -- sites that manipulate link totals as a way to boost
(and rent out) page ranks -- the percentage of offenders dropped
entirely from Google search listings is microscopically small.

That, says British SEO specialist Phil Craven, leaves plenty of room for
other people to push the envelope.

"If a search engine like Google can make link text so important, then
people are going to go out of their way to get link text," says Craven.
"So-called spamming is perfectly valid, if necessary."

Such words are tempered by Craven's own experience as a target of exotic
spam. As manager of the SEO forum Web Workshop, Craven says he recently
had to upgrade his site-registration system to ward off bots that had
been masquerading as human guests in an effort to deposit links in the
open forum and profile sections.

"Basically, the bot would come along and register five names at a time,"
says Craven. "The names always began with a non-alphanumeric character
and ended with a non-alphanumeric character, like a percentage symbol or
an exclamation point."

To stop the bot, Craven simply modified the registration process,
forcing registrants to confirm their chosen username before getting the
usual welcome e-mail. The trick worked only because the bot's author,
knowing that most users will run the program in default security mode,
didn't bother accounting for such a variation.

"I can do that because I'm a programmer," Craven says. "A lot of forums
don't have programmers operating them and they simply wouldn't be able
to do it."

Such modifications are similar in their simplicity to the now-common
anti-spam technique of spelling out e-mail addresses using "at" and
"dotcom." The only thing keeping bot writers from anticipating the
trick, Wall says, is the level of effort. Currently, bot writers and
copiers find that there are enough newbie operators out there to serve
as unwilling page-rank boosters.

"The main thing that's driving specialization is whatever's exploitable
and easy," Wall says. "Once it's no longer exploitable and easy, people
move on to something else." To get a glimpse of innovation in the bot
world, the best place to look, as usual, is in the realm of adult
entertainment.

"The adult industry will likely be married to spam and its attendant
distribution methods long past the evolution of man into beings of pure
energy," jokes Domenic Merenda, vice president of business development
for Edge Productions, a company that operates adult-media properties.

Merenda says his company doesn't resort to spam but admits to having
"rubbed elbows with the kingpins." The experience has given him a chance
to divide so-called porn bots into three major categories:
lead-generation bots, URL-proliferator bots and address-harvesting bots.

Of the three categories, lead-generation programs tend to be the most
sophisticated and most expensive. Unleashed on X- and R-rated chat-room
logs, they run through transcripts, seeking out the names and addresses
of the most active participants. Once acquired, these contacts become
fodder for third-party vendors eager to advertise webcams, escort
services and other variations on the adult-entertainment theme.

Aside from the obvious legal issues, such programs face a growing
hurdle: Many of the most active participants in public chat-rooms
nowadays are other bots masquerading as human users, often for
commercial purposes.

To cut down on this practice, many chat-rooms now use CAPTCHA, an
automated tool developed by computer scientists at Carnegie Mellon
University. Short for "completely automated public Turing test to tell
computers and humans apart," CAPTCHA is the chat-room equivalent of an
immune system T cell. It asks registrants to prove their non-bot status
by identifying a randomly generated word. Instead of displaying the word
as normal text, however, it displays it as a distorted image, usually
with a patterned background, a format that can befuddle even the most
sophisticated optical character recognition systems.

"We settled on something humans could do, but machines can't," says Luis
von Ahn, a Carnegie Mellon grad student and CAPTCHA project member.

Like the helper T cell, however, CAPTCHA is far from perfect. In 2002,
less than a year after the Carnegie Mellon group delivered a working
prototype of the CAPTCHA system, programmers at the University of
California were already claiming the ability to crack CAPTCHA-generated
images in Yahoo's e-mail account-registration system. Porn marketers,
meanwhile, have recruited eager users to beat the system. To gain entry
or special privileges on many sites, users identify CAPTCHA images piped
in by bots currently attempting to register fresh accounts.

If such ploys seem slightly Darwinian, maybe that's because the people
charged with designing them see the Internet in survival-of-the-fittest
terms.

When the referrer-log spam phenomenon first attracted attention two
years ago, Francois Lane, owner of the Canadian marketing firm
Mastodonte Communication, took credit for the outbreak while at the same
time disavowing any sense of guilt.

"I'm not too worried about my reputation," Lane wrote in response to
blogger complaints. "Marketing is all about being innovative, different,
adaptive, taking risks and knowing how to use the technology. I'm trying
to be all that."

- - - - - - - - - - - -

Sam Williams is a freelance reporter who covers software and
software-development culture. He is also the author of "Free as in
Freedom: Richard Stallman's Crusade for Free Software."

--
To terminate puerile preservation prattling among pals and the
uncoffee-ed, or to change your settings, go to:
<http://maelstrom.stjohns.edu/archives/bullamanka-pinheads.html>

ATOM RSS1 RSS2