By John Borland Email 08.14.07 | 2:00 AM
graduate student Virgil Griffith built a search tool that traces IP addresses
of those who make Wikipedia changes. Photo: Jake
On November 17th, 2005, an anonymous Wikipedia user deleted 15 paragraphs
from an article on e-voting machine-vendor Diebold, excising an entire section
critical of the company's machines. While anonymous, such changes typically
leave behind digital fingerprints offering hints about the contributor, such as
the location of the computer used to make the edits.
In this case, the changes came from an IP address reserved for the corporate
offices of Diebold itself. And it is far from an isolated case. A new
data-mining service launched Monday traces millions of Wikipedia entries to
their corporate sources, and for the first time puts comprehensive data behind
longstanding suspicions of manipulation, which until now have surfaced only
piecemeal in investigations of specific allegations.
Wikipedia Scanner -- the
brainchild of Cal Tech computation and neural-systems graduate student Virgil
Griffith -- offers users a searchable database that ties millions of anonymous
Wikipedia edits to organizations where those edits apparently originated, by
cross-referencing the edits with data on who owns the associated block of
internet IP addresses.
Inspired by news last year that Congress members' offices had been editing
their own entries, Griffith says he got curious, and wanted to know whether big
companies and other organizations were doing things in a similarly
"Everything's better if you do it on a huge scale, and automate
it," he says with a grin.
This database is possible thanks to a combination of Wikipedia policies and
(mostly) publicly available information.
The online encyclopedia allows anyone to make edits, but keeps detailed logs
of all these changes. Users who are logged in are tracked only by their user
name, but anonymous changes leave a public record of their IP address.
Share Your Sleuthing!
Cornered any companies polishing up their Wikipedia entries? Spotted any
government spooks rewriting history? Try Virgil Griffith's Wikipedia Scanner yourself, then submit your finds and
vote on other readers' discoveries here.
The organization also allows downloads of the complete Wikipedia, including
records of all these changes.
Griffith thus downloaded the entire encyclopedia, isolating the XML-based
records of anonymous changes and IP addresses. He then correlated those IP
addresses with public net-address lookup services such as ARIN, as well as
private domain-name data provided by IP2Location.com.
The result: A database of 34.4 million edits, performed by 2.6 million
organizations or individuals ranging from the CIA to Microsoft to Congressional
offices, now linked to the edits they or someone at their organization's net
address has made.
Some of this appears to be transparently self-interested, either adding
positive, press release-like material to entries, or deleting whole swaths of
Voting-machine company Diebold provides a good example of the latter, with
someone at the company's IP address apparently deleting
long paragraphs detailing
the security industry's concerns over the integrity of their voting machines,
and information about the company's CEO's fund-raising for President Bush.
The text, deleted in November 2005, was quickly restored by another
Wikipedia contributor, who advised the anonymous editor, "Please stop
removing content from Wikipedia. It is considered vandalism."
A Diebold Election Systems spokesman said he'd look into the matter but
could not comment by press time.
Wal-Mart has a series of relatively small changes in 2005 that that burnish
the company's image on its own entry while often leaving criticism in, changing
a line that its wages are less than other retail stores to a note that it
pays nearly double the minimum wage, for example. Another leaves activist
criticism on community impact intact, while citing
a "definitive" study showing Wal-Mart raised the total number of jobs
in a community.
As has been previously reported, politician's offices are
heavy users of the system. Former Montana Sen. Conrad Burns' office, for
example, apparently changed one critical paragraph headed "A controversial
voice" to "A voice for farmers," with predictably image-friendly
content following it.
Perhaps interestingly, many of the most apparently self-interested changes
come from before 2006, when news of the Congressional offices' edits reached
the headlines. This may indicate a growing sophistication with the workings of
Wikipedia over time, or even the rise of corporate Wikipedia policies.
Wikipedia founder Jimmy Wales told Wired News he was aware of the new
service, but needed time to experiment with it before commenting.
The vast majority of changes are fairly innocuous, however. Employees at the
CIA's net address, for example, have been busy -- but with little that would
indicate their place of apparent employment, or a particular bias.
on "Black September in Jordan" contains wholesale additions, with
specific details that read like a popular history book or an eyewitness'
Many more are simple copy edits, or additions to local town entries or
school histories. One CIA entry deals with the details of lyrics sung in a Buffy
the Vampire Slayer episode.
Griffith says he launched the project hoping to find scandals, particularly
at obvious targets such as companies like Halliburton. But there's a more
practical goal, too: By exposing the anonymous edits that companies such as
drugs and big pharmaceutical companies make in entries that affect their
businesses, it could help experts check up on the changes and make sure they're
accurate, he says.
For now, he has just scratched the surface of the database of millions of
entries. But he's putting it online so others can look too.
The nonprofit Wikimedia Foundation, which runs Wikipedia, did not respond to
e-mail and telephone inquiries Monday.