For more than a year, Gawker’s website has remained in a sort of stasis, a goodbye post from its founder, Nick Denton, adorning the top of the page. That might, however, soon change. In an article on the impending sale of the site—currently held by Gawker Media’s bankruptcy estate—the Wall Street Journal suggests that buyers might consider purchasing the site in order to shut it down for good.
From a certain perspective, the very possibility that such a development might play out is appalling. And yet, the immediacy of information on the internet often goes hand-in-hand with its instability: Now and again, articles vanish before crawlers can archive them. On poorly maintained sites, old interactives fall into obsolescence, rendering award-winning journalism inaccessible. And sometimes, maybe more often than we realize, people delete whole sites out of spite.
In centuries past, if an aggrieved plutocrat wanted to truly destroy a publication, he would have had to go to absurd lengths. Shutting an irksome newspaper down is one thing, but to remove all trace of its existence, you’d have to buy up every extant copy, including those still held in private hands. By contrast, eliminating Gawker’s supposed offensive might just be a matter of deleting the troublesome articles from its servers.
It’s a grim possibility, both because Gawker’s legal defeat was a blow to the free press and because the site itself is an important landmark of internet history. Bemoaning the prospect on Twitter, former Gawkerite Hamilton Nolan proposed that “a wealthy fan of journalism, the First Amendment, historic preservation, or speaking truth to power in general” should buy the site.
Here, one might reasonably wonder why such an intervention would be necessary at all. Couldn’t the Library of Congress do something? Doesn’t the Internet Archive already preserve records of such sites?
The Library of Congress, for one, does actively capture past versions of some websites, though it (like the Internet Archive) makes copies of sites rather than preserving the sites themselves. Further, the library neither strives to archive the web as a whole nor to pull items into its collection piecemeal. As it explains in a programmatic document on the topic, “This means that the usual practice is not to acquire individual web sites one‐by‐one, but as part of a named subject, event, or theme‐based collection.” Those collections tend to be selective and carefully constructed, encompassing topics such as specific Olympic games and past U.S. elections.
It would certainly be possible to make a case that Gawker should have a place in the library’s holdings: It is, after all, a key representative of the golden age of blogging, which might, for example, mean it has a place in the American Folklife Center’s Web Culture collection, which I’ve written about in the past. The trouble is, it might not be possible for the library to capture the site, even if it wanted to. As it notes in an FAQ about its archiving program, it regularly “asks permission to archive or to provide off-site access to researchers.” The institution’s commitment to this process means that it sometimes can’t include certain sites within its collections, even if it would be ideal to incorporate them. Further, it’s possible that even if Gawker’s current owners agreed to let the library archive the site, future owners could revoke those permissions. Much the same might be true for the Internet Archive’s collections, though nothing of the kind appears to have happened in the past.
(I have inquired about the Library of Congress and the Internet Archive’s policies and will update if I hear back.)
Other problems arise when you turn to the Internet Archive’s Wayback Machine. While it has saved more than 17,000 images of Gawker over the past 14 years, most of those records are copies in the loosest possible sense. On my browser, at least, trying to visit some older images of the site effectively spits out unintelligible gibberish. All the information is clearly there, but it doesn’t render in a way that lets me get to it.
Perhaps more importantly, it’s difficult (though not entirely impossible) to search for information in the Internet Archive’s cloned Gawker pages. Because of the way the Internet Archive crawls sites, Gawker’s own search engine sometimes won’t turn up results, or will only show you the first page of hits. There might be ways around these difficulties, but even if there are, the original experience of the page has been lost. Unless you know exactly what you’re looking for and when it was published, you’ll have a hard time finding anything on the site.
Together, these uncertainties and frustrations speak to why it’s important to preserve sites like Gawker in their original condition. While digital archives provide a valuable service—buffering against the total loss of the internet’s past and showing us sites as they once were—they can’t replace sites themselves.