Future Tense

The Library of Congress Will Stop Archiving Every Tweet. Good.

Photo illustration by Slate. Photos by iStock, Twitter.

Taken as a whole, Twitter is a sort of Borgesian fever dream. Its users—some humans, some bots—send out hundreds of millions of messages a day. Some of those missives contribute to ongoing conversations, while many more go unread altogether. In aggregate, the volume is deafening, noise drowning out signal.

As such, it always seemed vaguely quixotic that the Library of Congress was set on archiving the platform as a whole. It first announced the project in 2010, and the effort continued in the years that followed, swelling to encompass 170 billion tweets by 2013 alone. Soon, however, this improbable endeavor will finally end. And that’s almost certainly for the best.

As the institution explained in a blog post on Tuesday, it will cease to archive every new tweet starting in January 2018. Instead, it will then begin to “acquire tweets on a selective basis.” Elaborating on that shift in a separate white paper, the library explained that it would focus on gathering collections of “thematic and event-based [tweets], including events such as elections, or themes of ongoing national interest, e.g. public policy.”

Summing up the library’s rationale for its new policy, Gizmodo’s Matt Novak jokes, “Why is it stopping? Because tweets are trash now.” Unsurprisingly, the institution itself takes a more diplomatic tone. As it explains in its white paper, Twitter itself has changed, thanks to both the increasing number of tweets and the larger size of those messages. Further, the library now has what it initially set out to acquire: extensive documentation of the platform’s early years.

From an institutional perspective, this is arguably the important point. Like most archival institutions, the Library of Congress doesn’t collect for the sake of collecting. As I’ve written before, for example, its collection of internet folklore is highly selective, driven by the recommendations of scholars and researchers. Similar guidelines have driven its approach to other internet archiving projects. Its goal has never been to archive the web as a whole, only to preserve portions of it. With that in mind, continuing to archive all of Twitter as such seems largely unnecessary, and possibly even counterproductive if future scholars really do want to look into the platform’s rise. Hence its shift to more selective archiving, which will, as the white paper puts it, bring the library’s Twitter “collecting practice more in line with its [other] collection policies.”

In any case, the library still has several questions to resolve before its Twitter archive is even available for use. Among other things, the current all-encompassing collection isn’t set up to reflect the desires of Twitter users who delete past tweets or otherwise limit their tweets—say by retroactively making their account private. Until the library can determine how to respond to these and other issues, the collection will remain inaccessible to researchers—who could, in any case, presumably still find much of the relevant material on Twitter itself.

Ultimately, if the collection is ever going to be truly useful, the library will have to grapple with some other concerns as well. At present, its Twitter collection only takes the text of a tweet, meaning that images and other data get left out. It’s also not clear whether metadata associated with a tweet—including dynamically shifting information such as the number of retweets and likes it receives—is included in the archive. Without such material, the collection could only ever be fragmentary, recording the way we spoke, but not the tone we took; what we spoke about, but not the ways we spoke with one another. As such, it may present an actively distorted information of the ways we used the social media platform.

By embracing a more selective collection strategy, the Library of Congress has the opportunity to resolve some of these concerns. In the process, it might find better ways to help us make sense of what we’ve been doing online precisely because it will be gathering less information about it. Here there may be a lesson for all of us: Perhaps we too would do well to be a bit more deliberate in how we attend to Twitter in 2018.