In the tech policy world, antitrust is on everyone’s minds, and breaking up Big Tech is on everyone’s lips. For those looking for another way to fix tech’s competition problem, one idea keeps popping up. Mark Zuckerberg named it as one of his “Four Ideas to Regulate the Internet.” Rep. David Cicilline, a Democrat from Rhode Island and chairman of the House Judiciary Committee’s antitrust subcommittee, said it could “give power back to Americans.” It’s already enshrined as a right in the European Union as part of the General Data Protection Regulation, and in California’s new Consumer Privacy Act as well.
The idea is data portability: the concept that users should be able to download their information from one platform and upload it to another. That way, the theory goes, people can more easily try new products, and startups can jump-start their products with existing user data. The family group chat can move off of WhatsApp without leaving behind years of data. Members of the anarcho-socialist Facebook group can bring their conversations with them and take their Marxist memes with them. A whole new world can flourish off of years of built-up data. It’s competition without the regulatory and technological headache of breaking up companies.
But data portability might not be the regulatory golden goose the private and public sectors hope it is. It’s not even a new idea: Facebook has allowed users to export their data through a “Download Your Information” tool since 2010. Google Takeout has been around since 2011. Most major tech companies introduced some form of data portability in 2018 to comply with GDPR. Yet no major competitors have been built from these offerings. We sought to find out why.
To do this, we focused our research on Facebook’s Download Your Information tool, which allows users to download all of the information they have ever entered into Facebook. We showed the actual data Facebook makes available in this tool to the people we would expect to use it to build new competitors—engineers, product managers, and founders. Consistently, they did not feel that they could use it to create new, innovative products.
Just by looking at the sheer volume of data Facebook makes available, it’s hard to believe this is true. The Download Your Information export includes dozens of the user’s files, containing every event attended, comment posted, page liked, and ad interacted with. It also is a stark reminder of just how many features Facebook has (a fully fledged payments platform! Something called “Town Hall”!) and how many have been retired (remember pokes?). When Katie Day Good got her data from Facebook, the PDF ran to 4,612 pages.
But the people we interviewed—the ones who might actually make use of all this information—noted some serious shortcomings in the data. A user can download a comment made on a status, but not the original status or its author (at least in a way useful for developers). A user can get the start time and name of an event attended, but not the location or any fellow attendees. Users can get the time they friended people, but little else about their social graphs. Time and time again, Facebook data was insufficient to re-create almost any of the platform’s features.
From a privacy perspective, these shortcomings make sense. Facebook draws a hard line around what it considers one user’s data versus another’s in order to ensure that no one has access to information not their own. Sometimes, though, the hard line makes the data less useful to competitors. Information falls in the gaps, leaving conversations unable to be reconstructed, even if both sides upload their data. Facebook mused extensively on the privacy trade-offs involved in data portability in a white paper published in September. It concluded, more or less, that there need to be more conversations on this subject. (Mark Zuckerberg himself has given a similar line about data portability since as early as 2010.)
Conversations aside, there is some low-hanging fruit to make current data portability options more useful for competitors and easier for users. Almost no platforms we looked at gave any sense of what downloaded data might actually look like, and without this kind of documentation, developers would have a hard time incorporating this data into any real products. The process of actually downloading data could also be improved. Currently, many platforms hide their data exports deep in menus, limit how frequently users can download their data, and take a long time to make the data accessible. Spotify, for example, can take up to 30 days to create its data export.
One-user-at-a-time data portability might also be the wrong approach. On social platforms, users want to be where their friends are, and portability pioneers may find themselves on barren networks. But alternative forms of data portability might address this problem and work better for competition. For example, platforms could allow users to move their data in coordinated groups. The family WhatsApp could agree to move to Vibe all at once, or the anarcho-socialist Facebook group could put it to a vote. Similarly, open and continuous integration may be more effective than one-time data transfers. There is room for the kind of experimentation and innovation Silicon Valley is famous for.
Even with all of these improvements, data portability is in danger of being a waste of time. It has all the trappings of a radical, win-win way to increase competition on the internet, but when put into practice, it has so far fallen short. It might work for nonsocial applications, like music streaming or fitness apps, but as of now it acts as a distraction from proposals for more systemic integration, including those put forward as part of the Senate’s recent ACCESS Act. Data portability is just one narrow tool to improve competition in the tech sector—and it’s an Allen wrench, not a Swiss Army knife.