The Hive

Every Day We Write the Book

What would happen if Facebook made its data available for research?

Facebook is a data supernova. According to the site’s official stats page, there are more than 500 million users, with the average user creating 90 pieces of content a month. Previously in this Hive, I discussed how my iPhone’s camera has nudged me into better documenting my life. For even more of us, the status update is a fossil record of our existence: TV shows we watched, stuff that occurred to us, articles we’ve read, family photos, FarmVille accomplishments, wedding announcements, and on and on without end. I was curious who was looking at this data and what larger trends they discovered.

Our first stop is Openbook. The site lets you search public Facebook updates and was created to demonstrate how FB’s privacy settings are confusing: People don’t realize how widely they are sharing personal information. And, indeed, when you do a search like “cheated on my wife,” you discover updates that would’ve been better left in the privacy of one’s own mind. Same with “my boss sucks.”

As you move beyond obvious “gotcha” searches, the vastness, weirdness, and potential usefulness of Facebook becomes even more apparent. A search like “brushing my teeth” reveals the amazing variety of pop music that launches people into their day. It would satisfy a small curiosity to map the status updates about UFO sightings, and I could imagine tech-happy CNN showing where love for President Obama is currently cresting. I also like doing lunchtime marketing research about how people feel about organic food or comparing the patrons of Pizza Hut vs. Taco Bell.

But there is a more serious type of analysis to accomplish. It would be helpful for transportation planners to know the places where people complain the most about traffic. Educators could see the data and sentiment analysis around how a community feels about its local schools. The writer Marshall Kirkpatrick at has called for Facebook to open up its data for research. He points to the fact that the discriminatory practice of redlining was discovered “when both U.S. Census information and real estate mortgage loan information were made available for bulk analysis.” And he rightly speculates that “patterns of comparable importance” could be found in Facebook’s enormous social graph.

Facebook’s challenge is to leverage that social graph in a way that doesn’t alienate us all. The site analyzes us for the benefit of its advertisers but offers only limited peaks at what its engineers are capable of. The Facebook Data Team, for example, tries to measure how happy people are on Facebook each day with the Gross National Happiness Index. The index tracks the numbers of positive and negative words in status updates. In America, we just hit a happiness peak on Thanksgiving Day—Mother’s Day is a distant second. (Fun fact: We are happier on the day of the Super Bowl than we are on Easter.) The data team also analyzed how diverse its U.S.-based users were and voter turnout trends in the recent election.

The larger trend here is that Facebook keeps very close tabs on its information. The poster boy for FB’s data hoarding is entrepreneur Pete Warden. He built his own database of 210 million publicly available Facebook profiles and created a whimsical map of the United States that divided the country into regions like the “Nomadic West” and “Socalistan,” based on where people’s friends were likely to be located. His widely circulated Fan Page Analytics showed, say, what things people who liked NPR also liked, or the top states for Megan Fox lust. Warden’s plan was to make his data available to researchers, but he was threatened with a lawsuit by Facebook, and that was that. (Be sure to look at Warden’s new project, OpenHeatMap.)

A basic hurdle with self-tracking or a volunteer data collection project like Track Your Happiness is simply getting in the habit of collecting the numbers. Facebook is a natural platform for these efforts, and I know that many “quantified self” tools are integrated with the site. Tell me more about how you’ve tried using Facebook for, say, fitness or goal-keeping projects. I’d also like to hear more ideas about how we could respectfully (i.e., anonymously) analyze Facebook data for the benefit of all. Finally, please link to great Facebook data visualizations that I may have missed, like this one on when break-ups are most likely to happen.

Slate is looking for great ways that we can collect and analyze data to improve our lives. You can submit your idea from now through Friday, Dec. 3. We’ll be tracking your most interesting ideas throughout the month. And don’t forget to vote on the proposals you like best. In early December, we’ll take a closer look at the three top-vote-getting ideas and write about them.