The Prose of the Machines

“Robots” are surprisingly good at writing news stories, but humans still have one big edge.

“Robot” journalists can’t compete with humans at the things humans do best.

Courtesy of Shutterstock

Here are the first two paragraphs of a pair of business stories. One was written by a human, one by a computer. See if you can tell which is which.

Story No. 1:

BURBANK, Calif. (AP) — The Walt Disney Co. (DIS) reported a 33 percent increase in its fiscal first-quarter net income, beating analysts’ estimates.

Disney, which is based in Burbank, California, earned $1.84 billion in the quarter, up from $1.38 billion in the same period a year ago. Per-share earnings climbed to $1.03 from 77 cents.

The average estimate of analysts surveyed by Zacks was 92 cents per share.

Revenue rose 9 percent to $12.31 billion from $11.34 billion. Analysts expected $11.8 billion.

Story No. 2:

LOS ANGELES (AP) — Disney on Tuesday posted second-quarter earnings that beat Wall Street forecasts, helped by the home video sales of blockbuster movies “Frozen” and “Thor: The Dark World.”

Both films showed the power of buying multibillion-dollar content brands. “Thor” comes from Disney’s $4 billion purchase of Marvel Entertainment in 2009. “Frozen” was a direct result of adding creative talent from Pixar after Disney bought it for $7.4 billion in 2006.

The difference is fairly obvious, right? The second report was written by AP business reporter Ryan Nakashima. The first was composed by a bot. The human-written earnings story feels more natural, and it weaves the “why” into the lede, whereas the bot’s report is limited to the “who,” “what,” “where,” and “when.”

Nevertheless, one of the world’s largest news organizations is letting the bots take over the task, at least for certain types of articles. The Associated Press announced last month that “the majority of U.S. corporate earnings stories for our business news report will eventually be produced using automation technology.” In the coming weeks, its first machine-written articles—written with software supplied by the Durham, North Carolina­–based startup Automated Insights and corporate earnings data from Zacks Investment Research—will go live on the AP’s global news wires.

Automated Insights’ software platform, called Wordsmith, is already being used by Edmunds to generate car descriptions and by Yahoo to write personalized Fantasy Football recaps for millions of users. Forbes is publishing earnings previews using software from a rival startup, Chicago-based Narrative Science.

The rise of automated journalism has been greeted by some reporters with fear and disgust. They fret that “robots” will take their jobs or depress their salaries. Others, noting the dry and, well, robotic quality of the bots’ reports, dismiss automated news as a misguided fad.

They’re both wrong. In journalism, as in most other fields, “robots”—better described as software programs, really—can’t compete with humans at the things humans do best. But they’re far faster and more efficient at certain basic tasks. In time, they might also generate certain sorts of insights that even Pulitzer Prize–winning humans would overlook. Yet, for reasons I’ll explain, they’ll also continue to overlook points that even a rookie reporter would find obvious.

First let’s look at what humans do well. We’re good at telling stories. We’re good at picking out interesting anecdotes and drawing analogies and connections. We’re good at framing information: We can squint at the amorphous cloud of information that surrounds a news event and discern a familiar form. And we have an intuitive sense of what our fellow humans will find relevant and interesting. None of these qualities come naturally to machines.

In theory, well-designed software programs could acquire such soft skills with enough data, development, training, and processing power. But teaching machines to think like humans is one of technology’s most daunting tasks, and—Eugene Goostman’s passable 13-year-old-boy impression notwithstanding—we’re nowhere close to achieving it. If we ever do, the impacts on the journalism job market will hardly be humanity’s greatest concern.

And again: Humans are already better at thinking like a human than computers will ever be.

Computers, obviously, are far better at computing—that is, at quickly scanning, crunching, and identifying patterns in large sets of data. They’re also reliable. They won’t be in the bathroom, out to lunch, or asleep at home when big news breaks.

And they’re fast. Given the proper algorithms, they can turn inputs (like a 40-page spreadsheet) into outputs (like a 150-word news brief) faster than a human reporter can say to her editor, “Oh, hey, maybe I should write something on this.” One more thing: Once you’ve built the software, the marginal cost of producing each story approaches zero. That’s how Automated Insights churned out 300 million reports last year for its various clients—a rate of 9.5 reports every second. This year it’s aiming to more than triple that output.

All of which gives software a great advantage over people when it comes to things like quickly summarizing key data in an earnings report. It also makes it an ideal tool for writing stories that humans otherwise wouldn’t get to, like recaps of Little League ballgames or the aforementioned fantasy-league draft reports.

As New York magazine’s Kevin Roose observes, “The stories that today’s robots can write are, frankly, the kinds of stories that humans hate writing anyway.” I made a similar point when I wrote about Quakebot, the Los Angeles Times program that cranks out a breaking-news brief whenever an earthquake hits. Even Robbie Allen, Automated Insights’ CEO, agrees. “Automation is about augmenting what reporters do,” he told me. “Now that you don’t have to have a reporter listening in on every earnings call, they can be focused on other things.”

But what happens when the machines get smarter? What will keep them from encroaching on human reporters’ territory?

You might think that what separates human writing from robo-journalism is the ability to write with flair. In fact, Automated Insights’ machines have little trouble couching their reports in a snarky tone if that’s what the client requests. Here, for example, are a few choice lines from a machine-generated Yahoo fantasy football draft recap:

“Good News, Bad News: The good news is that the draft for Benjamin’s Boss Team was consistent throughout. The bad news is that it was consistently the worst in the league in terms of projected points (over both the first and second halves of the draft).”

“Fantasy Fútbol: This is the American version, folks. Benjamin’s Boss Team elected to go with two kickers, rather than adding depth at other positions.”

Snark, it turns out, is just another software setting that you can dial up or down according to your predilections.

What will keep robo-journalism from usurping people’s jobs anytime soon is not the quality of the machines’ writing. Strange as it might sound, it’s the quality of their data.

Recall that Automated Insights’ Disney story made no mention of Frozen or Thor, recent works that readers might readily associate with the company. That’s because the company’s software, which draws exclusively on Zacks financial data for its reports, has no way of knowing that these movies are tied to Disney’s strategically important acquisitions of Pixar and Marvel, respectively. Even if it did, it couldn’t begin to generate a broader insight like the AP reporter Nakashima’s judgment that “both films showed the power of buying multibillion-dollar content brands.”

To pick out what might be most interesting to business readers about Disney’s latest earnings report isn’t a matter of processing power. It’s a matter of having the right kinds of data at your disposal. A good story about Disney requires a journalist who already has a conception of what the company is about and why it’s important in the wider scheme of things. Nakashima’s piece also draws on his understanding of the big abstract questions looming over 21st-century business management. Are big-time content brand acquisitions, in general, worth the money? Disney’s earnings report suggests they can be, at least in some circumstances.

Software programs can scan a spreadsheet in a fraction of a second. They can rapidly compare the numbers in that spreadsheet to all sorts of other data sets, from past earnings reports to historical averages to the recent performance of other companies in the relevant industry sector. Over time, this ability should allow them to spot trends that human reporters would overlook. For instance, Allen says, Automated Insights’ software might notice that a given company’s revenues have beaten estimates every third quarter for the past five years. “Certain kinds of analysis are completely missed all the time today because humans aren’t ideally suited to that kind of thing.”

But it still takes a human to situate those sorts of insights in the context of broader trends, current controversies, and ongoing narratives. In the end, the data in a company’s earnings report means little until it’s combined with all the data that’s already in people’s heads, accumulated through years of experience—not only experience covering business, but experience watching Disney movies, going to Disneyland, mulling over the role of corporations in children’s lives and the cultural forces that shaped a man like Walt Disney himself. In short, the data in people’s heads doesn’t just come from spreadsheets. It comes from years of experience being human.

Fortunately, we don’t have to choose between automated reporting and real journalism. The Disney reports above are a perfect example of how we can have both. For some smaller companies, whose earnings reports would have merited only a brief anyway, the automated reports may supplant human-written stories. But for the larger companies, like Disney, the AP will continue to supply both. The algorithmic brief will hit the wires first, conveying the essential data to people who want it as soon as possible. Then the humans will come in behind them, bringing fresh reporting, adding context, and making the sorts of connections that turn a factual brief into a real story.  

Robot journalists, it turns out, are neither robots nor journalists. They’re just another useful tool for presenting data in a form that humans can use to do what only humans can—make sense of the world.