Do The Math

Guilt by Calculation

It takes more than an Excel sheet to prove the Iranian election was fixed.

Read more of Slate’s coverage of Iran’s June 12 election and its aftermath.

Were Iran’s election numbers too good to be true? That’s the question that the blog Tehran Bureau raised hours after Friday’s election, when it noted a strange trend in the government’s electoral data: Each time a new vote total was released, President Mahmoud Ahmadinejad had won a nearly identical percentage, around 67 percent. As more results rolled in, his tally climbed in linear lock step.

We’re used to watching the lead fluctuate wildly in American elections as returns come in, particularly early in the night, so the perfect straight line on Tehran Bureau’s graph suggested the numbers were faked—and ham-handedly at that. Within hours, the graph was showing up in tweets and blogs all over the world. Atlantic blogger Andrew Sullivan saw it as conclusive evidence. “They didn’t even attempt to disguise the fraud,” he wrote. “This graph is a red flag to Iran and the world.”

This kind of statistical gumshoeing has a long history. In 1936, for example, English biologist and statistician R.A. Fisher went gunning for Gregor Mendel, whose experimental results Fisher believed had been tweaked to be more favorable to Mendel’s ideas. “Fictitious data can seldom survive a careful scrutiny,” Fisher wrote, “and, since most men underestimate the frequency of large deviations arising by chance, such data may be expected generally to agree more closely with expectation than genuine data would.” In other words, it was precisely the beautiful agreement of experiment with theory that exposed Mendel’s thumb on the scale. Only once in 15,000 times, Fisher computed, could one expect such strong conformity. (The controversy over Mendel’s research practices continues to this day, with notable scientists lining up on both men’s sides.)

More recently, John Darsee, a rising star in cardiology, was caught reporting an unusually consistent series of measurements. When his supervisor demanded to see the original printed-out readings, Darsee said he’d thrown them out to make room in a filing cabinet. In the end, Darsee lost his position at Harvard, and 82 of his research papers had to be junked.

So it’s natural to be suspicious when you see that the vote total at each of the six official reporting times follows a linear formula almost exactly. In fact, that’s precisely what we expect from the way the data were reported. As more and more of the total vote was counted, it would have taken larger and larger surges by one or the other candidate to noticeably tip the proportions. Political stats whiz Nate Silver made a roughly analogous chart of the 2008 U.S. presidential election, based on the imaginary scenario in which states reported in alphabetical order, and found a linear trend just about as strong as the one reported in Iran.

A better way to assess the plausibility of the Iran data is to examine the six batches of votes separately, instead of the cumulative way it appeared in Tehran Bureau’s graph. You see a big first batch, 36 percent of the total vote, which comes in 70 percent for Ahmadinejad. Next come two smaller batches, 18 percent and 21 percent of the electorate, respectively, each of which Ahmadinejad wins with about 66 percent of the vote. The last three batches are smaller still—10 percent, 6 percent, and 8 percent of the population—and the incumbent takes these by 67 percent, 64 percent, and 62 percent margins. So Ahmadinejad’s official share really is fairly consistent from batch to batch.

But unbelievably so? Fisher used sophisticated statistical techniques to track down Mendel’s fiddling, but we can get away with much less. We’re simply asking the following: How much do we expect Ahmadinejad’s percentage to deviate from his overall total of 67.2 percent, based on a generally diverse electorate that will vary in allegiances from place to place? The answer is given by a standard deviation, a mathematical measure that tells about how far we expect any given measurement to stray from the overall average value. Here’s one way to scratch out an estimate: Let’s say the 27 million Iranians who voted last week are divided into 1,000 different regions with 27,000 voters each. For the sake of argument, we’ll say that half of these regions are 87.2 percent for Ahmadinejad—20 points over his overall average—while half are 47.2 percent for Ahmadinejad—20 points below. For each region, the deviation from Ahmadinejad’s overall vote total is exactly 20 percent, and when you add them all up, you get his 67.2 percent average.

Now let’s look at that first batch of votes, made up of 360 of our 1,000 regions (corresponding to the first real batch of 36 percent of the votes). Absent any reason to think that this particular sample is skewed compared with the overall vote, we can employ the following beautiful and simple formula: “The standard deviation of the average over N regions is the standard deviation of each region divided by the square root of N.”

So the amount by which it’s reasonable to expect that batch to differ from overall average of 67.2 percent is 20 percent divided by the square root of 360, or 1.05 percent. In other words, even if we assume a wide variance in the support for Ahmadinejad in any region—20 points in either direction—a batch consisting of 36 percent of the electorate is likely to wander from the average only by somewhere in the neighborhood of 1 percent.

And Ahmadinejad’s reported total of 70 percent for the first 36 percent of the vote misses his average by substantially more than that, suggesting even messier data than our scenario predicts. The same argument estimates the standard deviations of the other five batches as 1.5 percent, 1.4 percent, 2 percent, 2.6 percent, and 2.2 percent, respectively. In other words, these figures, though they may seem eerily consistent at first glance, are actually just what we would expect. That’s the nature of large batches of data, governed by what’s called the Law of Large Numbers: Averages of widely varying quantities can, and usually do, yield results that look almost perfectly uniform. Given enough data, the outliers tend to cancel one another out.

Of course, these estimates depend vitally on the arbitrary guesses about the sizes of the regions and their individual vote totals we made when setting up our estimate. But every reasonable guess I tried yielded the same result; on purely statistical grounds, the Iranian election numbers look more or less reasonable. It might be a different story if Ahmadinejad had drawn between 67.1 percent and 67.3 percent in all six batches, suggesting a standard deviation of less than 0.1 percent—or if 500 mini-batches of data, each making up 0.2 percent of the vote, were all in that 62 percent to 70 percent range. (One reason American readers may be more used to seeing wide swings in the vote totals is that our fine-grained media start reporting results when just a few percent of the votes are in.)

I’m not saying the election wasn’t fixed; Juan Cole and Richard Sexton offer more reasons for doubting the government’s numbers. On the other side, Ken Ballen and Patrick Doherty argue that their pre-election polling is consistent with a big Ahmadinejad win. Either way, the final verdict on the Iranian election won’t be settled by drawing a graph. The official numbers may or not be authentic, but they’re definitely messy enough to be true.