What’s the Best Jury Size?

The Supreme Court rejected math before, but it’s now possible to calculate the jurors and margins you need for justice.

Illustration by Rob Donnelly

Several years ago, a man from Oregon named Alonso Alvino Herrera borrowed a friend’s car and never returned it. He was arrested and charged with unauthorized use of a vehicle and possession of stolen property. In 2008, a jury acquitted him of the second charge but convicted him of the first charge … by a 10-to-2 vote.

Does something strike you as odd about this story? Isn’t a verdict in a criminal trial supposed to be unanimous? The answer is yes in 48 states and yes if the case is tried in a federal court. But two states, Oregon and Louisiana, allow convictions by a nonunanimous vote. In both states, the threshold in noncapital cases is 10 to 2.* Arguably, Herrera had to go to jail for the crime of living in Oregon.

The Supreme Court has allowed this conflict between federal and state law (as well as between state law and conventional wisdom) to persist for more than 40 years, during which time it has come up with a mishmash of seemingly arbitrary rules about what constitutes a legal trial. A jury of six, the Supreme Court has decided, is constitutional (Williams v. Florida, 1970). A jury of five, however, is not constitutional (Ballew v. Georgia, 1978). In a jury of six, conviction must be unanimous (Burch v. Louisiana, 1979). But in a jury of 12, conviction does not have to be unanimous (Johnson v. Louisiana and Apodaca v. Oregon, 1972). (At the time of these decisions, Louisiana required a 9-to-3 vote to convict in noncapital cases, which the court upheld as constitutional. The state has since changed its threshold to 10 to 2.)

The Supreme Court’s rationale has been as confusing as its decisions. “The line between five- and six-member juries is difficult to justify, but a line has to be drawn somewhere,” wrote Justice Lewis Powell in Ballew v. Georgia. Justice Harry Blackmun, in the majority opinion for that case, cited several studies by social scientists on the error rate of juries of different sizes and wrote in a footnote: “We have considered [these studies] carefully because they provide the only basis, besides judicial hunch, for a decision about whether smaller and smaller juries will be able to fulfill the purpose and functions of the Sixth Amendment.” Powell, though agreeing with the decision, chastised Blackmun’s justification: “I have reservations as to the wisdom—as well as the necessity—of Mr. Justice Blackmun’s heavy reliance on numerology derived from statistical studies.”

As a mathematician and a science writer, I was shocked to see such an unabashedly anti-scientific sentiment emanating from the highest court in the land. Powell was perhaps right to be skeptical of how well academic studies capture the nuances of jurors’ decision making, but he was wrong if he wanted us to simply avert our eyes from the scientific study of these questions. Science and mathematics exist for this very purpose: to enable us to question our assumptions and understand how difficult-to-observe processes work. Besides, juries are fascinating. Nobody really understands how they arrive at a verdict and how the legal requirements interact with the personalities in the jury box.

I was introduced to the mathematical study of juries by Jeff Suzuki, a Brooklyn College mathematician, earlier this year at the national meeting of the American Mathematical Society. In his lecture, he provided a purely mathematical brief for reconsidering the Supreme Court verdicts from the 1970s.

The Marquis de Condorcet, a French philosopher and mathematician, was perhaps the first person to subject juries to mathematical analysis. In 1785, he proved a theorem that could be succinctly summarized, “More is better.” Assuming that all the members of a jury have a better than 50 percent chance of determining the true innocence or guilt of the accused—a big assumption but a common one in mathematical studies of juries—larger juries are more likely to decide correctly than smaller ones.

However, Condorcet was writing only about majority decisions; his theorem does not say whether unanimous decisions are better than a simple majority. An important difference is that if you require unanimity, then you might not get a decision on the first ballot. Also, Condorcet assumed that all jurors vote independently, which is demonstrably false. For instance, if everybody on a 12-person jury has a competence of 60 percent (that is, a 60 percent chance of drawing the right conclusion) and all their votes are independent, there is only a 1-in-500 chance that they will reach a unanimous verdict on the first ballot. Reaching unanimity in real life is difficult, but it’s not that difficult.

In 2008, Eric Helland and Yaron Raviv refined Condorcet’s model to take into account the deliberations that take place after the first ballot. They assumed that the likelihood of a conviction is simply proportional to the votes on the initial ballot. In other words, if the first ballot is 11 to 1 to convict, then the odds of a conviction are 11 to 1; if the first ballot is 6 to 6, then the odds of a conviction are also even. With this not too unrealistic assumption, they came to a conclusion that is the exact opposite of Condorcet’s: “More isn’t better.” In fact, a jury of one (i.e., a judge) is just as likely to get it right as a jury of 12! This suggests that requiring a unanimous verdict is actually worse than requiring a simple majority because it somehow squanders the advantage of having a large jury.

This example should give us pause. Before we make policy based on mathematical models, we should scrutinize the assumptions very carefully. Perhaps Powell was right after all!

Another problem with the Condorcet model is the assumption that jurors make the “right” decision a certain fixed percent of the time. Who is to say what is right? How can we ever measure the competence of a juror?

In a very provocative 1992 paper, George Thomas, a law professor at Rutgers University, and his student Barry Pollack, now a partner at Pollack Solomon Duffy LLP in Boston, argued that the function of a jury is to serve as a proxy for society. In ancient Greece every citizen of the polis served on the jury. In the modern world this is impractical, so we settle for juries of 12.

According to Thomas and Pollack, then, the most objective measure of a jury’s success is whether it agrees with what society would have decided. In his lecture, Suzuki phrased it more cynically: The function of a jury is to preserve the appearance of justice. Consider the fact that the most notorious judicial outcomes of recent years—the O. J. Simpson case, the Casey Anthony trial—are exactly the ones where society disagreed with the jury’s decision. Or, looking at it the opposite way, the jury failed to do its job of agreeing with the rest of us!

Whether or not you subscribe to this definition of a jury’s purpose, Thomas and Pollack’s model does make it possible to determine mathematically how likely such disagreements are. If you pick 12 people at random, how likely is it that they will disagree unanimously with the majority of society? Not very likely. How likely is it that they will disagree with society by a 9-to-3 majority? Thomas and Pollack crunched the numbers, and Suzuki recrunched them. And they found a surprising consistency. For every margin that the Supreme Court has allowed to stand (6- to 0, 10 to 2, 9 to 3), the probability of a disagreement between society and the jury is less than 1.5 percent. And for every margin that the Supreme Court has ruled unconstitutional (5 to 1, 5 to 0), the probability of disagreement is greater than 1.5 percent. Thus, without realizing it, the Supreme Court has consistently held that there should be less than a 1-60 chance that the jury will disagree with society. Judicial hunch meets mathematical rigor!

However, the Supreme Court should not pat itself on the back too hard. Suzuki pointed out that nonunanimous juries fail the Thomas-Pollack test if some of the jurors are biased in favor of conviction. A 12-member jury with a unanimity requirement, he showed, can absorb up to five biased jurors and still meet the 1.5 percent threshold. By contrast, if the required margin for conviction is only 10 to 2, then a single biased member is enough for the jury to make the wrong decision (from society’s point of view) more than 2 percent of the time. Six-member juries fare even worse. Even with a unanimous vote required to convict, one biased juror will cause the jury to make the wrong decision more than 5 percent of the time. Both 6-to-0 and 10-to-2 convictions, Suzuki concluded, should be held unconstitutional. “The Supreme Court should reconsider its decisions and prohibit six-person juries,” he said.

Will the Supreme Court pay attention to mathematics this time? Given its record, the answer seems almost certain to be no. “The Supreme Court is remarkably uninterested in statistical research,” says Thomas. “The jury is, according to the court, a black box that deserves to remain black as long as the verdicts stay within the parameters the court has set out.”

However, the issue is not completely dead, because the Supreme Court continues to receive appeals from Louisiana and Oregon, including Alonso Herrera’s. (The court was interested enough in that case to request a brief from the state of Oregon, but it ultimately declined to hear his appeal.)

In addition, there are very good legal (as opposed to mathematical) reasons to reconsider the decisions of the 1970s. The discrepancy between federal and state law is perhaps the most compelling one. The Fourteenth Amendment expressly extended Bill of Rights protections (such as the jury trial) to the states. In Apodaca v. Oregon, eight of the nine Supreme Court justices agreed that federal and state laws should require the same quota for a conviction. The only trouble was that the justices didn’t agree what the quota should be. Four thought it should be unanimous, and four thought it should be nonunanimous. Justice Powell—yes, him again—broke the deadlock. He was the only justice who believed federal juries should be unanimous but states should be allowed to “experiment” with nonunanimous juries. Perhaps it is time to declare the experiment over.

Correction, April 26, 2013: This article misstated that Louisiana requires a vote of 9 to 3 to convict in noncapital cases. This was true at the time of the Supreme Court cases, but the state has since changed its threshold to 10 to 2. (Return to corrected sentence.)