Medical Examiner

Influenza Body Count

The math behind estimating seasonal flu deaths.

By now, the swine flu panic has started to recede. Kids in Mexico are back at school; President Obama worked a flu joke into his White House Correspondents’ Dinner routine; drugstore face mask displays have been demoted from the impulse-purchase bin to the medical aisle. And in the media, the swine flu backlash has begun. According to the CDC, 36,000 Americans die of ordinary strains of flu every year—so why, the new narrative goes, did we get so agitated over a bug whose victims worldwide, as of this writing, number just 65?

The problem is, we can’t compare those numbers. The official swine flu deaths are from patients who were confirmed by lab tests to have been infected with the H1N1 strain. The 36,000 figure, by contrast, isn’t a count of people whose death certificate lists “flu” as cause of death; in 2005, the total number of those was just 1,812. But people who die of flu are often no longer infected when they die. Instead, they succumb to pneumonia or heart disease or emphysema—ailments they would have survived if they hadn’t been weakened by the flu. That’s why the 2,000 or so certified flu deaths represent an underestimate of the flu’s real cost.

How does the CDC come up with 34,000 more flu victims? The number comes from a 2003 study led by William W. Thompson. All winter, about 80 labs across the United States continually test patients for flu virus, so we have a pretty good estimate for the number of Americans infected with flu in any given week of the last 20 years. We also know how many Americans total died each week.

Suppose 52,000 people died in the first week of February 2004; 55,000 in the same week in 2005; 51,000 in 2006; and 54,000 in 2007. Suppose furthermore that the number of influenza specimens confirmed by labs was 1,000, 2,500, 500, and 2,000 in the four weeks in question. Then it certainly looks like the flu is killing people (whether directly or by opening the door to another lethal illness) at a rate of about two deaths per confirmed specimen; in a world without influenza, the death rate would be constant at 50,000 per week.

In real life, though, the numbers aren’t that clean—they never are. Lots of nonflu factors push the death rate around from week to week and year to year. But a statistical technique called regression allows us to find the value of X such that the formula

[Total deaths] = [Deaths if there were no such thing as flu] + X*[number of confirmed flu cases]

matches the data as closely as possible. The rightmost term, X*[number of confirmed flu cases], is then our estimate for the number of deaths you can attribute to flu. In the example above, you’d choose [Deaths without flu] to be 50,000 and X to be 2. And if 18,000 specimens test positive for flu over the course of a year, you’d blame 36,000 deaths on the flu.

Not everybody’s comfortable with a body count that consists of statistically inferred victims instead of, well, bodies. And there are potential glitches—for example, if a snowy winter causes both more flu (people spend more time indoors) and more car accidents (slippery roads), the model is going to blame the flu for a lot of traffic deaths. For this reason, some versions of the model, including Thompson’s, exclude causes of death, like car crashes, that don’t seem plausibly related to flu.

But what’s the alternative to the estimate? Counting only the 1,812 people who died with the flu still in their lungs? That would be like recording the cause of death as “car accident” only for victims who died in the car and filing everyone who bled out in the ambulance under “anemia.” Or like restricting your account of the lives lost to the Iraq war to documented violent deaths, like those in the Iraq Body Count, instead of making a statistical best estimate as the Lancet study did. (While the specific methodology used in the Lancet study has drawn some criticism, the use of statistical techniques to estimate excess deaths is standard.) That 36,000 estimate is far from an exact figure—tweaking the technique can easily knock it up or down by 10,000 or so—but it’s a “least bad” estimate; the 1,812 number is very precise but also very incorrect.