The recent discovery of alarming rates of false-positive results in several research literatures—the so-called replication crisis—has led to a series of reforms in science publishing and practice, with the goal of making studies more transparent and reproducible. In the past few weeks, this push for openness by academics has been taken up by the Trump administration.
Under the direction of Administrator Scott Pruitt, the U.S. Environmental Protection Agency put out a plan in late April to “strengthen transparency” in its use of research findings. If that rule goes through (we’re in the middle of a 30-day period for public comment), then any data, models, or analyses used by the agency for setting environmental standards would have to be made available for public scrutiny. The proposal claims that these changes would be in line with the open-science movement as a whole and that they would be “informed by the policies recently adopted by some major scientific journals, spurred in some part by the ‘replication crisis.’ ”
The phrase bad faith gets thrown around a lot these days, but this supposed methodological awakening at the EPA—Pruitt’s come-to-Nosek moment, if you will—feels like it could be the baddest faith of all. To state the obvious: Over the past 15 months, the agency has wallowed in the shadows, given in to paranoia, and otherwise displayed a deep aversion to transparency. Thirty-five years ago, the agency’s first administrator promised that the EPA would “operate in a fishbowl” and communicate “as openly as possible.” Pruitt and his team have smoked the glass at every opportunity. They’ve ignored requests for public records and avoided taking notes at crucial meetings; they’ve installed biometric locks, armed guards, and soundproof phone rooms; they’ve kept Pruitt’s calendar a secret and his travel plans mysterious even for his senior staff; they’ve given him control of four different email accounts; they’ve booted academic scientists from advisory boards and replaced them with representatives of industry.
In light of all these facts—or, rather, in the darkness of them—one may as well assume the worst about the “transparency” rule: that it isn’t really meant to make the use of science any more reliable, reproducible, or open, but rather to sideline any research that might force the agency into regulatory action. Then again, motivations aren’t everything, and rejecting policies solely on account of their “bad faith” amounts to little more than faith-based reasoning. What about the rule itself, and the principles of open science it claims to represent? In reflexively decrying Pruitt’s rule (and the closely related efforts from the climate science–hating wing of Congress), have we given up on something with potential?
Opponents of the rule—including the 985 scientists who signed an April 23 open letter to Pruitt—have tried to underline its purported flaws: The problem, they say, is that it would cut off access to a large, important body of research. The goal of sharing data is a noble one, but for many studies—maybe even most of them, in certain fields of inquiry—sharing data would be de facto impossible, due to “intellectual property, proprietary, and privacy concerns,” as the scientists write in their letter. According to Stanford meta-scientist and reproducibility maven John Ioannidis, whose work is cited three times in the EPA’s proposal, just a tiny fraction of past research studies have made their raw data available for review and reanalysis. If all the rest were ruled ineligible for the agency’s consideration, he pointed out earlier this month, then “science will be practically eliminated from all decision-making processes. Regulation would then depend uniquely on opinion and whim.” This would have catastrophic consequences, critics say; for example, the rule could lead the agency to ignore all past research on lead toxicity and end its efforts to reduce exposure.
The black-or-white approach to open science that’s described above would of course be very bad. It’s not exactly what the rule suggests, however. The EPA’s draft does not argue for a blanket ban on unshared data sets; rather it says that “all reasonable efforts” must be made to render data available for public use, in a manner that is “consistent with law, protects privacy, confidentiality, confidential business information, and is sensitive to national and homeland security.” When such reasonable efforts fail—e.g., in situations where sharing data turns out to be impractical or impossible—then the rule may be suspended. The EPA also asks for public comments “on which criteria the Agency should use to base any exceptions” to the transparency requirements, and whether application of the rule would hamper the “timeliness and quality of the scientific information available.”
The kind of rule imagined by its most alarmed opponents—retroactively applied, with no exceptions, and in such a way that might further poison us with lead—would be insane. It would also be unwelcome, whether you’re an earnest advocate for open science or a shill for corporate interests. In its true-boogeyman form, the transparency provision might end up posing obstacles not just for academics doing vital public health research, but also, say, agrochemical companies. According to the EPA’s internal emails, obtained via Freedom of Information Act request by the Union of Concerned Scientists, this very issue came up when Pruitt first ordered that the policy be written up in January. “This directive needs to be revised,” replied industry stalwart Nancy Beck, who is now in charge of the agency’s toxic chemical unit. It would be “incredibly burdensome,” she said, for those companies to publish all the private data they keep on their pesticides.
In fact, the EPA has already endorsed the use of open science with the stipulation “when feasible” in official documents. Its Obama-era Scientific Integrity Policy, issued in February 2012, describes seven measures “to enhance transparency within Agency scientific processes,” including the use of data and models that can be shared online. “It’s already explicit,” said environmental chemist Paul Anastas, who served as an EPA assistant administrator and science adviser from 2009 to 2012 and was instrumental in drafting that policy.
The new transparency proposal doubles down on this idea and would seem to make the policy somewhat more prescriptive. But it also adds an alarming caveat. As written, Pruitt’s rule appears to leave all discretion on the matter of transparency exceptions to the EPA’s administrator—which is to say, him. He’d be the one to decide whether “reasonable efforts” have been made to share a given data set, and when the rule ought to be suspended. In effect, it gives him the power to usurp scientific judgment from the scientists themselves.
The rule is not yet in its final form, but there may well be a better version that is not so ripe for exploitation. Decisions on the feasibility of data sharing could be left to subject experts instead of political appointees, for example. The push for open science in the government could also follow the path drawn in academic science, which attempts to recalibrate interpretations of published literature rather than expunge it. “The real point of the transparency and reproducibility movement is to expose imperfections so we can make wise decisions,” said Brian Nosek, co-founder and executive director of the Center for Open Science. “It’s not about getting perfect evidence.” With that in mind, one should not default to tossing out decades-old research with unshared data. Rather this material could be given somewhat lesser weight in decision-making than it has been in the past. Across the sciences, researchers have learned to ascribe more value to trials that have been conducted under full transparency than to vintage studies that were based on data and methods that can’t be shared or scrutinized. It’s no different from how researchers ascribe more value to trials with randomized and well-controlled designs.
A more restrictive open-data mandate could then be applied prospectively by the EPA, to research that hasn’t yet been carried out or published. Instead of pretending that classic work on environmental toxicology could or should be brought in line, ex post facto, with today’s best practices, the agency might say it wants to ensure those practices are in place for all research going forward, at least given reasonable constraints. As the text of the proposed rule points out, this sort of forward-looking, top-down push for more transparency has already been adopted by some academic journals. (Not all of these journals adhere to their stated policies, however.) So why not add another push from the government?
Such requirements would slacken the pace of research, at least in the short term, since it’s cheaper and easier to conduct research based on private data and unspecified plans for analysis. But the current pace may be harmful in the long run, if it leads to spurious results. According to Nosek, open science provides for more efficiency in the overall pursuit of knowledge: “You will slow down the production of science,” he said, “but will you slow down the gathering of credible evidence?”
There’s always a trade-off between pace and rigor in scientific practice, but the setting of that balance has unusually high stakes at the EPA. Delaying science on, say, the effects of lead or air pollution can have deadly consequences, while unnecessary or overeager regulation may have massive costs. In practice, this dial has been twisted somewhat to the right already, slowing down potential rules by drawing out debate. Regulatory science can be glacial as it stands—it took decades for the agency to move on formaldehyde, for example—so any means of making that process more meticulous, through adding open-data mandates or extra layers of peer review, will tilt things more in favor of commercial interests, maybe at the cost of public health.
It’s also hard to figure how the agency’s apparent push for better, more transparent research—its claim to be swapping short-term speed for long-term reproducibility—can be squared with the Trump administration’s call for radical cuts to the EPA science budget. If Congress really cared to improve the quality of science at the agency, says Anastas, it might fund a study of the issue from the National Academies of Science and issue recommendations for greater openness. Then the EPA would have to act on those recommendations or explain why it was going against the latest science on how to use science most responsibly.
Writing in the Washington Post last week, the Brookings Institution’s Robert Hahn argues that the EPA’s proposal may not be perfect but that it would likely end up doing more good than harm to science. He also faults the scientific community for “throwing stones” instead of offering practical suggestions for how to make it better. It’s hard to share Hahn’s optimism, given how the rule was drafted and the politics behind it. But even if Pruitt and the EPA’s appropriation of the open-science movement doesn’t leave much room for meaningful engagement, it seems ill-advised for scientists to shoot it down with mindless affirmations of the status quo.
Indeed, some opponents of the rule have made rather retrograde arguments against it, arguing, for example, that standard scientific practices work well enough as is. “This is not about all of the details that scientists need to scrutinize each other’s work. That information is already widely available, and scientists spend a tremendous amount of time disclosing all of their data and methods to get their work published,” said Gretchen Goldman, research director for the Center for Science and Democracy.
Others have claimed that certain types of research, like studies of the Deepwater Horizon oil spill, simply can’t be reproduced—though as Andrew Gelman pointed out, you don’t need to restage a disaster to make the data from its fallout public, and those analyses could be scrutinized and tried again. Many also cite the cost of making data public, pointing to a Congressional Budget Office report from 2014, which said an open-science rule would add between $10,000 and $30,000 to the ledger for each study used by the EPA (for a total of $250 million per year).
These claims recapitulate some objections academic scientists made in the early days of the reform movement, according to Simine Vazire, co-founder of the Society for the Improvement of Psychological Science. Indeed, the new proposal’s use of replication rhetoric and research, including citations of both Ioannidis and Nosek, has put the open-science movement in a somewhat awkward position. “My summary assessment is that it’s 80 percent great and 20 percent off the rails,” Nosek told me. But if you were going only by the outrage that’s been expressed by environmental scientists and activists, it would be difficult to sort out those proportions.
Those who favor open science should also worry over backlash. Even before the EPA went out with its proposal, Vazire had heard nervous chatter that the methodological reform movement would get “weaponized” by right-wing ideologues and used to damage the credibility of science as a whole. Now those predictions have come true. “I think there’s a real danger in overreacting to the [EPA] rule,” she said, even if it does seem poorly designed, hypocritical, or even anti-scientific. “I’m at least as worried that people will react too strongly in the other direction, and say we need to protect our data against people who want to use it to draw conclusions that we don’t like. That makes scientists sound crazy.”
Put another way, it may seem as though scientists’ debates about the practice of their research (and its many flaws) do more harm than good when aired in public. But if these conversations were carried out with more transparency—if science weren’t quite so vain about its blunders to begin with and so overconfident in its self-governance—then perhaps the right-wing rhetoric that swirled around this new proposal, with its talk of “secret science” and “hidden agendas,” wouldn’t be so potent to begin with.