How the media hypes “research that is absurd on its face”

Aaron Brown says his new book, Wrong Number: How to Extract Truth from a Blizzard of Quantitative Disinformation, “isn’t an exposé of fraud—Retraction Watch covers that ground. It’s about legitimate-looking research that is absurd on its face.” 

Published this month by Wiley, Brown uses dozens of case studies to show “why widely reported and influential studies in top journals are not just wrong, but obviously and egregiously illogical or contrary to simple fact. My focus is less on the policy and statistical errors than on why no one seems to care,” he says.

Brown is a risk manager working in hedge fund management. He also teaches statistics at New York University and the University of California San Diego and writes columns for Reason and Bloomberg, among other outlets. We asked him to tell us more about how he thinks about the nexus of science, journalism and the publish-or-perish system that also pushes researchers to engage with non-experts to promote their work.

RW: Your case studies often suggest that if journalists consulted statisticians more often, the public would be less likely to adopt bad information. Is that a fair characterization?

Brown: Closer to the opposite: Don’t outsource your skepticism. 

There are times when journalists should defer to experts — including statisticians — but the message of Wrong Number is you don’t need statistics, you just need to treat a journal article’s claim with the same skepticism of someone trying to sell you a mutual fund. 

Ernest Rutherford is reported to have said, “If your experiment needs statistics, you ought to have done a better experiment.” I would extend that to journalists. If you need a statistician to write about a study, you ought to report on a better study.

RW: You argue for the occasional use of what you call “limited honesty” about scientific findings. Tell us about that.

Brown: I understand the dilemma of researchers who come up with findings they know will cause harm if released. A study finding some negative outcomes from a vaccine could cause parents who distrust the medical establishment to withhold vaccines from a thousand children who need them for every thoughtful medical decision it informs. When the science behind the nuclear winter hypothesis — that even a limited nuclear war threatened all life on the planet — began to be challenged , I understand why prominent scientists preferred to conceal that fact as the hypothesis seemed to be driving progress on arms control and peace negotiations.

What matters more than honesty is truth. When Gregor Mendel probably fudged his work to make the results neater, he made it more likely its truth would be recognized. A narrow, accountant’s honesty can get in the way of brilliant scientists forging revolutionary advances. If the ideas are right, we can forgive some unjustified data cleaning. If the ideas are wrong, data honesty doesn’t make them any more useful.

RW: The book includes a lot of reanalysis of overly simplistic statistical claims, and in your reanalysis, you often rely on government statistics. How would you advise when to trust and when not to trust governmental numbers and claims?

Brown: I’m a universal skeptic. Treat government figures with the same skepticism you would the odometer of a used car. That said, many government statistics have some advantages. The methodologies are usually disclosed in detail, and the numbers compiled by career staffers. With documented exceptions they’re computed the same way every period and they’re subjected to rigorous criticism and frequent use. If nothing else, they provide a common starting point for researchers, which is useful even if they’re not particularly meaningful.

But I give plenty of examples in the book where people go wrong. A headline National Transportation Safety Board study that curbside bus services had seven times the fatal accident rate of traditional terminal carriers — used to shut down 26 “Chinatown” bus services with excellent safety records — turned out to have included 30 traditional carrier fatal accidents (24 by Greyhound) stuffed alongside seven curbside carrier fatal accidents. A subfield of economics, “kinked demand curve” theory, turned out to be based on a misunderstanding of how the government collected price information. We often see figures like the unemployment rate among Black workers in Wyoming based on a survey too small to have covered any Black workers in Wyoming.

RW: You come across as a big fan of rigorous peer review. But you also worry that peer review can quash novel ideas that ought to be given a hearing. How do you reconcile this?

Brown: I’m a fan of rigorous post-publication peer review — the messy, ongoing process of papers being replicated, contested, extended, or quietly ignored as the field moves past them. That’s where science self-corrects. Pre-publication review is a device for enforcing conformity, not filtering out error.

A finding becomes trustworthy when it gets woven into the broader web, when other researchers build on it, find related effects, fail to replicate it, modify it. A claim that lives only in its original paper, never built upon and never refuted, is not yet science regardless of where it was published. Letting competing researchers block entrants into the process causes more harm than good.

That said, I absolutely support rigorous audit by statisticians without stakes in the subject matter findings. Alongside pre-registration of hypotheses, full disclosure of data and code and holdout samples, this could filter out a lot of bad research. Other gatekeepers could check citations (a major issue is papers that cite studies for crucial assumptions, where the study’s abstract says precisely the opposite of the paper’s claim). But studies should be blocked for errors, not because anonymous colleagues don’t like the findings. And decisions about importance should be made by named editors who take responsibility for them.

I have an even more radical suggestion: In most fields researchers shouldn’t be doing the studies in the first place. Being a subject-matter expert in psychology, medicine, astronomy or some other field does not make you competent at the data collection and analysis to test your ideas. I would love to see disinterested, specialized testing institutions perform studies suggested by subject-matter experts.

RW: Speaking of peer review, your book is published by Wiley, which of course also publishes thousands of scholarly journals. What was the peer review and fact-checking process like?

Brown: I won’t pretend a trade book gets the kind of scrutiny a journal article does. It doesn’t. So readers should treat my book with the same skepticism I urge for everything else.

There were multiple rounds of editorial review focused on argument, structure, and accessibility, and I went through line-edits and copy-editing. The publisher sent the manuscript to outside readers whose comments shaped the revisions. I sent it out on my own, including to people whose views differ from mine. For a random example, Victor Haghani, a friend of mine who blurbed the book, caught a major error in my account of capture-recapture analysis.

RW: Do you think of yourself as a scientific sleuth?

A sleuth investigates wrongdoing — fraud, fabrication, paper mills, image manipulation. That’s important work and I’m glad people do it, but it isn’t mine. The studies I dissect were, for the most part, honestly produced by researchers who believed their results. My complaint isn’t that they cheated. It’s that their logic or statistics didn’t support what they claimed, and that the surrounding ecosystem distorted and amplified the claim.

I’m a critic, not a detective. I’m not trying to get papers retracted; I’m trying to refute them. Retraction removes a thread from what Xenophanes called the “woven web of conjecture.” Refutation leaves both sides of the issue accessible for future researchers to learn from and build on. That’s more useful for results that are honestly reported but reinterpreted by better analysis. Future researchers can take up the debate, or learn from prior mistakes..

RW: In many ways, this book tries to help average readers see past biases. You write, “I am no populist, but I am a libertarian.” Did you attempt to control for your own biases in the writing of this book, and if so, how?

Brown: Readers who embrace my philosophy of skepticism should be skeptical of my belief that biases played no part in my selections. But I don’t think you’ll find a specifically libertarian bias. The book does seem to have a populist bias, but only because populists are the ones who distrust official dogma, and I’m criticizing bad research that either has become official dogma, or aspires to it.

Libertarians have promoted plenty of wrong numbers. I know many who think Sam Peltzman (a former professor of mine and a major influence) established definitively that seatbelt requirements increase traffic fatalities, and that the Food and Drug Administration kills one hundred patients by delaying or rejecting good treatments for every life it saves by blocking bad ones. Sam did good work on both questions, but there’s a lot of other literature that casts considerable doubt on both claims.

I’ve written about these and other libertarian wrong numbers, but I don’t need to fight bias to do it. I’m a moral libertarian. I believe it’s wrong for the government to coerce a competent adult to wear a seatbelt or to prevent her from choosing her preferred medical treatment. I have no strong belief either way about whether that principle would result in more or fewer deaths.

RW: You write that “researchers and editors don’t even try to publish stuff that’s more likely true than false.” (p. 129) What do you mean by this, and how would you remedy the situation?

Brown: To publish more true results than false ones you have three main parameters to work with: the ratio of true to false hypotheses that you test, the fraction of false hypotheses you reject and the fraction of true hypotheses you accept. Most journals establish a standard only for the middle parameter and do not report even general information about the other two.

Journals prefer surprising results, creating an incentive to test a lot of hypotheses that are likely false. Low-power tests that fail to reject lots of false hypotheses and support relatively few true ones are cheaper and easier to run–sloppier methods, smaller sample sizes–and can generate more publications by failing to reject false hypotheses than they miss by failing to support true ones.

For example, suppose you test 1,000 hypotheses that would be surprising if true, so only 10% are true. You reject 95% of the 900 false hypotheses, but publish the 45 remaining false results. Meanwhile you reject 80% of the 100 true hypotheses. You end up publishing 45 false results to 20 true ones.

It is possible to do good research under this standard. Skilled researchers can identify hypotheses likely to be true, but surprising to others. They can apply stringent evidentiary standards — extraordinary claims require extraordinary evidence. They can do careful, large-sample, high-power tests. But few researchers have the talent and resources to do this at a rate to satisfy university and granting agency publication requirements. So even many good researchers do some low-quality work, and some researchers can do only low-quality work. The journal system is set up to accommodate academic careers, not to publish reliably true results.

And sadly, this is not the end of the story. In addition to the fraud and other issues highlighted at Retraction Watch, there are many well-known flaws even in the “p-value” significance calculations for the middle parameter.

It’s no surprise that, as John Ioannidis pointed out in 2005, most published research findings are false.

Journals that wanted a low ratio of false results would at least make some effort to disclose standards for prior probability and power, in addition to significance. If they did publish these and readers did the math there would be a lot more skepticism about published results.


Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on X or Bluesky, like us on Facebook, follow us on LinkedIn, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.