‘I thought I had messed up my experiment’: How a grad student discovered an error that might affect hundreds of papers

Susanne Stoll

Earlier this month, we reported on how Susanne Stoll, a graduate student in the Department of Experimental Psychology at the University College London, discovered an error that toppled a highly-cited 2014 article — and which might affect hundreds of other papers in the field of perception.

We spoke with Stoll about the experience. 

Retraction Watch (RW): What did it feel like to find such a significant error? Did you doubt yourself at first, and, if so when did you realize you’d found something both real and important? 

Stoll: When I first came across the “error”, I had no clue what I was dealing with. I thought I had messed up my experiment because I obtained similar results in fairly distinct experimental conditions. There was no extraordinary feeling I experienced – errors are everyday business in science after all. I simply told my supervisor Sam Schwarzkopf and started checking my code to figure out where I erred.

After lots of double checking, it was clear that there was nothing wrong with my experiment code, so I started checking my analysis code. Initially, I was unable to spot any errors.

It was not until I ran a control analysis that I realized the patterns of results I quantified initially became clearer the more noisy measurements I kept in the data set. That was when I had first solid evidence that something had gone awry, so I called Elisa Infanti into my office to tell her that we might be dealing with an artifact. She suggested performing my analysis on random data, which I did. This produced similar (albeit still quite different) results. I forwarded those to Sam and we started developing more appropriate simulations and control analyses. That was the true beginning of our quest.

RW: Did you seek out advice from anyone, Dr. Schwarzkopf, for example?

SS: I received plenty of advice along the way, such as that by Elisa I mentioned above. The simulations and control analyses Sam and I developed with Elisa’s help became fairly complex because I performed my analyses in 2D space and as a brain researcher I tend to deal with thousands of data points. So at some point, we realized that we need to simplify matters to make further progress. That was when Sam broke the whole analysis down to a single data point. 

Of course, we knew that this was a scenario that was overly simplistic, but it helped us a great deal to understand the interdependencies in our analysis logic. We then used this simplistic case as a starting point to add in more complexity and explored further factors that we thought would matter in how the artifact manifests.

One of these factors I identified in Benjamin de Haas’s supplementary material of his 2014 study. So I ran a few simulations to see whether they supported my hypotheses. That was the case. I then contacted Ben. He took me very seriously immediately and some of his reanalyses corroborated mine. He also engaged in his own series of simulations and identified further factors modulating the artifact’s appearance. Scientifically speaking, this was excellent because we essentially cross-validated and complemented one another. At the same time, Sam ran another set of complementary simulations, and everything just fell into place really.

After this epiphany moment, I had to properly connect all issues we identified to the relevant previous literature. There is certainly a lot of indirect advice I received from doing readings on the topic. 

Some of the literature I really enjoyed reading was a book on regression artifacts by Donald Campbell and David Kenny (1999), articles on regression artifacts in other subfields by Nicholas Holmes (2009) and David Shanks (2017), an article on regression away from the mean by Schwarz and Reike (2018), and an article on circularity by Nikolaus Kriegeskorte and colleagues (2009).

RW: Do you have any advice for people in your position who find similar errors?

Stoll: Validate all your analyses procedures using simulations and/or repeat data. Try to surround yourself by researchers who are actively practicing open science, such as Sam and his lab including Ben. Just to give you an example: Although I did not know exactly what I was dealing with a year ago, Sam and Elisa had no issues with and encouraged me to present my ‘tale of unexpected complexities’ at a conference. Likewise, Ben was as keen as everybody else on the team to figure things out.

Like Retraction Watch? You can make a tax-deductible contribution to support our work, follow us on Twitter, like us on Facebook, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].

5 thoughts on “‘I thought I had messed up my experiment’: How a grad student discovered an error that might affect hundreds of papers”

    1. Do you think scientists have time to validate results with a software they will never use again after the grad student leaves the lab? They need to publish as soon as possible, data quality be damned.

      1. Yes. There really is little needed validation, and that is due to the pressures of the system to have positive results immediately to get publications, and therefore grant money, which all the school really cares about.

  1. I’m surprised that “talk to a statistician” does not seem to have been one of the options considered when having problems with a statistical analysis.

  2. I love hearing about actual scientific exploration in active researchers like Susanne and Ben. No defensiveness – no social drama – just science. Pure science. It gives me goosebumps every time!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.