Last month, we published a guest post by Jean Hazel Mendoza about the retraction of a Molecular Cell paper for sampling errors, flawed analysis, and and miscalculation.
Mendoza heard back from Jean-François Allemand, the head of one of the labs involved. Allemand tells Retraction Watch by email that when his group tried to repeat the experiment, they suspected of “missing” or “averaging” of data points in the retracted paper:
Initially different contributors to the work were in charge of different aspects of the experiment and in particular only one person was in charge of the analyses leading from raw images in movies to intensity plots. When we re-analyzed the data, from the same movies, we, even the person who did the published analysis, could not reproduce all the same points as if some points were missing or if some averaging had been performed on the initial data, which we could not reproduce. Clearly this step contained errors and biases.
One of the reviewers had trouble understanding the data, Allemand says, but accepted the first author’s (Giuseppe Lia,“the person in charge of this experiment”) word for it:
The data in this type of experiments are not like a gel to analyse where you have clear bands that you can see on a picture, or a simple electric signal. They are performed on a naturally very noisy system. You need to spend hours analysing many individual fluorescent spots that do not look exactly the same to get an average picture. All the protocol for the analysis was indeed clear and rational (as can be read from the paper). If in the curves obtained from this analysis there is no clear bias or error it is not simple to detect a mistake that results from a long averaging process. There are some standards elements to take into account (error bars, meaningful statistical distributions…) but the data presented looked OK. Indeed one of the reviewer found a curve that was difficult to interpret but the person in charge of this experiment said it had been reproduced several times so we had to trust the experimental facts and this is also what the referee accepted. Among the authors my colleague B. Michel was the first one that felt a problem when the duplication of the setup started. She played a crucial role in the discovery of the problems.
Lia has since left his post at Centre de Génétique Moléculaire, Retraction Watch has learned.
Allemand says his group is undergoing the “painful,” “but necessary process” of repeating previous studies that used the same methods:
When a paper has no more solid data to support its main results it is natural to retract it. In fact we are currently performing analyses and experiments on a previous work using the same technique to validate or not the other published data. It is a painful, time consuming, but necessary process, part of common scientific procedure.
He adds:
The best solution is to give greater value to publications that confirm or contradict previous results as compared to totally new ones. After all the essence of scientific inquiry is reproducibility and independence of the data from the observer.
We couldn’t have said it better ourselves.
“When a paper has no more solid data to support its main results it is natural to retract it.”. While in the current context I agree that the discrepancy between results and data (after pre-processing) warrants a retraction, I think this statement should be read with caution because it should not apply to null-findings that result from high statistical and methodological rigor (even with regards to noisy data, as long as the preprocessing can be reproduced).
This is pretty shocking. The issues with observational methods are not complicated. You get 2-3 reviewers, each one rates independently, you do a reliability analysis – you look for appropriate variation between samples, and little variation within samples. Again and again, the lack of methodological sophistication and the failure to use appropriate expertise has led to a bad outcome for the authors. This is done in psychiatry, radiology, any experiment with human judgement. The first step is the training of raters. The second step is the reliability study. The third is the validity study. Then you do real data.