The week at Retraction Watch featured mass resignations from a journal’s editorial board, software that writes papers for you, and a retracted retraction. Here’s what was happening elsewhere:
- “Academic journal publishing is headed for a day of reckoning,” says librarian Patrick Burns. (The Conversation)
- A Beijing BBQ joint, named after The Lancet, is offering discounts to researchers who have recently published papers, based on impact factor. (China Daily)
- A European consortium announces a prize for publishing “negative” scientific results. Got any nominations? (ECNP)
- “Null results are exceedingly common” and should be published, says Anupam Jena, who offers up some of his own. (STAT)
- Neuroskeptic sees you making Trump jokes in your paper titles and wants it to stop. (Twitter)
- In management research, “Controlling for other factors, women are slightly more likely to be cited than men.” And “In general, gender differences in citation impact are marginal to non-existing.” (Journal of Informetrics)
- “Statistical Criticism is Easy;” writes Frank Harrell. “I Need to Remember That Real People are Involved.”
- “Too much discussion gets locked away in the journal clubs and never sees the light of day,” says David Kent. (University Affairs)
- Leif Nelson, Joseph Simmons, and Uri Simonsohn say psychology is having a renaissance. (Annual Review of Psychology)
- Developing countries “are particularly vulnerable to predatory journals,” says Sioux McKenna. (The Conversation)
- “Reviewing Better.” Thoughts from Ben Britton. (Medium)
- “Although I do receive intellectual gratification from my peer review efforts, it is increasingly difficult to reconcile these intangible benefits with the time required.” (Andrew Wilner, Medscape)
- Bart Penders argues that “two different modi operandi emerge when it comes to authorship.” (Accountability in Research)
- A new astronomy journal “publishes summaries of ongoing research, brief clarifications, and comments that don’t warrant a lengthy scientific paper.” (Dalmeet Singh Chawla, Physics Today)
- “Clearly now is the time for the U.S. research enterprise, and for us at Johns Hopkins, to re-evaluate our processes and incentive systems,” says Paul Rothman, the dean of Hopkins’ medical school. (Karen Nitkin, Hopkins Medicine)
- “What makes a data-sharing effort trustworthy to you?” asks Angela Villanueva. (Baylor College of Medicine blog)
- In South Africa, a publisher resists the government’s demands to retract a journalist’s exposé of the president. (Sunday Times)
- Gareth Jones writes about receiving invitations to write or speak “on topics about which I know nothing.” (Otago Daily Times)
- An international team presents the “percentage-based author contribution index.” (Research Integrity and Peer Review)
- A think tank takes back its assessment of the GOP House tax bill after finding an error, prompting Reuters to retract a story. (Washington Post)
- How much does it cost to run eLife 2.0? Head of Technology Paul Shannon runs the numbers. (eLife)
- Why you gotta be so mean? Taylor Swift’s lawyer sends a blogger a threatening letter — and draws the ACLU’s (winking) ire. (NPR via WBGO)
- Many retractions aren’t consistently indexed on PubMed and the Web of Science, Marion Schmidt says. (JASIST)
- Another blow for Sci-Hub: The American Chemical Society wins a copyright suit against the site and a US court has ordered internet companies to block it. (Science)
- Too good to be true: An over-hyped study helped created the narrative that chocolate is a health food. (Julia Belluz, Vox)
- How we choose to judge science has a cultural and historical legacy, says Alex Csiszar. (Nature)
- A new survey says women in PhD programs publish less often than their male counterparts. (Katarina Zimmer, The Scientist )
- Coosje Veldkamp offers ideas on how to recognize and address “the human fallibility of scientists.” (PsyArXiv)
Like Retraction Watch? Consider making a tax-deductible contribution to support our growth. You can also follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up on our homepage for an email every time there’s a new post, or subscribe to our daily digest. Click here to review our Comments Policy. For a sneak peek at what we’re working on, click here. If you have comments or feedback, you can reach us at [email protected].
Ask a thousand academics: the only people who dislike Sci-Hub are rent-seekers. It is interesting to see how far the US courts will be willing to go to support rent-seeking.
Frank Harrell’s comments about the ORBITA blinded placebo controlled clinical trial of heart stents are most unfortunate indeed. I continue to lament that a statistician I have long trusted now rails against reasonable statistical studies, bashing p-values inappropriately, and harping that only Bayesian methods can save us.
The high profile paper published in the Lancet (the ORBITA trial) shows a very good understanding of statistical issues. The authors recognized that no truly blinded study of this medical manoeuvre had ever been done. Years of anecdotal publications litter the literature, enough to convince many who do not understand statistical issues that this trial would be unethical. What is unethical is to continue to promote ill-founded medical manoeuvres based on poorly done studies.
These authors worked hard to convince others of the errors in their thinking, and arranged for a proper blinded clinical trial. They registered their trial plan beforehand, with ClinicalTrials.gov and pre-published with the Lancet. They identified an outcome of no medical relevance (30 second difference between the two groups) and performed a power calculation using then-available data which showed that 100 cases per treatment group would provide 80% power to detect such a difference.
The authors, reviewers, and editor did not allow a classic absence of evidence is not evidence of absence error to be made, as Harrell states in an update to a previous blog entry of his (“Statistical Errors in the Medical Literature”, first published April 8 and updated November 4, 2017). In the presence of a power analysis showing adequate sample size to detect any difference larger than that of no medical relevance, a large p-value does provide sound statistical evidence that a difference of medical relevance is likely not present, i.e. that the null hypothesis is the relevant hypothesis to accept, at the stated type II error rate. This study measured a difference of 16.6 seconds between the two groups, well below the minimum difference of medical relevance that they specified a-priori. If a difference of more than 30 seconds had been the true state of affairs, 4 out of 5 such studies would have detected the difference. This study did not yield such a measurement, so their sound statistical conclusion is entirely valid: these data support the null hypothesis at the stated type II error rate.
Harrell has become fond of picking the end-point of a confidence interval, and saying “see, the difference could be this big, so accepting the null hypothesis is bogus, and they should have done a Bayesian analysis”. Harrell declares this clinical trial to be “small”. The authors’ power analysis showed that 200 cases would be adequate to detect a difference of 30 seconds or more in 80% of trial attempts (type II error rate of 20%). So in what sense is this trial small? It includes the requisite number of cases indicated via a proper power analysis. Testing too few cases is a waste of resources, as in that case a large p-value can not be interpreted as allowing the null hypothesis to be accepted. With too few cases, a large p-value yields no interpretable result, as the type II error rate is either unknown or is unacceptably large. With many more cases than indicated by a power analysis, the trial risks denying an effective treatment to the control group should the treatment show a medically relevant improvement. This is the whole point of doing a-priori power analyses, to ensure that neither too few nor too many cases are enrolled in the trial. Both of those scenarios have ethical problems.
This weekend I will host a visiting friend who is recovering from a stroke induced by a portion of a stent breaking away and lodging in his brain. My friend will travel by train, no longer being able to drive due to probably permanent damage to a portion of the visual portion of the brain necessary to process important driving-related visual cues. Placing a stent is not a benign manoeuvre, and if people are going to suffer consequences such as my friend experienced, there should be solid statistical evidence that the manoeuvre can provide substantial medical benefit, enough to outweigh the harms that the manoeuvre can also induce. Unfortunate comments such as these of Harrell are not helping to clarify these issues to those not well versed in statistical methods.