A second psychology researcher has resigned after statistical scrutiny of his papers by another psychologist revealed data that was too good to be true.
Ed Yong, writing in Nature, reports that Lawrence Sanna, most recently of the University of Michigan, left his post at the end of May. That was several months after Uri Simonsohn, a University of Pennsylvania psychology researcher, presented Sanna, his co-authors, and Sanna’s former institution, the University of North Carolina, Chapel Hill, with evidence of “odd statistical patterns.”
Simonsohn is the researcher who also forced an investigation into the work of Dirk Smeesters, who resigned last month. Last week, Yong reported that Simonsohn had uncovered another case that hadn’t been made official yet.
According to today’s story, Sanna has asked the editor of the Journal of Experimental Social Psychology — which is also retracting one of Smeesters’ papers — to retract three papers published from 2009 to 2011. These are the three he seems to have published there during that time:
- Rising up to higher virtues: Experiencing elevated physical height uplifts prosocial actions, cited twice, according to Thomson Scientific’s Web of Knowledge
- Think and act globally, think and act locally: Cooperation depends on matching construal to action levels in social dilemmas, cited three times
- When thoughts don’t feel like they used to: Changing feelings of subjective ease in judgments of the past, cited three times
The resignations of course follow that of Diederik Stapel, another psychology researcher.
Read Yong’s full report here.
That really is an odd pattern of significance:
“When thoughts don’t feel like they used to: Changing feelings of subjective ease in judgments of the past”:
Every significant effect was “p < .001".
"Rising up to higher virtues: Experiencing elevated physical height uplifts prosocial actions":
Every significant effect was "p < .001".
"Think and act globally, think and act locally: Cooperation depends on matching construal to action levels in social dilemmas":
Every significant effect was "p < .01".
Where are all the p = .059's that I'm always cursed with?
Rising stars get p<.001! That's why they are rising, while the rest of us losers usually get p=.059! 🙂
p=.059 is plain rubbish.
p=.0501 should be the number when you may sincerely be fed up with your data…
exactly, think about all the published articles with p<0.00000000000000000000000001
The first one is reminiscent of this paper: http://www.ncbi.nlm.nih.gov/pubmed/20424062 , same year.
Another Psychological Science fluff paper of the same kind (“Elevation leads to altruistic behavior”).
Re: the papers, yes, it makes sense that those would be the three, and two of them are the ones that Simonsohn investigated. Cooper is currently travelling and away from his records, so he couldn’t confirm the exact papers to be retracted.
Cooper being the journal editor.
I have to agree with your “Another Psychological Science fluff paper of the same kind” because the titles of all these papers (Stapel included) make them sound like “well, duh!” Hard to avoid the suspicion that he/they deliberately made up trivial studies for which to manufacture data.
Here is another real aspect of this, sadly, ongoing problem. I was an editor back when one of the above author’s brief reports crossed my desk. While I no longer recall the details (it was in the early or mid 1990s) I do remember being impressed at how clear the results were, and I accepted the paper almost “as is” (some minor quibbles, but nothing serious).
Now, that paper may have been just as good as it appeared. But in light of current concerns, everything becomes tainted by association — leaving investigators interested in a particular topic no idea how to proceed.
This is all quite sad for psychology. I hope the problems are addressed and issues redressed thoroughly; and that ultimately due diligence will lead to better understanding and knowledge of what is among the most fascinating topics — how and why we behave and think (I am not a behaviorist — thus the distinction) as we do.
The main problem in psychology is not so much data fabrication (though that may be the case for social psychology). The main problem is selective reporting. There are a ton of “classic” effects that are cited in textbooks and all that simply DO NOT REPLICATE. That is because the original articles reported the results of a very specific set of experimental parameters under which the cool results were obtained. However, the original articles misleadingly portrayed the results as much more general and robust than they really are. Why did they do that? Well, career reasons, obviously: Saying that the cool results would only be found for a very narrow range of parameters would render them much less interesting and general.
Agreed. That certainly is one aspect of a complex set of issues. Is it unique to psychology? I doubt it. Is it more prevalent in psychology? Probably.
The specific issue with this particular problem is that graduate students are not instructed in the nature of coherent argument and the art of drawing permissible conclusions from a set of empirical findings. Thus, they generalize lab-based outcomes beyond (often way-beyond) comfort zone. This in turn occurs because psychology instructors also often are lacking when it comes to the logical or argument and inference vis a vis experimental results (or for that matter, reasons to examine a question.)
I do not think this regrettable situation is unique to psychology, but, in conjunction with the expedient to publish or perish, we do tend to turn out an uncomfortably large number of “well-educated drones” who follow the edict of” find, find find” — irrespective of the broader content and greater relevance of those findings.
While such also is true in other fields of science, their paradigms and methods are better established, more clear, and have a general theoretical guiding over-structure that partially masks these pervasive drawbacks.
What to do? Well, in the case of psychology (the only areas for which I can speak semi-coherently) a strong dose of the philosophy of logic and scientific argument would certainly be some help in understanding whether one’s findings refer to “nature at her ‘interesting’ joints” or rather to a particular task.
This is true of all disciplines:
“…instructors also often are lacking when it comes to the logical or argument and inference vis a vis experimental results”
So I removed Psychology. I’m in the health sciences. In 20 years of graduate level teaching and advising, I estimate <7% understand logical argument construction and inference based on the data. The conclusions are often written as a reiteration of the study rationale.