Two years after questions surfaced about work by former Harvard psychology professor Marc Hauser, an official government report is finally out.
It’s not pretty.
The findings by the Office of Research Integrity were first reported by the Boston Globe, which was also first to report the issues in Hauser’s work. They’re extensive, covering misconduct in four different NIH grants (we’ve added some links for context):
Respondent published fabricated data in Figure 2 of the paper Hauser, M.D., Weiss, D., & Marcus, G. “Rule learning by cotton-top tamarins.” Cognition 86:B15-B22, 2002, which reported data on experiments designed to determine whether tamarin monkeys habituated to a sound pattern consisting of three sequential syllables (for example AAB) would then distinguish a different sound pattern (i.e., ABB). Figure 2 is a bar graph showing results obtained with 14 monkeys exposed either to the same or different sound patterns than they were habituated to. Because the tamarins were never exposed to the same sound pattern after habituation, half of the data in the graph was fabricated. Figure 2 is also false because the actual height of the bars for the monkeys purportedly receiving the same test pattern that they had been habituated to totaled 16 animals (7.14 subjects as responding and 8.87 subjects as non-responding).
Respondent retracted the paper in 2010 (Cognition 117:106).
• In two unpublished experiments designed to test whether or not tamarin monkeys showed a greater response to certain combinations of unsegmented strings of consonants and vowels than others, Respondent falsified the coding of some of the monkeys’ responses, making the results statistically significant when the results coded by others showed them to be non-significant. Respondent acknowledged to his collaborators that he miscoded some of the trials and that the study failed to provide support for the initial hypothesis.
This research was never written up for publication.
• In versions of a manuscript entitled “Grammatical Pattern Learning by Human Infants and Monkeys” submitted to Cognition, Science, and Nature, Respondent falsely described the methodology used to code the results for experiments 1 and 3 on “grammar expectancy violations” in tamarin monkeys either by claiming coding was done blindly or by fabricating values for inter-observer reliabilities when coding was done by only one observer, in both cases leading to a false proportion or number of animals showing a favorable response.
Specifically, in three different experiments in which tamarin monkeys were exposed first to human voice recordings of artificial sounds that followed grammatical structure and then exposed to stimuli that conformed to or violated that structure, Respondent (1) provided an incorrect description of the coding methodology by claiming in the early versions of the manuscripts that “two blind observers” coded trials and a third coded trials to resolve differences, while all of the coding for one experiment was done just by the Respondent, and (2) in a revised manuscript, while Respondent no longer mentioned “two blind observers, he claimed that “Inter-observer reliabilities ranged from 0.85 to 0.90,” a statement that is false because there was only one observer for one of the experiments.
Furthermore, in an earlier version of the manuscript, Respondent falsely reported that “16 out of 16 subjects” responded more to the ungrammatical rather than the grammatical stimuli for the predictive language condition, while records showed that one of the sixteen responded more to grammatical than ungrammatical stimuli, and one responded equally to grammatical and ungrammatical.
Respondent and his collaborators corrected all of these issues, including recoding of the data for some of the experiments prior to the final submission and publication in Cognition 2007.
• In the paper Hauser, M.D., Glynn, D., Wood, J. “Rhesus monkeys correctly read the goal relevant gestures of a human agent.” Proceedings of the Royal Society B 274:1913-1918, 2007, Respondent falsely reported the results and methodology for one of seven experiments designed to determine whether rhesus monkeys were able to understand communicative gestures performed by a human.
Specifically, (1) in the “Pointing without food” trial, Respondent reported that 31/40 monkeys approached the target box while the records showed only 27 approached the target (both results are statistically significant), and (2) there were only 30 videotapes of the “Pointing without food” trials, while Respondent falsely claimed in the paper’s Materials and Methods that “each trial was videotaped.” Respondent was not responsible for the coding, analyses, or archiving but takes full responsibility for the falsifications reported in the published paper. Respondent and one of his coauthors replicated these findings with complete data sets and video records and published them in Proceedings Royal Society B 278(1702):58-159, 2011.
• Respondent accepts responsibility for a false statement in the Methodology section for one experiment reported in the paper Wood, J.N., Glynn, D.D., Phillips, B.C., & Hauser, M.D. “The perception of rational, goal-directed action in nonhuman primates.” Science 317:1402-1405, 2007. The statement in the paper’s supporting online material reads that “All individuals are . . . readily identifiable by natural markings along with chest and leg tattoos and ear notches.” In fact, only 50% of the subjects could be identified by this method, thus leading to the possibility of repeated testing of the same animal.
Respondent and one of his coauthers replicated these findings with complete data sets and video records and published them in Science 332:537, 2011 (www.sciencemag.org/cgi/content/full/317/5843/1402/DC2 – published online 25 April 2011).
• Respondent engaged in research misconduct by providing inconsistent coding of data in his unpublished playback experiment with rhesus monkeys exploring an abstract pattern in the form of AXA by falsely changing the coding results where the prediction was that habituated animals were more likely to respond to an ungrammatical stimulus than a grammatical one. After an initial coding of the data by his research assistant, in which both Respondent and assistant agreed that an incorrect procedure was used, the Respondent recoded the 201 trials and his assistant coded a subset for a reliability check.
The Respondent’s codes differed from the original in 36 cases, 29 of them in the theoretically predicted direction, thereby producing a statistically significant probability of p = <0.01. Respondent subsequently acknowledged to his collaborators that his coding was incorrect and that the study failed to provide support for the initial hypothesis. This research was never written up for publication.
Hauser “neither admits nor denies committing research misconduct” but has agreed to a three-year period, starting August 9, 2012, in which he will have any of his research that is funded by the Public Health Service — that’s the parent body of NIH — supervised and certified. He also can’t do any peer review for the PHS, nor serve on any PHS committees.