About these ads

Retraction Watch

Tracking retractions as a window into the scientific process

Language of a liar named Stapel: Can word choice be used to identify scientific fraud?

with 9 comments

stapel_npcA pair of Cornell researchers have analyzed the works of fraudster Diederik Stapel and found linguistic tics that stand out in his fabricated articles.

David Markowitz and Jeffrey Hancock looked at 49 of the Dutch social psychologist’s papers — 24 of which included falsified data. (Stapel has lost 54 papers so far.)

According to the abstract for the article, “Linguistic Traces of a Scientific Fraud: The Case of Diederik Stapel,” which appeared in PLoS ONE:

When scientists report false data, does their writing style reflect their deception? In this study, we investigated the linguistic patterns of fraudulent (N = 24; 170,008 words) and genuine publications (N = 25; 189,705 words) first-authored by social psychologist Diederik Stapel. The analysis revealed that Stapel’s fraudulent papers contained linguistic changes in science-related discourse dimensions, including more terms pertaining to methods, investigation, and certainty than his genuine papers. His writing style also matched patterns in other deceptive language, including fewer adjectives in fraudulent publications relative to genuine publications. Using differences in language dimensions we were able to classify Stapel’s publications with above chance accuracy. Beyond these discourse dimensions, Stapel included fewer co-authors when reporting fake data than genuine data, although other evidentiary claims (e.g., number of references and experiments) did not differ across the two article types. This research supports recent findings that language cues vary systematically with deception, and that deception can be revealed in fraudulent scientific discourse.

In more detail:

Liars have difficulty approximating the appropriate frequency of linguistic dimensions for a given genre, such as the rate of spatial details in fake hotel reviews [8], the frequency of positive self-descriptions in deceptive online dating profiles [10], or the proportion of extreme positive emotions in false statements from corporate CEOs [11]. Here we investigated the frequency distributions for linguistic dimensions related to the scientific genre across the fake and genuine reports, including words related to causality (e.g., determine, impact), scientific methods (e.g., pattern, procedure), investigations (e.g., feedback, assess), and terms related to scientific reasoning (e.g., interpret, infer). We also considered language features used in describing scientific phenomena, such as quantities (e.g., multiple, enough), terms expressing the degree of relative differences (e.g., amplifiers and diminishers) and words related to certainty (e.g., explicit, certain, definite).

We were also interested in whether the fake reports contained patterns associated with deception in other contexts.

To probe Stapel’s studies, Markowitz and Hancock:

applied a corpus analytic method using Wmatrix [19], [20], an approach that is commonly used for corpus comparisons (e.g., [21], [22]). Wmatrix is a tool that provides standard corpus linguistics analytics, including word frequency lists and analyses of major grammatical categories and semantic domains. Wmatrix tags parts of speech (e.g., adjectives, nouns) in relation to other words within the context of a sentence (e.g., the word “store” can take the noun form as a retail establishment or a verb, as the act of supplying an object for future use).

You can see a table of Stapel’s word choices here.

But the Cornell researchers expression caution about the obvious leap here — using linguistic tools to probe manuscripts for evidence of fraud before they’re published:

… [I]t is tempting to consider linguistic analysis as a forensic tool for identifying fraudulent science. This does not seem feasible, at least for now, for several reasons. First, nearly thirty percent of Stapel’s publications would be misclassified, with 28% of the articles incorrectly classified as fraudulent while 29% of the fraudulent articles would be missed. Second, this analysis is based only on Stapel’s research program and it is unclear how models based on his discourse style would generalize to other authors or to other disciplines.

About these ads

Written by Adam Marcus

August 27, 2014 at 9:30 am

9 Responses

Subscribe to comments with RSS.

  1. Reblogged this on This Got My Attention and commented:
    Identifying fraudsters and plagiarizers is always difficult.

    Mike

    August 27, 2014 at 10:42 am

  2. “the frequency of positive self-descriptions in deceptive online dating profiles”

    You mean there are online dating profiles that aren’t deceptive?

    littlegreyrabbit

    August 27, 2014 at 11:03 am

  3. This is an interesting analysis of a unique dataset. As a practicing “data scientist” who works on text mining, I would also caution (as the authors already do) about hoping for a “silver bullet”-type solution based on such work. As we all know, prediction is quite difficult – even in this case of a balanced gold standard set, accuracy is encouraging, but has a long way to go – let alone in the more realistic case of a wildly unbalanced test set, on a different domain, etc. Some additional comments on this blog (in Dutch, via Twitter):

    http://nederl.blogspot.nl/2014/08/je-kunt-ons-alles-wijs-maken-over.html

    Note there is a newer 2014 work by same authors: http://www.academia.edu/6643662/Linguistic_patterns_in_fraudulent_science_writing_style

    From the abstract on the conference site (didn’t have time to ask for/read paper yet) it seems that when executing a similar type of analysis on additional data, fraudulent papers may have lower “readability scores” and less “concrete language” than genuine ones. Again, authors seem to cautiously emphasize the descriptive rather than predictive nature of the work.

    In my humble opinion, nothing will beat U. Simonsohn’s recommendation to “just post” the data. And many fields are definitely moving in that direction.

    Maria

    August 27, 2014 at 11:47 am

  4. Reblogged this on fragmentedvision and commented:
    Do journals have the same tool these researchers used? As long as the false-positive rate is not too high, the journals can use this as a screening method for evaluation of manuscripts prior to sending it out for peer-review?

    harshark1978

    August 27, 2014 at 12:53 pm

    • The results (and features) in the paper above are specific to the work of one researcher (and even in that case, the rate of false positives is high). However, if the follow-up 2014 work shows “readability scores” (for which there are standard measures based on outputs of standard tools) may be a signal of problematic work, a journal could easily compute such a score. The tools used by the featured researchers are well-known tools from previous work (they have significant limitations I won’t go into here though :)).

      A journal may automatically score a submission and have a separate mechanism for handling very low readability ones – the nice thing is that no one needs to be unnecessarily accused of fraud, as an unclear, unreadable submission clearly needs additional scrutiny regardless.

      Maria

      August 27, 2014 at 2:02 pm

  5. It would be interesting to study James Hunton’s 50 or so papers after Bentley University implied possible misconduct to see if a similar pattern to Stapel’s exist.

    https://www.bentley.edu/files/Hunton%20report%20July21.pdf

    Scott Allen

    August 27, 2014 at 2:13 pm

  6. Unsurprising that the usual data-dependent cherry picking is the basis for digging out differences post hoc. Ironically, they resort to data-dredging while they overlook a blatant signal Stapel sent.

    Mayo

    August 27, 2014 at 3:18 pm

  7. In Stapel’s autobiography he claims that he hated inventing data, became nervous whenever he had to do it, and as a result, did it in a hurry.

    This might also apply to when he was writing up papers based on made-up data. It would be interesting to check for the rate of ‘typos’ in the original manuscripts (the final papers will have been copyedited.)

    • I don’t think so Neuroskeptic. As a psychologist I think because Stapels intention was to get more famous and admired he had to submit manuscripts properly, no matter if the data was real or not.

      Betfried van Efget

      September 8, 2014 at 7:34 pm


We welcome comments. Please read our comments policy at http://retractionwatch.wordpress.com/the-retraction-watch-faq/ and leave your comment below.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 35,970 other followers

%d bloggers like this: