First author blamed for retraction in prestigious medical journal

jemcoverThe authors of a Journal of Experimental Medicine have retracted it, blaming the first author for data and figure manipulation.

The paper, “The requirements for natural Th17 cell development are distinct from those of conventional Th17 cells,” was initially published in September 2011 and has been cited 25 times, according to Thomson Scientific’s Web of Knowledge. First author Jiyeon Kim was an MD-PhD candidate at the University of Pennsylvania until this year, according to a LinkedIn profile.

Here’s the notice:

The co-corresponding authors, Drs. Koretzky and Jordan, are retracting this publication due to their discovery that many of the figures prepared by the first listed author do not accurately represent the data from the underlying experiments. The primary issues discovered with this publication are listed below:

(1) The data presented in Fig. 1 A include inappropriate gating, inappropriate controls, and use of different cell populations than indicated.

(2) The data presented in Fig. 2 D include inappropriate gating, inappropriate controls, and use of different cell populations than indicated.

(3) The data presented in Fig. 3 A include the use of a different cytokine than indicated, and the data presented in Fig. 3 D include inappropriate gating.

(4) The data presented in several panels of Fig. 4 C were not derived from the correct congenic allele and inappropriate gates were used.

(5) The data presented in Fig. 5 D were not derived from the correct congenic marker, and the histograms in Fig. 5 E do not accurately depict the cell populations indicated.

As a consequence, any conclusions based on these data remain unsupported. None of the other authors were aware of these inaccuracies at the time of submission and review. All of the authors have agreed to this retraction. We deeply regret this circumstance and apologize for any adverse consequences that this might cause to the scientific community.

We contacted Kim, both principal investigators, and the school to find out more, but have yet to hear back. We also reached out to the editor, and we’ll update with any news.

Update, 1:30 p.m. Eastern, 9/3/14: As a commenter points out, Kim and colleagues retracted another paper on August 28, this one in Nature Immunology. Here’s the notice for “Natural and inducible TH17 cells are regulated differently by Akt and mTOR pathways:”

We are retracting this publication due to the discovery by the authors that some of the results presented in the paper are unsupported. We deeply regret this circumstance and apologize for any adverse consequences that this might have for the scientific community.

All authors agreed to the retraction of the paper with the following exceptions and clarifications: Jiyeon S. Kim and Weihong Hu could not be reached to comment on the retraction; however, J.S.K. signed an initial version of the retraction submitted to the journal.

The paper has been cited 13 times.

44 thoughts on “First author blamed for retraction in prestigious medical journal”

    1. All the problems in the figures are associated with flow cytometry (inappropriate gating, controls) of the samples. So the question is did the supervisor ensure that the student was trained in flow cytometry? Why it wasn’t checked how the samples were being gated and what were the controls used when the preliminary results were obtained? The supervisors need to take interest in the process of deriving the results rather than just seeing the figures.

  1. How can we trust the other articles published by the first author after this retraction with a long list of flawed content? J. Kim is first author on a Nature Immunology paper (2013) and another J. Exp. Med. paper (2014).
    Has there been an institutional investigation to clear these papers?

  2. I am particularly puzzled by this retraction…since this paper does not have a single “first author”, but rather two co-authors. Are both authors involved in these issues? More information would be helpful..

    “J.S. Kim and J.E. Smith-Garvin contributed equally to this paper.” – If you examine “author notes” below the publication on the JEM website.

    1. In the retraction notice is written that “first listed author” was responsible for the misrepresentative figures, and not the first author or first authors.

    1. It is my understanding that in scientific publishing all authors are co-responsible for all content. This is not restricted to the first author(s).

      1. It is my understanding that in scientific publishing all authors are co-responsible for all content.

        If this were genuinely the case, one would not see footnotes that read “equal contributor,” much less the budding of the “who did what” section in the general vicinity of the acknowledgments.

        1. I think contribution and responsibility are two completely different entities. Have you ever seen an publication with a footnote that mentions that two authors have equal responsibility? In the “who did what” section it is generally stated that all authors have read and consented with the final manuscript. Thus, all authors assume responsibility for the complete content of the paper, and not just for the part they contributed.

      2. This is a complete fallacy and I’d hope that that would be well known. The majority of “Authors” listed on biomedical papers provided a tissue sample, plasmid or protein and at most read the manuscript before tacking it onto their CV.

  3. Gotta love the back-pedaling from their student and the paper they gladly put their names on once the problems were exposed. How dishonorable to not assume ANY responsibility whatsoever.

    1. “…inappropriate gating, inappropriate controls, and use of different cell populations than indicated.”

      These are issues that the advisor should be helping the trainee with. Otherwise who is teaching the student how to perform good science? “Use the FACS, talk to Bob and present data in group meeting next week” is not training. It’s possible the lab is dysfunctional and the student had to figure out the FACS machine all by themselves, which is truly very sad. The other is poor OJT from the senior grad student/post-doc/technician, which you would hope a student would notice, but might not always notice until it is too late.

      “…the use of a different cytokine than indicated…”

      “…not derived from the correct congenic allele…”

  4. The Nature Immunology paper has also been retracted.

    Does anyone know the full story here? These are not easy manipulations to spot like Westerns or dodgy microscope images. Why did they start to think that something was up?

    (Apart perhaps from the fact that a PhD student getting two J Exp Med papers and a Nature Immunology paper is a rather unusual occurrence in itself…!)

    1. Indeed, and all within three years. That kind of productivity is suspicious, especially for a student, and should have raised eyebrows.

  5. This student should at least lose her PhD and be banned from publishing as an MD.

    It will be interesting to see how UPenn deals with this…

    1. I believe that there is something very flawed with this logic. Indeed, the student has made errors. But, it is a student, not a person with a PhD. Surely the one’s who should be losing their jobs and being banned should be Dr. Koretzky and Dr. Jordan, who appear to be the direct PhD supervisors. Good (i.e., responsible and professional) supervisors should:
      a) show their PhD students the ropes;
      b) check the data, including the accuracy of figures and if the figures correspond accurately to the data they claim to represent;
      c) take blame in their role of supervisors and collectively as a team.

      The notice states clearly “None of the other authors were aware of these inaccuracies at the time of submission and review.” That implies that the supervisors also failed to do quality control of their own paper, and of the student they were meant to guide and supervise. Unlike what you claim, Upenn needs to consider cluase 4 of the responsibilities of authors, as stated in the new definitions of the ICMJE [see discussion here: 1], although – strangely – JEM does not appear to follow the definitions of authorship as defined by the ICMJE (at least none of the pages associated with the Instructions for Authors covers this issue). In fact, a search for Authorship with the JEM search function reveals no hits, which indicates that authorship, its definition, and the resonsibilities of authors, are not covered by JEM. This is a serious gap and editorial / publisher oversight, in my opinion. Interestingly, there is a good editorial from 2008 in defense of the Creative Commons License that states “Even before the United States existed, copyright law was introduced in Britain to protect authors from publishers’ monopoly on printed material. However, scientific publishers twisted these laws to their advantage, usurping copyright from authors.”


  6. I’ve been keeping an eye on this retraction for a while now, and I’m really pleased that RW finally covered it.

    JEM managed to produce a truly transparent retraction notice, the only problem is that it was written by the corresponding authors, thus it represents nothing but their personal opinion about this case. I’m adamant that not being aware of a grad students misconduct as a PI/corresponding author can not serve as an excuse, but rather it is the admission of failure as a supervisor.

  7. I think there are clearly some facts that are missing that would shed light on this story, and I think people need to evaluate the information at hand a bit more carefully.

    First, there are only 2 first author papers in question here, both of which have been retracted. The listing in PubMed in J Ex Med for 2014 is the actual retraction notice for the 2011 paper.

    Second, I think it’s premature to blame either of the advisors, Drs Koretzky and Jordan, or the co-first author until more details are known about the nature of the mistakes. While I agree that all co-authors are responsible for the content, fraud of the kind described in the retraction would be extremely hard to notice if it was intentional fraud. For instance:

    “…include the use of a different cytokine than indicated” is an example where it would be impossible to know by looking at the data if a different cytokine was intentionally used to mislead.

    Likewise data “not derived from the correct congenic allele and inappropriate gates were used” would be difficult to spot if outright fraud were involved. For instance, a Ly5.1 versus a Ly5.2 gate could show gating on Ly5.1 and the subsequent “daughter” gate could actually be data from the other. High resolution multi-color FACS experiments generate a huge amount of data, and it’s unrealistic to expect any advisor to sit next to the student and ensure the plots reflect the gating strategy depicted. While I agree they should check fidelity of data as much as possible and ensure students are trained, I don’t believe anyone advocates for a 2nd person to watch, for instance, data input into excel spreadsheets. Frankly, if PIs are spending their time on such activities, they are mismanaging their labs. There is a huge difference between ensuring data integrity while trusting students/staff/etc. and being remiss as a supervisor.

    The types of mistakes made in the retraction notice indicate intentional fraud in my mind, although I’m willing to wait for the full story. Again, all of the co-authors are responsible for data, but 16 people across two papers failed to detect the fraud along with everyone who reviewed the manuscript. Ultimately, the supervisors are responsible for the content, but they also seem to have driven the retractions along with detailed accounting of the errors.

    On a purely anecdotal note, I have seen the first author present this data at conferences, and have spoken with her about the data. She spoke quite intelligently not only about FACS in her own work, but in the work of several others at the conference. Whether or not she physically ran the samples herself, she was perfectly aware of what are appropriate flow controls and gating.

    1. I agree with N’s points (September 3, 2014 at 5:51). The journals have given more than usual detail about why the papers were retracted, and the authors appear to have done the right thing by retracting the papers. Of course we are not in a position to be certain that all the statements made are correct, but the more information that is provided, the confidence I have in the process.

    2. ” I think it’s premature to blame either of the advisors, Drs Koretzky and Jordan”
      It is premature to blame anyone. At the moment we have the senior lab members pointing at a junior lab member: “It was all his/her fault”
      Generally the person who physically performed the experiments can never escape blame, but it is not unknown for the senior people to say: “This is how they do it in other labs….”
      At very least the senior people have given the junior a project where the hypothesis they proposed, which arose out their preceding work, was false. And as pointed out above, they seemed quite happy with the high level of productivity as well.

  8. It is quite common to a PI to design a hypothesis, employ a PhD/postdoc for a pre-defined project, stress the implications of proving the hypothesis experimentally (big paper/career prospects) vs failing to prove it (no paper, no future). When affirmative data is brought in, the PI is happy, if results don’t match, the PI blames it on the employee’s shortcomings. Oh, and without “good” results the grant proposal which would pay the employee’s salary is in danger.
    So here is a question of ethics: is the PI co-responsible when the data turns out to have been manipulated? I think yes.

    1. @ Leonid Schneider.. I think that is very well put and you wrote exactly what I was thinking. I would attribute all the data fabrication and research misconduct that we see these days to this single trait of certain PIs. They create an environment that is perfect fertile ground for fabrication.

      1. And by extension, departments and universities are to blame for fostering these traits in newly minted PIs. I’ve seen young PIs’ behaviors change from doing careful science with active involvement at the bench to retreating to their offices and only poking their heads out to boast about a new glamor mag acceptance or grant. The emphasis on impact factor (e.g. my former department had websites and talks dedicated to the publicizing of “high impact factor publications” – by extension, the rest didn’t matter) and immediate translational relevance by department heads, program directors, and deans is indirectly to blame for the degradation of the scientific integrity in individual labs.

        1. I believe that this is an extremely pertinent observation, in particular the “abuse” of the impact factor (IF). No matter how the proponents try and swing mass opinion, it is a fact that the IF is used to game the system. This is a recurrent theme and underlies so much about what drives misconduct, and partially retractions. So, we need to understand the power-plays that underlie these events, and which may be the direct causes, even if in part. Let’s not be naïve. Science and publishing-related business is a multi-billion dollar investment and revenue-creator, so the interests stated are not quite black on white. Problem 1: the scientific community is failing to provide a comprehensive listing of those institutes that factor in the IF score into grants, research funding, bonuses and salaries. Until we get such comprehensive lists of a university-by-university basis, Thomson Reuters will just sit back and see scientists and the universities that impose non-sensical rules upon them fight among themselves. For Thomson Reuters, whether science implodes or not is likely irrelevant, because their web of resources on the globe is so vast and so broad, that if science implodes, they just move their investments elsewhere. However, Thomson Reuters can see that science, despite these power plays and struggles, will survive, and is investing heavily now in expanding its metrics, to make the “game” all the more complex, and thus attractive. Take any Elsevier journal, for example, that carries an IF score. In plant science, let me chose Scientia Horticulturae: until the end of June (approx.), before the 2013 IF scores were released, one would have only observed the IF score and also, where available, the 5-year IF score. Now, one sees 4 metrics to measure the journal: Source Normalized Impact per Paper (SNIP); SCImago Journal Rank (SJR) [1]. Translated, total crap, meaningless numbers that are simply meant to inflate the egos of scientists into thinking that what they have published, and where they have published it, is actually of some worth, simply because it carries a number. But while scientists and “academic” institutes continue to lend their direct support to the IF, they only fortify the relevance and importance of Thomson Reuters on this planet. Seeing that this page is about the White House, could some US scientist please indicate here the campaign donations that Thomson Reuters has been making to Republican and Democratic candidates over the past 3-5 years? What “action groups”, PACs and/or super-PACs do they support, if any? What about Elsevier and other publishers? I understand that unlimited donations are not illegal in the USA. So, one has to look behind this request by the White House into transparency in science, because it doesn’t quite gel with the current spirit of honesty in the US Government, I believe. Current events over the past 5-10 years would support my latter notion. When a government starts to ask such questions, the question one has to ask, is why the sudden interest, and in whose interest is being satisfied by this?

    2. It is quite common to a PI to design a hypothesis, employ a PhD/postdoc for a pre-defined project,


      Just how ‘common’ it? It does not work like that in computational sciences. A PhD student is actually expected to come up with his own “hypotheses”, not to merely implement or apply models proposed by others.

      1. This is quite common in biological scinences. Most of the times PI comes with hypothesis on which he write grant and then he/she ask gradute students and postdocs to prove that hypothesis with experiments. If they fail to prove the hypothesis then PI blames their technical skills. But if they show positive results by any means then PI will be happy becacuse now he/she will get manuscripts, tenure, new grants. So as said earlier this way they create fertile ground for data fabrication. If later on somebody finds the scientific misconduct then just put all the blame of these junior researchers.

        1. In theory, your hypothesis makes perfect sense. Why else would PIs, or professors, aim to conduct science that a priori would lead to negative results which could not be published except for in the Negative Results Journal (or journals with a similar title)? So, the theoretical basis of your claim will unfortunately be true, but this might not necessarily be a result of the dishonesty of the scientist, or the PI / supervisor / professor, but may be as a direct result of the corrupting factor by the publishers and the publishing industry. Allow me to explain why in a bit more detail.

          Most publishers would most likely not be able to sell a journal if the results they published therein contained negative data, because no-one wants to read about negative data. Think about it, in practice, how much money would a scientist or university pay for subscription to a journal that was filled with negative data vs one that was filled with positive data? And even though negative data is absolutely essential, and most likely accounts for the great majority of results that emerge from any experiment or lab, they (i.e., scientists) will only tend to report on the positive results (while ignoring, or not reporting on the negative ones) simply because, I believe, the publishers have created a toxic climate of “unreasonable positive thinking” that only tends to accept a paper that has “novel” and “positive” results. Thus, I believe the publishers are also responsible for indoctrinating the editors and peer pool too. In other words, it is the skewed impetus created by publishers (who are now seriously threatened by alternative models like f1000 Research (paid OA), the All Results Journals (free OA) [1], etc.) that is also driving the academic “bias” and/or fraud (in extreme cases) I believe. If you remove this impetus, this skewed motivation, and tell scientists that their positive AND negative results will both be equally considered upon submission to a journal, then you might actually see more “honest”, balanced and unbiased research come out in the literature. I think that most scientists can claim that so many of their experiments did not work, and in fact, in many ways, a lot of the “negative” results are “disguisedly” reported in published data sets (either as significantly worse than the best treatment, or even the control), but they are not portrayed in a negative light simply because they are contrasted against “the best, “the most significant” or the most “effective” treatments. So, there is not only bias in the data set itself, and in what it states, but also bias in the way in which scientists report on the negative results (by only trying to emphasize the positive ones). In that sense, the use of statistics can be used to strengthen or weaken that bias or reporting, depending on the “desired” outcome.

          What my rather wordy claim above is trying to state, in a more synthetic way, is:
          a) Publishers have created an ambience of “peer review” that favors positive results over negative ones.
          b) Scientists are stuck in an “ethical” conundrum: do they emphasize the positive at the expense of the negative in order to generate a paper which is likely to only be accepted if it shows clearly the latter? Why is this issue not covered by COPE, as a sub-set of “honest” reporting?
          c) Understandably, there will be down-stream effects in academia when such a set of conditions exists, i.e., with such a biased or skewed publishing environment (driven by “positive thinking” capitalism): grants will only be given to PIs, or labs, that generate papers. The greater the achievement (aka Nature), the greater the grant (also taking into account into the equation the gambling factor, the impact factor). The core of the problem is thus quite simple, and easy to understand, but difficult to erode, or substitute.
          d) Ultimately, given the “toxic” nature of the field of publishing into which science then finds itself inserted (i.e., manipulated and controlled by the publishers, who then also control scientists’ voices and freedoms), is it surprising to then learn that PIs would encourage (or force) their students to “design a hypothesis, employ a PhD/postdoc for a pre-defined project” (FooBar), or the multiple claims that Mathew makes?

          With respect to the superiority of the current publishers, it is the words of Mickemusing that resonate loudest and most clearly: “forced positivity and constant (often inappropriate) cheer leading are just insidious ways to get out of taking responsibility while assuming moral superiority.” July 29, 2014 [2].


  9. From the nature of the description of the retraction (especially in the 2011 JEM study), it appears that the first author (first listed author) is at fault. It seems like she went above and beyond to manipulate data in ways that no PI would be able to detect even if they spent hours scouring every aspect of the experiment. While I agree that PI’s are often at fault, this is SIMPLY NOT THE CASE in this instance. I feel terrible for the PI’s involved who are being punished for reasons that are beyond their control, and quite frankly unfair.

    1. Dear KungFuNinja,
      a scientific project takes several years between early ideas and publication. The PIs had during this time a lot of occasions to wonder about the excessive productivity of their PhD student, whose results had to be unambiguously perfect in order to allow two publications in top-tier journals in such a short time. Those who work in a lab know how rare such results are, yet how much they are welcomed by many PIs. Are you actually aware that PIs sometimes do not even want to know the details of how experiments were performed and quantified, so they can enjoy the perfect results without any bad conscience?

      1. Dear Leonid,

        The PI’s could have wondered all they want, but based on how flow cytometry works, no reasonable person would have been able to figure out the manipulations of the first author. My guess is that extremely strong suspicions were necessary before the PIs even thought to go back and look at raw flow cytometry data. Kudos to them for doing that and coming forward with it.

        1. An additional problem: blind faith. Any experienced or responsible PI will know never to trust anything, or anyone. That means, to trust the work of the first author, or a student, who is so inexperienced, and then use the student as the scape-goat to avoid assuming a hit, while then trying to appear as the hero by “bringing forth” the evidence of the manipulated FC data is just plain naive. The PIs, as for any supervisor, professor or senior author, must be directly responsible for the mistakes made by junior scientists, even more so when those individuals are mere, inexperienced students. No one is condoning the manipulation by the student in this case. But man up and take responsibility PIs! You get paid a big salary precisely because some of your core responsibilities are to be RESPONSIBLE for your students, to OVERSEE their activities, and to GUIDE them (from experimental exception, conducting phases, data analysis and paper write up). There were ample opportunities for quality control BEFORE the paper was published, but these were clearly squandered. The result of student manipulation AND poor supervisory responsibilities = retraction. Students should not be throw under the bus to then expect two strata of justice when the #$”& hits the fan. An almost identical analogy could be made about the Obokata case. In that case, Obokata was the junior (the invited) while Sasai and Niwa were the senior, and responsible. It is usually the PIs and senior researchers who get the fat salaries and the juicy grants, some of which is trickled down to the student or post-doc. The greater these values, the greater the responsibility for oversight. Seems like these really basic issues are being ignored.

    2. ” I feel terrible for the PI’s involved who are being punished”

      How are the PIs being punished? I have missed this.

      I guess my point that if apportioning blame is the right thing to be doing in such circumstances, then my personal feeling is I lack the confidence to do so at this point. Since there is large power imbalance between the last authors and the first author.

      I agree with LabGuy that it is very easy to fake flow cytometry (actually it is very easy to fake a lot of biology methods – only fools use photoshop), but since this group’s results are heavily flow cytometry based it means we can’t look through their past papers to get any idea if this problem was a one-off or not.

      We could get a little more confidence perhaps if we knew the events leading up to this retraction. For example if a member of the group noticed problems and went to the PIs who supported this internal whistleblower – in that case we might tentatively say it was a one-off problem. But even then a really comprehensive investigation might want to know if the first author had been reporting negative results and what the response of the PIs was to this.
      On the other hand, if the retraction had been driven from outside with other groups trying replicate the work and saying they were finding nothing like what had been reported at all – then we would still be in the dark.

      But for me personally, I don’t feel it is fair, in the absence of any impartial investigation that genuinely tries to get the accounts of all parties involved, to simply accept the word of the PIs as to where responsibility lies.

  10. In all actuality they probably went through her data once she left lab, opened up the flow jo files and saw that the plots she used did not match the actual channels in the file. This may have been because they were suspicious in general of the student, tried to reproduce data and were unsuccessful or were reviewing data for a new member of the lab to continue the project. There is no way that one could know just by looking at the finished figures that they were false. The discovery was made by someone who knew what they were looking for. I’m glad that when I was a student a new postdoc tried to reproduce my FACS stainings and they worked (everyone is always nervous that things work only in their hands, its the nature of biology). Sucks that a lot of labs probably spend a lot of money and resources ($) following up on this line of research.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.