How critics say a computer scientist in Spain artificially boosted his Google Scholar metrics

Juan Corchado

Want a higher h-index? Here’s a way – but be warned, it’s a method that will raise some eyebrows.

Take the example of Juan Manuel Corchado, a computer scientist at the University of Salamanca in Spain. He has the 145th-highest h-index in the country. But many of the nearly 39,000 citations are by him to his own work.

This conference abstract, about the Internet of things and blockchain for smart cities, for instance, cites 44 references to Corchado’s own papers out of a total of 322 references. While this conference abstract, presented to a conference about artificial intelligence in educational technology in Wuhan, China, in July 2021, contains the exact same references as the one about blockchain for smart cities.

Other examples of short conference abstracts by Corchado listing dozens of citations to his own previous papers also exist. 

It’s not clear why anyone would try to inflate their Google Scholar metrics in Spain, according to Alberto Martín-Martín, an information scientist and bibliometrician at the University of Granada in Spain. Research evaluation there still focuses heavily on the Journal Impact Factor, said Martín-Martín.

“In other countries like Italy, I know that the evaluation procedures take into account citation counts of individual documents, but in Spain that’s not so much the case,” he added. 

“The self-citation in this case is certainly extensive,” says Simone Ragavooloo, research integrity manager at the British Medical Journal in London, UK. “Whilst I would hesitate to call this behaviour suspicious or harmful, it looks as though the author has not understood the purpose of citation and attribution in scientific literature.” 

Corchado told Retraction Watch on February 16 that he has broken his arm and will be slow in sending us his comments. He has not responded to further requests for comment since then.

Martín-Martín said that by his calculations, just under 22% of Corchado’s citations are to his own papers that are listed on Google Scholar — around 8,400 out of nearly 39,000 citations. 

The main drawback of Martín-Martín’s approach is that not all of Corchado’s papers are listed on his Google Scholar page, making it difficult to accurately determine the exact rate of self-citation. 

According to Martín-Martín’s analysis, more than 11,000 citations to Corchado’s work come from papers posted on ResearchGate, out of which 9,300 are from papers that are not available anywhere else. Some researchers whose papers cite Corchado’s work on ResearchGate appear to do so excessively in several papers. 

One of those is Arturo Perez Pulido of Televant (Telecom Ventures) in Spain who has cited Corchado’s work just under 4,000 times till date. For instance, all 40 references in this conference paper of Pulido’s on ResearchGate cites Corchado’s papers. “He barely receives any citations, but he makes a lot of them,” Martín-Martín says, referring to Pulido. 

We weren’t able to find contact details for Pulido. 

Even if someone is deliberately excessively citing certain papers and uploading them to Google Scholar, it’s difficult to point a finger at the culprit, Martín-Martín says. “It is very easy for anyone to upload documents that include references to repositories, and these references could be pointed to any paper,” he says. “In a similar manner, anyone could copy texts from any paper and try to pass them as their own in unmoderated repositories such as ResearchGate.”

Petr Heneberg of the Charles University in Prague, who has conducted research about citation manipulation, says he doesn’t see anything wrong in uploading one’s own abstracts or presentations to Google Scholar. He added: “The WOS citation counts of this person contain 1,300 self-cites, which could appear as a high number, but the total number of cites is 13,656, so it is well below the acceptable limit — some 10% is ok, even more self-cites would still be ok.”

What Heneberg found more surprising was Corchado’s publication record. He notes that although Corchado published a doable number of papers recently — 25 papers in 2021, for instance — his numbers were significantly higher before. For instance, in 2009, Corchado co-authored a striking 603 papers — most of them conference papers. 

Nicholas Robinson-Garcia, a social scientist at the University of Granada who has penned a study showing how easy it is to manipulate citations on Google Scholar, agrees that the trends are suspicious. “I believe the harm depends on the importance we place in tools such as Google Scholar for evaluative purposes,” he says. 

Robinson-Garcia said one researcher even contacted him after he published his Google Scholar paper asking for advice on how to boost his citations. He didn’t reply, “but some time later checked him out and he had definitely done it,” he said. “I’ve also seen some cases of compulsive self-citation including teaching materials, but I do not know how much is pure egomania and how much is specifically directed at boosting their citations, or both.”

Robinson-Garcia adds: “If these behaviours are not unique to this individual, then it reflects serious problems with the excessive attention paid to bibliometric indicators at the individual level in Spain. The Spanish system is an anomaly in Europe (maybe also along with Italy) in the fact that it has an evaluative system centralised at the national level which assesses individual performance. This puts such a burden in the administration that it has to rely on bibliometric indicators to be efficient, which harms the quality of such evaluation.”

Like Retraction Watch? You can make a one-time tax-deductible contribution by PayPal or by Square, or a monthly tax-deductible donation by Paypal to support our work, follow us on Twitter, like us on Facebook, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at team@retractionwatch.com.

19 thoughts on “How critics say a computer scientist in Spain artificially boosted his Google Scholar metrics”

  1. I definitely couldn’t write 603 articles in a single year, even if someone else already did all of the data collection/analysis. Although I’m sure co-authors account for much of the work, I wonder if there isn’t a shred of ‘recycling’ going on somewhere in that vast body of work.

    1. Where does that 603 articles in 2009 come from? Even on Google Scholar I see “only” less than 200 and it seems to me that some of those are actually different persons with the same or similar name.

      1. This is a good point, so I would like to clarify that my use of ‘603’ is only a reference to the text.

        There is also the possibility that this is mixed results from research gate or some other indexing service, or that scholar is missing conference results or something.

    2. I was rather surprised when 25 papers in one years was mentioned as a “doable” amount. If someone publishes 25 papers in one year, I wonder how much they actually contributed to each.

    3. Well, it is more than just “recycling”. If you look at Corchado’s papers referenced below, you will see that he has published the same paper (the text is almost the same in all versions) in four different journals:
      – Florentino Fdez-Riverola, Juan M. Corchado: FSfRT: Forecasting System for Red Tides. Appl. Intell. 21(3): 251-264 (2004)
      – Florentino Fdez-Riverola, Juan M. Corchado: FSFRT: Forecasting System for Red Tides. A Hybrid Autonomous AI Model. Appl. Artif. Intell. 17(10): 955-982 (2003)
      -Florentino Fernández Riverola, Juan M. Corchado: Forecasting red tides using an hybrid neuro-symbolic system. AI Commun. 16(4): 221-233 (2003)
      – Florentino Fernández Riverola, Juan M. Corchado: CBR based system for forecasting red tides. Knowl. Based Syst. 16(5-6): 321-328 (2003)

  2. I have to say I’m somewhat disappointed by Corchado’s strategy of boosting his h index! He could do better! How? Get a couple of like-minded chumps into an authorship cartel by making everyone co-author on everybody’s papers, even if they haven’t lifted a single finger for the research reported in that paper. Ideally, have some hapless and contract-dependent grad student write the actual papers, but cede first author position to whoever of the “seniors” has been promised this honor (and then have the grad student do his own data collection/writing work on the side, in addition to spending her or his paid work time on this scheme). Now if everyone involved also cites everybody’s work in such papers, you can boost your h index to the moon. So from this perspective, Corchado is actually an under-achiever.

  3. We all know you can buy citations from papermills now, right? They’ll churn out a load of fake papers in ghost author names just to boost your cites. And even if the papers get retracted, the cites still count. Can’t lose.

    This system of excellence = citations is a disaster

  4. Everybody boost they h-index by self-citations, it’s a non-secret. that’s why h-index scopus (not scholar) without self-citations are mandatory in my university for confirmation or promotions. This guy went little over lol.

    1. True, but Scopus is also a misery: you need to actively feed it for it to work, and since they claim it’s automatic, not many people do. I have published around 65 papers. Scholar has all of them, Semantic Scholar (good midway point between Scholar and Scopus in my view) is missing a couple. Scopus has 45. And I am not talking about conference papers, in my field we don’t do that. Also not in predatory journals. Scopus is just bad at indexing, period.

      And the problem is that sometimes academic authorities use Scopus without telling you, for things where it’s important. We got news one day that our faculty was using this for determining funding between departments. Which is the first time I checked my Scopus profile. It turns out that 15 of my papers, some as a postdoc and some with several other authors, were all listed at a university where I have actually never worked but where some of the coauthors were located. 10 emails and a few months later, they were all listed as mine and in my institution, but still.. I am not opposed to them using Scopus, but the limitations should be emphasized and people should be told it will be used so they can check completeness.

  5. Calling Arturo Perez Pulido, come in Arturo Perez Pulido. Are you a sock puppet, Arturo Perez Pulido?

      1. If you take a closer look at these “conference” papers, they don’t even cite these papers in the text.
        This whole scheme is not only an exploitation of the metric system, but also of the algorithm behind that identifies references.

  6. I just have one question before I lose my mind: How common is this?

    We are literally putting some of the most creative minds in the planet to compete against bots for who write the longest intertwingled list!! And, we are still promoting the bot’s work!

    Bots/Automated Requests should be detected with the latest and greatest technology we have available. As a software developer myself, Google Scholar’s “Inclusion Guidelines for Webmasters” document indicates to me that the crawling mechanism is rudimentary.

    Once the bar is risen, it’s going to be pointless to try to pull this little stunt, because it will immediately raise red flags. We can evaluate the reputation of the source in the citation, which would shift the complexity of an attack from a stupidly repetitive one, to one where the system evaluates using multiple criteria to give a score with the probability of your profile being a real person (should always start at 1 (max) and decrease once cross-data that indicates that the account is a bot deducts points from this score, up to a suspend margin) – but this process also has a feedback loop, as the attacker can always figure out what combination of inputs and actions can fly below the radar.

    After we have perfected the fight against bots, the majority of the community (presumably majority of real-humans now) to regulate itself through voting. All things considered, the community will always be majority against the remaining bots (similar to bitcoin majority attack), at which point an attempt to rig the system should be so extraordinary, that it would be obvious to an admin, that could intervene.

    Science deserves this, Google Scholar is a strong influence to thought on an ever-more connected world, and they should stand up for this responsibility by getting it rid of the plague that are bots.

    1. I can’t edit my comment, but the part where I say about “voting”, the concept I meant wasn’t “votes”, but the citations and citations relationship itself, which would be self-regulated and fair to rank articles.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.