Scale whose copyright owner defends zealously falls under scrutiny — and journal takes two years to publish a critique

Donald Morisky

As long-time readers of this blog know, we’ve spilled more than a few pixels on the work of Donald Morisky. His Morisky Medication Adherence Scale (MMAS) has been a financial boon to himself — and the bane of many researchers who have been forced to either retract papers or pay Morisky what they consider to be exorbitant fees to retroactively license the instrument.  

But lately things have been a bit rocky for Morisky. Last year, he and his former business associate (read, legal enforcer) found themselves embroiled in a lawsuit which claims, as we reported, that Morisky used: 

their company as a personal piggy bank and taking steps to starve the business of clients and funnel money to his family. 

And now, a researcher has questioned the validity of the MMAS, arguing that his review of a foundational paper underpinning the instrument shows serious flaws. 

In August 2019, Michael Ortiz, of the University of New South Wales, in Sydney, Australia, emailed the Journal of Clinical Hypertension, a Wiley title, about a concerning finding in a 2008 paper by Morisky and colleagues on the MMAS-8, an updated version of Morisky’s original scale and the one in widespread use today. 

The paper has been cited well over 1,000 times. Ortiz said he discovered that: 

This article seems to contain misleading information in describing the Sensitivity and Specificity of the MMAS-8 questionnaire. 

I respectfully request that the A.S.P.E.N. policy be followed and that the author be approached to explain how the Sensitivity and Specificity values in this article were calculated as well as explain any discrepancies. 

Ortiz also noted that the authors of the 2008 paper declared no conflicts of interest — which clearly ignores Morisky’s considerable financial stake in the MMAS. 

The journal thanked Ortiz for his letter, then did nothing. Ortiz, who said he vetted his concerns with other researchers, including Mark Bolland of New Zealand, reiterated his concerns in June  2020 to Jiguang Wang, the editor in chief — noting additional “inconsistencies” with the MMAS-8. Among these, he wrote: 

The S&S [sensitivity and specificity] values described in the 2008 article have not been replicated in any other published MMAS-8  study and there are more than 100. This suggests that there is either an error in the numbers used to calculate S&S or there is an error calculating  S&S. That is why I requested that appropriate procedures for serious errors be followed. I respectfully request that the A.S.P.E.N. policy be followed and that the author be approached to explain how the Sensitivity and Specificity values from this article were calculated as well as explain any discrepancies. 

The implications of a serious error could be profound given the extensive use of this instrument over the last 10 years and the financial investment and profile of the first author. 

Wang responded that the publication would “communicate with the author” about the questions. That might have happened, but Ortiz didn’t hear back from the journal until March 2021, when he sent another email complaining about the long-delayed follow-up to his concerns. Wang replied: 

I forwarded your emails to my colleague in Wiley. I believe that they will look at this matter very seriously.

In April of this year, Ortiz received word that the journal would, finally, be publishing his 2019 letter to the editor — but that, too, was delayed because Morisky and his co-authors did not respond in a timely fashion. Oh, and the journal messed up.

Ortiz forwarded us a June 24 email explaining the lag: 

The estimated time for submitting a response to the letter to the editor has been due already and the invitation has been canceled because of the overdue.

We were unable to export the manuscript to the publication due to a system error regarding the article type.

Rest assured that your manuscript will be exported to the publication as soon as the error has been rectified.

That error must have been a doozy, because a month later Ortiz’s letter still remained unpublished, prompting yet another email from the researcher asking for a status report. In early August, the journal replied to say that:

the system error has been sorted out by our team and your manuscript has been exported to production. 

You will be receiving the final proof for editing before publication. 

The letter indeed appeared online in August. And although Morisky and his co-authors did not respond to the journal, they did respond to our questions about Ortiz’s critique. 

Alfonso Ang, a statistician at UCLA and co-author of the 2008 paper, told us:

I hope his critique does not confuse readers, make false claims, and set research and technology back 40 years!  Thanks!

Ang and Morisky, now a professor emeritus at the UCLA Fielding School of Public Health, said: 

In a nutshell, M. Ortiz degraded the MMAS scale which ranges from 1 to 8 to a binary (yes/no) analysis – and of course, it would underestimate values due to loss of information. This is an issue that has been around for over 40 years (i.e., the general consensus is don’t do it!) 

Outside the nutshell, Morisky wrote

It is unfortunate that the author (M. Ortiz) misunderstood and misapplied the numbers reported in the 2008 paper. The numbers presented were raw percentages and as such should not be used in the calculation of the sensitivity and specificity of the AUC (area under the curve) and ROC (Receiver Operating Characteristic). The MMAS scale ranges from 1 to 8, and for descriptive purposes, we presented the percentages in a 2 x 3 table. Ortiz made false assumption that this 2 x 3 table was the actual scale and erroneously dichotomized this into a 2 x 2 contingency table. The MMAS scale should not be degraded into a yes/no scale, but rather the original scale should be appropriately used in a multiple logistic regression model. From this multiple logistic regression analysis, the predicted values were generated and the AUC ROC estimated. These analyses can be performed using standard statistical packages such as SAS and R. The sensitivity and specificity were calculated based on the predicted values of the ROC – this cannot be hand calculated from the erroneously collapsed raw percentages. This study was not based on a 2 x 2 model design but considers many sociodemographic factors in the analysis. Not only would the 2 x 2 Chi-Square yield the wrong sensitivity and specificity since the model was misspecified, it also lacked context – both methodologically and conceptually. Methodologically, the logistic regression needs to control for sociodemographic factors such as age and income, education, etc. Conceptually, the MMAS scale does not exist in a vacuum but should be analyzed in the clinical context which include patient characteristics, clinical setting and patient chronic conditions. 

The original study was mainly on patients with hypertension. The 2017 meta-analysis paper includes patients with other medical conditions aside from hypertension, such as diabetes, osteoporosis, myocardial infarction, seizure, etc. Translated versions were also sometimes used. As such, the meta-analysis results show an expected heterogeneity. The sensitivity and specificity measures are diagnostic assessment tools meant to help clinicians. In general, the higher the sensitivity, the lower the specificity and vice versa. There is a tradeoff between achieving high sensitivity vs high specificity. Depending on the context, higher sensitivity may be preferred over specificity in some cases, and in other cases, higher specificity would be preferred over sensitivity.

This issue had been brought up more than 40 years ago (see Cohen, 1983), which discussed the huge cost of dichotomization (i.e., loss of statistical power, loss of precision of scale, etc.) and there is rarely any justification for dichotomizing when the true scale is available (MacCallum et al, 2002.) Based on the flawed hand calculated unadjusted sensitivity and specificity, the author makes a totally faulty conclusion saying that the “measure may be no more accurate in detecting patients with uncontrolled BP, than tossing a coin to decide.” This is totally false due to his faulty simple analysis and assumptions, plus the fact that he completely misunderstood what the function of sensitivity and specificity were. In order to make that conclusion, one should examine the entire ROC curve at various cutoff points. The cutoff point of 6 is just one suggested cutoff point. Various cutoff points of the ROC should be examined and possibly perform some likelihood ratio tests in order to assess the viability of the scale. In the meta-analysis, the AUC ranges roughly from 0.6 to 0.7 on average in most of the studies. This is greater than 0.50, which means the measure may be useful in predicting adherence. The meta-analysis implied possibly using higher cut off points to increase either sensitivity or specificity. There are also other factors to consider when assessing the scale – the reliability, validity, factor loadings and variance explained by the scale which the critique failed to mention. The sensitivity and specificity of the measure can vary among different studies depending on the study population and patient characteristics, and is just a diagnostic tool and should not be miscalculated or misused to invalidate measures as suggested by the author.

When shown Morisky’s response, Ortiz said:

I agree with a lot of what he says, but it is what he doesn’t say that is important.

He did not provide one number to defend his claims of 93%, 53% and 80% for Sensitivity, Specificity and Accuracy.

He should have these numbers as they are in the output of his logistic regression (Binary outcome).  He has specified the cut point on the ROC as 6  (so he can read off the y value at 6 i: Should be 0.93 and the x value is (1- 0.53) and the area under the curve (c statistic) should be 0.8.

Unfortunately when I solved the four simultaneous equations, the results indicate calculation errors. These equations depend on: study sample size, sensitivity, specificity and accuracy % values ONLY and do not include any results. They do not use the proportions reported in the study and there is only one possible mathematical solution.

In addition,  Professor Morisky forgot to mention that many of the articles, in which he is a co-author, use 2 x 2 tables to estimate Sensitivity and Specificity. Just like he forgot to mention he had a conflict of interest in the MMAS-8. 

Ortiz also noted that he has found similar issues with a 2017 meta-analysis in PLoS ONE by Morisky and colleagues on the MMAS-8 — a paper which in 2018 was corrected to reflect a missing conflict of interest statement from the researcher noting his financial relationship with the tool — and that he has alerted the journal to those concerns:

In my opinion some corrections will be required and depending on the response more action may be appropriate.

If he fails to respond or co-operate, then the response should be more severe.

Like Retraction Watch? You can make a one-time tax-deductible contribution or a monthly tax-deductible donation to support our work, follow us on Twitter, like us on Facebook, add us to your RSS reader, or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at team@retractionwatch.com.

6 thoughts on “Scale whose copyright owner defends zealously falls under scrutiny — and journal takes two years to publish a critique”

  1. I read the 2008 paper and found it to be muddled. It seems they had created an 8-item scale to assess the level of medication adherence in patients with hypertension. However, they never showed that it in fact does this. Their reported results suggest that the MMAS-8 is associated with other factors that are associated with medication adherence. So what?
    The sensitivity and specificity (just percentages reported, no raw numbers or confidence intervals!) were for whether the patient’s blood pressure was “under control,” based on two blood pressure measurements taken five minutes apart. Again, so what? If you want to know if the patient’s blood pressure is under control, you taken the patient’s blood pressure, not ask them 8 questions. Morisky’s response that he used the AUC/ROC from the logistic regression to get the sensitivity and specificity doesn’t make sense since that modelled which factors are associated with “medication adherence,” not predicting blood pressure that is out of control.

  2. I remain astonished that after all Morisky’s harassment of authors with demand for excessive payments and forced retractions, the fact that he has such a glaring ethical breach in claiming no conflict of interest hasn’t been a lightening rod. Many papers have been retracted for less.

  3. There seems to be problems with the Sensitivity and Specificity values from the MMAS-8 in this study, as well as many other studies using the MMAS-8.

    There seems to be an obvious calculation error, as the mathematical solution for sensitivity and specificity is implausible.

    All that is required is for the author to provide their data analysis that supports the 93% sensitivity and 53% specificity an 80.3% accuracy.

    Unless the authors can provide evidence to support these values then they need to remove these values from their publication.

    All this requires is a simple ROC from the study as these values can be determined off the ROC using the cut point of 6 on the ROC.

    Professor Ang acknowledged that the AUC values were between 6 and 7 for most MMAS-8 studies. Based on the rule of thumb below, this suggests that the MMAS-8 is a poor diagnostic tool. This is also supported with the Moon et al 2017 systematic review. It is important not to confuse acceptable psychometric properties of the MMAS-8 with its poor diagnostic test accuracy.

    The area under the ROC curve (AUC) results were considered excellent for AUC values between 0.9-1, good for AUC values between 0.8-0.9, fair for AUC values between 0.7-0.8, poor for AUC values between 0.6-0.7 and failed for AUC values between 0.5-0.6.

  4. The nutshell response attributed to Ang and Morisky, and the three-paragraph response attributed to Morisky are odd in light of what was published by Morisky, Ang et al. in the Journal of Clinical Hypertension in 2008. That article presents information that was used by Ortiz in his 2021 letter to the editor to derive the 2 by 2 table. The Morisky et al. article states that “all possible cutpoints were examined” and “final cutpoints were chosen based on the relationship with blood pressure control, so that the medication adherence scale could provide useful information in a clinical setting” (p. 350). It was Morisky et al. that dichotomized the adherence scale at =6 (p. 350) and the only reasonable implication from the article is that this was the basis of the reported sensitivity of 93% and specificity of 53%. But, as Ortiz indicated, these estimates do not correspond to the 2 by 2 table that can be inferred from what was reported by Morisky et al.

    Morisky (or another author) needs to provide information about the multiple logistic regression model that putatively produced the results reported in the seminal artcle. Were there unnamed variables in the model beyond adherence that improved the prediction of blood pressure control? If so, why were these variables not indicated in the paper? Why was it implied that the prediction of adherence could be done using just the 8-item adherence scale?

  5. Professor Morisky has not responded to my concerns.

    They has not presented any evidence that supports the “reported” S&S values.

    There should be no difference in the total number of patients with uncontrolled BP between the “actual” (Table 4 Morisky et al 2008) and the “reported” total numbers derived from the reported S&S values.

    The total number of Non adherent patients should be the same for the “actual” and the “reported” total non adherent patient numbers. There was a large unexplained difference between the “actual” and the “reported” total number of non adherent patients which suggests that they used a cutoff of 8 (and not 6) in the reported S&S values.

    Since Morisky et al have failed to explain any of these obvious inconsistencies, they should withdraw their article until they can produce properly analysed data that support their S&S values of 93% and 53%. Alternatively the Editor in Chief, who has been sitting on his hands for three years, could finally attach a warning to this publication.

  6. Is this the End of the MMAS-8 as a tool to screen for Medication Non Adherence?
    The failure to correct errors in a 2008 article by has become such a sad story of a dysfunctional Journal peer review system and a failure to follow procedures:
    • Journal Editors did not follow their own procedures and then compounded this by failing to act despite compelling evidence of errors.
    • A lack of co-operation by the Authors is contrary to the “University Code of Conduct”. UCLA Policy 993: Responding to Allegations of Research Misconduct which states that:
    1. “All persons engaged in Research at UCLA are responsible for adhering to the highest standards of intellectual honesty and integrity. Those who supervise Research have a responsibility to create an environment that encourages those high standards through open publication and discussion, emphasis on Research quality, appropriate supervision, maintenance of accurate and detailed Research procedures and results, and suitable assignment of credit and responsibility for Research.”
    2. “All members of the UCLA community are expected to cooperate in reporting suspected Research Misconduct and in responding to Allegations by acting in Good Faith, providing Research Records and other relevant information, participating in Research Misconduct Proceedings, and refraining from Retaliation or interference with a Research Misconduct Proceeding.”
    Context
    The MMAS-8 was developed in 2008 because the MMAS-4 had borderline psychometric properties:
    • The MMAS-4’s criterion related validity for sensitivity and specificity was 81% and 44%, respectively.
    • Cronbach’s alpha reliability was 0.61, which is below the acceptable level of 0.7.
    Despite only having fair psychometric properties, the MMAS-4 was used in a large number of studies because it is easy to use.
    A modified eight item Morisky Medication Adherence Scale (MMAS-8) was developed by adding four additional items to the original four item Morisky scale. The authors reported an exceptional improvement in its psychometric properties: sensitivity and specificity are 93% and 53%, respectively and Cronbach’s alpha value of 0.83. Consequently the use of the MMAS-8 has become common in various clinical settings in order to identify patients at risk of poor medication adherence leading to treatment failure.
    Three years ago I observed that several authors were incorrectly claiming that the MMAS-8 measured medication adherence. I considered MMAS-8 scores to be a surrogate outcomes at best and decided to review whether it was a valid screening tool for medication non adherence.
    The History
    When I searched the Medication Adherence literature, I found that the MMAS-8 has been extensively validated as a predictor of a broad range of surrogate biomarkers and endpoints. I was also surprised that the MMAS-8 was rarely validated against a medication adherence measures like pill counts or PDC or MPR from prescription claims data. Most of the MMAS-8 studies used biomarkers like BP control or HbA1c thresholds
    What also surprised me was inconsistencies in the way that MMAS-8 scores were applied in many of these studies. The MMAS-8 3 x 2 matrices used to calculate S&S were not always consistent and in order to correct for these inconsistencies, I recalculated all the S&S values from first principles using the same method as Lee et al (2017). I also used REVMAN 5.3 to generate Summary ROC curves followed by the trapezoid method to calculate their AUC.
    This seemed to work well until I encountered the Morisky et al (2008) article which was mysteriously excluded by Moon et al (2017) from their S&S calculations. It took me several days to realize that Morisky et al had made an error in their reported S&S values and it took me even longer to understand the implications of their error.
    I wrote to the Editor of the Journal of Clinical Hypertension in August 2019 and he did not even acknowledged my email. I wrote to the Editor again one year later (June 2020). The new editor wrote back to me and requested that I send a Letter to the Editor explaining my concerns. I sent a Letter to Editor in August 2020 and the editor acknowledge receipt. He responded that: “I will try to convey this important message to the authors. I hope that they will respond. If yes, the Journal of course is willing to publish the correspondence and correct any possible error(s). This is important for the Journal and for the scientific community.”
    The Editor sought a response from the authors, however Morisky et al (2008) failed to provide any evidence to support their Sensitivity, Specificity and Accuracy claims for the MMAS-8 (93%, 53% and 80%). My letter was finally published in August 2021.
    Dr Ang responded to a separate RetractionWatch commentary about the MMAS-8 error saga in 2021. He did not include any patient numbers or actual results. He could have been easily resolved this matter with the ROC curve and the C statistic from the statistical output (that is, had it been conducted).
    In a systematic review of the MMAS-8 by Moon et al (2017), in which Professor Morisky was a co-author, clearly stated that: “using the cut-off value of 6, criterion validity was not enough good to validly screen a patient with nonadherence to medication.”
    More recently I found a post on ResearchGate by Professor Morisky, which described how Lee et al (2017) which explained the calculation of the S&S values from a 3 x 2 matrix. The methodology described in Professor Morisky’s post of Lee et al (2017), was identical to the method used to calculate the Sensitivity and Specificity applied by Ortiz (2021). These data produced the actual S&S values of 38% and 75% respectively, with an accuracy of 53%. There was no description in Morisky et al article of the actual patient numbers or how the Sensitivity. Specificity or accuracy values (93%, 53% and 80% respectively) were derived. It is obvious to most epidemiologists and biostatisticians that the two S&S results come from two different patient populations or two different MMAS-8 cut offs (6 vs 8).
    Considerations
    I am concerned that unless the Journal Editor attaches a warning to this article, that researchers will continue to use the MMAS-8 despite its being compromised. Furthermore, the majority of the patients classified as non adherent, are actually adherent.
    The problem for Professor Morisky is that if he corrects his error, then the MMAS-8 performance will appear to be worse than the MMAS-4. All four authors have benefited professionally and/or financially from the MMAS-8, so it came as no surprise that they did not respond to the Journal Editor.
    What disappoints me the most, is that Professor Morisky understands the implications of their error. He is showing scant regard for academic integrity in favor of his self interests. I am concerned that unless the Journal Editor steps in and attaches a warning to the Morisky et al article, researchers will continue to use the MMAS-8 tool.
    Conclusions
    The important to differenciate between falsification, fabrication, plagiarism and negligence, and whether or not there is intent to deceive. When researchers intentionally deceive their colleagues by falsifying information, fabricating research results or using others’ words without acknowlwdgement, then they are violating fundamental research standards. These actions are a clear violation of scientific standards because they undermine the trust on which scientific research is based.
    Based on the evidence above, it seems that Professor Morisky has not followed UCLA policy. Furthermore, Professor Morisky’s refusal to cooperate may be sufficient to be considered academic misconduct by the University.
    I also find it interesting that this research received Government funds and there are severe penalties for the misuse of Government funds. This research was supported by:
    • the National Heart, Lung, and Blood Institute, award number RO-H251119 and ,
    • in part by grant number R01 AG022536 from the National Institute on Aging,
    In addition, if the research was funded by Government and the Authors were paid their salaries by their Universities, is it reasonable that the Copyright belongs to the first author?
    Given that Professor Morisky requires researchers to correct or retract their publication if they fail to pay him a royalty for the use of the MMAS-8. It seems reasonable to expect a Journal Editor to follow a similar pathway to address Professor Morisky’s refusal to cooperate and to correct his serious error.
    Based on the evidence above, it seems that Professor Morisky has not followed UCLA policy. Therefore, I believe that Professor Morisky’s actions should also be assessed for academic conduct.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.