Nutrition paper claims intervention cuts child obesity. Experts disagree.

Does incorporating gardens and their harvest into school-based nutrition programs help children get healthier? A 2017 paper claims it does, but a group of outside experts disagrees — strongly.

The 2017 paper reported that adding gardens to schools and teaching kids how to cook the harvest, among other elements, helped kids learn about nutrition — and even improved their body mass index, a measure of body weight.

However, soon after the paper appeared, a group of outside experts told the journal the data reported by the paper didn’t support its conclusions — namely, the authors hadn’t shown that the intervention had any effect. The authors performed an inappropriate analysis of the data, the critics claimed, and the paper needed to be either corrected or retracted outright.

But the Journal of Nutrition Education and Behavior has not amended the paper in any way. Instead, last month, it published the outside experts’ criticism of the paper, including their explicit calls to either correct or retract it, along with the authors’ response to the critics.

David Allison, the last author on the critical letter and the dean of the school of public health at Indiana University Bloomington, said he was surprised to see the journal chose to publish his critical letter, but not alter the paper itself:

These are just straight facts that this is the wrong analysis, and the analysis doesn’t support the conclusions made…I’m still hopeful that the journal and editor will choose to correct this in some way. I don’t know if that’s the case, but I’m hopeful…We think if a mistake has been pointed out… then I think you do have an obligation to fix it.

The editor of the Journal of Nutrition Education and Behavior, Karen Chapman-Novakofski, declined our request for an interview. First author Rachel Scherr at the University of California, Davis told us:

We do agree that this was pilot study, and this needs further evaluation in a larger-scale multicomponent study that includes several schools in several districts.

She added:

At this time we are not planning for a retraction or correction as we had responded to the concerns of the critic in the response letter.

Digging for data

This is not the first paper Allison has tried to publicly correct; in 2016, Allison described in Nature his attempts to work with journals over problematic papers, with mixed results.

The 2017 paper focused on an intervention known as the Shaping Healthy Choices Program (SHCP). When initially reported, the findings received media coverage from outlets such as the Daily Mail and Medscape.

In 2014, Scherr and her colleagues published a methods paper, describing how they would test whether the SHCP program had any effect. In short, they would test the intervention in four schools from two school districts; each district included one school that implemented SHCP, and one control school. By combining the data from both districts, they could determine whether the program had an effect.

The approach was lauded by editor Chapman-Novakofski in a 2014 comment:

The paper by Scherr et al…is a good example of the detail and format for JNEB’s Methods papers. Want an example of how and why to determine sample size? Read this paper. Thinking about how to recruit schools to be in your study? Read this paper. Need an example of a detailed LOGIC model? You guessed it — read this paper.

The problem, according to Allison and his co-authors, is that Scherr and her colleagues didn’t follow their original methods when making their claims about SHCP’s effects. When the authors combined the data from both districts, they saw no statistically significant effect of SHCP on kids’ body mass index (BMI). But when the authors analyzed each district separately, they did.

Scherr told us she and her colleagues analyzed the two school districts separately because each implemented SHCP differently:

In our sub-analysis of the school districts, we did see that in the district that implemented all aspects of the program there was a significant change in BMI percentile and the district that did not implement all aspects of the program there was a non-significant for change in BMI percentile, which was also published in the paper. We were very clear to indicate that this was only in the one district and not across both. We did see significant improvements to nutrition knowledge and vegetable identification across both schools…The sub-analysis was not part of the original methods paper because we did not plan or anticipate the differential implementation at the two school districts.

But some of the messaging around the paper is less subtle. The title of the paper is “A Multicomponent, School-Based Intervention, the Shaping Healthy Choices Program, Improves Nutrition-Related Outcomes;” in an accompanying press release, Scherr said:

The dramatic decrease in BMI, although unexpected in this short time frame, demonstrated that the SHCP was effective due to positive health messages and reinforcing nutrition concepts throughout the school and home environments.

“The report should be corrected or retracted”

According to Allison, the authors simply can’t make the claim that the program had an effect on kids’ BMI based on their analysis. For instance, if researchers placed signs on one college campus encouraging people to walk more, and compared behavior to another campus that lacked those signs, they could say the signs were associated with an increase in walking among students. But they couldn’t argue the signs caused the healthy behavior without adding more campuses, Allison explained, which helps eliminate local factors that could independently influence the results.

In their letter, Allison and his colleagues note:

…the analyses by Scherr and colleagues are unable to assess the effect of the SHCP, and so conclusions stating that the data demonstrated effects of the SHCP on BMI are unsubstantiated. Therefore, the report should be corrected or retracted. Since July 7, 2017, we have made efforts to explain the issues and offered our support to advise on reanalysis or even to recalculate the statistical analyses ourselves if needed. The authors did not pursue our offers.

Allison and his colleagues weren’t the only ones to publicly criticize the paper; last month, Columbia University statistician Andrew Gelman posted a blog about the paper, noting:

So you’re clear that it’s a pilot study, but you still released a statement saying that your data “demonstrated that [the treatment] was effective”???

Noooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

According to Allison, this isn’t just about correcting the literature — data related to public health topics can have real-world impacts. “If a school invests [in SHCP], they don’t invest in something else,” he said, such as books, access to the flu vaccine, or scholarships:

Anytime you choose to invest in something, there’s an opportunity cost. In that sense, it’s important.

Like Retraction Watch? You can make a tax-deductible contribution to support our growth, follow us on Twitter, like us on Facebook, add us to your RSS reader, sign up for an email every time there’s a new post (look for the “follow” button at the lower right part of your screen), or subscribe to our daily digest. If you find a retraction that’s not in our database, you can let us know here. For comments or feedback, email us at [email protected].

This improper finding is a great example of p-hacking that currently plagues the scientific literature. Consultation with a knowledgeable statistician would have spared all this hand wringing. That should ideally have happened at the authors’ institution(s), barring that, from a competent statistical reviewer.

A proper analysis that will include assessment of a subgroup would entail an initial omnibus test, on the whole data set, using a model that would show subgroup estimates as well. If the initial omnibus test for any difference is not significant, then it is inappropriate to drill down into the subgroups hunting for something, an example of an inappropriate p-hacking exercise. The omnibus test protects against faulty multiple comparisons missteps, ensuring the overall stated significance level is appropriate.

In this case the omnibus test failed to demonstrate a difference

“The effect of the intervention in the combined-district model (Beta = -4.63; P = .26; 95% confidence interval [CI], -12.6 to 3.36) was not significant”

and analysis should have ended there. Drilling down into subgroups after the whole group test fails is inappropriate. The reported subgroup p-value is not adjusted appropriately for multiple comparisons.

Since this was a pilot study (as the authors admit in their reply to Wood et al’s critique), and had no a-priori power analysis to ensure adequate sample size to detect effects of a scientifically relevant size, failure to reject the null hypothesis yields an inconclusive finding. The data from this pilot study could be used to calculate sample sizes sufficient to enable detection of an effect of some relevant size with reasonably high power in a future study, but since the omnibus test in the pilot yielded a large p-value, no conclusions can be drawn from this pilot data.

5 thoughts on “Nutrition paper claims intervention cuts child obesity. Experts disagree.”

Abdurrahman Güner says:

April 18, 2018 at 11:20 am

The motto “publish or perish” seems to override the moral of abstention from misleading effect of purely statistical data analysis based conclusions. The data size may be insufficient in size and scope, without any critical discussion on the adequacy of the data, especially in case of data related to complex human attributes. This occurs (not happens) when ardent researchers, ignore some of the factors without any test on the significance of all probable factors, including factor interactions, affecting the response(s), in combination with the publishers’ looking for materials to publish.

Eli Rabett says:

April 18, 2018 at 12:11 pm

The headline on this post is misleading. “Nutrition paper claims intervention cuts child obesity. Experts disagree”.

From what is written here the critics claim that the paper did not analyze the data properly and could show no effect, not that there was no effect, moreover the authors of the original paper appear also to be experts.

Steven McKinney says:

April 18, 2018 at 5:28 pm

This improper finding is a great example of p-hacking that currently plagues the scientific literature. Consultation with a knowledgeable statistician would have spared all this hand wringing. That should ideally have happened at the authors’ institution(s), barring that, from a competent statistical reviewer.

A proper analysis that will include assessment of a subgroup would entail an initial omnibus test, on the whole data set, using a model that would show subgroup estimates as well. If the initial omnibus test for any difference is not significant, then it is inappropriate to drill down into the subgroups hunting for something, an example of an inappropriate p-hacking exercise. The omnibus test protects against faulty multiple comparisons missteps, ensuring the overall stated significance level is appropriate.

In this case the omnibus test failed to demonstrate a difference

“The effect of the intervention in the combined-district model (Beta = -4.63; P = .26; 95% confidence interval [CI], -12.6 to 3.36) was not significant”

and analysis should have ended there. Drilling down into subgroups after the whole group test fails is inappropriate. The reported subgroup p-value is not adjusted appropriately for multiple comparisons.

Since this was a pilot study (as the authors admit in their reply to Wood et al’s critique), and had no a-priori power analysis to ensure adequate sample size to detect effects of a scientifically relevant size, failure to reject the null hypothesis yields an inconclusive finding. The data from this pilot study could be used to calculate sample sizes sufficient to enable detection of an effect of some relevant size with reasonably high power in a future study, but since the omnibus test in the pilot yielded a large p-value, no conclusions can be drawn from this pilot data.

DWalker says:

April 19, 2018 at 12:20 pm

It’s too bad that “interested laypersons” like me can’t read the criticism or the response.

1. GNelson says:
  
  May 1, 2018 at 1:04 pm
  
  If I understand the intent of DWalker correctly, a visit to the local library and use of “inter-library loan” available free of charge to interested laypersons and others would be of benefit. Individual articles from virtually any professional or scientific journal can be obtained free-of-charge.

Share this:

Related

5 thoughts on “Nutrition paper claims intervention cuts child obesity. Experts disagree.”

Leave a ReplyCancel reply