Decoding the Hype

Identifying real evidence in health & nutrition studies

John (The John & Calvin Podcast)

Health and nutrition research is messy.

  • No study is flawless. Every design has limits.
  • Headlines & influencers turn nuance into clickbait.
  • The Fix:
    study design • basics stats • common pitfalls

Overview

  1. The Research Question
  2. Study Population
  3. Study Design (Power)
  4. Study Structure
  5. How Results Are Reported
  6. Interpreting Results
  7. Controlling for Other Factors
  8. Misleading Studies & Influencers

The Research Question

Main goal of the study

  • Drives study design, statistics, sample size
  • Includes measured metric (e.g., BP change, risk)
  • States the primary outcome (vs. exploratory or secondary)

Best practice (especially in RCTs): pre-register outcomes and analysis plan

Example (published RCT)
“We tested the hypothesis that long-term supplementation with omega-3 fatty acids would reduce cardiovascular events in this population.”
NEJM, 2012

The Research Question

Questions to Ask

  • What single question is the study trying to answer?
  • Is the primary outcome clearly stated—and the right one for that question?
  • Are secondary outcomes labelled as exploratory, or being oversold?
  • Does this question matter to you?

Study Population

Population matters for relevance
Not all studies are done on humans, and not all humans are like you.

  • Cells (in vitro)
  • Animals (in vivo)
  • Humans (in vivo)

Study Population

Questions to Ask

  • Who or what was actually studied?
  • Why did the researchers pick that population or model?
  • How similar are these subjects to you or the group you care about?

Study Design

What a study can claim depends on its design.


Observational

• Cross-sectional
• Prospective cohort
• Case-control

Interventional

• RCT
• Pre–post (single-arm)

Evidence Synthesis

• Systematic review
• Meta-analysis

Other Study Designs (reference)


Observational

• Case-control – with vs without outcome
• Case report / series – detailed look at few patients
• Ecological – group-level data only
• Retrospective cohort – past records follow exposure → outcome


Mixed / Natural

• Longitudinal – same subjects over time
• Natural experiment – exposure assigned by external factors

Interventional

• N-of-1 trial – single participant, alternating treatments
• Cross-over trial – each subject receives all treatments
• Before–after study – compare pre- vs post-intervention


Synthesis

• Systematic review – structured summary, no pooling
• Umbrella review – review of systematic reviews

Study Design

Sample Size & Power

  • Power (1 − β): the probability of finding an effect if it really exists
    Usual target: 80%

  • Grows with bigger N, larger effect, lower noise

  • Calculated before the trial to size it properly
  • Set from the minimum clinically important difference (MCID) (ideally)

Study Design

Sample Size & Power

  • Too small → wide CIs, missed effects
  • Too large → trivial but ‘significant’ findings
  • Rule-of-thumb study sizes
    • < 50 per arm = pilot / feasibility
    • 50 – 300 per arm = typical nutrition RCT
    • > 300 per arm = large, clinically robust
  • Power targets the primary outcome; secondaries often under-powered

Study Design

Questions to Ask

  • What type of study is this: observational, interventional, or a synthesis?

  • Can this design support a causal claim, or only association?

  • Is the control / comparison group appropriate?

  • For meta-analyses, are the pooled studies similar enough?

  • Was the sample size justified, and did they report a power calculation for the primary endpoint?

Study Structure

  1. Abstract
  2. Introduction
  3. Methods
  4. Results
  5. Discussion
  6. Conclusion
  7. Supplementary

Study Structure

Questions to Ask

  • Does the Abstract match what’s really in the paper?
  • Are the Methods detailed enough to judge rigor?
  • Are Results shown with raw numbers, effect sizes and confidence intervals?
  • Are any Results missing or highlighted more than reasonably?
  • Do the Discussion/Conclusion stay within the data’s limits?

How Results Are Reported

Population Metrics

  • Counts – raw tally
  • Percent / Proportion – share of the population
  • Rates – cases per person / time
  • Standardised rates – adjusted for age / other factors
  • Incidence vs prevalence – new vs total cases

How Results Are Reported

Group difference

  • Percent change
  • Mean difference

Risk comparison

  • Absolute risk (AR)
  • Relative risk (RR)
  • Odds ratio (OR)
  • Number Needed to Treat (NNT)

How Results Are Reported

Time-to-Event Metrics

  • Incidence rate
  • Hazard ratio (HR)
  • Survival curves

Metrics in Meta-analyses

  • Forest plot
  • Standardized Mean Difference

How Results Are Reported

Questions to Ask

  • Which metric is used: count, rate, risk ratio, mean difference, etc.?
  • Is that metric appropriate, or could it mislead (e.g., relative vs. absolute)?
  • Does it answer the main research question?
  • Could the metric exaggerate or downplay the finding?

Interpreting Results

P-values & Confidence Intervals


Is this effect real? How exact is it?


Concept What it tells us Quick example
Statistical significance (p-value) Chance vs. real effect? Supplement ↓ BP 5 mm Hg, p = 0.03
Precision (95 % CI) How exact is the estimate? 5 mm Hg (CI −8 to −2)

Interpreting Results

The p-value

  • Threshold test → typically p < 0.05
  • Smaller p → stronger evidence against chance
  • A p-value means little unless you also know the effect size
  • p ≠ probability the result is true

“5 mm Hg decrease, p = 0.03” → significant

Interpreting Results

Statistical vs. Practical Significance


Finding (effect) p-value Interpretation
−0.15 kg (12 wk) 0.001 Statistically significant; clinically trivial
−8.5 mm Hg systolic BP 0.09 Not statistically significant; could matter if real


Interpreting Results

Questions to Ask

  • Is the result statistically significant and by how much?
  • What does the confidence interval say about size and precision?
  • Is the effect big enough to matter in real life?
  • If it’s borderline, could high variation, small N, selective data processing or multiple testing explain it?

Controlling for Other Factors

Goal: isolate the effect of one variable while accounting for others.

Coffee & Heart Disease
Control for smoking so coffee isn’t blamed for smokers’ risk.

How?

Include the other factors in the statistical model.

Controlling for Other Factors

Example Cohort Study

Question: Does coffee raise heart-disease risk?

We record coffee cups/day, smoking, age, and heart-disease outcome.


Model with controls
Heart disease = Coffee + Smoking + Age + error
→ Estimates the coffee effect independent of smoking & age

 

Model without smoking
Heart disease = Coffee + Age + error
→ Smoking still influences both coffee & disease → confounding

Controlling for Other Factors

Under the Hood


\[ Y = \alpha \;+\; \beta_1 X_{coffee} \;+\; \beta_2 X_{smoke} \;+\; \beta_3 X_{age} \;+\; \varepsilon \]

Symbol Plain English
\(Y\) Outcome (heart disease)
\(\alpha\) Baseline when all X = 0
\(\beta_{1,2,3}\) Effect of each variable holding the others constant
\(\varepsilon\) Random noise / unexplained variation


More math? See the linked PSU resource

Controlling for Other Factors

  • Works only for variables we measured and included.
  • Unmeasured confounding can still bias results.
  • Randomized trials help by balancing both known and unknown confounders.

Controlling for Other Factors

Questions to Ask

  • Did the study adjust for the key known confounders?
  • What important factors might be missing or unmeasured?
  • Is the adjustment method explained and sensible?

Core Questions to Ask

  1. Is the main question clear, and is it the one you care about?
  1. Study type: does it address causation, or only association?
  1. Who / what (and how many!) was studied, and how relevant is that to you?
  1. Metric chosen: what does it reveal and what does it hide?
  1. Result strength: significant? How large, precise, and practically useful?
  1. Red flags: stats ‘tricks’, sensational claims, or conflicts?

Studies Can Mislead

  • Small samples inflate effects
  • Placebo isn’t inert
  • Publication bias
  • Funding or pet theories
  • Multiple comparisons (p-hacking), cherry-picking subgroups
  • Self-reported or recall data
  • Lack of blinding
  • Surrogate ≠ clinical outcome

Influencers Do Mislead

  • One study ≠ truth
  • Petri-dish (mechanism) hype
  • Diet ≠ morality
  • Relative risk without absolute numbers
  • Association ≠ causation
  • Anecdotes ≠ strong evidence
  • Claim comes from headline, not published study

Three Things to Remember

  1. Start with the study’s main question
    Know the real question, the headline / influencer might not.


  1. Know what kind of study you’re looking at
    Association ≠ causation. RCTs are strongest, but every type has limits.


  1. Don’t skip the statistics
    A result is only as strong as the methods and statistics behind it.

Follow-up on Vegan Twin Study