Strata Academy

Clinical vs Statistical Significance Explained

P-values, confidence intervals, minimally important differences, and absolute effects — when a significant result still does not matter clinically

Quick answer

Statistical significance means the observed effect is unlikely due to chance alone (often p < 0.05). Clinical significance means the effect is large enough to matter to patients — judged by absolute effects, minimally important differences, and guideline thresholds. A result can be statistically significant but clinically trivial, or clinically important but imprecise (wide CI).

1. Two different meanings of 'significant'

In statistics, 'significant' usually means the p-value falls below a pre-specified alpha (commonly 0.05) — the data are incompatible with a null effect assuming the model is correct. This is a mathematical statement about chance, not about patient benefit.

In clinical practice, 'significant' means the effect is large enough to change management, burden, or outcomes that patients care about. A 2-point reduction on a 100-point scale may be statistically significant with n = 5,000 but clinically meaningless if the minimally important difference is 10 points.

Medical students must use both lenses: statistical inference (is there an effect?) and clinical interpretation (does it matter?). Examiners and journal clubs reward this distinction.

2. Limits of p-values alone

P-values depend on sample size. Large trials can detect trivial differences as statistically significant; small trials may miss clinically important effects (low power).

P-values do not measure effect size. p = 0.001 does not mean a large clinical benefit — only that the observed effect is precise enough to reject the null.

Multiple comparisons inflate false positives without adjustment. Subgroup p-values in post-hoc analyses are hypothesis-generating, not confirmatory.

3. Confidence intervals bridge statistics and clinical judgement

A 95% confidence interval shows the range of effects compatible with the data. If the interval spans from trivial benefit to large harm, the estimate is imprecise — even if the point estimate is statistically significant.

For binary outcomes, ask whether the CI for risk difference or NNT includes values that would change practice. For continuous outcomes, compare the CI to the minimally clinically important difference (MCID).

In meta-analysis, wide pooled CIs trigger GRADE imprecision downgrades — statistical and clinical significance frameworks connect here.

4. Absolute effects: ARR, NNT, and events per 1,000

Relative risk reduction sounds compelling ('50% reduction!') but hides baseline risk. Absolute risk reduction (ARR) and number needed to treat (NNT) translate effects into patient terms.

Example: RR 0.75 with control event rate 4% → absolute reduction 1% → NNT 100. Whether NNT 100 is clinically worthwhile depends on treatment cost, harms, and alternatives.

GRADE Summary of Findings tables present absolute effects per 1,000 for this reason — read them before the abstract conclusion.

5. Minimally clinically important differences (MCID)

For continuous outcomes (pain, disability scores, quality of life), field-specific MCIDs define the smallest change patients perceive as beneficial. Compare mean differences to MCID, not only to zero.

MCIDs are context-specific: the same point change on a scale may matter in chronic pain but not in a surrogate laboratory marker.

Anchor-based and distribution-based methods exist for deriving MCIDs — cite established thresholds from guidelines or validation studies when available.

6. Clinical vs statistical significance in meta-analysis

A pooled odds ratio whose diamond barely excludes 1 may be statistically significant but clinically weak — especially if absolute event rates are low.

Heterogeneity complicates interpretation: a significant pooled effect may not apply to all patient subgroups represented in included trials.

Forest plots show statistical precision; clinical importance requires absolute effect translation and GRADE certainty — see our meta-analysis and SoF guides.

7. Appraisal workflow for journal club

Read the primary outcome result. Note p-value and 95% CI. Calculate or locate absolute effect and NNT. Compare continuous outcomes to MCID or guideline threshold. State whether you would change practice for a typical patient in your setting.

Document imprecision: if the CI includes both meaningful benefit and meaningful harm, the trial is inconclusive for practice regardless of p-value.

For systematic reviews, repeat per outcome in the SoF table — do not rely on a single significant secondary endpoint in the abstract.

  1. Identify primary outcome and pre-registration status
  2. Record effect estimate, 95% CI, and p-value
  3. Translate to absolute effect or compare to MCID
  4. Assess precision — would another trial plausibly shift conclusion?
  5. State clinical bottom line separate from statistical conclusion

Interactive version (quizzes, walkthroughs) loads when JavaScript is enabled.