Strata Academy
Regression essentials for paper appraisal
Linear, logistic, and survival models – what authors should report
Quick answer
Match the regression model to the outcome type, insist on effect sizes with 95% CIs, and scrutinise how confounders were chosen. Adjusted associations are not causal without design support.
1. Why regression appears in clinical papers
Regression models estimate associations between predictors and an outcome while adjusting for other variables (confounders). They appear in most observational papers and many secondary analyses of trials.
The model type must match the outcome: continuous → linear regression; binary → logistic regression; time-to-event → survival models (e.g. Cox proportional hazards).
Mis-specified models – linear regression on bounded scores, logistic regression on correlated clusters without adjustment – produce misleading coefficients.
Appraisal question: does the model answer the same question the abstract claims?
2. Linear regression
Coefficients represent change in the outcome per unit change in the predictor, holding other variables constant. Units matter: 'per year of age' vs 'per decade'.
Check whether outcomes or predictors were transformed (log, square root) and whether that transformation was justified.
Assumptions include linearity, independent errors, and homoscedasticity (roughly equal residual spread). Authors rarely report assumption checks – note the gap.
Prefer coefficients with 95% CIs and a clear statement of which variables were entered together in the final model.
- Report coefficients with 95% CIs
- Clarify units (e.g. mmHg per year of age)
3. Logistic regression and odds ratios
Logistic regression models log-odds of a binary outcome. Exponentiated coefficients are odds ratios (ORs).
Odds ratios approximate relative risks when outcomes are rare (<10%). With common outcomes, ORs exaggerate effects compared with risk ratios or risk differences.
Distinguish adjusted ORs from unadjusted comparisons. Ask which covariates were included, whether they were prespecified, and whether the model was built in one step or stepwise.
Stepwise selection without external validation inflates false positives – common in exploratory observational papers.
4. Survival analysis
Kaplan–Meier curves describe time-to-event in the presence of censoring. Log-rank tests compare curves without adjustment.
Cox proportional hazards models estimate hazard ratios (HRs) adjusting for covariates. HRs are relative instantaneous risks – not the same as risk ratios at a fixed time point.
Check censoring assumptions, proportional hazards (Schoenfeld residuals or stated tests), and whether competing risks matter (death vs recurrence, for example).
Immortal time bias and time-varying exposures require specialised models – a naive Cox model may be invalid.
5. Appraisal checklist
Authors should state how variables were selected – clinical knowledge and protocol pre-specification are stronger than data-driven stepwise without validation.
Missing data must be handled explicitly (complete-case, multiple imputation, inverse probability weighting). Complete-case deletion can bias adjusted estimates.
Interactions should be pre-specified; post-hoc interaction claims are exploratory.
For cluster trials or repeated measures, ask whether clustering was accounted for (mixed models, robust SEs, GEE).
- Is the model type appropriate for the outcome?
- Are effect sizes and CIs reported for all key predictors?
- Is missing data handled in the model or via imputation?
- Are interactions pre-specified or exploratory?
Interactive version (quizzes, walkthroughs) loads when JavaScript is enabled.