Strata Academy
How to critically appraise a research paper – complete walkthrough
A step-by-step workflow for students: PICO, study design, risk of bias, statistics, limitations, and structured summary for journal club or dissertation
Quick answer
Critical appraisal steps: (1) define your PICO question, (2) identify true study design, (3) apply the matching framework (ROB 2 for RCTs, ROBINS-I for non-randomised interventions, QUADAS-2 for diagnostics, AMSTAR 2 + PRISMA for reviews), (4) evaluate statistics with CIs not p-values alone, (5) judge applicability to your setting.
1. Why critical appraisal matters
Reading a paper is not the same as trusting it. Critical appraisal asks whether the study design, conduct, analysis, and reporting are strong enough for the conclusions the authors draw.
Supervisors, journal clubs, and clinical guidelines all depend on this skill. A structured approach stops you from being swayed only by a significant p-value or a confident abstract.
In coursework and theses, examiners look for explicit framework use – not a vague paragraph saying 'the study had some limitations'.
- Validity – could the results be explained by bias or chance?
- Precision – how uncertain are the estimates?
- Applicability – does this study answer your patient or research question?
2. First pass: title, abstract, and PICO
Start with the abstract, but never stop there. Identify the population, intervention or exposure, comparator, outcomes, and study design (PICO/PECO).
Ask: Is this paper attempting to establish causation, describe association, measure diagnostic accuracy, or synthesise existing studies? The answer determines which appraisal tool you need next.
Check trial registration (ClinicalTrials.gov, ISRCTN) and protocol documents if the paper is a trial or review – compare registered outcomes to published outcomes.
- Who was studied, and can you generalise to your setting?
- What was done or measured?
- What outcomes matter for your decision?
- Is the design named correctly (RCT, cohort, case–control, cross-sectional, systematic review)?
- Who funded the study and are conflicts declared?
3. Match study design to the right framework
Using the wrong checklist is the most common student error. Randomised trials need ROB 2, not tools built for cohort studies. Systematic reviews need AMSTAR 2 and PRISMA – not ROB 2 applied to the review as if it were a trial.
Use our interactive framework picker on the guides hub if you are unsure after reading the methods section.
- RCT → ROB 2 + CONSORT reporting
- Non-randomised intervention → ROBINS-I + STROBE
- Cohort / case–control → NOS or design-specific tools + STROBE
- Diagnostic accuracy → QUADAS-2 + STARD
- Systematic review / meta-analysis → AMSTAR 2, PRISMA, ROBIS, GRADE
4. Appraise risk of bias domain by domain
Official tools break bias into domains (e.g. randomisation, deviations from intended interventions, missing outcome data). Work through signalling questions from the official tool rather than gut feeling.
Distinguish risk of bias from reporting quality. A poorly written paper may still be low bias if methods were sound; conversely, polished writing cannot fix fundamental design flaws.
For each domain, note your judgement and one sentence of justification – examiners and journal club audiences expect reasoning, not only a traffic-light colour.
- Selection bias – who entered the study and who was analysed?
- Performance bias – were groups comparable during the intervention?
- Detection bias – could outcome assessment differ between groups?
- Attrition bias – is missing data related to outcome?
- Reporting bias – are all prespecified outcomes reported?
5. Evaluate the statistics (not just the p-value)
Check whether the analysis matches the design: logistic regression for binary outcomes, survival methods for time-to-event data, paired tests only when pairing exists.
Look for effect sizes with confidence intervals, not only p-values. For trials, prefer intention-to-treat analyses unless there is a clear and justified per-protocol secondary analysis.
For subgroup analyses, ask whether they were pre-specified. Post-hoc fishing without multiplicity adjustment is a red flag.
- Was the sample size justified a priori?
- Are confidence intervals reported for main estimates?
- Is multiple testing acknowledged or adjusted where needed?
- How was missing data handled?
- For reviews: was heterogeneity explored (I², τ², prediction intervals)?
6. Interpret results in context
Separate statistical significance from clinical or practical importance. A large sample can make trivial differences significant; a small sample may be underpowered even if p < 0.05.
Read the limitations section critically, then add your own – especially generalisability, confounding, and whether outcomes were patient-centred.
Consider whether the discussion overstates causation from observational data or extrapolates beyond the studied population.
7. Summarise for your supervisor or journal club
A good appraisal ends with a plain-language verdict: strengths, main biases, key numbers, and whether you would act on this evidence for your question.
Structured tools encode this workflow: study-type routing, framework-aligned domains, and explicit scoring – so your appraisal is reproducible and auditable.
- One-sentence study aim in your own words
- Design + framework used
- Top 2–3 strengths and top 2–3 concerns
- Bottom line for your PICO question
- What would change your mind (future evidence)
8. Appraising systematic reviews differently
When the paper is a systematic review, add PRISMA flow reconciliation, AMSTAR 2 quality, risk of bias in included studies, and GRADE certainty. The failure modes are different from single-trial appraisal.
See our systematic review methodology guide for the full workflow.
9. AI tools – use with caution
Chat tools can summarise text but often mismatch frameworks and invent checklist items. For coursework, disclose AI use and verify every claim against the PDF.
Framework-aligned appraisal tools route study type automatically – compare AI chat output to structured appraisal on the same paper.
10. Practice with a structured workflow
Pick one paper from your reading list each week. Appraise it with the same template until domain thinking becomes automatic.
Upload the PDF to StrataResearch quick analysis and compare your manual ROB 2 or AMSTAR worksheet to the structured output – disagreement is where learning happens.
Interactive version (quizzes, walkthroughs) loads when JavaScript is enabled.