Strata Academy

How to critically appraise a research paper – complete walkthrough

A step-by-step workflow for students: PICO, study design, risk of bias, statistics, limitations, and structured summary for journal club or dissertation

Quick answer

Critical appraisal steps: (1) define your PICO question, (2) identify true study design, (3) apply the matching framework (ROB 2 for RCTs, ROBINS-I for non-randomised interventions, QUADAS-2 for diagnostics, AMSTAR 2 + PRISMA for reviews), (4) evaluate statistics with CIs not p-values alone, (5) judge applicability to your setting.

1. Why critical appraisal matters

Reading a paper is not the same as trusting it. Critical appraisal asks whether the study design, conduct, analysis, and reporting are strong enough for the conclusions the authors draw.

Supervisors, journal clubs, and clinical guidelines all depend on this skill. A structured approach stops you from being swayed only by a significant p-value or a confident abstract.

In coursework and theses, examiners look for explicit framework use – not a vague paragraph saying 'the study had some limitations'.

2. First pass: title, abstract, and PICO

Start with the abstract, but never stop there. Identify the population, intervention or exposure, comparator, outcomes, and study design (PICO/PECO).

Ask: Is this paper attempting to establish causation, describe association, measure diagnostic accuracy, or synthesise existing studies? The answer determines which appraisal tool you need next.

Check trial registration (ClinicalTrials.gov, ISRCTN) and protocol documents if the paper is a trial or review – compare registered outcomes to published outcomes.

3. Match study design to the right framework

Using the wrong checklist is the most common student error. Randomised trials need ROB 2, not tools built for cohort studies. Systematic reviews need AMSTAR 2 and PRISMA – not ROB 2 applied to the review as if it were a trial.

Use our interactive framework picker on the guides hub if you are unsure after reading the methods section.

4. Appraise risk of bias domain by domain

Official tools break bias into domains (e.g. randomisation, deviations from intended interventions, missing outcome data). Work through signalling questions from the official tool rather than gut feeling.

Distinguish risk of bias from reporting quality. A poorly written paper may still be low bias if methods were sound; conversely, polished writing cannot fix fundamental design flaws.

For each domain, note your judgement and one sentence of justification – examiners and journal club audiences expect reasoning, not only a traffic-light colour.

5. Evaluate the statistics (not just the p-value)

Check whether the analysis matches the design: logistic regression for binary outcomes, survival methods for time-to-event data, paired tests only when pairing exists.

Look for effect sizes with confidence intervals, not only p-values. For trials, prefer intention-to-treat analyses unless there is a clear and justified per-protocol secondary analysis.

For subgroup analyses, ask whether they were pre-specified. Post-hoc fishing without multiplicity adjustment is a red flag.

6. Interpret results in context

Separate statistical significance from clinical or practical importance. A large sample can make trivial differences significant; a small sample may be underpowered even if p < 0.05.

Read the limitations section critically, then add your own – especially generalisability, confounding, and whether outcomes were patient-centred.

Consider whether the discussion overstates causation from observational data or extrapolates beyond the studied population.

7. Summarise for your supervisor or journal club

A good appraisal ends with a plain-language verdict: strengths, main biases, key numbers, and whether you would act on this evidence for your question.

Structured tools encode this workflow: study-type routing, framework-aligned domains, and explicit scoring – so your appraisal is reproducible and auditable.

8. Appraising systematic reviews differently

When the paper is a systematic review, add PRISMA flow reconciliation, AMSTAR 2 quality, risk of bias in included studies, and GRADE certainty. The failure modes are different from single-trial appraisal.

See our systematic review methodology guide for the full workflow.

9. AI tools – use with caution

Chat tools can summarise text but often mismatch frameworks and invent checklist items. For coursework, disclose AI use and verify every claim against the PDF.

Framework-aligned appraisal tools route study type automatically – compare AI chat output to structured appraisal on the same paper.

10. Practice with a structured workflow

Pick one paper from your reading list each week. Appraise it with the same template until domain thinking becomes automatic.

Upload the PDF to StrataResearch quick analysis and compare your manual ROB 2 or AMSTAR worksheet to the structured output – disagreement is where learning happens.

Interactive version (quizzes, walkthroughs) loads when JavaScript is enabled.