Strata Academy

Systematic Review Data Extraction: Forms, Dual Review & PRISMA

Design extraction forms, pilot on 2–3 studies, resolve disagreements, and link extracted data to risk-of-bias and meta-analysis

Quick answer

Data extraction translates each included study into structured fields for synthesis and risk-of-bias assessment. Cochrane methods expect a pre-specified form, piloting on 2–3 studies, dual independent extraction for key outcomes, and documented disagreement resolution — not copying numbers from abstracts alone.

1. Why extraction is not copy-paste from PDFs

After screening, each included study must be converted into structured data: population characteristics, interventions, comparators, outcomes (with time points), effect sizes, variance measures, and funding or conflicts. This step determines what enters meta-analysis, narrative synthesis, and GRADE tables.

Abstract-only extraction is a common student error. Abstracts often omit non-significant secondary outcomes, per-protocol analyses, or subgroup results that appear only in tables. Extraction from full text — including supplements — is the minimum standard for coursework marked at postgraduate level.

Poor extraction propagates through the entire review: wrong effect sizes in forest plots, misclassified study designs in subgroup analyses, and GRADE certainty statements that do not match the underlying data.

2. Building the extraction form

Start from your registered PICO and pre-specified outcomes. Each field on the form should map to an analysis planned in the protocol. Ad hoc fields added mid-review without protocol amendment weaken reproducibility.

Typical sections: bibliographic identifiers (DOI, trial registry ID), design and setting, population and eligibility, intervention and comparator details, outcome definitions and time points, effect estimates with CIs or SEs, sample sizes, missing data handling, funding, and conflicts of interest.

For meta-analysis, record enough detail to compute or verify effect sizes: means and SDs for continuous outcomes, events and totals for dichotomous outcomes, hazard ratios with CIs for time-to-event data. Note whether ITT or per-protocol analysis was reported.

3. Dual independent extraction

Cochrane recommends independent extraction by two reviewers for key data, with a third resolver for conflicts. This applies especially to outcome data, sample sizes, and risk-of-bias judgements that feed directly into synthesis.

Single extraction is acceptable only in rapid reviews with explicit limitation statements. For dissertation or publication targets, single extraction without justification will attract methodological criticism.

Blinding extractors to study authors and journal where feasible reduces prestige bias. Covidence, DistillerSR, and Rayyan support dual extraction workflows with conflict flags.

4. Linking extraction to risk of bias

Extract data needed for framework judgements in the same pass where possible. For ROB 2: randomisation method, allocation concealment, blinding, attrition flow, and outcome assessor blinding. For ROBINS-I: confounding control, selection, classification of interventions, and missing data.

Do not treat risk of bias as a separate checkbox exercise after extraction. The same missing ITT table that blocks effect size extraction often drives a high-risk judgement in Domain 3 (missing outcome data).

5. From extraction sheet to meta-analysis

Before pooling, check clinical homogeneity: are populations, interventions, and outcome definitions similar enough that a common effect is meaningful? Statistical heterogeneity (I², τ²) is assessed after clinical plausibility — not instead of it.

Transform effect sizes consistently (log OR, log RR, mean difference). Document assumptions when converting SE from CI or p-value. Cochrane RevMan and R meta packages expect specific input formats — validate a subset manually.

Contact authors for missing data when pre-specified in the protocol. Document non-response as a limitation. Do not impute missing study-level data without sensitivity analysis and clear assumptions.

6. Reporting extraction in PRISMA

PRISMA 2020 Item 9 expects description of data extraction methods: number of reviewers, processes for resolving disagreements, and software used. Item 10 covers data items extracted.

If extraction methods changed after piloting, report the change transparently. Post-hoc changes without protocol update are a common reason for major revision at peer review.

Interactive version (quizzes, walkthroughs) loads when JavaScript is enabled.