What, why & when?

  • On this page we discuss:​

- what we mean by IPD and an IPD meta-analysis

- give an example of IPD from a randomised trial

- compare a meta-analysis dataset containing IPD or aggregate data

- discuss why and when IPD meta-analysis projects are most important.​


What do we mean by IPD and an IPD meta-analysis?

  • Systematic reviews are the cornerstone of evidence synthesis and evidence-based decision-making in healthcare.

  • They use transparent methods to identify, appraise and combine a body of research evidence.

  • The goal is to produce summary results that guide best practice for stakeholders including patients, clinicians, health professionals, and policy makers.

  • Most systematic reviews include a meta-analysis, which is a statistical technique for combining (synthesising) quantitative data obtained from multiple research studies.

  • Traditionally, most meta-analyses have used aggregate data extracted from study publications, but there is growing demand for meta-analyses that utilise individual participant data (IPD). 

  • IPD refers to the raw (but de-identified) information recorded for each participant in a research study (e.g. a randomised trial), such as baseline characteristics, prognostic factors, treatments received, outcomes and follow-up details.

  • In contrast, aggregate data refers to information averaged or estimated across all participants in a particular study.

  • For example, the treatment effect estimate, the total participants, and the mean age and proportion of males in each treatment group.

  • Such aggregate data are derived from the IPD, and therefore the IPD can be considered the original source material. 

  • An IPD meta-analysis project involves the collection, checking, harmonisation, and synthesis of IPD from multiple studies, and offers huge promise and potential in the new era of data sharing and personalised healthcare.


Example of IPD from a randomised trial

  • Example of hypothetical IPD from a randomised trial examining anti-hypertensive treatment is shown in the table below.

  • Its clear that the IPD has information at the participant level, about treatment group allocation, prognostic factor values (baseline characteristics) and outcomes such as systolic blood pressure (SBP), death, and length of follow-up. 

  • Without IPD for this trial, the meta-analyst would be reliant on reported aggregate data, such as the treatment effect estimate on change in SBP (a mean difference and its 95% CI), the treatment effect on the rate of death (a hazard ratio and its 95%), the number of participants (and events) in the treatment and control groups, the mean age in each group, and so forth.


Comparison of a meta-analysis dataset containing IPD or containing aggregate data?

  • The collection, checking and harmonisation of IPD for meta-analysis is a considerable undertaking, often taking upwards of 1-2 years to complete.

  • An excerpt of IPD collected from 10 randomised trials for an IPD meta-analysis project is given in the box below.

  • The IPD meta-analysis dataset contains a single row per participant in every trial.

  • In contrast, a conventional meta-analysis would use aggregate data (usually as extracted from trial publications), and so their meta-analysis typically would contain just a single row per trial


Why and when are IPD meta-analysis projects needed?

  • IPD meta-analysis projects began to emerge in the late 1980s and early 1990s, originating mainly in the cancer and cardiovascular disease fields.

  • In the decades since, the number of IPD meta-analysis projects has risen sharply.

  • The growth of IPD meta-analysis projects reflects their potential to revolutionise healthcare research, especially as they align with three major contemporary initiatives:

- reducing research waste

- data sharing, and

- personalised healthcare.

  • Sharing of IPD maximises the contribution of existing data from millions of research participants, and so is becoming an increasingly frequent stipulation of research funding.

  • Compared to using published aggregate data, the availability of IPD can

- improve the quantity and quality of data

- help standardise outcome and covariate definitions across trials

- enable data checking and independent scrutiny

- produce more flexible and sophisticated analyses than are possible with only existing aggregate data.

  • In particular, IPD meta-analysis projects also allows a far broader and detailed set of analyses and research questions to be addressed than when using published aggregate data.

  • For example, IPD are vital for a thorough investigation of modelling assumptions, examining treatment effect modifiers (treatment-covariate interactions), tailoring diagnostic strategies, identifying risk and prognostic factors, and individualising risk prediction

  • However, IPD meta-analysis projects generally require more resources than a traditional aggregate data meta-analysis, including additional costs, time and expertise.

  • Given the additional resource, it is important to consider when an IPD project is needed. This depends on the particular research question, and whether IPD would produce a more reliable and comprehensive answer than using published aggregate data.

  • It is useful to first undertake a review to identify existing studies and whether they report suitable aggregate data to answer the research question of interest. Where focus is on overall treatment effects, published aggregate data may suffice; however, when going beyond the overall effect, IPD meta-analysis projects are usually required.

  • Even when IPD meta-analysis projects are needed, the available IPD needs to be of sufficient quality, record the required participant-level characteristics and outcomes of interest, and have reasonable statistical power to address the research question(s) reliably.