Power and sample size

Before IPD collection, exploring the statistical power of a planned IPD metaanalysis can provide valuable insight about the value and viability of the project.
â€‹

Power calculations can be made conditional on the aggregate data (e.g. number of participants and events) known for studies (potentially) promising their IPD.
â€‹

The calculations may indicate when the power is likely to be too low even if IPD were obtained from the majority of trials.

Conversely, they may provide reassurance that even with a conservative assumption about IPD availability, the resulting power is likely to be sufficient.
â€‹

A variety of approaches to power calculations are available, and we explain this below.

For further details see Chapter 12 of our book and various references.
â€‹
Power calculations based on studies promising their IPD
â€‹

The power of an IPD metaanalysis depends on the research aim (e.g. to examine a treatmentcovariate interaction), the number of trials and the number of participants (and events) for which IPD can potentially be obtained, and many intricate aspects including the distribution of covariate values and magnitude of assumed effects.
â€‹

For example, closedform solutions to calculate the power of an IPD metaanalysis for treatmentcovariate interactions are available for continuous outcomes.

These utilise likelihoodbased solutions for the variance of the summary interaction estimate which depend only on sample sizes, residual variances, and variance of covariate values from those trials (potentially) promising their IPD. Such information can be extracted from trial publications.
â€‹

Closed form solutions for binary and timetoevent outcomes are difficult to derive reliably before IPD collection, due to participantlevel response variances being conditional on their actual covariate values, which are unknown without IPD.
â€‹

A flexible alternative is to use simulationbased power calculations, where IPD metaanalysis datasets of a particular size (chosen to reflect the trials promising their IPD) are generated multiple (e.g. 10000) times based on a particular datagenerating model, and on each occasion an IPD metaanalysis is performed to estimate the effect of interest.

The proportion of simulated datasets that give a pvalue < 0.05 (or equivalently a confidence interval excluding the null) provides the estimated power.
â€‹

Power calculations also reveal which trials contribute most to the power, which can be useful if needed to prioritise IPD collection from a subset of trials (i.e. those that provide the most information) due to time and resource constraints. A trial’s contributions does not just depend on sample size or number of events, but also other factors, in particular the variance of covariate values.
â€‹

In addition to calculating power, researchers may also want to check that their IPD metaanalysis will give precise enough estimates of the effect of interest, to ensure that confidence intervals will be sufficiently narrow to inform clinical decision making.
â€‹

IPD metaanalysis results might also inform the sample size required for a new primary study.