top of page

Advanced statistical methods 

  • IPD meta-analysis projects allows more nuanced research questions.

  • However, often these require more advanced and complex statistical methods.

  • For example, fundamental one-stage and two-stage approaches (described here) can be extended to:

- examine treatment-covariate interactions (participant-level effect modifiers)

- jointly model multiple correlated outcomes (using multivariate meta-analysis)​

- compare multiple treatments (using network meta-analysis) ​

  • Missing data is a common problem in IPD projects, and novel approaches may be required to deal with it.

Treatment-covariate interactions (participant-level effect modifiers)

  • A key component of stratified and precision medicine research is to identify participant-level characteristics (covariates) that are associated with changes in a treatment’s effect. These are known as treatment-covariate interactions. 

  •  When IPD from multiple randomised trials are available, an IPD meta-analysis provides the opportunity to increase power to detect true treatment-covariate interactions.

  • When done properly, an IPD meta-analysis avoids using across-trial information from a meta-regression of the observed treatment effects (based on all trial participants) and aggregated values of participant-level covariates (such as mean age, proportion male).

  • Such analyses are prone to aggregation bias, and may not reflect actual interactions at the participant-level within trials.

  • A two-stage IPD approach to estimating treatment-covariate interactions avoids aggregation bias by estimating treatment-covariate interactions in each trial separately, and then synthesising them in the second stage. This ensures that only within-trial information is used.

  •  A one-stage IPD meta-analysis to the estimation of treatment-covariate interactions must ensure that within-trial and across-trial information are separated out, by either (i) stratifying all nuisance parameters by trial, or (ii) centering the covariate by its mean and allowing the mean covariate value to explain between-trial heterogeneity. 

  • Many current IPD meta-analysis projects apply a one-stage model that amalgamates within-trial and across-trial information; this is not recommended, as the summary treatment-covariate interaction may then be influenced by aggregation bias. 

  • If interactions do exist, they are more likely to be detected when continuous covariates and outcomes are analysed on their continuous scale. 

  • Treatment-covariate interactions may be non-linear (e.g. U or J shaped), and should be investigated, for example using a two-stage multivariate IPD meta-analysis summarising interactions defined by a restricted cubic spline function is recommended.  

  • Treatment-covariate interactions may depend on the scale of analysis; for example, they may arise on the odds ratio or hazard ratio scale, even when there is no interaction on the risk ratio scale, simply due to changes in the baseline risk across covariate values.

  • Measurement error may also lead to apparent treatment-covariate interactions when they actually do not exist, or conversely mask genuine interactions.

  • Predicting an individual’s treatment effect conditional on their covariate values is a complex issue, potentially requiring the combination of risk prediction models and treatment-covariate interactions, whilst still avoiding aggregation bias.


 Multivariate meta-analysis of correlated outcomes

  • Often IPD meta-analysis projects are interested in a treatment’s effect on each of multiple correlated outcomes (e.g. systolic and diastolic blood pressure); however, most perform a separate meta-analysis for each outcome, and ignore the correlation between them.

  • Statistical models for multivariate IPD meta-analysis address this by analysing treatment effects for multiple outcomes simultaneously, whilst accounting for within-trial and between-trial correlation.

  • In the first stage of a two-stage multivariate IPD meta-analysis, each trial is analysed separately to obtain a treatment effect estimate and its corresponding variance for each outcome, and the within-trial correlation between treatment effect estimates for each pair of outcomes.

  • Within-trial correlation arises from participants in the same trial having correlated data for each of the multiple outcomes. It can be estimated by using joint models or, more generally, using bootstrapping.

  • The second stage requires a multivariate meta-analysis model, which typically assumes multivariate normality, for both the treatment effect estimates within each trial, and the true treatment effects across trials. When there is between-trial heterogeneity, the true treatment effects for the outcomes may also be correlated across trials, a phenomenon known as between-trial correlation.

  • A variety of statistical software and estimation methods are available to fit the multivariate meta-analysis models.

  • By accounting for correlation amongst outcomes, the multivariate meta-analysis can borrow strength across outcomes (i.e. gain information) to provide more precise summary results for each outcome; this is especially useful when some outcomes are not available in all trials.

  • Accounting for correlation in a multivariate meta-analysis also enables more appropriate joint inferences across outcomes, such as the probability that a treatment is beneficial for both outcome 1 and outcome 2.

  • Alternative one-stage IPD meta-analysis models are also possible to handle multiple outcomes, especially for multiple continuous outcomes or for a multinomial outcome.

  • Multivariate IPD meta-analysis has many other applications; for example, for modelling multiple time-points (longitudinal data), examining surrogate outcomes, and joint synthesis of multiple model parameters, such as for dose-response relationships and non-linear trends.

Multivariate ma

 Network meta-analysis of multiple treatments

  • There are often multiple treatments available for the same clinical condition; evidence synthesis of existing randomised trials can inform decisions about which treatments are best.

  • Existing randomised trials rarely directly compare all the available treatments; rather, each directly compares a subset of treatments of interest.

  • A network meta-analysis simultaneously synthesises such trials, allowing the direct evidence about available treatment comparisons to be combined with indirect evidence propagated through the network. This produces summary treatment effect estimates, and provides a coherent framework for all the treatments to be compared and ranked.

  • Rankings can be very sensitive to the uncertainty in summary results and do not reveal clinical value. In particular, focusing on the probability of being ranked first is potentially misleading: a treatment ranked first may also have a high probability of being ranked last, and its benefit over other treatments may be of little clinical relevance.

  • A network meta-analysis combines direct and indirect evidence by assuming consistency between these two sources of evidence.

  • The consistency assumption should be evaluated in each network where possible. There is usually low power to detect inconsistency, which mainly arises when trial-level or participant-level effect modifiers are systematically different in the subsets of trials providing direct and indirect evidence.

  • IPD has the same potential advantages for network meta-analysis as it does for pairwise meta-analysis, including standardising participant inclusion criteria, outcome definitions and length of follow-up in each trial; allowing analyses that adjust for prognostic factors and treatment effect modifiers; and conducting analyses within subgroups of individuals.

  • The main additional advantages of IPD for network meta-analysis pertain to inconsistency and its detection or reduction. In particular, IPD enables researchers to examine and plot covariate distributions to improve detection, and allows statistical models to include adjustment for covariates that cause inconsistency (e.g. effect modifiers, and prognostic factors when modelling non-collapsible measures such as odds ratios).

  • Even if IPD are available from just one or a few trials in the network, it may be possible to adjust for effect modifiers by fitting a multi-level network meta-regression model, and to produce estimates of population-adjusted average treatment effects.

network ma

Dealing with missing data in IPD meta-analysis projects

  • Missing data is a common problem in healthcare research, including IPD meta-analysis projects.

  • Complete-case analysis discards participants with missing data, but is generally undesirable, as it may be invalid when data are not missing completely at random (MCAR) and leads to a loss of information.

  • An exception is in the analysis of randomised trials to evaluate treatment effects, where missing values for prognostic factors (adjustment variables) can be handled using mean imputation (for continuous variables) or the missing indicator method (for categorical variables).

  • In prognosis and prediction research, it is usually preferable to assume data are missing at random (MAR), and to adopt multiple imputation methods that impute missing values for a variable conditional on observed values for other variables.

  • Multiple imputation can be achieved by explicitly defining a joint distribution of the observed data (joint modelling), or by defining a series of conditional distributions (fully conditional specification, FCS).

  • In an IPD meta-analysis, missing data are either sporadically or systematically missing. Sporadically missing data occur when variables are missing for some (but not all) participants in one or more studies.  Systematically missing values occur when variables are missing for all participants of one or more studies. 

  • A practical approach to account for sporadically missing values is to multiply impute each study separately. This avoids borrowing information across studies, but requires systematically missing variables to be dropped.

  • The imputation of systematically missing data requires borrowing information across studies, and to adopt multilevel imputation methods that utilise the FCS or joint model framework.  These methods account for the clustering of participants within studies, and allow for between-study heterogeneity on key parameters of the imputation model.

  • The choice between multilevel joint models or FCS approaches is context specific, and recommendations are only just emerging.

  • It is important that the generation of imputed values is consistent (“congenial”) with the analysis of the imputed data.

  • In some situations, it is possible to avoid imputing missing values at the participant-level, and rather borrow information directly at the study-level. In particular, when some studies have a full set of variables and other studies have the same subset of variables, a bivariate meta-analysis can be used to jointly synthesise fully adjusted and partially adjusted results, whilst accounting for their correlation.

missing data
bottom of page