Advanced statistical methods

IPD metaanalysis projects allows more nuanced research questions.

However, often these require more advanced and complex statistical methods.

For example, fundamental onestage and twostage approaches (described here) can be extended to:
 examine treatmentcovariate interactions (participantlevel effect modifiers)
 jointly model multiple correlated outcomes (using multivariate metaanalysis)
 compare multiple treatments (using network metaanalysis)

Missing data is a common problem in IPD projects, and novel approaches may be required to deal with it.
Treatmentcovariate interactions (participantlevel effect modifiers)

A key component of stratified and precision medicine research is to identify participantlevel characteristics (covariates) that are associated with changes in a treatment’s effect. These are known as treatmentcovariate interactions.

When IPD from multiple randomised trials are available, an IPD metaanalysis provides the opportunity to increase power to detect true treatmentcovariate interactions.

When done properly, an IPD metaanalysis avoids using acrosstrial information from a metaregression of the observed treatment effects (based on all trial participants) and aggregated values of participantlevel covariates (such as mean age, proportion male).

Such analyses are prone to aggregation bias, and may not reflect actual interactions at the participantlevel within trials.

A twostage IPD approach to estimating treatmentcovariate interactions avoids aggregation bias by estimating treatmentcovariate interactions in each trial separately, and then synthesising them in the second stage. This ensures that only withintrial information is used.

A onestage IPD metaanalysis to the estimation of treatmentcovariate interactions must ensure that withintrial and acrosstrial information are separated out, by either (i) stratifying all nuisance parameters by trial, or (ii) centering the covariate by its mean and allowing the mean covariate value to explain betweentrial heterogeneity.

Many current IPD metaanalysis projects apply a onestage model that amalgamates withintrial and acrosstrial information; this is not recommended, as the summary treatmentcovariate interaction may then be influenced by aggregation bias.

If interactions do exist, they are more likely to be detected when continuous covariates and outcomes are analysed on their continuous scale.

Treatmentcovariate interactions may be nonlinear (e.g. U or J shaped), and should be investigated, for example using a twostage multivariate IPD metaanalysis summarising interactions defined by a restricted cubic spline function is recommended.

Treatmentcovariate interactions may depend on the scale of analysis; for example, they may arise on the odds ratio or hazard ratio scale, even when there is no interaction on the risk ratio scale, simply due to changes in the baseline risk across covariate values.

Measurement error may also lead to apparent treatmentcovariate interactions when they actually do not exist, or conversely mask genuine interactions.

Predicting an individual’s treatment effect conditional on their covariate values is a complex issue, potentially requiring the combination of risk prediction models and treatmentcovariate interactions, whilst still avoiding aggregation bias.

For further details on treatmentcovariate interactions see Chapter 7 of our book and various references.
Multivariate metaanalysis of correlated outcomes

Often IPD metaanalysis projects are interested in a treatment’s effect on each of multiple correlated outcomes (e.g. systolic and diastolic blood pressure); however, most perform a separate metaanalysis for each outcome, and ignore the correlation between them.

Statistical models for multivariate IPD metaanalysis address this by analysing treatment effects for multiple outcomes simultaneously, whilst accounting for withintrial and betweentrial correlation.

In the first stage of a twostage multivariate IPD metaanalysis, each trial is analysed separately to obtain a treatment effect estimate and its corresponding variance for each outcome, and the withintrial correlation between treatment effect estimates for each pair of outcomes.

Withintrial correlation arises from participants in the same trial having correlated data for each of the multiple outcomes. It can be estimated by using joint models or, more generally, using bootstrapping.

The second stage requires a multivariate metaanalysis model, which typically assumes multivariate normality, for both the treatment effect estimates within each trial, and the true treatment effects across trials. When there is betweentrial heterogeneity, the true treatment effects for the outcomes may also be correlated across trials, a phenomenon known as betweentrial correlation.

A variety of statistical software and estimation methods are available to fit the multivariate metaanalysis models.

By accounting for correlation amongst outcomes, the multivariate metaanalysis can borrow strength across outcomes (i.e. gain information) to provide more precise summary results for each outcome; this is especially useful when some outcomes are not available in all trials.

Accounting for correlation in a multivariate metaanalysis also enables more appropriate joint inferences across outcomes, such as the probability that a treatment is beneficial for both outcome 1 and outcome 2.

Alternative onestage IPD metaanalysis models are also possible to handle multiple outcomes, especially for multiple continuous outcomes or for a multinomial outcome.

Multivariate IPD metaanalysis has many other applications; for example, for modelling multiple timepoints (longitudinal data), examining surrogate outcomes, and joint synthesis of multiple model parameters, such as for doseresponse relationships and nonlinear trends.

For further details on multivariate IPD metaanalysis see Chapter 13 of our book and various references.
Network metaanalysis of multiple treatments

There are often multiple treatments available for the same clinical condition; evidence synthesis of existing randomised trials can inform decisions about which treatments are best.

Existing randomised trials rarely directly compare all the available treatments; rather, each directly compares a subset of treatments of interest.

A network metaanalysis simultaneously synthesises such trials, allowing the direct evidence about available treatment comparisons to be combined with indirect evidence propagated through the network. This produces summary treatment effect estimates, and provides a coherent framework for all the treatments to be compared and ranked.

Rankings can be very sensitive to the uncertainty in summary results and do not reveal clinical value. In particular, focusing on the probability of being ranked first is potentially misleading: a treatment ranked first may also have a high probability of being ranked last, and its benefit over other treatments may be of little clinical relevance.

A network metaanalysis combines direct and indirect evidence by assuming consistency between these two sources of evidence.

The consistency assumption should be evaluated in each network where possible. There is usually low power to detect inconsistency, which mainly arises when triallevel or participantlevel effect modifiers are systematically different in the subsets of trials providing direct and indirect evidence.

IPD has the same potential advantages for network metaanalysis as it does for pairwise metaanalysis, including standardising participant inclusion criteria, outcome definitions and length of followup in each trial; allowing analyses that adjust for prognostic factors and treatment effect modifiers; and conducting analyses within subgroups of individuals.

The main additional advantages of IPD for network metaanalysis pertain to inconsistency and its detection or reduction. In particular, IPD enables researchers to examine and plot covariate distributions to improve detection, and allows statistical models to include adjustment for covariates that cause inconsistency (e.g. effect modifiers, and prognostic factors when modelling noncollapsible measures such as odds ratios).

Even if IPD are available from just one or a few trials in the network, it may be possible to adjust for effect modifiers by fitting a multilevel network metaregression model, and to produce estimates of populationadjusted average treatment effects.

For further details on network IPD metaanalysis see Chapter 14 of our book and various references.
Dealing with missing data in IPD metaanalysis projects

Missing data is a common problem in healthcare research, including IPD metaanalysis projects.

Completecase analysis discards participants with missing data, but is generally undesirable, as it may be invalid when data are not missing completely at random (MCAR) and leads to a loss of information.

An exception is in the analysis of randomised trials to evaluate treatment effects, where missing values for prognostic factors (adjustment variables) can be handled using mean imputation (for continuous variables) or the missing indicator method (for categorical variables).

In prognosis and prediction research, it is usually preferable to assume data are missing at random (MAR), and to adopt multiple imputation methods that impute missing values for a variable conditional on observed values for other variables.

Multiple imputation can be achieved by explicitly defining a joint distribution of the observed data (joint modelling), or by defining a series of conditional distributions (fully conditional specification, FCS).

In an IPD metaanalysis, missing data are either sporadically or systematically missing. Sporadically missing data occur when variables are missing for some (but not all) participants in one or more studies. Systematically missing values occur when variables are missing for all participants of one or more studies.

A practical approach to account for sporadically missing values is to multiply impute each study separately. This avoids borrowing information across studies, but requires systematically missing variables to be dropped.

The imputation of systematically missing data requires borrowing information across studies, and to adopt multilevel imputation methods that utilise the FCS or joint model framework. These methods account for the clustering of participants within studies, and allow for betweenstudy heterogeneity on key parameters of the imputation model.

The choice between multilevel joint models or FCS approaches is context specific, and recommendations are only just emerging.

It is important that the generation of imputed values is consistent (“congenial”) with the analysis of the imputed data.

In some situations, it is possible to avoid imputing missing values at the participantlevel, and rather borrow information directly at the studylevel. In particular, when some studies have a full set of variables and other studies have the same subset of variables, a bivariate metaanalysis can be used to jointly synthesise fully adjusted and partially adjusted results, whilst accounting for their correlation.

For further details on missing data in IPD projects see Chapter 18 of our book and various references.