top of page

Two-stage approach

  • The flagship package for conducting a two-stage IPD meta-analysis is ipdmetan, developed by Dr David Fisher for the Stata software package. See:

Fisher DJ. Two-stage individual participant data meta-analysis and generalized forest plots. Stata Journal 2015;15(2):369-96

  • This package automates the calculation of trial-specific aggregate data (e.g. treatment effect estimates and their variances) from the first stage, and then produces summary meta-analysis results and a forest plot from the second stage.

  • It can be installed from within Stata, simply by typing ‘ssc install ipdmetan’. Type ‘help ipdmetan’ for a detailed help file and range of examples.

  • The package is applicable to IPD meta-analyses aiming to summarise a particular effect of interest defined by a single parameter in a regression model (such as a treatment effect, or another measure that can be estimated in a regression model, such as a prognostic effect or a treatment-covariate interaction).

  • To implement ipdmetan, IPD from all trials should be collated in a single dataset including a column identifying the trial in which each participant was included. The user then specifies the trial identification variable, which regression model to use in the first stage (calling standard regression packages in Stata such as reg, logit, or stcox), and which meta-analysis model and estimation method(s) to use in the second stage.

  • The package then fits a regression model to the IPD from each trial separately and stores the derived results (aggregate data) for each trial, which are then immediately used to fit a chosen meta-analysis model in the second stage. A wide variety of model and estimation options are available for the second stage, including common or random treatment effects, REML estimation, Hartung-Knapp-Sidik-Jonkmann (HKSJ) confidence intervals and tailored display of forest plots.

In Chapter 5 of our book, we focus on a two-stage meta-analysis of randomised trials - and below is the code we used to run the examples (we cannot share the IPD itself).

  • For the time-to-event outcome example of Box 5.4, the following syntax was used to implement the ipdmetan package to the IPD from 10 trials of anti-hypertensive treatment:

ipdmetan, study(trial) re(reml, hksj) forest(xtitle(hazard ratio) boxsca(30) xlab(0.2 0.5 1 2) astext(40)) hr effect(HR) : stcox treat smk sbpi age bmi


The terms in the syntax are explained below:

  • study(trial) denotes that the column called ‘trial’ in the dataset is the trial identification covariate

  • re(reml hksj) denotes that a random treatment effects model is to be fitted using REML estimation, with 95% confidence interval derived using the HKSJ method

  • forest() includes various options for tailoring the display of the forest plot, including the size of the boxes around the trial treatment effect estimates (boxsca()), the title of the x-axis (xtitle), the values of the hazard ratio to be displayed in the x-axis (xlab()) and relative size of the text on the forest plot (astext())

  • hr denotes that results should be displayed on the hazard ratio scale (and not the log hazard ratio scale)

  • effect(HR) denotes that HR should be displayed above the column of trial treatment effect estimates on the right-hand side of the plot.

  • the syntax after the colon denotes the method to use in the first stage to each trial separately

  • stcox treat denotes that a Cox regression model should be fitted in each trial separately, with the treatment variable (column called ‘treat’ containing a value of 1 for participants in the treatment group and 0 for participants in the control group) included alongside prognostic factors of smoking (‘smk’), SBP (‘sbpi’), age (‘age’) and BMI (‘bmi’). Unless told otherwise (using the poolvar() option before the colon), ipdmetan will produce meta-analysis results for the first covariate listed within the subsequent regression statement.

  • For the continuous outcome example of Box 5.5, the following syntax was used to implement the ipdmetan package to the IPD from 33 trials examining the effect of interventions to reduce unnecessary weight gain in pregnancy :

ipdmetan, study(studyid) re(reml, hksj) forest(spacing(2) boxsca(30) effect(mean difference) lcols(intervention study_name n) xlab(-5 -4 -3 -2 -1 0 1 2) astext(70)): reg final_wt trt basline_wt

Here, reg invokes a linear regression analysis, and final_wt and baseline_wt are the final and baseline weights, respectively, and treat is 1 for those in the treatment group and 0 for those in the control group.

  • For the binary outcome example of Box 5.6, the following syntax was used to implement the ipdmetan package to do the first stage analysis and then store the study results in a dataset called 'AD.dta', followed by a meta-regression using the metareg command.

ipdmetan, study(studyid) re(reml, hksj) forest(spacing(2) boxsca(30) lcols(intervention study_name) xlab(0.5 1 2 3) astext(40) ) saving(AD.dta, replace) or: logit outcm_comp_exclbase trt2

use AD.dta

* this dataset has stored the effect estimates (_ES) and standard errors (_seES)

* plus study-level covariates including intervention type (intervention) and study name (study_name)
sort intervention study_name

* create new dummy variables for the different types of interventions
tabulate intervention, gen(int_type)

* fit a meta-regression with exercise (int_type2 = 1) and mixed (int_type3=1) as covariates
metareg _ES int_type2 int_type3, wsse(_seES) eform

  • Of course, if you are willing to generate and collate the aggregate data outside of ipdmetan (indeed, sometimes this is necessary, for example if studies have different designs, or if the IPD are not all stored together) then tther meta-analysis packages are also be useful.

  • That is, researchers could take the dataset of treatment effect estimates and variances obtained from the first stage, and use another package to fit their meta-analysis (or meta-regression) model in the second stage.

  • For example, available packages in Stata include meta, metan, metaan and metareg. In R, suitable packages include the exceptional metafor (see here for what this package offers), and also rmeta and metaplus, amongst others.

bottom of page