Article Text
Abstract
Objective Missing outcome data are a common problem in clinical trials and systematic reviews, as it compromises inferences by reducing precision and potentially biasing the results. Systematic reviewers often assume that the missing outcome problem has been resolved at the trial level. However, in many clinical trials a complete case analysis or suboptimal imputation techniques are employed and the problem is accumulated in a quantitative synthesis of trials via metaanalysis. The risk of bias due to missing data depends on the missingness mechanism. Most statistical analyses assume missing data to be missing at random, which is an unverifiable assumption. The aim of this paper is to present methods used to account for missing outcome data in a systematic review and metaanalysis.
Methods The following methods to handle missing outcome data are presented: (1) complete cases analysis, (2) imputation methods from observed data, (3) best/worst case scenarios, (4) uncertainty interval for the summary estimate and (5) a statistical model that makes assumption about how treatment effects in missing data are connected to those in observed data. Examples are used to illustrate all the methods presented.
Results Different methods yield different results. A complete case analysis leads to imprecise and potentially biased results. The bestcase/worstcase scenarios give unrealistic estimates, while the uncertainty interval produces very conservative results. Imputation methods that replace missing data with values from the observed data do not properly account for the uncertainty introduced by the unobserved data and tend to underestimate SEs. Employing a statistical model that links treatment effects in missing and observed data, unlike the other methods, reduces the weight assigned to studies with large missing rates.
Conclusions Unlike clinical trials, in systematic reviews and metaanalyses we cannot adapt preemptive methods to account for missing outcome data. There are statistical techniques implemented in commercial software (eg, STATA) that quantify the departure from the missing at random assumption and adjust results appropriately. A sensitivity analysis with increasingly stringent assumptions on how parameters in the unobserved and observed data are related is a sensible way to evaluate robustness of results.
Statistics from Altmetric.com
Introduction
The term attrition is widely used in the clinical trials literature to refer to situations where outcome data are not available for some participants. Missing data may invalidate results from clinical trials by reducing precision and, under certain circumstances, yielding biased results. Missing outcome data are an important and common problem in mental health trials as dropout rate may exceed 50% for certain conditions.1 The Cochrane Collaboration regards incomplete outcome data as a major factor affecting the credibility of a study and requires systematic reviewers to assess the level of bias in all included trials via the Cochrane Risk of Bias tool.2 ,3 The ideal solution is to avoid missing data altogether and the National Research Council suggested ideas for limiting the possibility of missing data in the design of clinical trials.4 Systematic reviews and metaanalysis are retrospective by their own nature and preventive measures for the avoidance of missing outcome data cannot be used.
The intentiontotreat (ITT) principle is widely accepted as the most appropriate way to analyse data in randomised controlled trials (RCT).5 ,6 The ITT principle requires analysing all participants in the group they were originally randomised irrespective of the treatment they actually received. The Cochrane Handbook suggests employing an ITT as the least biased way to estimate intervention effects from randomised trials.7 However, in order to include in the analysis participants whose outcomes are unknown, one needs to employ an imputation technique and make assumptions about missing data, which may affect the reliability and robustness of study findings.
Missing data mechanisms
There are several reasons why data may be missing and not all of them introduce bias. The risk of bias due to missing data depends on the missing data mechanism which describes how propensity for missing data depends on the participant's characteristics and outcomes. Missing data mechanisms can be classified as follows:

I. Missing completely at random (MCAR)
The probability of a missing outcome is the same for all participants and does not depend on any participant characteristic (eg, if a participant misses some appointments due to scheduling difficulties). The MCAR assumption means that the group of participants who provided data is a random sample of the total population of participants, but this is often unrealistic in practice.

II. Missing at random (MAR)
The propensity for missingness is related to participants’ characteristics, but the probability of a missing outcome is not related to the outcome itself. For instance, suppose that primary school children are randomised to different interventions aiming to reduce schoolrelated anxiety measured on a symptom severity scale. Younger children are less likely to provide data because they have a harder time understanding the items of the symptom severity scale. In the study, the proportion of young children and missing rates are expected to be comparable across interventions. The MAR assumption implies that outcomes for the younger children who dropped out are expected to be similar to outcomes for the younger children who completed the study.4 Under the MAR assumption, missingness does not depend on the actual outcome, although it is associated with some participant's or setting's characteristics. This term MAR is often confusing and sometimes misunderstood as MCAR. Statistical analyses usually start with MAR assumption, and, if it is true, analysis of completers only can provide an unbiased estimate of the relative treatment effect.8 However, the MAR assumption is formally untestable in metaanalyses as we usually have aggregated data and not enough information about those who dropped out; hence, we should consider a different approach for dealing with missing data. By contrast, it is possible to explore the MAR assumption using auxiliary data within an individual trial, for example, if baseline disease severity predicts missingness it is sensible to assume that final disease severity would predict missingness as well and the data may not be MAR.9

III. Missing not at random (MNAR) or informatively missing (IM)
Even accounting for all the available observed information, the probability that an observation is missing still depends on the unseen observations themselves. Participants may dropout for reasons that are associated with the actual effect of the intervention. In schizophrenia trials, for example, placebo arms show larger dropout rate than patients treated with antipsychotics because of placebo's lack of efficacy. Analysis of the participants who completed the study under MNAR would provide a biased estimate of the relative treatment effect. When missing data are MCAR or MAR they are termed ignorable. A MNAR mechanism is termed nonignorable.
We use a hypothetical example to illustrate differences between the three categories. Consider an RCT with 200 participants randomised equally (1:1) to the experimental or control group (table 1). We assume that the true response rate is 33.3% in the control group and 50% in the experimental group so that the estimated OR should be 2. In the MCAR scenario, 10% of the participants dropped out because they missed the appointment. In the MAR scenario, young people dropped out because summer started and they left for vacation. In both these scenarios the missing rate across treatment groups is the same because groups are expected to have participants with similar baseline characteristics (eg, age). Therefore, the probability of dropping out is the same in both the groups. In the MNAR scenario, 40% of the participants who did not see any improvement dropped the study. The number of those not improved is larger in the control group; hence, missing rate is larger in the control group. Reasons for dropping out are related to the intervention received and more specifically to the actual outcome of the study. Missingness shed bias in the results that favours the control group because successes in completers remained unchanged in both groups but the number of completers is now smaller in the control group. We get unbiased results both in the MCAR and MAR scenarios, but to ignore missing data in the MNAR scenario may give biased results. Bias increases with sample size, hence large studies with data that are MNAR may give seriously biased results.
Methods
Methods to address missing data may power results and increase precision. Systematic reviewers are encouraged to collect information about the amount of missing outcome data and the techniques used to estimate treatment effects in each trial. There is a plethora of both ad hoc and more sophisticated approaches for handling missing outcome data at the metaanalysis level.10
Imputations of missing outcome data in individual studies and their metaanalysis
Trials often employ methods to impute values for missing outcomes. Two very common approaches include (1) replacing missing values with the mean value of the participants who provided data (simple imputation) and (2) replacing missing values with the last observed value. The latter method is called last observation carried forward (LOCF) and is routinely used in mental health trials. The LOCF approach has been criticised for producing biased results, as conditions in mental health field are rarely stable and usually involve progressive conditions.11 ,12 Hence, it is not sensible to assume that participants who dropped out at intermediate steps would have remained stable until the end of the study (especially when data are MNAR the LOCF cannot be defended because participants could have left the study for reasons associated with their unobserved outcome data eg, they got worse). Simple imputation of missing values using the two aforementioned methods usually underestimates the SE for the outcome because it fails to account for the fact that missing values are imputed rather than observed. Multiple imputations replace missing values by a number of different plausible imputed values and subsequently combine them to get parameter estimates.13 Plausible imputed values are sampled from their predictive distributions given the observed data (eg, using regression models). The SE of the treatment effect is estimated by the within variance of each imputed dataset as well as from the variance between the imputed datasets. A naïve or unprincipled imputation method may create more problems than it solves by introducing bias both in parameter estimates and their SE.8 When data are MAR, multiple imputation may correct for bias but it will still yield biased results when MNAR holds. To give unbiased results, the variable(s) that are predictive of missing data should be included in the imputation model.
Published study reports typically present results together for fully observed and imputed outcomes. Consequently, a metaanalysis is not given much choice but to synthesise study outcomes as reported in trials, even when the imputation technique is inappropriate. In some cases, trial reports present the outcomes for completers only as well as the results from the merged sample of observed and imputed outcomes.
Synthesis of studies with missing outcome data
It is often the case that studies report results from the participants who provided the outcome of interest and they only report the number of participants for which the outcome is unknown. From a metaanalysis perspective several synthesis options exist and are outlined below.
Complete cases metaanalysis
This is usually the reference approach in many metaanalyses. From each study, only individuals whose outcome is known are included. If the MCAR assumption holds then a complete cases metaanalysis will give unbiased results and the only consequence of missing data is the loss in power. The larger the missing rate, the less reliable the results of this analysis when the data are MNAR.10
Bestcase and worstcase scenarios
A typical simple imputation technique for dichotomous outcomes includes the bestcase and worstcase scenarios.14 The bestcase scenario assumes that all missing participants have a favourable outcome in the experimental group and poor outcome in the control group; the converse is assumed for the worstcase scenario. These two extremes are typically used as a sensitivity analysis and may produce unrealistic results in practice, especially if missing rates are high.
Uncertainty intervals
Gamble and Hollis14 suggested that studies for which there is a big discrepancy between bestcase and worstcase scenarios should be downweighed. Bestworst case scenarios give an interval, called uncertainty interval, for the treatment effect that includes all uncertainty due to missing data.14 A pseudo SE is estimated from this for each study and is subsequently used to downweigh studies. A summary estimate is computed using the revised weights from the uncertainty intervals. This method has low power when there is a large amount of missing data.14
Using a statistical model that relates missing data to observed data
For the case of missing binary outcomes, White and colleagues presented a metaanalysis model where the degree of departure from the MAR assumption is quantified by the informative missingness OR (IMOR).15 IMOR describes the relationship between the unknown odds of the outcome among missing participants and the known odds among observed participants.15 ,16 By relating missing treatment effects to the observed ones it is possible to adjust treatment effects within each study. The adjusted treatment effects are then summarised via metaanalysis. Higgins et al16 suggested a sensitivity analysis for the IMOR approach with risks imputed for the missing data over a plausible range of values. Expert opinion can be used to elicit information on how the risk in the missing participants is related to that of the observed participants and use that information to adjust treatment effects. Experts may be asked how larger/smaller is the risk in the missing participants compared with that of the observed participants for each treatment group and study. Giving a range of plausible value for the ratio or difference of risks may help quantify uncertainty around IMOR. The IMOR approach is implemented in STATA17 in the metamiss command18 (http://www.mrcbsu.cam.ac.uk/software/statasoftware/). The IMOR approach can incorporate the bestcase/worstcase scenarios as special cases and has several advantages. It does not aim to estimate the missing outcomes but aims at making valid inference on the summary treatment effects.19 It also accounts for the uncertainty induced by missing outcome data, unlike the naïve approaches that consider the imputed values as if they were fully observed. An obvious downside of this approach is its complexity compared with the naïve approaches as it requires involvement of a knowledgeable statistician and expert opinion.
Results
The approaches described above are illustrated via an example of a systematic review of all registration studies which compared the effectiveness of an atypical antipsychotic (amisulpride) with a typical one (haloperidol or flupenthixol) in schizophrenia.20–24 The condition of the participants was measured with the Brief Psychiatric Rating Scale (BPRS). The primary outcome is a 50% reduction in BPRS between baseline and week 6. We carried out a metaanalysis and results are reported here.
In figure 1 we present the ORs and the weights used in a randomeffects metaanalysis reporting corresponding estimates for each one of the five studies, using complete cases metaanalysis (black lines), best case scenario (blue), worst case scenario (red), synthesis of studies where missing outcomes have been imputed with LOCF (green) and IMOR model (brown). For the IMOR approach we assumed that the odds in the missing participants is on average the same to that of the observed participants (IMOR=1) but with some uncertainty expressed by the fact that there is 70% chance that the IMOR varies from 0.5 to 2. The differences between the results depend heavily on the rate of missing participants. Studies by Colonna and Carriere, for example, have the lowest dropout rates (about 10%) and hence the ORs slightly change across the different methods. By contrast, studies with much higher dropout rates (about 30%) present results that are very conflicting when different assumptions are made for the missing outcome data. Bestcase/worstcase scenario analysis gives extreme results, whereas the LOCF approach reduces erroneously uncertainty, because it considers imputed data as if they were truly observed. The IMOR approach, as opposed to the other ones, drastically reduces the weights assigned to studies having large missing rates, while the relative weights increase for studies with lower missingness.
In figure 2 we present the pooled metaanalytical results for each approach. The LOCF method provides the most precise estimate, while the Gamble and Hollis uncertainty interval is very wide and probably there is not enough power to detect a nonzero effect. The differences in significance between the complete cases analysis and the IMOR model is attributable to the fact that IMOR reduces the weight of Möller 1997 study23 which has a large missing rate and an OR that favours (though not significantly) the typical antipsychotics. Although the IMOR approach increases withinstudy variability, it may also result in a decrease in betweenstudy variability and uncertainty around the summary estimate does not necessarily increase when a randomeffects model is assumed.
Discussion
Metaanalysts typically do not have access to the reasons for missing outcome data and consequently the MAR assumption cannot be tested empirically. A sensitivity analysis is the only viable way to evaluate the effect of different scenarios for the missing data mechanism.13 Missing data should not be ignored and systematic reviewers should not take for granted that the problem has been appropriately handled in the trial level. Naïve imputation techniques such as imputing the mean value, LOCF and the bestcase and worstcase scenarios, though widely used, may produce biased results and underestimated SEs. The uncertainty interval has low power. Assuming a statistical model that relates treatment effect in missing data to those of the observed data is helpful to consider how robust results are to departures from the MAR assumption. Clinical expertise may inform such a model or a sensitivity analysis can be employed assuming increasingly more stringent scenarios.
The Cochrane Handbook7 urges systematic reviewers to

contact the original investigators to request missing data;

state explicitly the assumptions of any methods used to cope with missing values;

perform sensitivity analysis to assess how robust results are;

address the potential impact of missing data on the findings of the review.
In most trials, most of all in mental health trials, missing data are likely not to be missing at random. Researchers should document the reasons why data are missing and collect data on auxiliary variables that may be predictive of both the outcome and the probability of dropping out.
References
Footnotes

Competing interests DM and GS received research funding from the European Research Council (IMMA 260559), AC and OE received funding from Greek national funds through the Operational Program ‘Education and Lifelong Learning’ of the National Strategic Reference Framework (NSRF)—Research Funding Program: ARISTEIA. Investing in knowledge society through the European Social Fund.