Background: The aim of this paper was to perform a systematic review and, when feasible, a meta-analysis of randomized controlled trials (RCT) which used benzodiazepines (BZD) as a monotherapy versus placebo, antidepressant drugs (AD), or both. Methods: Keyword searches were conducted for identifying RCT comparing BZD and AD, and/or placebo in the treatment of depression, using electronic databases from their inception up to April 2017. We selected reports of RCT in which BZD were compared to AD and/or placebo in the treatment of adult patients with a primary diagnosis of depressive disorder or anxious depression. When feasible, data were subjected to meta-analysis. Results: A total of 38 studies met the criteria for inclusion and were then included in the systematic review. Only 1 study concerned a newer AD, fluvoxamine. For the meta-analysis, we submitted data on response rate from 22 RCT, considering BZD versus placebo (8 comparisons) and BZD versus tricyclic antidepressants (TCA) (20 comparisons). There was a lack of significant differences as to response rate between BZD and placebo, as well as between BZD and TCA. Analysis of individual studies disclosed that, in more than half of the studies comparing BZD to TCA and/or placebo, BZD were significantly more effective than placebo and as effective as TCA. Conclusions: BZD are a therapeutic option in anxious depression and there are no indications that AD are preferable. There is a pressing need for RCT of adequate methodological quality and follow-up comparing BZD to second-generation AD and placebo in anxious depression.
Benzodiazepines (BZD) are often used in the treatment of depression in addition to antidepressant drugs (AD) in order to address any initial symptoms of anxiety [1‒5]. However, data regarding the use of BZD as a monotherapy and their specific antidepressant action are lacking.
The reviews available in the literature are older than 20 years, are not systematic, and lack important information about the inclusion criteria of the reported studies. In 1978, Schatzberg  analyzed 20 double-blind controlled trials in which BZD were compared to AD, or to AD and placebo for the treatment of depressive disorders. They concluded that, even though BZD may elevate mood, they exert limited effects on the core symptoms of depression. Subsequent published reviews on the comparisons between BZD and AD, and/or placebo in the treatment of depression [7‒9] did not yield clear indications.
The aim of this paper was to provide a systematic review of the use of BZD in depressive disorders, including only studies involving a monotherapy compared with AD and/or AD. Particular attention was paid to anxious depression, since it is a highly prevalent condition  for which AD are likely to yield limited responses .
PRISMA guidelines  were used to conduct a systematic review of the literature to identify randomized controlled trials (RCT) comparing BZD and AD, and/or placebo for the treatment of depression and anxious depression. Keywords were: “depression,” “benzodiazepine,” and “antidepressant.” From each of these words, a more extensive list of terms was obtained, including: “alprazolam,” “adinazolam,” “bromazepam,” “clonazepam,” “chlordiazepoxide,” “diazepam,” “lorazepam,” “major depressive disorder,” “dysthymic disorder,” “minor depression,” “neurotic depression,” “reactive depression,” “endogenous depression,” “melancholic depression,” “catatonic depression,” “atypical depression,” “seasonal depression,” “postpartum depression,” “psychotic depression,” “anxious depression,” “mixed anxiety depression,” “TCA,” “tricyclic antidepressant,” “amitriptyline,” “clomipramine,” “desipramine,” “dosulepin,” “doxepin,” “imipramine,” “nortriptyline,” “SSRI,” “serotonin reuptake inhibitor,” “citalopram,” “escitalopram,” “fluoxetine,” “fluvoxamine,” “paroxetine,” and “sertraline.” Only published trials in the English language were considered for inclusion. Electronic databases included PsycINFO, PsycARTICLES, CINAHL, the Cochrane Library, MEDLINE, Web of Science, KCI-Korean Journal Database and SciELO citation index, from the inception of each database up to April 2017. Reference lists from relevant studies and reviews were examined for further clinical trials not yet identified. Authors of significant articles and other experts in the field were contacted.
The search and the rating of target responses were carried out independently by 2 investigators (G.B. and E.O.); disagreements were resolved by consensus by these primary raters and a senior investigator (G.A.F.). We selected English reports of RCTs in which BZD were compared to antidepressants (AD) and/or placebo in the treatment of adult patients with a primary diagnosis of depressive disorder or anxious depression.
We excluded (a) nonrandomized controlled trials, (b) studies conducted with nonclinical samples, (c) those including patients < 18 years of age, (d) symptoms of depression and anxiety occurring in the context of primary diagnoses other than unipolar depressive disorders. We further excluded trials in which (e) BZD were not directly compared with AD or placebo, (f) BZD, AD, or placebo were administered in combination with other treatments, (g) patients were affected by bipolar depression, anxiety disorders, or other conditions were included without a separate analysis of the results, (h) no data about age were reported, and (i) the antidepressant effect of BZD was not assessed.
The primary outcome measure was response rate, according to standardized instruments and the clinician’s global evaluation. Secondary outcome measures were the dropout rate and the occurrence of adverse events during treatment.
Data were independently extracted by both reviewers with the use of a precoded form. The following data were extracted from studies meeting the criteria for inclusion in the systematic review: study design; demographics (age, gender distribution, and size of the sample); the primary diagnosis and other features (severity, clinical subgroups, and setting); the methods used to define response to treatment and adverse events; the type, duration, and dose of pharmacological treatment; group comparisons and the number of patients randomized to each treatment arm; response rate and dropout rate (and reasons); the frequency and severity of adverse events; and details about compliance, length of follow-up, and treatment discontinuation.
The methodological quality of the included trials was assessed on 2 levels. At the study level, we considered the presence of random allocation of treatments and blinding of outcome assessment; at the outcome level, we considered the presence of intent-to-treat analysis (ITT). Definitions of withdrawal, postwithdrawal, and rebound symptomatology were based on the criteria developed by Chouinard and Chouinard .
When feasible, data were subjected to a meta-analysis. The primary outcome of the meta-analysis was treatment efficacy, expressed as response rate. Therefore, the risk ratio (RR) of response and its standard error (SE) were calculated from each eligible study. Examination of the pooled results was performed based on the random-effects model to increase the generalizability of findings, since this model is more conservative than the fixed-effects model. An α level of 0.05 was used for hypothesis tests.
In addition to point estimates and confidence intervals, the Q statistic was performed to assess heterogeneity between study results. With this statistic, the null hypothesis is tested that effect sizes from each of the studies were similar enough for a common population effect size to be calculated . However, the Q statistic only informs about the presence versus the absence of heterogeneity; it does not report on the extent of such heterogeneity. The I2 statistic, which is an indicator of heterogeneity in percentages, was also calculated. A value of 0% indicates no observed heterogeneity, and larger values show increasing heterogeneity: 25% = low, 50% = moderate, and 75% = high heterogeneity .
The likelihood of significant publication bias was assessed with the Begg funnel plot , and testing for asymmetry using the Egger test statistic . Sensitivity analyses were implemented in order to estimate the influence of each study, by deleting each in turn from the analysis and noting the degree to which the size and significance of the treatment effect changed. Meta-regression was performed to investigate how certain characteristics (i.e., treatment duration, the presence of comorbid anxiety symptoms and ITT) acted to influence the treatment effects. Finally, clinical heterogeneity between studies was explored performing subgroup analyses based on the presence of comorbid anxiety symptoms (i.e., anxious vs. nonanxious depression). All analyses were conducted by J.G. using the user-written packages for meta-analysis available in Stata v10.1 (Stata Corp., College Station, TX, USA).
Characteristics of the Included Studies
The initial search identified 10,355 reports (online suppl. Fig. 1; for all online suppl. material, see www.karger.com/doi/10.1159/000486696). After adjusting for duplicates, the abstracts and titles of 2,190 reports were screened for a first evaluation. Of these, 2,120 did not meet the inclusion criteria and were excluded. Main reasons for exclusion were: language (i.e., other than English), sample characteristics (i.e., animals, children, adolescents, or a nonclinical population), type of document (i.e., comment, case report, pilot study, review, meta-analysis), another primary diagnosis (e.g., bipolar disorder, anxiety disorder, psychotic disorder, and substance abuse), and the experimental condition (e.g., a comparison with antipsychotics or a combination with other drugs, ECT, or psychotherapy). The full texts of the remaining 70 studies were then analyzed, with the additional elimination of 32 studies. Eight of these concerned subjects < 18 years of age or did not report any information about age, 7 compared BZD with ECT, psychotherapy, drug combinations or nonantidepressant medications, 1 did not evaluate the antidepressant effects of BZD, 3 were renamed copies of the same study and did not satisfy the blinded randomization criterion, 2 reported data on a subsample of subjects that were included in the main study, and 11 included subjects with anxiety or bipolar disorders in the total sample without a separate analysis of results.
Therefore, a total of 38 studies met the criteria for inclusion. Three compared BZD and placebo, 13 compared BZD and tricyclic antidepressants (TCA) and placebo (3 studies involved 2 BZD), and 21 compared BZD and TCA for the treatment of depression with or without comorbid anxiety. Only 1 study concerned newer AD. All these studies will be summarized and critically reviewed.
For the meta-analysis, we submitted data on response rate from 22 RCT involving BZD versus placebo (8 comparisons) and BZD versus TCA (20 comparisons) as separate analyses.
Systematic Review of RCT on the Efficacy of BZD in the Treatment of Depressive Disorder
BZD versus Placebo
Anxious Depression. One study found no significant differences between chlordiazepoxide and placebo in mixed anxiety/depression  (online suppl. Table 1).
Nonanxious Depression. Two reports showed adinazolam to be associated with a rapid onset of action and significantly greater efficacy than placebo in the treatment of DSM-III major depressive disorder [19, 20] (online suppl. Table 1). Smith and Glaudin  reported no medical events after abrupt or tapered discontinuation of adinazolam, and Cohn et al.  observed no seizures after gradual tapering of adinazolam.
BZD versus TCA and Placebo
Anxious Depression. Nine reports compared BZD with TCA and placebo in the treatment of mixed anxiety/depressive states [21‒23], reactive or neurotic depression , major depressive disorder with symptoms of anxiety [25‒28], and ICD-9 neurotic depression or affective psychosis  (online suppl. Table 1).
Regarding efficacy, in 6 studies, all active treatments were significantly more effective than placebo, with no significant differences between BZD and TCA [21, 24‒26, 28, 29]. Only alprazolam, and not amitriptyline, differed significantly from placebo in the study by Imlah , and the difference between active treatments and placebo did not reach a significant level in Rickels et al. . In 3 of these studies, BZD showed a faster onset of action than TCA [24, 25, 29]. Results in favor of BZD were found in 1 study , in which bromazepam was associated with a faster onset of action and a significantly greater improvement than amitriptyline. Lastly, in 2 of the studies, response to treatment was significantly greater with TCA, with no differences between chlordiazepoxide and placebo [22, 27].
Dropout rates were significantly higher among patients treated with placebo, presumably due to ineffectiveness [21, 26] and/or deterioration [23, 24, 27], and in those treated with TCA due to the side effects [21, 23, 26, 27]. No significant differences in attrition rates were observed between BZD and TCA and placebo in 2 studies [22, 25]. Adverse reactions, such as anticholinergic symptoms, agitation, and tachycardia were more frequently reported for TCA [21, 24‒28], while BZD generally resulted in greater drowsiness [27, 28]. However, in Rickels et al. [21, 26], sedation and drowsiness occurred significantly more frequently with all active treatments compared with placebo. In Shammas , the proportion of patients experiencing side effects was similar with bromazepam and amitriptyline, and was significantly higher for both medications compared to placebo. No withdrawal reactions or nightmares were observed by Shammas  and Laakman et al. .
Nonanxious Depression. Four reports compared BZD with TCA and placebo in patients suffering from DSM-III major depressive disorder [30, 31], with melancholic characteristics in 2 cases [32, 33] (online suppl. Table 1).
As for the response to treatment, in 1 of these studies, alprazolam and imipramine were significantly more effective than placebo and produced a comparable effect . In 2 of them, TCA yielded a significantly greater improvement than both BZD and placebo [31, 33]. In the fourth study, both alprazolam and imipramine were significantly more effective than diazepam and placebo . Most of these studies reported a faster action with BZD than with TCA [30‒32].
Dropouts due to ineffectiveness were significantly more frequently associated with placebo [30‒32] whereas those due to side effects were significantly more prevalent among patients treated with TCA [30, 31], except for 1 study . Two investigations provided follow-up data 1 week after the end of treatment, during which the dosage was decreased to 50% for 3 days and then to zero for 4 more days. Rickels et al.  reported a slight worsening only in the alprazolam- and diazepam-treated groups, but no changes with imipramine and placebo. Rickels et al.  documented that only adinazolam resulted in a rebound of symptoms.
BZD versus TCA
Anxious Depression. Seven reports directly compared BZD and TCA in patients suffering from mixed anxiety/depressive states [34‒37], neurotic depression [38, 39], and DSM-III major depressive disorder with anxiety  (online suppl. Table 1).
As for adverse reactions, a significantly greater number of side effects was reported in patients treated with TCA by Singh et al. , whereas a comparable likelihood of reporting side effects was observed with both BZD and TCA in 3 other studies [35, 37, 39].
Nonanxious Depression. Fourteen reports compared BZD and TCA in the treatment of depression [41, 42], major depressive disorder [43‒47], and serious forms of endogenous depression [48‒53] including 1 with melancholic characteristics  (online suppl. Table 1).
Regarding response to treatment, 9 of these studies found an overall improvement, with no significant differences across medication classes [41‒43, 45, 46, 49, 51, 52, 54], even in the setting of melancholia , in both outpatients and inpatients . Results significantly favoring the use of TCA were found in 4 studies [44, 47, 48, 50]. However, in Rush et al. , alprazolam showed a more rapid action than TCA, with a significantly higher percentage of alprazolam-treated patients considered as being in remission after 1 week of treatment. Eriksson et al.  found that amitriptyline was significantly superior to alprazolam only in a subgroup of patients suffering from depression with psychomotor retardation, a low level of anxiety, and an absence of precipitating factors, but they found no differences between treatments in the other cases. Finally, in Ansseau et al. , amitriptyline was significantly more effective than diazepam; adinazolam showed an intermediate position between amitriptyline and diazepam, with amitriptyline being significantly superior to adinazolam only on the endogenomorphy subscale.
Dropout rates were estimated to be approximately similar for both medication classes in 4 studies [43, 45, 47, 52]. However, in Hubain et al. , a significantly greater number of alprazolam-treated patients dropped out of the study (due to ineffectiveness) than amitriptyline-treated patients. The dropout rate was significantly greater with TCA in 2 studies [46, 51]; in Ansseau et al. , it was significantly more common with diazepam than with amitriptyline due to ineffectiveness after 2 weeks. Adverse reactions, such as anticholinergic symptoms, were significantly more frequent with TCA in 5 studies [42, 44‒46, 53], but no differences were observed between TCA and BZD in 3 other reports [41, 47, 52].
BZD versus Newer AD
Laws et al.  compared lorazepam to fluvoxamine in the treatment of 112 subjects affected by a mixed state of anxiety and depression (online suppl. Table 1). They observed no differences between the 2 medications. Compliance and adverse events were similar for both drugs, even though nausea and vomiting were significantly more frequently observed among fluvoxamine-treated patients.
Meta-Analysis on the Efficacy of BZD in the Treatment of Depressive Disorder
Twenty-two reports contributed data for the meta-analysis. Participants were assigned at random to the conditions in all studies, and all were double-blind trials. ITT analyses were performed in 8 reports [19, 20, 23, 28, 30, 31, 42, 53], completers’ data only were reported in 12 studies [25, 27, 29, 37, 39, 41, 44, 45, 47, 49, 50, 54], and handling of attrition was not reported in 2 studies [46, 48].
BZD versus Placebo
Eight studies contributed data for the meta-analysis of response rate comparing BZD and placebo in the treatment of depressive disorder with or without comorbid anxiety [19, 20, 25, 27‒31]. They reported response rates for a total of 1,302 participants (723 patients in the BZD treatment arm and 579 in the placebo arm). Participants averaged 40.1 (SD 3.0) years of age, and 64.9% (range 45.9–75%) were female.
The data did not show significant differences in the response rate between BZD and placebo (Fig. 1). The pooled RR for response was 1.878 (95% CI 0.574–6.145; p = 0.297) in the random-effects model. Heterogeneity across trials was statistically not significant (Q = 0.442; df = 7; p = 1.000). The I2 statistic also indicated no significant heterogeneity (0%) among the pooled studies. Visual inspection of the Begg funnel plot and the Egger test (p = 0.886) did not suggest the presence of publication bias. A sensitivity analysis was performed to determine the contribution of each study to the overall effect size, and no study appeared to markedly influence the observed RR for response.
Performing meta-regression analysis, we did not find any significant effect of the abovementioned characteristics (i.e., treatment duration, the presence of comorbid of anxiety symptoms, and ITT) on the response rate among the included studies.
Subgroup analyses were also performed, considering anxious depression and nonanxious depression separately. No significant differences were detected between BZD and placebo as to response rate in both anxious (RR = 1.818, 95% CI 0.339–9.758, and p = 0.486, based on 4 studies) and nonanxious depression (RR = 1.940, 95% CI 0.364–10.328, and p = 0.437, based on 4 studies).
BZD versus TCA
Twenty studies contributed data for the meta-analysis of response rate comparing BZD and TCA in the treatment of depressive disorder with or without comorbid anxiety [23, 25, 27‒31, 37, 39, 41, 42, 44‒50, 53, 54]. They reported response rates for a total of 2,118 participants (1,146 patients in the BZD treatment arm and 972 in the TCA arm). Participants averaged 40.2 (SD 5.05) years of age, and 67.6% (range 43.3–86.4%) were female.
The pooled RR for response was 0.809 (95% CI 0.514–1.273; p = 0.359) in the random-effects model, suggesting no significant differences in response to treatment with BZD and TCA (Fig. 2). Heterogeneity across trials was statistically not significant (Q = 4.772; df = 19; p = 1.000). The I2 statistic also indicated not significant heterogeneity (0%) among the pooled studies. The visual inspection of the Begg funnel plot and he Egger test (p = 0.379) did not suggest the presence of publication bias. A sensitivity analysis was performed to determine the contribution of each study to the overall effect size, and no study appeared to markedly influence the observed RR for response.
Performing meta-regression analyses, we did not find any significant effect of the abovementioned characteristics (i.e., treatment duration, the presence of comorbid of anxiety symptoms, and ITT) on response rate among the included studies.
Subgroup analyses were performed, considering anxious depression and nonanxious depression separately. We did not find significant differences in the response rate with BZD and with TCA in the treatment of both anxious (RR = 1.172, 95% CI 0.482–2.848, and p = 0.727, based on 7 studies) and nonanxious depression (RR = 0.709, 95% CI 0.418–1.203, and p = 0.203, based on 13 studies).
Our systematic review included 38 RCT examining the efficacy of BZD in the treatment of depression. None of these studies was conducted after 1990s and most of them compared BZD to TCA and/or placebo, with only 1 involving newer AD. Analysis of individual studies disclosed that, in more than half of the studies comparing BZD to TCA and/or placebo, BZD were found to be significantly more effective than placebo and as effective as TCA. In 11 studies, TCA were better than BZD, while BZD were better than TCA in only 1. In 12 of these studies, BZD were associated with a faster onset of action than TCA [23‒25, 29‒32, 38, 43, 45, 50, 51]. Dropouts due to ineffectiveness were more frequently observed in the placebo group, while in the TCA group they were more likely to be associated with adverse reactions. Furthermore, side effects recurred more frequently with TCA, while BZD induced more drowsiness and cognitive impairment.
Only the studies that were deemed to share certain clinical characteristics a priori  were included in the meta-analysis, the main findings of which were lack of significant differences as to response rate between BZD and placebo as well as between BZD and TCA. Even though most of the studies recruited depressed patients with anxiety symptoms, an attempt was made to analyze clinical subgroups according to the presence of comorbid anxiety (i.e., anxious depression) separately. These subgroup analyses confirmed the previous result of a lack of significant differences in response rate between BZD and placebo and BZD and TCA in both anxious and nonanxious depression.
The interpretation of these findings is hampered by a number of methodological problems: small sample sizes, the presence of a wide variability across studies regarding treatment duration (ranging from 3 to 12 weeks), diagnosis definition (interestingly, it is only in DSM-5 that an “anxious distress” specifier was added to the diagnosis of depressive disorder), inclusion criteria, methods used to evaluate response to treatment, information about severity, adverse reactions or dropouts, and the use of ITT analysis. Furthermore, very little information was available about the long-term antidepressant efficacy of BZD and TCA. Two studies [31, 32] with a follow-up 1 week after the treatment reported a worsening with BZD but not with TCA. Laakman et al.  observed a comparable worsening in all the BZD, TCA, and placebo groups after 2, 4, and 6 weeks of follow-up.
The only study comparing BZD and newer AD  did not find significant differences in response to treatment between lorazepam and fluvoxamine in the treatment of mixed anxiety/depression, with a better tolerability of BZD. These findings should be considered with caution, since subjects who dropped out during the first 2 weeks of treatment were not included in the efficacy analysis and a placebo control group was lacking. It is indeed difficult to draw firm conclusions about the comparative efficacy of BZD and TCA (or newer AD) when studies do not include a placebo control group.
Finally, our systematic review relied on published findings only. This could explain the paucity of trials on the comparison between BZD and newer AD, since the financial interests in the latter may have led to a selective publication of results about their efficacy [57, 58].
This systematic analysis of the literature, even considering these limitations, has shown BZD to be as effective as AD, with a faster action and better tolerability, in the short-term treatment of depression, particularly with comorbid anxiety symptoms and regardless of the severity of the major depressive disorder. A secondary analysis  of data, obtained by Rickels et al. , reported many significant interactions between drug and patient subgroups within a 4-week period, e.g., a lack of significant differences between treatment groups in cases of low levels of anxiety and depression, greater efficacy of chlordiazepoxide and amitriptyline than placebo in cases of high levels of anxiety and depression, greater efficacy of amitriptyline than chlordiazepoxide and placebo in cases of high levels of depression and low levels of anxiety, and greater efficacy of chlordiazepoxide than amitriptyline and placebo in cases of high levels of anxiety and low levels of depression.
These findings have important clinical implications for the treatment of anxious depression. About half of outpatients with major depressive disorder also have clinically meaningful anxiety, measured either by a dimensional approach or with co-occurring anxiety disorder [11, 60]. Such characterization should be differentiated from mixed anxiety-depression , which is concerned with subsyndromal symptoms that do not reach the threshold diagnosis of major depressive disorder, generalized anxiety disorder, or any other full-syndrome disorder . Anxious depression tends to display a worse response to AD than other forms of depression , and AD are not more effective than placebo in the treatment of minor or subthreshold depression [62, 63]. Nevertheless, AD, especially SSRI and SNRI, are considered to be a primary choice for treating anxious depression [2‒4] as well as anxiety disorders, despite the fact that direct comparisons indicate that BZD are more effective and tolerable than AD for the latter [64‒66]. Dosages of BZD used in trials concerned with depression were comparable to those employed in the treatment of anxiety [64‒66].
The findings of our meta-analysis confirm the low response of anxious depression to psychotropic drug treatment, whether AD or BZD, which failed to display any significant differences when compared to placebo. At the same time, however, there was no evidence to support the superiority of AD over BZD in anxious depression. BZD, however, are portrayed as having a marginal role in the treatment of depression, because of concerns about toxicity, abuse, and dependence. Such risks, usually associated with a prolonged use of BZD, have been widely emphasized in the literature, even though the evidence does not support these concerns [67‒69]. The risk of abuse of BZD is actually rather low considering the number of people who use these compounds [69, 70]. Withdrawal reactions may follow after the discontinuation of both BZD and AD [13, 71, 72]. Furthermore, while BZD have been related to greater sedation, anterograde amnesia, and cognitive or psychomotor impairment [64, 69], long-term treatment with SSRI and SNRI has been linked to serious side effects such as gastrointestinal symptoms, weight gain, and metabolic abnormalities, cardiovascular disturbances, sexual dysfunction, osteoporosis and the risk of fractures, bleeding, and behavioral toxicity [73‒75]. Finally, the initiation of treatment with AD may be hampered by symptoms such as jitteriness , especially in anxious patients.
BZD may thus be considered in the treatment of anxious depression, with attention to compounds that are less likely to induce dependence, such as clonazepam . The temporary use of BZD may attenuate a patient’s anxious and depressive symptoms, and, if associated with psychotherapeutic management (the application of psychological understanding in the management of the patient, which includes helping them to identify and deal with current life situations, without the use of formal psychotherapy), may improve the overall clinical picture. The results do not provide evidence (in terms of efficacy, side effects, or dependence) to support the clinical stance that AD should be preferred in the range of mild severity. There is a pressing need for RCT of adequate methodological quality and follow-up  that compare BZD, second-generation AD, and placebo in the treatment of anxious depression.
No conflicts of interest were reported.