Recent years have seen major developments in psychotherapy research that suggest the need to address critical methodological issues. These recommendations, developed by an international group of researchers, do not replace those for randomized controlled trials, but rather supplement strategies that need to be taken into account when considering psychological treatments. The limitations of traditional taxonomy and assessment methods are outlined, with suggestions for consideration of staging methods. Active psychotherapy control groups are recommended, and adaptive and dismantling study designs offer important opportunities. The treatments that are used, and particularly their specific ingredients, need to be described in detail for both the experimental and the control groups. Assessment should be performed blind before and after treatment and at long-term follow-up. A combination of observer- and self-rated measures is recommended. Side effects of psychotherapy should be evaluated using appropriate methods. Finally, the number of participants who deteriorate after treatment should be noted according to the methods that were used to define response or remission.
Research in psychotherapy is crucial to test and provide support for psychological treatments which may improve public health and reduce the burden of mental illness and behavioral problems . Psychotherapeutic strategies may also have potential to improve psychological coping, quality of life, illness behavior, and affective components of medical illness . A clinical response after treatment is not synonymous with an effect that can be attributed to psychotherapy. The latter can only be accurately estimated with reference to an appropriate control group. The randomized controlled trials (RCT) play the most important role in this process, even though also other forms of investigation may yield valuable information [3-9]. The methodology for psychological interventions of trials has been discussed in the literature [3, 5, 6, 8, 9-13]. In Psychotherapy and Psychosomatics, specific reference to the selection and design of control conditions has been made . However, in recent years there have been important developments that suggest the need to address critical methodological issues. These recommendations do not replace those that are operational for RCT , with particular reference to the CONSORT statement for nonpharmacological treatments , but rather supplement issues that need to be taken into account for an updated consideration of clinical trials concerned with the efficacy of psychological interventions. We will not discuss designs that aim to identify outcome predictors, process variables, and qualitative studies.
Diagnostic criteria have been developed in psychiatry and clinical psychology to improve the interrater reliability of diagnostic assessments . These criteria are particularly helpful in setting a threshold for conditions worthy of clinical attention and, not surprisingly, clinical studies concerned with psychological interventions and/or pharmacotherapy are generally based on DSM nomenclature. However, questions may be raised about the effectiveness of diagnostic criteria in yielding relatively homogeneous conditions in terms of severity and course of illness that are amenable to be tested by RCT . As an addition to diagnostic categorical approaches, network analysis may yield valuable insights into changes in patterns of symptoms and signs in psychotherapy trials [17-18].
The standard RCT design in medicine is still based on the acute disease model and ideally evaluates therapeutic effects in untreated patients who have a recent acute onset of their disorders. This is in sharp contrast with the fact that previous treatments may have actually modified the course and responsiveness of the individual patient [19-22]. Any type of treatment, such as long-term use of psychotropic medications, may increase the risk of experiencing additional health problems that do not necessarily subside with discontinuation of the medication and modify responsiveness to subsequent treatments. The term “iatrogenic comorbidity” refers to unfavorable modifications in the course, characteristics, and responsiveness of an illness that may be related to treatments previously administered . Such vulnerabilities may occur during treatment administration (whether pharmacotherapy or psychotherapy) and/or manifest themselves after its discontinuation. The changes can be persistent and not limited to a short phase, such as in the case of persistent postwithdrawal disorders after discontinuation of psychotropic medications, and cannot be subsumed under the generic rubrics of adverse events or side effects . Patients are generally included in a trial irrespectively of previous treatments (the so-called “nowhere patients”), even though these features may affect its outcome . Meta-analyses of these groups of patients may amplify the heterogeneous nature of the clinical populations , particularly if the randomization process does not take these variables into account.
Staging offers important opportunities to incorporate a patient’s history into the assignment to randomization. Staging methods have been developed for unipolar depression, bipolar disorder, panic disorder, schizophrenia, eating disorders, and alcohol use disorders [25, 26]. This differs from the conventional diagnostic practice in that it defines not only the extent of progression of a disorder at a particular point in time but also where a person is currently standing along the continuum of the course of illness, including the previous response to treatment. Staging has an important role in planning psychotherapeutic interventions in patients with mental disorders , particularly in the sequential model where psychotherapeutic intervention is applied to the residual phase after pharmacotherapy of depression [28, 29]. Further, staging allows classification of previous responses to treatment (including psychotherapy), such as resistance or loss of a clinical effect [26, 30].
In addition to the inclusion criteria, other important points for the validity of a psychotherapy study are how the recruitment of patients takes place and how representative the sample examined is.
Study Design and Choice of the Control Group
Parallel group designs are the most widely used modalities of investigation of psychological interventions. Since the effects of a psychotherapeutic approach cannot be withdrawn or switched to attention-placebo, crossover designs entail major methodological problems. The design of a control group in a parallel controlled study very much depends on the purposes of the study. The most frequently used models are as follows.
Comparison between Experimental and Control Psychotherapy
This is the classic model of RCT in psychological interventions . In psychotherapy research, the no-treatment control condition was soon found to be clearly inadequate, since it does not incorporate any of the ingredients that are subsumed under the placebo effect . The waiting list control (WLC) does not provide most of the variables which occur within a psychotherapeutic process, such as establishing a therapeutic relationship or encouragement. The WLC may be followed by the experimental treatment after a certain time lag (the crossover variant). Another option is the minimal attention control group, which offers the people on the WLC a minimum of contact by the researcher and/or monitoring of symptoms and well-being. Even though it may provide a certain control for the natural course of the condition and for patient’s expectations, the WLC may overestimate the true effects of an intervention  and is likely to be inadequate in view of the previous treatments that may have been experienced. Further, the illness may change during the waiting period. The WLC may thus be suitable for preliminary testing of an experimental procedure and for pilot studies.
Comparison between Experimental Psychotherapy and Treatment as Usual
Routine interventions provided by clinicians in the settings from which participants are recruited provide another commonly used form of control . Obviously, this type of control does not allow establishment of whether any significant difference – that was yielded by the addition of psychotherapy to treatment as usual (TAU) or care as usual – was actually due to some specific treatment ingredients, introduced by the experimental procedure, or to nonspecific factors, such as attention and opportunity for disclosure [2, 33, 34]. Further, TAU is seldom manualized, monitored, or supervised and may be anything but usual . The value of TAU lies in providing a demonstration of the clinical value of an approach once its specific features have been demonstrated against other forms of control groups. Considering the growing role of psychotropic medications in the management of patients, iatrogenic comorbidity is likely to considerably affect the outcomes of TAU in both experimental and control conditions .
Comparison between Experimental Psychotherapy and Nonspecific Treatment Component Control Group
An experimental form of psychological intervention may be compared to an attention placebo group. Placebo effects are often attributed to clinical interactions and contextual factors that affect the expectations of the patient about the treatment and result in symptom changes . There are many forms of attention placebo control groups. In “clinical management” (CM), a control group receives the same amount of time and attention from a professional figure than the experimental group, but without any specific interventions such as exposure, a structured diary, or cognitive restructuring. It applies psychological understanding to the management of an individual patient, identifying current problems and providing opportunities for disclosure. It may be associated with medication monitoring [37, 38], but its primary focus may also be unrelated to pharmacological treatment [39-42]. Since CM provides the nonspecific ingredients of the psychotherapeutic approach, significant differences between an experimental treatment and CM are likely to reflect specific ingredients entailed by the experimental approach, unlike what takes place with TAU or WLC. CM should be differentiated from nonspecific factor component control , where patients are informed only about treatments available or self-aid options and psychotherapeutic management is missing.
Attention placebo control groups may suffer from low acceptance by participants, resulting in unbalanced and/or heightened dropout rates .
Another option is to submit the control group to a treatment, which was found to be devoid of therapeutic effectiveness in other studies. For instance, in the London-Toronto study, exposure treatment was compared to relaxation, a treatment modality that was found to be ineffective in panic disorder and could thus act as psychological placebo .
Comparison between Experimental Psychotherapy and Other Treatments
Another modality is to compare a psychotherapeutic approach to another treatment that was found to be effective in other studies, whether pharmacotherapy or psychotherapy. The problem with this type of comparison is that we do not know whether in that specific situation the gold standard (whether pharmacotherapy or psychotherapy) would be superior to placebo or not. Only a 3- or 4-arm design including pharmacological and/or psychological placebo  could solve this problem.
The comparison of 2 active psychological treatments may lead to nonsignificant differences in statistical tests. Although the methods for noninferiority trials have been refined, these approaches may have a bias in favor of (erroneous) noninferiority conclusions .
An adaptive intervention is a multistage process that is based on the patient’s characteristics and intermediate outcomes collected during an intervention, such as the patient’s response after the first line of treatment . The Sequential Multiple Assignment Randomized Trial (SMART) involves multiple intervention stages; each stage corresponds to one of the critical decisions and the participant is randomly reassigned to one of the intervention options . Examples are adaptive interventions that followed nonresponse to initial treatment (whether pharmacotherapy or psychotherapy or both) in mood [47-49] and anxiety disorders [50-52]. The conceptual assumption is that after testing a standard treatment in a group of patients we are left with a fairly homogeneous group characterized by resistance. Actually, nonresponse may include a very wide range of explanations (inadequate treatment in terms of indication, dosage, or duration; the occurrence of side effects prevailing over benefits; partial compliance; previous exposure to that specific treatment; psychosocial events and/or physical health problems intervening during the trial; problems in the patient-therapist relationship; and modifications in the patient’s lifestyle and illness behavior). Various forms of tolerance (e.g., resistance upon rechallenge with the same medication that yielded a response, and loss of clinical effect) and dependence may occur with pharmacological treatment and are likely to affect the treatment response [22, 31, 53]. Nonresponse thus encompasses very heterogeneous features. Very seldom, however, such clinical events are considered for inclusion of patients in a trial.
Treatment outcome in medicine is the cumulative result of the interaction of several classes of variables with a selected treatment, i.e., living conditions, patient characteristics, self-management, illness characteristics, and previous treatments, as well as treatment setting . A psychotherapeutic approach is likely to include multiple treatment ingredients [10, 12, 14]. As a result, dismantling designs may determine whether a specific treatment component may yield additional value to the treatment package. They include both studies in which full and partial treatment packages are compared and those in which a given component is added to an existing therapy . For instance, classic studies performed in the seventies and eighties in anxiety disorders demonstrated the redundant nature of relaxation and therapist-aided exposure compared to homework exposure in phobic disorders . Other examples of dismantling design may involve the role of meditation in mindfulness-based cognitive therapy  or the sequential use of 2 different psychotherapeutic strategies compared to a single psychotherapeutic technique . Dismantling designs may assess the relative impact of including, or even targeting, family members in psychotherapy with children and adolescents. Alternative designs that could yield valuable insights may involve microtrials/case series designs. However, dismantling or specific-factor component control studies may present problems regarding sample size and statistical power .
Interaction of Experimental Psychotherapy with Pharmacotherapy
There are 4 models of interaction: (1) addition (the effect of 2 interactions combined equals the sum of their individual effects), (2) potentiation (the effect of 2 interventions combined is greater than the sum of their individual effects), (3) inhibition (the effect of 2 interventions combined is less than each individual effect), and (4) reciprocation (the effect of 2 interventions combined equals the individual effect of the more potent intervention) . Even though most of the studies are compatible with the additive and reciprocal models of interaction, also inhibitory effects can occur [60-63]. Adequate study of the interaction between psychotherapy and pharmacotherapy requires the use of both an active medication and placebo. For instance, one of the most influential efficacy studies of antidepressant medications in adolescents, the Treatment for Adolescents with Depression Study (TADS) , included a comparison between cognitive behavioral therapy alone, fluoxetine, placebo, and the combination of fluoxetine and cognitive behavioral therapy. There was not, however, a placebo associated with the cognitive behavioral therapy group, a decision that favored the pharmacotherapy/psychotherapy combination and excluded the possibility of an inhibitory effect (with psychotherapy plus placebo being significantly superior to pharmacotherapy plus psychotherapy), as was found to be the case in the London-Toronto study where both alprazolam and placebo were investigated .
Description of Treatment Components
Psychological interventions are usually complex and involve several components, each of which may influence the estimated treatment effect [10, 11]. Regardless of the study design, it appears to be important that all potential treatment ingredients that were found to yield effects in controlled trials should be detailed in the description of treatment packages of investigations concerned with psychotherapy, including the order of administering therapeutic components . Citing a published manual or adding a supplement containing details can boost the probability of attempts to replicate the findings.
Psychological assessment blind to treatment assignment has been a cornerstone of psychotherapy research. As in double-blind placebo-controlled studies the assessors may recognize patients assigned to medication or placebo, also single-blind studies of psychotherapy may present with the same problem (e.g., the patient may mention the type of treatment he/she is receiving). It is also important to assess the patients in a trial not only before and after treatment but also at some time during follow-up in order to verify long-term outcomes. This is the most suitable strategy for differentiating psychopharmacological and psychotherapeutic approaches. The following issues deserve brief comment.
The term “clinimetrics” [30, 65, 66] indicates a domain concerned with the measurement of clinical issues that do not find room in customary clinical taxonomy. Such issues include the types, severity, and sequence of symptoms; the rate of progression in illness (staging); the severity of comorbidity; problems of functional capacity; reasons for medical decisions (e.g., treatment choices), and many other aspects of daily life, such as well-being and distress. Clinimetrics has a set of rules that govern the structure of indices, the choice of component variables, and the evaluation of consistency and validity and that differ from psychometrics, which was developed outside of the clinical field, mainly in the educational and social areas. An essential clinimetric requisite for an assessment method is its discrimination properties (responsiveness/sensitivity), which means that it should be able to detect clinically relevant changes in health status over time . As important is the concept of incremental validity that refers to the unique contribution (or incremental increase) in predictive power associated with a particular assessment procedure in the clinical decision process . Accordingly, each distinct modality of measurement should deliver a unique increase in information in order to qualify for inclusion. In clinical research, several scales are often used under the misguided assumption that nothing will be missed. On the contrary, violation of the concept of incremental validity leads to conflicting results .
In clinical trials priority has been given to the standardization of observer-rated scales that could be gold standards in the differentiation between the efficacy of a psychotropic medication compared to placebo. Such standardization stems from the necessity of comparing studies in different countries which may have different languages . As a result, a limited number of symptoms is selected and psychological measurements are targeted to test efficacy. These pragmatic needs, however, have limited the field and prevented developments. Excessive reliance on symptoms that are part of diagnostic criteria of mental disorders (e.g., major depressive disorder and generalized anxiety disorder) has impoverished clinical assessment in psychotherapy research and does not reflect the broad spectrum of variables that affect clinical presentations, such as demoralization and irritable mood , psychological well-being and euthymia [67, 68], mental pain , social adjustment and functioning [70-72], illness behavior [2, 73], and patient satisfaction . Patient-reported outcomes  are frequently used in psychotherapy research. They may be more conservative than clinician-rated outcomes in assessing changes over time [74, 75]. Many self-rating scales reflect general aspects of distress and not necessarily specific treatment targets . Self-rated methods might be particularly indicated in the case of computer-assisted internet-delivered treatments . In child psychotherapy research studies, assessment often includes reports from multiple informants . Finally, it is important to differentiate between outcome and process measures.
Another major limitation of standard assessment strategies has to do with the fact that targets of assessment have predominantly involved the desired effects of a medication. The evaluation of adverse events has been neglected despite the fact that appraisal of side effects of psychotherapy has attracted increasing interest [79, 80]. Psychotherapists are biased against recognizing their own treatment’s side effects . The assessment of side effects of psychotherapy entails a number of problems: side effects may be related not only to symptoms or course of illness but also to other areas in life, and it is difficult to ascertain the relationship between a certain event and a treatment. As with adverse events induced by medications, recognition of side effects depends on the adequacy of collection strategies [79, 80], and both interviews and self-rated instruments can be used.
It has become common practice in RCT to quantify the number of participants who, after a pharmacologic and/or psychotherapeutic trial, achieve response or remission according to specific cutoff points of rating scales [12, 14, 62]. Remission can be expressed either as a categorical variable (present/absent) or as a comparative category (nonrecovered, slightly recovered, moderately recovered, or greatly recovered) which refers to the clinical distance between the current state of the patient and his/her pretreatment position . In the same vein, many studies are concerned with relapse and recurrence as primary outcome measures, even though adequate criteria are not available for all mental health conditions. It is important, however, to indicate the number of participants who display deterioration after treatment according to specific cutoff points of the same rating scales . In fact, in clinical trials where differentiation according to cogent subgroups is made, a treatment which is helpful on average may be ineffective in some patients (no difference with placebo) and even harmful in someone else (worse than placebo) [7, 82]. The same phenomena have been described with the outcome of psychotherapy .
Further, including biomarkers as secondary outcomes, in addition to the primary outcome that is used in any psychotherapy trial, may be particularly helpful. Examples of biomarkers may encompass changes in neurotrophins such as BDNF, neuroimaging changes after psychological intervention, or digital biomarkers such as actigraphy [68, 83]. Big data approaches may open new avenues in psychotherapy research through behavioral biomarkers .
The main methodological recommendations that we have discussed are summarized in Table 1. Psychotherapy research has produced major clinical advances in the treatment of mental disorders. The question is how to put the available evidence within the context of individual unique assets and liabilities. The methodological innovations that we have outlined may demarcate major prognostic and therapeutic differences in psychological trials and yield valuable insights into the clinical role of psychotherapy.
Drs. Guidi, Bockting, Brakemeier, Cosci, Cuijpers, Jarrett, Linden, Marks, Peretti, Rafanelli, Rief, Schneider, Schnyder, Sensky, Tomba, Vazquez, Vieta, Zipfel, and Fava have no financial conflict of interests to disclose.
Dr. Wright has an equity interest in Empower Interactive and Mindstreet, developers and distributors of computer programs for behavioral health. He receives no royalties or other payments from sales of these programs. His conflict of interests is managed by an agreement with the University of Louisville.