Abstract
Drug dependence has long been thought to have a genetic component. Research seeking to identify the genetic basis of addiction has gone through important transitions over its history, in part based upon the emergence of new technologies, but also as the result of changing perspectives. Early research approaches were largely dictated by available technology, with technological advancements having highly transformative effects on genetic research, but the limitations of technology also affected modes of thinking about the genetic causes of disease. This review explores these transitions in thinking about the genetic causes of addiction in terms of the “streetlight effect,” which is a type of observational bias whereby people search for something only where it is easiest to search. In this way, the genes that were initially studied in the field of addiction genetics were chosen because they were the most “obvious,” and formed current understanding of the biological mechanisms underlying the actions of drugs of abuse and drug dependence. The problem with this emphasis is that prior to the genomic era the vast majority of genes and proteins had yet to be identified, much less studied. This review considers how these initial choices, as well as subsequent choices that were also driven by technological limitations, shaped the study of the genetic basis of drug dependence. While genome-wide approaches overcame the initial biases regarding which genes to choose to study inherent in candidate gene studies and other approaches, genome-wide approaches necessitated other assumptions. These included additive genetic causation and limited allelic heterogeneity, which both appear to be incorrect. Thus, the next stage of advancement in this field must overcome these shortcomings through approaches that allow the examination of complex interactive effects, both gene × gene and gene × environment interactions. Techniques for these sorts of studies have recently been developed and represent the next step in our understanding of the genetic basis of drug dependence.
Addiction Problems Past and Present
Although the problems associated with drug addiction and drug overdose have been a major public health focus in recent years, as well as the focus of much political attention, for addiction researchers this is not a new problem. Addiction is not a phenomenon unique to our society, or even to modern society, as is occasionally suggested. Indeed, it is clear that there have been opioid addiction epidemics in the USA after every major war since the civil war in the mid-19th century, as well as periods where opioid misuse resulted from poor medical practice that underestimated the potential for addiction, in addition to other contributing societal, cultural, economic, and geopolitical factors [Courtwright, 1978, 2001]. Moreover, of course, many other drugs have been abused in the USA, and elsewhere, and the abuse of particular drugs waxes and wanes, in part due to changes in public perception of their danger that rise as drug use, abuse, and overdose increase, but then are quickly forgotten as the drug becomes less popular. Of course, it is also true that in each society certain drugs are treated with much more tolerance, culturally and legally, and thus present more consistent problems for a society, as alcohol has for the USA.
One of the reasons that drugs are often tolerated within a society is the simple fact that the vast majority of people who use those drugs do not develop abuse problems, even for many “hard drugs.” In considering the problems associated with drug abuse and addiction it is therefore important to consider this obvious fact and turn the question around: what is different about people who develop drug use problems from those who do not? The answer to this question is far from simple, and certainly involves a complex mix of environmental and genetic influences. This review will focus largely upon the question of genetic contributions to drug addiction and abuse liability, how the role of these genetic factors has been studied, the weaknesses that were inherent in those research approaches, and how those weaknesses may now be overcome.
Drug Dependence Has a Genetic Basis
One of the first steps in science is to formalize ways of thinking about what non-scientists often view as “obviously true.” One of the benefits of this formal way of thinking is the common demonstration that “common knowledge” is often false. With regard to the genetic basis of addiction, however, common beliefs based on everyday observations are largely true. These are based upon the observation that addiction tends to be observed more often in closely related individuals. Of course, closely related individuals usually share common environments as well as genetics, so that the greatest difficulty in confirming a genetic contribution in the likelihood of developing drug use problems is in separating these factors.
There were 3 primary types of evidence that were initially used to establish the genetic relationship for drug dependence and drug abuse liability. Family, adoption, and twin studies examining dependence and abuse for a range of substances have all supported the view that there is a substantial genetic contribution to drug dependence [Uhl et al., 1995; Tsuang et al., 1998; True et al., 1999; Kendler et al., 2000]. There is strong evidence from studies in closely related individuals (as summarized in Schuckit [1985]) for a genetic component to drug dependence liability: (1) parental alcohol use disorder (AUD; the term we would now use instead of the term “alcoholic”) is observed in 31% of persons with AUD; (2) monozygotic and dizygotic twin concordance rates are 55 and 28%, respectively; and (3) adoption studies show a 44% higher rate of AUD in the adopted offspring of persons with AUD compared offspring of non-affected persons. Furthermore, an examination comparing a sample of alcohol-dependent individuals and their siblings [Bierut et al., 1998] found that the siblings of alcohol-dependent probands had elevated rates of alcohol dependence (50% for men and 25% for women) compared to non-alcohol-dependent control subjects and their siblings. Similar results have been found for dependence on other drugs, including opioids, cocaine, cannabis, and/or alcohol [Merikangas et al., 1998], with an 8-fold increased risk of drug disorders observed in the family members of drug-dependent individuals. It might be suggested that environmental factors, including just exposure to particular drugs, might also play a role here, but other evidence suggests that specific heritable factors contribute not only to the liability for drug dependence overall, but also the liability to abuse particular drugs. The familial clustering of drug use problems [Kendler et al., 1997] and the increased risk associated with having a parent with drug use problems [Midanik, 1983] strongly suggest the importance of genetic factors in drug abuse and dependence. Other approaches have helped to separate the inherent problems involved in comparing closely related individuals that also have shared environments, and in particular the influence that an addicted parent can have over the behavioral and psychological development of their offspring.
Another early approach taken to try to tease out this complex web of environmental and genetic influences was to compare children who had been adopted away from their biological parents, and therefore did not share a common environment with their biological siblings. These comparisons involved examination of the concordance for drug-related phenotypes (e.g., drug dependence) between offspring and their biological and adoptive parents: a greater similarity between offspring and biological parents suggests a genetic basis for the phenotype, while greater similarity between offspring and adoptive parents suggests an environmental basis of the phenotype. Studies of this type [Cadoret et al., 1986, 1995, 1996] have found strong evidence for genetic factors in drug dependence liability, but have also found evidence for environmental risk factors, including parental divorce and parental psychiatric disorders in the adoptive parents. Before considering the different magnitudes of the contributions of these genetic and environmental factors to drug dependence and abuse liability, it must first be recognized that adoption studies have certain intrinsic limitations. Notably, biological parents of adoptees, adoptive parents, and adopted children are not representative of the population as a whole. There are reasons that parents give their children up for adoption, which may be simple, such as being young, poor, and lacking social support from their families or society as a whole. Although less information is available for biological parents of adoptees than for adoptive parents [United Nations, 2009], it is clear that biological parents have an increased incidence of virtually every type of personal (and societal) problem compared to adoptive parents, including poverty, poor health, addiction, mental illness, criminality, low education, low IQ, and so forth [Kreider and Lofquist, 2014]. Many of the traits that may underlie these outcomes are also thought to have strong genetic components and might contribute to increased drug dependence in biological parents of adoptees, as well as their adopted children. Furthermore, part of what makes the contrast between biological and adoptive parents so stark is that adoptive parents are actively selected against all of those same factors and are consequently wealthier, more educated, and more intelligent than the average parent [Kreider and Lofquist, 2014]. Additionally, many adoptions do not occur at birth, so children may have negative experiences associated with their biological parents prior to adoption. Even if adopted at birth, prenatal influences could also affect later outcomes, including prenatal illicit drug exposure, which would certainly be more likely in adoptive children than the general population. An example of this is one of the current consequences of the opioid epidemic whereby large numbers of children have been left parentless as a result of opioid drug overdoses [Winstanley and Stover, 2019].
That there exists a substantial genetic component which contributes to drug dependence and drug abuse vulnerability appears to be clear. The next most important question is how much of the liability for drug dependence is due to genetic factors, that is, the “heritability” of that trait. Comparisons between monozygotic and dizygotic twins, who would presumably share much of the same environmental influences, have often been used to calculate heritability, although other genetic comparisons have also been used in this manner. Reviews of such comparisons, largely from twin and adoption studies, indicate that the heritability of AUD is between 0.5 and 0.6 [Schuckit, 2009; Verhulst et al., 2015], while the range for other abused substances is slightly wider, between 0.3 and 0.8, in part depending on the substance considered [Tsuang et al., 2001; Vink et al., 2005; Agrawal and Lynskey, 2006, 2008]. Heritability estimates give some indicator that there is a genetic component for drug dependence, but the weaknesses of such approaches are often overlooked or underestimated [Benchek and Morris, 2013]. Among many issues, a primary problem is that from the outset, to make the calculations of heritability feasible, interactions between and within components are assumed to be minimal. As shall be seen, the same additive, non-interactive assumptions have been necessary for many genetic approaches. This basic assumption may not be correct and may consequently skew the perception of the causes of drug dependence.
Ultimately, the existence of a genetic component for drug abuse liability also leads to questions of its origins. In considering the persistence of drug dependence across human cultures and across time, it is clear that the genetic contributions to addiction are certainly not an aberration, although they are commonly treated this way, as if they are something that evolution has just not gotten around to eliminating. The affinity of the anthropoid lineage for alcohol has been suggested to relate in part to a highly frugivorous diet [Dudley, 2002], although this trait is something that varies substantially across the primate lineage and is certainly not characteristic of our own species. Moreover, humans clearly have the potential for dependence upon drugs acting through many other mechanisms that involve very old genetic variants. As just one example, a variant of the fatty acid amide hydrolase gene that is associated with drug dependence is estimated to be around 150,000 years old [Flanagan et al., 2006]. Moreover, many of the mechanisms that underlie drug dependence have much older evolutionary roots than that, far older than even just the mammalian lineage [van Staaden and Huber, 2018]. These roots may lie more in the common and deep evolutionary roots for motivation and its role in learning rather than for addiction per se. Nonetheless, the potential for addiction may lie in these roots. As researchers began to explore the potential genetic basis for addiction the initial focus was upon systems involved in motivation, although certainly that focus has broadened over time to include various aspects of motivation, behavioral control, and decision making. Ultimately, the persistence of gene variants across time indirectly implies that there is a reason for that persistence. The genetically mediated aspects of drug dependence are certainly deleterious, so they must be in some way counterbalanced by other advantageous effects also produced by those gene variants. As we come to understand the genetic basis of drug dependence this relationship should become clearer.
In any case, estimates of heritability are just a starting point in understanding the genetic contribution to drug dependence liability. Regardless of the size of the genetic contribution, demonstration of a heritable component still leaves questions as to its underlying architecture, including how many genes are involved, which genes are involved, and how heterogeneous is this genetic contribution to drug dependence liability? The demonstration of a genetic component does not constitute an understanding of how genetic factors induce their effects upon drug dependence liability, which genes are involved, or how many genes are involved.
Candidate Gene Studies: Guessing Which Genes Are Important for Addiction
Evidence that addiction and abuse liability have some degree of a genetic component led to studies attempting to specify the genes that underlie this genetic basis. Somewhat naturally, in the pre-genomic era, the choice of which genes to study was based upon a priori understanding of the mechanisms thought to underlie the reinforcing effects of drugs of abuse. Indeed, early sequencing efforts focused on particular genes for this purpose, such as the dopamine transporter (DAT or SLC6A2 [Shimada et al., 1991]) and the µ-opioid receptor (MOR or OPRM1 [Wang et al., 1993]), thought to be the primarily molecular targets for major classes of addictive drugs, psychostimulants, and opioidergic drugs, respectively. Identification of DNA sequences for these genes led to efforts to identify variation in these genes that could be used for linkage and association studies, most commonly single nucleotide polymorphisms (SNPs). The cumulative findings of candidate gene studies of drug dependence have been extensively reviewed many times [Kreek et al., 2005; Gelernter and Kranzler, 2009; Jones and Comer, 2015], including reviews that focus upon particular substances [Stickel et al., 2017; Icick et al., 2020; Thorpe et al., 2020]. Many reviews have tended to indicate that there are greater degrees of concordance between studies than have actually been observed. A previous review by one of the present authors [Hall, 2016] analyzed results of comparisons for several genes that have been widely studied for their potential relationship to drug dependence. This included the DRD2 and OPRM1 genes, for which clearly less than half of the studies in the literature have found positive associations in candidate gene studies, and this was seen despite the well-known publication bias against negative findings.
The logic behind many candidate gene studies was generally sound from the point of view of biological relevance, although this was probably considered from a rather limited point of view, focusing mostly on the most immediate actions of drugs of abuse. Nonetheless, the consequences of some polymorphisms appeared to support the logic behind such choices in candidate gene studies. For instance, the MOR A118G SNP produces an amino acid change that alters endogenous ligand binding [Bond et al., 1998], which would be expected to affect responses to opioid drugs, including drug-induced subjective effects and drug reinforcement. However, despite the “obvious” relevance of such a genetic change, that study found no significant association of this polymorphism with opioid dependence. Subsequent studies have both found [Szeto et al., 2001; Tan et al., 2003; Bart et al., 2004] and failed to find [Li et al., 2000; Franke et al., 2001; Shi et al., 2002; Crowley et al., 2003; Tan et al., 2003] significant associations for the same polymorphism. Population admixture has been suggested to be involved in positive associations with this polymorphism when they have been observed in some, though not all, cases [Kreek et al., 2005]. Other factors associated with the populations studied and other aspects of the experimental designs surely contribute to whether or not significant associations are found, although it must be considered whether this indicates that there is no true association underlying these observations, whether the effect size is very small, hindering the observation of significant associations, or whether there is substantial heterogeneity underlying these genetic effects which also impairs the ability to identify significant associations.
Although drugs of abuse differ in many of their initial acute effects, there is a convergence in many downstream effects, including those that are involved in drug reinforcement, other forms of drug learning, and adaptations to chronic drug exposure [for review, see Scofield et al., 2016; Koob, 2020; Wise and Robble, 2020]. Consequently, research focus has also been placed upon the downstream mechanisms that may be involved in these processes, particularly those involving dopaminergic and glutamatergic systems that are directly or indirectly affected by all drugs of abuse. The genes targeted in these systems are primarily those involved in synaptic transmission, particularly receptor mechanisms, and to a large extent this focus has persisted to the present day.
Candidate-gene approaches relate variation in particular gene loci with a chosen phenotype using linkage or association techniques that examine related and unrelated individuals, respectively. A large number of studies have examined broad phenotypes, in particular the genetic propensity to develop drug dependence overall [Smith et al., 1992; Comings et al., 1994; Berrettini et al., 1997; Vandenbergh et al., 1997; Kranzler et al., 1998; Krebs et al., 1998; Gelernter et al., 1999a; Blomqvist et al., 2000; Hoehe et al., 2000; Vandenbergh et al., 2000; Luo et al., 2003; Agrawal et al., 2006; Zhang et al., 2008], or dependence upon particular types of drugs. A priori considerations dictated many of the choices in these studies, including focusing upon the immediate targets of drugs of abuse. Thus, studies of dependence on psychomotor stimulant drugs like cocaine and amphetamine focused upon genes in dopaminergic systems [Noble et al., 1993; Persico et al., 1996; Comings et al., 1999b; Gelernter et al., 1999b; Serý et al., 2001; Tsai et al., 2002; Hong et al., 2003; Liu et al., 2004; Guindalini et al., 2006], while studies of opioid dependence focused upon opioid system genes [Mayer et al., 1997; Bond et al., 1998; Comings et al., 1999a; Franke et al., 1999; Li et al., 2000; Zimprich et al., 2000; Szeto et al., 2001; Shi et al., 2002; Xu et al., 2002; Crowley et al., 2003; Tan et al., 2003; Bart et al., 2004; Yuferov et al., 2004], studies of alcohol dependence focused on GABAergic system genes [Noble et al., 1998], and studies of nicotine dependence focused upon nicotinic cholinergic system genes [Saccone et al., 2007]. As understanding of the broader circuitry underlying drug dependence developed this focus expanded beyond the initial targets of drugs of abuse, but this still focused upon a priori knowledge. Because of the focus on dopamine as an essential mediator of drug reinforcement, there was much focus on dopaminergic genes, which can be seen in studies that examined alcohol dependence [Blum et al., 1993; Sander et al., 1997b; Noble, 1998a, b; Noble et al., 1998; Gelernter et al., 1999a; Vandenbergh et al., 2000; Gorwood et al., 2003] and opioid dependence [Kotler et al., 1997; Lawford et al., 2000], among other drugs of abuse. Similarly, because opioid systems are also thought to have a broader role in drug dependence, not just in the actions of opioids, the relationship of opioid system genes to dependence on other drugs has also been a focus of investigation [Bergen et al., 1997; Franke et al., 1999; Chen et al., 2002; Ide et al., 2004; Dahl et al., 2005; Xuei et al., 2006]. Because the basic presumption here was that the genetic effects were additive, studies also began to look at combinations of genes, although again often focusing on those thought to be involved in addiction based on their known position in drug reinforcement circuitry. For example, one study looked at the additive relationship between variants in the dopamine receptor D2 gene (DRD2) and the GABA receptor subunit gene α3 (GABRA3) in alcoholism [Noble et al., 1998].
In candidate gene studies, associations between allelic markers for selected genes and dependence on specific drugs of abuse is not always consistent for reasons that have been discussed. Surprisingly, this has been the case even for genes thought to be intimately involved in the immediate pharmacological mechanisms of particular abused drugs. For example, despite the supposed importance of dopamine in the effects of alcohol, Hack et al. [2011] found that out of 10 dopaminergic system genes, only DRD4 and SLC6A3 were associated with alcohol dependence, while no significant associations were found for DRD1, DRD2, DRD3, DRD5, SLC18A2, DDC, TH, or COMT. Even for drugs known to interact quite directly with the dopaminergic system, such as cocaine and methamphetamine, consistent associations with allelic markers in dopaminergic genes have not been observed. In one study of 13 candidate genes, no dopaminergic genes were associated with cue reactivity in cocaine-dependent individuals [Smelson et al., 2012], including DRD1, DRD2, DRD3, DRD4, and SLC6A3. Interestingly, significant associations were found in that study between variants in the GABRA2 and OPRM genes, which code for proteins that are less directly involved in the actions of psychostimulant drugs. Similar patterns have been observed in larger studies. Of over 1,000 SNPs examined in 130 candidate genes assessed for association with heroin dependence, 17 SNPs were nominally positive, including SNPs in a number of genes associated with dopamine, serotonin, GABA, and glutamate function; however, none of the associations were significant after correction for multiple testing [Levran et al., 2009]. This has been a consistent problem, which probably reflects all of the problems that have been discussed, including small individual effects sizes and allelic heterogeneity, as well as the tendency for studies to be underpowered given those 2 facts in particular. Somewhat consistent findings have been shown for DRD2 variants in opiate dependence [Lawford et al., 2000; Li et al., 2002; Clarke et al., 2014], although it has been suggested that some DRD2 associations may in fact involve adjacent genes [see discussion in Gorwood et al. 2012]. As will be discussed in the context of genome-wide association studies (GWAS), the absence of implication of genes encoding proteins that are central to the immediate neural actions of many drugs of abuse may indicate that those genes may not tolerate as much variability as other genes, or, simply, that other genes are more important contributors to the development of drug use disorders despite the clear involvement of those neurotransmitter systems in the pathophysiology of addiction. Indeed, as discussed below, genetically modified mouse studies have clearly identified neurotransmitter genes (among others) that play a role in the acute actions of several drugs of abuse, including cocaine, amphetamines, and opioids.
The studies mentioned so far in this review primarily looked at the very broad phenotype of liability to develop drug dependence, or perhaps dependence on a particular drug. However, drug use disorders are far from homogeneous, and like other psychiatric conditions, individuals with the same diagnosis often have a different pattern of symptoms despite some core underlying similarity. This apparent heterogeneity of symptoms and underlying causes contributed to the development of the concept of an “endophenotype” [Gottesman and Gould, 2003], the core idea being that the wider phenotype, such as drug dependence, is formed from intermediate phenotypes or sub-phenotypes, termed endophenotypes, that have a stronger relationship to specific underlying mechanisms, including genetic mechanisms. To some extent, thinking along these lines led to an overall reconsideration of basis of psychiatric diagnoses using the Research Domain Criteria (RDoC) nosology [Morris and Cuthbert, 2012]. This initiative sought to reframe the basis of psychiatric nosology, recognizing both that underlying symptoms differed between individuals with the same Diagnostic and Statistical Manual (DSM) diagnosis, as well as the fact that the same symptoms could be observed in individuals with different diagnoses. This perspective has obvious connections to the concept of endophenotypes. Moreover, the other primary goal was to link psychiatric diagnoses to underlying biological mechanisms, which had never been a part of psychiatric diagnoses in any version of the DSM, even the most recent version, DSM V [American Psychiatric Association DSM-5 Task Force, 2013]. An important corollary of the RDoC approach to psychiatric nosology is that the underlying biological mechanisms, including genetic ones, may be more strongly linked to endophenotypes than to the all-encompassing phenotypes that comprise psychiatric diagnoses, including drug use disorders.
Prior to the development of these concepts, in part due to negative or inconsistent findings for the broad phenotype of drug dependence, the relationships between genetic variance and endophenotypes for drug dependence began to be explored. Indeed, such studies found stronger relationships between allelic variants and specific drug-related phenotypes [Comings et al., 1994; Sander et al., 1997a; Shi et al., 2002; Gorwood et al., 2003; Ide et al., 2004] than were found for those same phenotypes and drug dependence overall. In this regard, although some people come to abuse any drug (polydrug abusers), many individuals tend to seek specific substances. This may relate to underlying mechanisms that specifically predispose individuals to abuse those drugs, which may include premorbid psychiatric conditions for which individuals essentially self-treat [for an extensive discussion of this topic in relation to nicotine dependence, see Hall et al., 2015]. Many lines of evidence support the overall idea of the role of self-medication in the development of drug dependence, but one of the simplest is that genetic associations may be specific to particular substances. For example, Comings et al. [1999b] found that variants in the dopamine D3 receptor (DRD3) were associated with cocaine dependence, but not dependence on several other classes of drugs, including amphetamines, opioids, and alcohol.
One of the fields that has developed alongside those discussed above is pharmacogenetics. This is most commonly applied to predicting responses to therapeutic drugs, which, like illicit drugs, vary substantially in individual responses. GWAS for the subjective effects of methamphetamine identified some overlap with results for GWAS for drug dependence [Hart et al., 2012], including the cell adhesion molecule CDH13. This is essentially similar to pharmacogenetic studies attempting to identify differences in response to therapeutic drug actions, although such differences might also result from different underlying disease mechanisms that require different therapeutic approaches. There are very few effective treatments for drug dependence, particularly for certain illicit drugs like methamphetamine [Kitanaka et al., 2016, 2019], but as therapeutic treatments for drug use disorders are developed it is quite likely that there will be both differences in underlying mechanisms among individuals, as well as differences in therapeutic responses. For those drugs of abuse for which treatments exist, variants in certain genes have been shown to be predictors of effective treatments. For example, the effectiveness of bupropion in nicotine cessation trials is associated with variants in the dopamine D4 receptor (DRD4) [Leventhal et al., 2012]. In any case, these types of effects show similarly small effect sizes as those that are seen for other drug dependence phenotypes and the study of the genetics underlying therapeutic responses have similarly relied primarily upon candidate gene approaches.
Of course, there may be many reasons for inconsistencies between candidate gene studies, but the most obvious should be considered – that some positive findings are false positives. It is important to consider how these genes were chosen for study in the first place. One of the premises of this review is that these choices are based largely upon the “streetlight effect,” as exemplified in this old Mutt and Jeff cartoon (Fig. 1). Like Jeff, geneticists began their search for the genetic basis of addiction based on where the light shone best; that is, those systems that were the most well-studied, for which the most tools existed, and which were most strongly linked to the acute effects of drugs of abuse. The rest of the genome was largely in the dark, in many cases only being gradually illuminated after initial discovery in the genome, with no prior understanding of their role in biology whatsoever, except based upon homology to known genes.
Thus, it seems likely that many of the findings of candidate gene studies may simply be wrong; there was great persistence in studying certain genes, even when presented with initial negative findings. After all, the proteins coded by these genes were “obviously” important for responses to addictive drugs. Conversely, other genes were simply not given consideration (they stood outside of the immediate illumination of the streetlight), or when negative findings were observed, they were accepted much more easily. The possibility that many initial findings from candidate gene studies were false positives will be reconsidered later in this review when GWAS are discussed. However, given the importance of the genes that were initially considered in candidate gene studies in the actions of drugs of abuse other possibilities should be considered. To begin with, it appears that the magnitude of the contributions of any particular gene variant to the overall genetic variation is rather small. Analysis of the magnitude of effects clearly shows this. For instance, Comings et al. [1999b] found that DRD2 and DRD3 receptor gene variants accounted for only 1.76 and 1.64% of the variance, respectively, in the liability to develop cocaine dependence. These are likely at the high end of such effect magnitudes for genetic effects in addiction, outside of some rare exceptions for alcohol dependence that involve alcohol metabolizing genes [Couzigou et al., 1994; Higuchi, 1994; Chen et al., 1999; Edenberg et al., 2019]. Another approach to capturing a larger proportion of this variance has been to examine the collective contributions of sets of gene variants to a phenotype. In one such study of cocaine dependence, Derringer et al. [2012] examined 8 dopaminergic system genes, finding that 4 SNPs in 4 of these genes accounted for 2.76% of the phenotypic variance in an initial trial sample, but only 0.54% in a subsequent test sample. If these combined effects represent the larger effects among the gene variants contributing to cocaine dependence, and the rest of the gene variants produce even smaller additive effects, it seems likely that the total genotypic contribution involves a large number of gene variants. Moreover, this would also suggest that these smaller gene effects will be very difficult to detect consistently, which might explain the mix of positive and negative findings in the case of true positives, but also suggest that there may be many false positives in this literature as well. Additionally, this situation is likely to be exacerbated if there is substantial allelic or locus heterogeneity, or if genetic contributions to addiction involve interactive effects with other gene variants or with environmental influences on the phenotype, none of which would be apparent in a typical association study.
One of the approaches taken to potentially overcome the problem of small effect sizes, as well as the consistent problem that many candidate gene studies are statistically underpowered, is meta-analysis. In some cases, the combination of multiple studies in meta-analyses has found overall significant effects, even when there is a mix of positive and negative findings amongst the original studies. For example, Noble [1998a] combined 15 studies that examined the relationship between a DRD2 receptor polymorphism and alcoholism in a meta-analysis. Of the studies included in the analysis, 8 had shown a significant association and 7 had failed to find a significant association, yet the meta-analysis found a highly significant association between DRD2 polymorphisms and drug dependence. This is not always the case though. As another example, ethnic-specific meta-analyses of the A118G OPRM polymorphism found a significant overall effect in 5 studies in individuals of Asian descent, but no significant overall effect in 7 studies that examined individuals of European descent [Chen et al., 2012]. This latter study not only demonstrates the problem of potential false positives and false negatives, but also the problem that genetic background influences the outcomes of genetic studies. This is likely due to both differences in the causal variants between populations, as well as differing relationships between genetic markers and causal variants. As an example of this situation, the same haplotype variant of the neuronal cell adhesion molecule gene (NRCAM) was associated with substance abuse vulnerability in European American and African American samples, but the phase of the association was opposite in each ethnic group [Ishiguro et al., 2006].
While candidate gene approaches did establish that there is a substantial genetic component that contributes to drug dependence liability, when there has been evidence of involvement of a particular gene variant in dependence, results have been inconsistent at best. No large gene effects have been found and given the nature of the phenotypes and the effects in question it appears likely that many genetic loci are involved. However, it has been argued that many early studies were simply too underpowered, and the effect sizes too small, to consistently identify consistent results. Some meta-analyses have suggested as much. One such study found a highly significant association between DRD2 polymorphisms and drug dependence, as have others focusing on dopaminergic genes. Nonetheless, due to inconsistent findings, it has been hypothesized that allelic variants may correlate with endophenotypes, such as severity of dependence or particular symptoms. It is also possible that genetic variation primarily influences endophenotypes relevant to specific drug classes rather than a drug’s action. In any case, it has been a consistent observation across these candidate gene studies that drug dependence is highly polygenic and heterogeneous.
GWAS: Leaving a priori Considerations Behind
It is clear that large gene effects are rarely found in candidate gene studies for drug dependence phenotypes. Additionally, given the magnitude of the effects, many studies have certainly been underpowered, increasing the chances of false negative findings, meaning that much of the genetic variation underlying the genetic component of predisposition to drug dependence has not been revealed by such approaches. Given that the effect of any particular genetic variant contributing to drug dependence liability is rather small, and consequently that the genetic contribution to drug dependence liability is highly polygenic, approaches examining the association of markers across the entire genome appeared to be more likely to identify a larger proportion of the gene variants contributing to drug dependence liability and other addiction-related phenotypes. The development of genome-wide approaches also resulted in fundamental change in genetic studies – a move away from a priori predictions about the nature of the genetic basis of addiction towards approaches that are not based on any preconceptions. In terms of the metaphor we have used here, this move away from a priori predictions moved beyond the range of the streetlight’s illumination. To some extent this therefore became a test of the validity of those initial predictions, one which has largely shown that they were wrong, or at the very least that they vastly underestimated the complexity of the genetic causes of drug dependence.
GWAS overcame the initial bias inherent in candidate gene approaches, but also came with a new set of problems. One of these was based upon the very large number of comparisons involved in GWAS. It was clear that such a large number of comparisons would result in an increased potential for false positive findings, but the concern for false positive results led to an increased acceptance of false negative results [for a discussion of this issue, see Sebastiani et al., 2009]. One approach to addressing this problem, one which was also more reflective of the actual scientific process, was to seek replication of the same genomic markers, or markers in the same genomic regions, in independent samples across multiple GWAS experiments. This approach identified replicable genetic influences on addiction liability, although these effects once again appeared to be highly heterogeneous, small additive effects. In many studies, nominally significant results were reported, and replication of these nominally significant results were sought across studies as a way to overcome the potential for false negative results inherent in the large number of comparisons in GWAS. This reporting of “nominal significance” was common because few studies identified any comparisons reaching the p value <10–8 significance threshold that was calculated from the large number of individual comparisons, a value commonly referred to as “genome-wide” significance. This stringency, particularly given the smaller sample sizes common in most initial GWAS, produced a bias toward type II errors. Early approaches to overcoming this initial bias included the identification of clusters of positive SNPs in individual analyses [Liu et al., 2006], as well as identification of the same chromosomal regions across multiple studies [Drgon et al., 2011], presuming that false positive results would not recur in the same regions across multiple comparisons (indeed p values could be calculated for this probability as well). Using these approaches, not only were a greater number of gene variants associated with drug dependence, and a greater proportion of the genetic component identified, but the strategy also produced a greater degree of replicability. This replicability, although highly significant, was far from perfect, a reflection of both the underlying heterogeneity as well as the small effect sizes of the individual components.
One of the most important findings of these early GWAS studies was the dearth of positive findings for classical neurotransmitter system genes (e.g., receptors, transporters, etc.), many of which had been a primary focus of candidate gene studies. For example, of the 96 genes identified by clusters of positive SNPs by Liu et al. [2006], almost no monoaminergic system genes were among the significant associations. Instead, the major gene category implicated in drug dependence by this study was cell-adhesion molecules, which were 28% of the total positive associations. This study was from a series of GWAS conducted at the National Institute on Drug Abuse beginning in 2001 [Uhl et al., 2001a, b, 2007, 2008c; Liu et al., 2006; Johnson et al., 2006, 2008; Liu et al., 2006; Drgon et al., 2011]. Comparisons across these studies that involved different populations and different ethnicities identified replicable findings suggesting that the genetic basis of drug dependence liability is highly heterogeneous and highly polygenic, consisting of small additive effects [Drgon et al., 2011]. The standard of “genome-wide significance” in GWAS erred too much in favor of false negative results in its effort to avoid false positive results. Consequently, the criterion was so stringent that most GWAS could not identify any significant associations, despite clear evidence for a genetic contribution to the propensity to develop drug dependence. The alternative approach was to report “nominally significant” results at a lower statistically stringency, but then to seek replication across studies. Monte Carlo simulations examining the likelihood of repeatedly identifying the same genomic regions in multiple studies identified 50 highly replicated gene loci that were associated with drug dependence [Uhl et al., 2008a]. This list contrasts in several ways with previous candidate gene studies, including the greater number of loci identified, the higher degree of replication, and the gene classes of those genes identified. Although there was a high degree of replication, which was highly significant as determined by Monte Carlo simulations, this does not mean that the same associations were found in all studies or even that any single association was found 100% of the time. This no doubt reflects the underlying heterogeneity of drug dependence liability, consistent with previous findings in candidate gene studies. Importantly, many more associations were found, which is likely the result of consideration of genes that had been largely ignored in candidate gene studies. Although not a primary emphasis of this review, candidate gene studies have been conducted for many individual substances as well as dependence on any illicit drug. It is clear that although GWAS or candidate genes studies for dependence upon particular substances identify some substance-specific genes, the majority of the genetic contributions to substance use disorders involve general contributions to drug dependence [Kendler et al., 2003; see also discussion in Hall, 2016].
Although the prominence of cell adhesion molecule genes among those identified in GWAS studies was initially surprising, this was reconsidered in the light of the importance of learning mechanisms in drug dependence [Uhl, 2004]. Clearly the importance of neuroplasticity and learning mechanisms in drug dependence [for a review, see Robbins et al., 2008; Badiani and Robinson, 2004] is consistent with a role of cell adhesion molecules in drug dependence mechanisms [Muskiewicz et al., 2018]. These learning mechanisms include both the development of positive reinforcement early in the course of drug dependence and negative reinforcement later in in the course of drug dependence. These mechanisms underlie important aspects of drug dependence, including drug craving and consequent drug-seeking behavior driven by drug withdrawal, stress, and drug-associated stimuli. Greater sensitivity to the development of positive and negative reinforcement in some individuals may result from changes that occur early in neurodevelopment prior to drug exposure or neuroplasticity that occurs during the course of drug exposure, but in either case could likely involve variation in genes involved in neurodevelopment and neuroplasticity, including cell adhesion molecules. The first possibility is certainly consistent with the model of behavioral endophenotypes being characteristic of drug-dependent individuals even prior to drug experiences, as well as the high degree of comorbidity of drug dependence with other psychiatric disorders that may contribute to negative reinforcement and self-medication [for a discussion of this topic, see Hall et al. 2015, 2017].
The most common gene classes that were identified in GWAS studies, including cell adhesion molecules, had not been widely considered in previous candidate gene studies, so the findings were generally novel. This clearly shows that previous candidate gene studies were limited by the streetlight effect, failing to recognize whole classes of genes that have important roles in drug dependence. Nonetheless, this does not necessarily mean that the previous candidate gene findings were erroneous for reasons previously discussed, including substantial allelic and locus heterogeneity, and stronger association with more specific drug dependence endophenotypes. For example, variation in OPRM1 has been associated with the analgesic and reinforcing effects of opioids [for a review, see Ikeda et al., 2005]. These allelic effects are highly heterogeneous; more than 100 polymorphisms in OPRM1 produce functional consequences on OPRM1 expression, binding affinity, or function. The fact that such broad variation exists within a single gene that may contribute to drug dependence phenotypes suggests 2 things: (1) once again, that many false negatives might result from examining only a small number of markers or even functional variants of that gene in either candidate gene studies or GWAS; and (2) that when a gene plays an important role in drug dependence it might accumulate more mutations than other genes. This is something that is certainly true for “Mendelian” disorders such as cystic fibrosis [Paranjapye et al., 2020] that are associated with a great number of potential causal genetic variants. Indeed, this genetic basis is thought to characterize such disorders. Although they are often thought of as “simple” genetic disorders, Mendelian disorders are not only associated with a great number of “causal” allelic variants within the primary gene that underlies the disorder, but also the severity of the disease differs substantially with the particular allelic variant and a number of variants in other genes that also modify the severity of the disease. The OPRM findings just mentioned may indicate that this genetic architecture that underlies Mendelian disorders is not all that different for complex diseases that have a polygenic basis.
The assumptions underlying GWAS are expressed by the Common Disease Common Variant Hypothesis [Bush and Moore, 2012], which suggests that common diseases result from genetic variation that is common in the population. The idea of small effect sizes for individual genes as well as substantial polygenetic causation naturally follow. Part of the presumption here is that GWAS will be able to identify common variants because they will occur significantly more often in the affected group compared to an unaffected control group. Heterogeneity presents a problem for identifying those variants because although a large number of individuals in the affected group might have disruption of the same genes, this might be the result of different variants in the same gene, for example allelic heterogeneity. It has now been recognized that there is far greater allelic heterogeneity in the genome overall than initially thought (or perhaps hoped for) [Hormozdiari et al., 2014, 2017]. Moreover, there is a substantial effect of sample size on the observation of allelic heterogeneity. Much higher allelic heterogeneity is observed in larger studies, suggesting that underpowered studies with smaller sample sizes have tended to give an impression that there is far less allelic heterogeneity than there actually is, although this would be an obvious explanation for the largely inconsistent results of candidate gene studies.
The identification of large numbers of associations in GWAS for common diseases did not identify the causal variants mediating those associations, only associations between marker variants. Fine mapping approaches that incorporate LD structure have been used to try to identify the actual causal variants [Maller et al., 2012; Schaid et al., 2018]. However, most of these methods assume low allelic heterogeneity, which has been shown to be much higher than initially thought [Hormozdiari et al., 2014, 2017]. Analyses examining allelic heterogeneity have looked at genetic loci implicated in several common disorders, including psychiatric disorders. Methods for examining allelic heterogeneity have yet to be applied to drug dependence, but based upon the substantial overlap between genes associated with drug dependence and those associated with psychiatric disorders [Uhl et al., 2008b, 2010; Johnson et al., 2009], it seems likely that there will be substantial allelic heterogeneity for drug dependence as well.
If allelic heterogeneity is higher than initially thought, false negatives are likely to be an ongoing problem in GWAS for drug dependence. Of the 15 GWAS for alcohol dependence listed in the NHGRI Catalog of GWAS [Hindorff et al., 2014], only 1 identified a significant association with alcohol dehydrogenase 2 (ALDH2), a gene widely associated with alcohol dependence [Edenberg and Foroud, 2013]. In any case, even this very large gene effect only accounted for 17% of the genetic variance contributing to alcohol dependence. As associations with drug dependence go, this is a very large effect, but still fails to account for the majority of the variance in genetic contributions to alcohol dependence and is fully consistent with a highly polygenic and heterogeneous genetic architecture. Using the clustering strategy previously mentioned [Drgon et al., 2010], a high degree of replication across GWAS for drug dependence has been observed. Despite this potential problem of allelic heterogeneity, the analysis of multiple GWAS findings that was discussed previously [Uhl et al., 2008a] did find a high degree of replication. Certainly, each drug-dependence related gene is not identified in every study 100% of the time, but based on the considerations discussed here 100% concordance would not be expected across studies – that is, it is likely that the set of genes involved in drug dependence in different populations, in dependence on different substances, in different levels of dependence, and so forth, may not completely overlap. As this field progresses it is likely that we will be able to demonstrate different sets of genes associated with each of these modifying conditions.
By their nature, GWAS studies make no presumptions about which particular gene variants contribute to the heritable component of drug dependence liability in humans, although by necessity they do maintain the presumption of an additive model. Many of the other problems that plagued candidate gene studies persisted with GWAS, including underpowered studies, particularly in the context of the far greater number of comparisons, and consequent debate over statistical stringency and the trade-off between type I and type II errors, that is false positives versus false negatives. Of course, one of the solutions to these problems would be very large samples sizes, not just hundreds of subjects, but tens of thousands of subjects [Visscher et al., 2017]. Such studies are a massive effort in terms of resources, and few such studies have been done. This solution alone, however, does not seem to identify more genes that were observed in previous, much smaller GWAS. For instance, one study of AUD examined over 250,000 subjects but only found 10 positive associations [Kranzler et al., 2019]. Some of the positive findings in this study replicated previous findings from candidate gene studies or smaller GWAS, but several had not been previously observed. When the data were split by ethnicity (e.g., genetic background), the findings in European Americans overlapped to a rather small extent with the findings in other ethnicities. Overall, the hope that much larger studies will identify a majority of the genetic contribution to drug dependence liability does not seem to be supported, but it is likely that any solution to this problem is likely to require much larger samples. The failure to identify more of the genetic components of drug dependence liability by just increasing the samples size may mean that one of the primary assumptions of GWAS was not true, that is additivity. It may be possible that the answers to understanding the genetic component of addiction, and other complex diseases, lies in examining interactions, either gene-gene interactions or gene-environment interactions. Before considering this issue, studies addressing the findings of GWAS studies should be addressed.
Confirming that the Results of GWAS Studies Are Not Just False Positives
Association studies, whether candidate gene studies or GWAS, establish a link between genetic markers and an outcome. The search for a causal variant is often the next logical stage of inquiry, but functional genetic studies are far less common than GWAS [Gallagher and Chen-Plotkin, 2018]. When the issue of causal variants is raised there has long been a bias towards searching for variants that cause amino acid changes, even though these types of variants are far less common than other variants. Indeed, more than 90% of disease-associated gene variants are located in non-protein coding regions [Schaub et al., 2012], and are enriched in transcriptional regulatory elements [Maurano et al., 2012]. Consistent with these findings, a fine mapping study that followed initial positive findings in GWAS identified an NRCAM haplotype that influences NRCAM gene expression [Ishiguro et al., 2006]. The differences in gene expression resulting from the gene variants underlying these sorts of haplotypes, as well as other variants in high-linkage disequilibrium, likely contribute to changes in gene expression in a complex manner. In the same way that the between locus effects appear to be highly complex and interactive, the within locus effects are also likely to be much more complex and interactive than was initially presumed (or hoped). This situation will make the search for “functional variants” for complex diseases quite difficult and perhaps require a paradigm shift to a more probabilistic and less absolute causal conception for it to be truly understood.
Another approach taken to confirm the results of GWAS is to examine manipulations of those genes in mouse models. Just as preclinical studies were widely used as a starting point for nominating genes to be studied in candidate gene studies, they were also used to confirm the potential role of novel genes nominated by GWAS. In particular, genetically modified mice were widely used for this purpose. These studies have been widely reviewed [Charbogne et al., 2014; Uhl et al., 2014; Muskiewicz et al., 2018], so only a few points will be made on this topic here. GWAS for drug dependence nominated many novel genes for potential involvement in drug dependence phenotypes. Among the approaches taken was to examine the effects of eliminating the gene (e.g., a gene “knockout”) on drug abuse-related phenotypes. This approach was in fact one of the first approaches taken in parallel with candidate gene studies prior to the genomic era. Like the genes that were studied in candidate gene studies, the main target proteins for the major drugs of abuse were of particular interest in gene knockout studies, including the main molecular targets for opioids (Oprm1), amphetamines (Vmat2), and cocaine (Slc6a3; DAT; e.g., Sora et al. [1997], Takahashi et al. [1997], and Sora et al. [1998], and for some overall summaries see Sora et al. [2010], Hall et al. [2013], and Moriya et al. [2013]). The primary goal of these initial studies, and many others that have examined the primary molecular targets for other drugs of abuse, was to determine whether gene modifications would alter behavioral responses to drugs of abuse, particularly those that were most relevant to drug dependence. In these studies, there were no real attempts to model human allelic variation using knock-in mice in which human allelic variants associated with drug dependence are inserted in place of the endogenous version of the gene. Some knock-in mice have been studied, but rather than modeling human variation these genetically modified mice were designed to either induce a hypersensitive receptor [e.g., Tapper et al., 2004] or to eliminate substrate binding [e.g., Thomsen et al., 2009]. Like the more commonly studied homozygous gene knockout mice that eliminate all expression of the target gene/protein, these knock-in approaches likely had more powerful effects than naturally occurring variants. Although not terribly illuminating with respect to naturally occurring genetic variation that might underlie the predisposition to drug dependence, these studies did identify critical proteins involved in behavioral responses to drugs of abuse, including many phenotypes that are relevant to drug dependence.
In part because of this history, the same logic was used to confirm the potential involvement of some of the novel genes implicated in drug dependence by GWAS. Again, these approaches did not seek to duplicate the sort of functional consequences of the human variants, and like the studies of candidate genes the magnitude of the changes were likely much greater than the magnitude of effects resulting from naturally occurring variation in these genes. Instead, the logic was that if these genes were important for drug dependence phenotypes then the complete elimination of gene/protein expression should affect drug dependence phenotypes. This was really intended to be more of a test for potential false positive outcomes from GWAS studies than an exploration of the mechanisms of the causal variants underlying GWAS. If genetic modifications in a small number of genes identified from all of the human genes tested in GWAS approaches, which had no previous implication in drug dependence, were found to affect drug dependence phenotypes, then this would be taken as an indication that these were true positives rather than false positives. Studies in genetically modified mice have shown that reductions in the expression of several CAM genes that were repeatedly associated with drug dependence, including NrCAM [Ishiguro et al., 2006], CDH13 [Drgonova et al., 2016], CSMD1 [Drgonova et al., 2015a], and PTPRD [Drgonova et al., 2015b; Uhl et al., 2018], affect behavioral responses to drugs of abuse and/or behavioral phenotypes relevant to addiction. Consequently, these studies strongly support the findings of GWAS that implicate variation in these genes in the predisposition to develop drug dependence. It must be remembered that these genes were identified in genome-wide approaches, and thus selected from the entire genome, and the likelihood of them all affecting drug dependence phenotypes by chance would be extremely low.
The identification of novel classes of genes, and the confirmation of their importance in drug dependence, may also provide some illumination of the evolutionary mechanisms that lie behind the persistence of these gene variants through time. The ability to learn about motivationally relevant stimuli, the ability to store those memories over long periods, and the resulting motivational effects of those stimuli will certainly have positive effects upon survival. Organisms that remember better or have strong induction of motivational states by conditioned stimuli might be better able to survive and reproduce. However, just as it is for other traits, enhanced memory and motivation may be a two-edged sword if certain types of stimuli gain too much control over behavior, at the expense of other organismal needs. Indeed, in recent decades this has been a common description of the propensity to develop drug dependence, as actions too quickly or too strongly lead to the development of habits and compulsions [Everitt and Robbins, 2005, 2016]. As our understanding of the totality of genetic contributions to drug dependence liability deepens in the post-GWAS era, including the nature of gene-environment interactions, so will our ability to trace the evolutionary roots of addiction.
Moving Forward in the Post-GWAS Era
This review has considered several important transitions in approaches to studying the genetic basis of addiction. In particular, this review has used the streetlight analogy to illustrate how initial attempts to understand addiction were biased by our choice of gene targets to study. GWAS overcame this initial bias but came with a separate set of problems. GWAS moved the field forward in many ways, but to continue to move the field forward it will be necessary to once again step back and consider what preconceptions continue to limit progress. One of the core assumptions that was necessary for GWAS was that the genetic effects are additive. A deeper understanding of the genetic basis of addiction will require considering gene × gene and gene × environment interactions and developing methods for doing so. Recent analytical and technical advancements are beginning to allow us to look not just at highly interactive genetic effects [Joubert et al., 2018, 2019], but also at multiple “omic” levels simultaneously [Weighill et al., 2019]. These multiple omic levels include the genome, the epigenome, the transcriptome, and the proteome, among others. Some of the inconsistency in genetic findings from candidate gene and GWAS approaches probably results not only from allelic heterogeneity, but also from the fact that the genetic “signal” is obscured by a heterogeneous set of complex genetic and environmental interactions that should be observable in alterations at other levels, including the epigenome and transcriptome. Moreover, it has become clear that many of the underlying mechanisms mediating drug dependence liability involve changes in gene expression that are highly tissue and cell specific [Gallagher and Chen-Plotkin, 2018]. This of course means that a single transcriptome analysis will not provide a full explanation of the underlying mechanisms. In the coming years newer approaches to the study of drug dependence, and other complex diseases, will be able to specify the genetic contributions to addiction that have been missed so far by examining gene × gene and gene × environment interactions, and how they affect multiple functional levels in a cell type- and tissue-specific manner. This will allow a much clearer view of the etiology and biology of addiction, as well as identifying more critical points that can be used in the developing addiction therapeutics.
Acknowledgements
The authors thank the J.B. Johnston Club in Evolutionary Neuroscience and Karger for organizing the 31st Karger Workshop on Evolutionary Neuroscience.
Conflict of Interest Statement
The authors have no conflicts of interest to declare.
Funding Sources
This review was supported by The University Toledo.
Author Contributions
F.S.H. wrote the first draft; all authors then contributed to editing the manuscript.