Abstract
Natural variations across animals in form, function, and behavior have long been sources of inspiration to scientists. Despite this, experimentalists focusing on the neural bases of behavior have increasingly focused on a select few model species. This consolidation is motivated primarily by the availability of resources and technologies for manipulation in these species. Recent years have witnessed a proliferation of experimental approaches that were developed primarily in traditional model species, but that may in principle be readily applied to any species. High-throughput sequencing, CRISPR gene editing, transgenesis, and other technologies have enabled new insights through their deployment in non-traditional model species. The availability of such approaches changes the calculation of which species to study, particularly when a trait of interest is most readily observed in a non-traditional model organism. If these technologies are widely adopted in many new species, it promises to revolutionize the field of neuroethology.
For such a large number of problems there will be some animal of choice or a few such animals on which it can be most conveniently studied.
August Krogh, 1929 [Krogh, 1929]
Introduction
Many transformative discoveries have leveraged the most advantageous organisms. These discoveries include green fluorescent protein (jellyfish), PCR (thermophilic bacterium), ion flux modeling of the action potential (squid), and the molecular basis of learning (sea hare). What is the most convenient animal(s) to study for a given question about the brain and behavior? Many factors influence the selection of an animal model, and ideally the primary considerations are the scientific questions that motivate an investigator. In reality, however, logistical and scientific considerations must each be weighed when making a decision; here I outline some practical concerns for deciding which species will make advantageous model species. Importantly, the available methodological capabilities have shifted dramatically in recent years, and many of the new developments can be readily applied to any organism. I speculate on how this will impact model selection as key technologies mature. Finally, I present a case study involving research from my own lab on social behavior evolution in cichlid fish. This system highlights many of the advantages now available in non-traditional model organisms.
Over the past several decades research on vertebrates, which will be the focus of this review, has increasingly moved toward work in the laboratory mouse (and to a lesser extent, the zebrafish Danio rerio), catalyzed in large part by technological advances. Transgenesis permitted the insertion of foreign sequences into the mouse genome [Brinster et al., 1981], while homologous recombination-based gene knockouts [Thomas and Capecchi, 1987; Koller et al., 1989] allowed the analysis of loss-of-function mutants. These approaches have allowed reverse genetics research to assign functions to genes in a manner that was previously inaccessible. The use of homologous recombination to edit the genome relied on inbred genetic lines (homologous recombination requires near-perfect sequence matches across thousands of nucleotides) and stable embryonic stem (ES) cells. Standardization of rat and mouse models drastically reduced levels of genetic polymorphisms via inbreeding, and the isolation of ES cells brought gene knockouts within reach [Evans and Kaufman, 1981; Martin, 1981]. The ability to directly test gene function accelerated the coalescence around the mouse as the primary mammalian research animal, and further technologies developed in this animal as a result (e.g., genome sequences, antibodies, viral vectors). Centralized repositories for genetic lines of mice were established, as were consortia and for-profit companies serving researchers in those communities.
There is reason to believe that a process to reverse this consolidation is afoot: the move back to a wider variety of animal models. This is facilitated by two broad phenomena: the recognition that research on additional species will provide insights inaccessible in the existing models, and the development of key technologies that are species independent (Table 1). Furthermore, animal species are surprisingly similar at genetic, developmental, neuroanatomical, and hormonal levels [Carroll, 2008; O’Connell and Hofmann, 2012]. Most evolutionary changes tend not to affect protein-coding sequences, but rather the regulatory switches that determine when and where a gene is expressed. Therefore, the protein targets of interest to an experimentalist remain largely intact across organisms, permitting study on a variety of species. Thus, as researchers survey the landscape of potential research species, the choice of a species may be pulled less toward animals that have been historically preferred due to the aforementioned technologies, and indeed drawn to those for which scientific answers would be most compelling. The work in traditional model systems will remain important and robust, but work in a menagerie of new species will provide complementary and novel insights.
Examples of technologies that may be applied across animal models, require some adaptation to be used in new species, and those which are inherently species specific

For an appreciation of how fruitful a comparative approach can be, consider the field of evolutionary developmental biology (evo-devo). The field leverages comparisons across species to gain insights into the genetic changes that occur over evolution to effect changes in animal development [Raff and Kaufman, 1983; Carroll, 2008]. Studies of development have been fruitful, as morphology is relatively straightforward to quantify, and the fossil record can provide historical perspective. The mapping of natural genetic variants that control body color and armor, development of eyes and body plan, and evolutionary enlargement of the human brain have all transformed our understanding of biology [Quiring et al., 1994; Carroll, 1995; McLean et al., 2011; Jones et al., 2012; Bilandžija et al., 2018]. In contrast to animal forms, behavior is ephemeral and rarely leaves fossil evidence [Hu and Hoekstra, 2017]. Thus, it can be difficult to quantify to the extent required in genetic studies. However, important strides have been made by ethologists and computer scientists [Anderson and Perona, 2014], and promise to reveal genetic mechanisms that regulate the incredible variation in behaviors across animals.
Researchers working with mice, Drosophila melanogaster, and Caenorhabditis elegans have access to unparalleled genetic tools. However, these animals are not ideal models for all phenomena. For example, traditional model organisms do not fully recapitulate key aspects of human circadian rhythms, cortical development, reward learning, social behavior, and sexual selection. This review focuses not on disease relevance per se, but rather on understanding the basic rules by which genes and neurons control behavior. Below I lay out some key practical questions that may help a researcher decide in which animal model a topic can be most conveniently studied, with a focus on research guided by molecular genetic experiments to uncover the regulation of behavior.
Animal Model Selection
Can the Species Be Studied in a Laboratory Setting?
It is essential that an animal reliably exhibits the behavior under study while in a controlled setting. Experimental science requires manipulation of hypothesized control systems, so the repeated elicitation of the behavior under multiple manipulations is required. While some animals perform fascinating behaviors in the wild, under captive conditions these behaviors may not be recapitulated due to stress or lack of natural stimuli. Ease of elicitation also promotes reproducibility because many independent trials can be run.
In order to maximize the sample size of animals tested, one should consider choosing a species with high fertility and fecundity in the laboratory. Many factors affect animal husbandry in the laboratory. When removed from their natural environments, many animals lack the cues that would typically drive reproduction in the mating season. Often, animals that breed year-round make excellent model systems, as replication of their preferred conditions in the lab leads to high fertility rates. An ideal species would have a high rate of mating, many viable offspring per reproductive cycle, and a short time to sexual maturity (i.e., generation time). These are of course characteristics often observed in animal pests; in this light it is no surprise that rats, mice, and fruit flies are utilized by scientists!
If a researcher’s goal is to use genetic manipulations, she should consider carefully aspects of an organism’s reproductive biology. The generation of genetic lines typically requires access to fertilized zygotes [but see also Chaverra-Rodriguez et al., 2018]. What is the feasibility of obtaining these embryos, and of growing them to adulthood after genetic manipulation? Some species also require parental care to survive. For example, in mammals zygotes are typically recovered from superovulated females and after manipulation are re-implanted into pseudopregnant females. While such protocols have been recently developed for new models such as prairie voles (Microtus ochrogaster) [Donaldson et al., 2009], this process is difficult and time consuming. In contrast, externally fertilizing species such as fish and many invertebrates provides ready access to embryos to be manipulated.
Other practical considerations concern physically housing the animals. The density at which animals can be kept limits the number of specimens that can be analyzed. Many species have extreme stress reactions to confinement in a small space, while others become highly aggressive in confined groups. These characteristics can limit the productivity of an animal colony. Furthermore, the behaviors they exhibit may not be naturalistic under such stressful conditions. Some researchers seeking to study non-human primates have chosen to study marmosets rather than macaques, since differences in the size of the animals, their space requirements, and their social structures permit the use of dramatically less laboratory space [Mitchell and Leopold, 2015]. Furthermore, the costs of housing animals rise with the space they require, with more specialized housing needs, and with the amount of trained care they receive. At one extreme, work with macaque monkeys runs into the tens of thousands of dollars per year, per animal. In addition, the ethical and legal constraints increase with the use of animals that are more closely related to humans – those that are believed to experience pain and suffering in a manner similar to humans.
One might also consider whether a species can be studied easily in its natural environment. Field studies can be used to test the generalizability of a result, and data from field experiments can inform further studies to be performed in the lab. Typically, experiments in the field are less invasive than those permitted in a laboratory, and cannot include genetic manipulations. Despite these limitations, there are many advantages to field work. First, it allows an analysis of the external validity of a mechanism in a natural environment and reveals important nuances. For example, in a laboratory setting, prairie voles exhibit a preference for spending time with a single partner, a phenomenon likened to monogamy [Walum and Young, 2018]. However, in naturalistic settings, important nuances arise. Not only does it become clear that prairie voles are not all sexually monogamous, but their propensity to philander can be correlated with natural variation in expression of a vasopressin receptor [Okhovat et al., 2015]. Thus, molecular and genetic studies in the field can highlight important questions to be answered through genetic manipulations in the lab.
Are Findings Using the Species Likely to Be Broadly Informative?
While studies in any species can reveal important biological principles, a researcher might ask how broadly useful results in a given species will be. For researchers who wish to impact human health in the (relatively) short term, studying animals most closely related to humans may be more fruitful. However, some caution is warranted here: promising therapeutic avenues discovered in laboratory mice often fail to translate into clinical progress [van der Worp et al., 2010]. It is an outstanding question whether there is another single species that combines the convenience of the mouse with a higher predictive value for clinical development. In any case, if a mechanism that controls a given phenotype is found in only a single laboratory species, it is unlikely to extend to humans. Conversely, when experiments show a mechanism to be a key feature across a variety of species, additional species (including humans) are more likely to share it. Thus, an expansion of the species studied may improve the rate of clinical successes of therapeutics [Striedter et al., 2014].
The laboratory mouse also does not recapitulate other important features of the human experience. For example, the durable social bonds that humans form among mating partners and between fathers and offspring are not well modeled in the mouse. In contrast, numerous species including many mammals, birds, and fish exhibit monogamous bonds and paternal care [Whiteman and Côte, 2004; Kvarnemo, 2018]. Mice are also nocturnal and therefore have visual systems that differ in some fundamental ways from humans [Refinetti and Kenagy, 2018]. Many behavioral tasks that seek to test complex cognitive processes, including drug addiction, require training that traditional genetic model species do not perform well [Spanagel, 2017]. Researchers with interests in specific behaviors might wish to investigate alternative species.
Studying how a phenomenon has evolved is key to understanding it deeply [Dobzhansky, 1973]. A careful selection of species from an advantageous position on the phylogenetic tree will permit the most useful generalizations. By studying species (or genetic strains) within a clade that vary with respect to a phenotype of interest, it is possible to discover correlations between that phenotype and other traits, including molecular, anatomical, or behavioral processes. Furthermore, by sampling species more widely, one gains an appreciation for intermediate steps that occurred as a phenotype arose (and why it was subsequently lost among some lineages). This may, in turn, lead to an appreciation of additional key mechanisms that control a phenomenon of interest.
How Large Is the Community of Scientists Working with the Species?
There are advantages to studying species with either large or small research communities. If a species is understudied, there may be results that are “low-hanging fruits.” However, benefits accrue to large communities studying a species as well. Other researchers in the field will perform complementary research, which provides opportunities for collaborations and improves the rigor of science. These communities attract researchers who apply different experimental approaches, and who study different questions in the same organism but utilize similar tools. The economies of scale provided by a large community incentivizes “infrastructure” development. Examples of this include resources often developed by large consortia including shared genetic lines, genome sequences, experimental protocols, species-specific tools including behavioral testing apparatus, detection reagents like antibodies or validated oligonucleotides, and viral vectors. These features of larger communities can push research forward more rapidly and permit more opportunities for experimental replication to ensure that results have external validity. Therefore, investing funds and energy in a carefully selected set of species is recommended [Striedter et al., 2014]. This will optimize the balance of benefits from larger community size with transformational discoveries that await, unexplored in neglected organisms.
The Future of Research in Non-Traditional Model Organisms
Importantly, some technologies created by a community are restricted to a single species, while others can be readily ported. In fact, many technologies have recently been revolutionized in ways that render them species independent. We next explore the potential of these tools to affect a variety of genetic and neural circuit-level analyses in neuroethology. New tools significantly lower the barriers to performing sophisticated experiments in non-traditional model species and this may, in turn, shift the landscape of species under study in the near future.
Genetic Mapping
Work in the traditional systems of D. melanogaster and C. elegans have leveraged randomly induced gene mutations to identify functions for genes. However, non-traditional model systems have been indispensable for mapping naturally occurring gene variants that control traits. For example, divergent populations of stickleback fish (Gasterosteus aculeatus) and Mexican tetra fish (Astyanax mexicanus) have recently adapted to a variety of environments. Along with differences in their morphology and physiology [Jeffery, 2009; Jones et al., 2012], they have also evolved differences in social behavior. In particular, certain populations form tight, aligned schools, while others do not; genetic regions have been identified in each species that regulate the variance in behavior [Greenwood et al., 2013; Kowalko et al., 2013]. It is also possible to identify genetic regions that control the differences in behavior across species, if those species generate fertile hybrid offspring. In one example, two species of Peromyscus mice, P. maniculatus and P. polionotus exhibit striking differences in the extent of paternal care provided to offspring. A large study used these differences to map genes that regulate paternal behavior to a few genetic loci, providing insights into evolutionary changes that regulate social bond formation [Bendesky et al., 2017]. Mapping gene variants responsible for traits has been greatly facilitated in the past decade by improvements to sequencing technology and the concomitant precipitous drop in its price. High-quality genome sequences can now be determined for several thousand dollars, and the bioinformatic pipelines for computational assembly can now be run rapidly and easily according to standardized protocols. With complete genome sequences of many species now available in public databases and complementary experimental protocols, mapping genomic loci that control differences in traits across strains or species has become simpler and faster [Peterson et al., 2012].
Gene Expression Profiling
An alternate route to identifying key genes for particular behaviors is through the profiling of mRNA transcripts which are differentially expressed across groups. For example, comparisons can be readily performed across species, social groupings, or sexes. High-throughput sequencing has turbocharged transcriptomics similarly to genomics. Previously, transcriptome profiling work relied on the use of microarrays upon which a short segment of each gene was printed in known locations, and a sample was subsequently hybridized, allowing the calculation of expression level for every gene. However, this work required a significant investment in creating the microarrays, and each is designed to optimally detect sequences from a specific species. In contrast, high-throughput sequencing is agnostic to species; it works as well on non-traditional model organisms as on well-studied organisms. There is also no a priori requirement for a sequenced genome, although it does facilitate the process of mapping sequencing data to specific genes. Several other sequencing-based technologies are also portable to non-traditional organisms including chromatin immunoprecipitation (histone modifications are highly conserved, permitting cross-species antibody usage) and transcriptional profiling of neurons active during behavior [Knight et al., 2012].
Upon identification of a gene of interest, a common next step is the determination of its spatial expression pattern. This experiment is often performed using antibodies specific for the protein, but unfortunately there are often amino acid changes when comparing a traditional model species to the one of interest. Therefore, antibodies often do not recognize the orthologous protein in more distant species, necessitating the generation of new, sequence-specific antibodies. This is a time-consuming and expensive process. Fortunately, two complementary technologies fill this gap for detecting expression levels. First, mass spectrometry enables the quantitative detection of specific proteins from tissues. Although mass spectrometry typically utilizes homogenized tissue, it is also possible to detect a wide array of proteins (or indeed non-protein molecules such as lipids) in an unbiased manner [Hanrieder et al., 2013; Hosp and Mann, 2017]. Second, detection of mRNA sequences is readily performed using in situ hybridization, permitting the visualization of transcript expression at cellular resolution, across tissues. Probes for mRNA detection can be generated in a few hours in a process that can be easily customized for a gene in any species.
Genetic Manipulation
Upon mapping the control of a trait to a gene (or more commonly, a region containing a few genes), a challenge remains to formally test whether the hypothesized gene(s) controls the trait. The most straightforward way to test this is to engineer a mutation into the gene in one of the parental species or strains, and then determine whether the trait in question is affected. Until recently, targeted mutations of the genome were difficult to perform in any species other than laboratory mice. Stable ES cells from inbred lines permitted gene manipulation through homologous recombination, but these cells are unavailable for most species. However, a series of genome editing tools have revolutionized biological research in the past 15 years or so. Zinc-finger nucleases, and later transcription activator-like effector nucleases (TALENs), coupled tailor-made DNA-binding proteins to a nuclease, resulting in a tool that would cleave the genome at a desired site [Gaj et al., 2013]. More recently, CRISPR (for clustered regularly interspaced short palindromic repeats) has emerged as a simpler, cheaper, and more reliable tool that utilizes an RNA-guided nuclease to achieve targeted genome modifications [Cong et al., 2013; Doudna and Charpentier, 2014]. Using standard molecular biology reagents and simple protocols, a researcher synthesizes an RNA that base pairs with a desired site in the genome, and delivers it to an embryo together with the associated nuclease, Cas9 (Fig. 1b). After DNA cleavage, the cell’s endogenous machinery often repairs the site through a mechanism that may introduce a small mutation (i.e., non-homologous end joining; NHEJ), typically <10 base pairs long. Animals carrying a mutation in the targeted gene can be readily recovered, and those carrying a loss-of-function mutation can be propagated and studied. In the 7 years since CRISPR was first shown to be a viable tool, this workflow has generated mutants in many species previously refractory to manipulation [Sasaki et al., 2009, 2014; Li et al., 2014; Basu et al., 2015; Harel et al., 2015; Juntti et al., 2016; Hart and Miller, 2017; Stern et al., 2017; Trible et al., 2017; Bryant et al., 2018; Horie et al., 2019]. Note that gene knockdowns via RNA interference or morpholino technology has been used extensively in the past [Nasevicius and Ekker, 2000]. However, use of these techniques should be initiated with great care. Recent evidence in zebrafish has revealed extensive divergence in observed phenotypes using morpholinos, compared to loss-of-function mutations induced by CRISPR and TALENs [Kok et al., 2015]. The reasons for the discrepancies are unclear, but they may result from high off-target rates of morpholinos leading to aberrant phenotypes or from insufficient gene knockdown to obtain a detectible effect. Thus, recent developments in gene editing technology (e.g., CRISPR) provide a more reliable path to determining gene function. With the recent proliferation of technologies, the primary barriers to performing genetic experiments in non-traditional model species are no longer a lack of molecular genetic tools, but rather issues described above regarding animal husbandry and embryology.
Methods for genetic manipulation. a A plasmid encoding a modified transposon contains terminal repeat transposase recognition sites (TR) flanking a cassette consisting of a promoter that drives transgene expression. A transposase enzyme provided in trans catalyzes the insertion into the genome, at a random location. b CRISPR gene editing utilizes a guide RNA (blue) which base pairs to a selected genomic location. The Cas9 nuclease catalyzes a double-strand break (DSB, at arrowhead) that can be repaired by the host cell’s non-homologous end joining (NHEJ) or homology-directed repair (HDR) mechanisms. NHEJ repair often results in small insertions or deletions (blue X) that lead to frameshift, loss-of-function mutations. When a “donor” sequence contains a desired sequence (e.g., a transgene), flanked by a sequence that matches the chromosomal location surrounding the DSB, the host cell may use HDR-based insertion of this sequence into this site in the genome. This permits faithful transgene expression under the control of endogenous regulatory elements. c A transgene and associated promoter may be packaged into a viral vector. Researchers produce viral particles in vitro by delivering transgene cargo and components for viral coat and integration factors. Viral particles can be delivered to fertilized embryos with the goal of germline modification, and thus the creation of transgenic lines. Alternatively, viral vectors can be injected into selected regions of adult tissues, thereby effecting transgene expression in a localized cell population.
Methods for genetic manipulation. a A plasmid encoding a modified transposon contains terminal repeat transposase recognition sites (TR) flanking a cassette consisting of a promoter that drives transgene expression. A transposase enzyme provided in trans catalyzes the insertion into the genome, at a random location. b CRISPR gene editing utilizes a guide RNA (blue) which base pairs to a selected genomic location. The Cas9 nuclease catalyzes a double-strand break (DSB, at arrowhead) that can be repaired by the host cell’s non-homologous end joining (NHEJ) or homology-directed repair (HDR) mechanisms. NHEJ repair often results in small insertions or deletions (blue X) that lead to frameshift, loss-of-function mutations. When a “donor” sequence contains a desired sequence (e.g., a transgene), flanked by a sequence that matches the chromosomal location surrounding the DSB, the host cell may use HDR-based insertion of this sequence into this site in the genome. This permits faithful transgene expression under the control of endogenous regulatory elements. c A transgene and associated promoter may be packaged into a viral vector. Researchers produce viral particles in vitro by delivering transgene cargo and components for viral coat and integration factors. Viral particles can be delivered to fertilized embryos with the goal of germline modification, and thus the creation of transgenic lines. Alternatively, viral vectors can be injected into selected regions of adult tissues, thereby effecting transgene expression in a localized cell population.
A second approach to genome manipulation utilizes transgenesis to insert sequences from one species into another, permitting a variety of studies. For the cases described above in which a trait has been mapped to a given locus, an alternate demonstration that the gene (or genes) at the mapped locus is causative for the trait of interest can be performed through transgenesis. By inserting the dominant allele via transgenesis into the animal carrying the recessive allele, a genetic rescue confirms the role of the candidate gene. In principle, any gene can be inserted into the genome, and this has been used to great effect in order to monitor and manipulate neural processes. For example, fluorescent proteins may be expressed in specific cell types, enabling the visualization of these cells or their synaptic connections. While transgenesis was initially developed in traditional model organisms [Chang and Cohen, 1974; Brinster et al., 1982], the underlying technologies have been ported successfully to other species. Four main methods exist for delivering transgenic constructs. First, a researcher may deliver naked DNA that encodes the transgene and its associated promoter (this may drive expression that is cell type specific or ubiquitous). Most protocols deliver the transgene into recently fertilized zygotes, thus maximizing the rate at which the transgene is incorporated into the germline (i.e., sperm and eggs). This is a crucial step in all methods of transgenesis as it is necessary to establish a genetic line. However, the likelihood that naked DNA inserts successfully into the genome is relatively low, and transgene sequences often integrate as unstable concatemers. A second method for transgene delivery utilizes modified transposons (Fig. 1a). Generally speaking, a naturally occurring transposon is stripped of the sequence that encodes a transposase enzyme, which is replaced by the desired transgenic sequence. Importantly, this process leaves in place terminal repeat sequences that the enzyme recognizes in order to insert the transgene. Subsequently, a researcher injects the modified transposon into zygotes, and delivers simultaneously (i.e., in trans) the transposase enzyme. In this manner, the transposase can insert the transgene into the genome. Since the injected enzyme is quickly degraded, in its absence the transgene becomes fixed in the genome. Both viral vector (described below) and transposon-mediated approaches suffer from the drawback that they integrate at a random location in the genome. Due to an effect known as position-effect variegation [Weiler and Wakimoto, 1995], two transgenes integrated at different locations may express in different spatial patterns, at different levels, or may be altogether silenced. Thus, in order to obtain a transgene that is expressed in the desired pattern, several independently founded transgenic lines may need to be screened. The use of the Tol2 transposon system has gained traction for its high efficiency in a variety of species, and other systems are available [Ivics et al., 2009].
The most reliable method to drive transgene expression in a predictable manner is through the use of homologous recombination to insert a transgene into the locus of a gene with a known expression pattern (a gene knock-in). This is commonly performed by either replacing the first coding exon with a transgene (knock-in/knock-out approach) or by co-expressing a transgene in a second cistron after the endogenous gene [Chan et al., 2011]. This has been shown to reliably express the transgene in a similar pattern to the endogenous gene. Current work using CRISPR editing is making rapid progress toward efficient transgene knock-ins. Following CRISPR-induced double-strand breaks, the cell repairs the genome using either NHEJ or homology directed repair (HDR, a set of DNA repair mechanisms that includes homologous recombination; Fig. 1b). The latter process is guided by a template, which may be either the homologous chromosome or a DNA template provided by the experimenter which contains a sequence for insertion, flanked by sequence that matches the insertion site. This process of knock-in via CRISPR-HDR is fairly efficient in mammals and some invertebrates. Interestingly, in fish NHEJ predominates and knock-in alleles are very difficult to recover, though some success has been reported [Kimura et al., 2014; Hisano et al., 2015; Auer and Del Bene, 2016; Wierson et al., 2018]. As gene editing technology improves at a stunning rate, there is reason to be optimistic that it will enable sophisticated genetic manipulations including Cre/lox, Gal4-UAS, and others in a wide variety of species.
Viral vectors that drive genomic insertion may also be used (Fig. 1c). Lentiviruses have been research foci, particularly vectors derived from one of the best-studied viruses, HIV. HIV has been modified to eliminate its pathogenicity and ability to replicate; it can also carry approximately 8 kilobases (kb) of genetic cargo, permitting many promoter-transgene combinations [Miyoshi et al., 1998]. This approach has been used to generate transgenic voles, rats, and songbirds [Lois et al., 2002; Agate et al., 2009; Donaldson et al., 2009]. The envelope protein of the vesicular stomatitis virus (VSV-G) grants lentiviruses a wide host range; it permits infections of both invertebrates and vertebrates alike [Cronin et al., 2005]. Viruses also allow for delivery of genetic cargo to somatic tissues in a region-specific manner. Thus, a transgene can be stereotaxically delivered to a single brain area of interest, without affecting neighboring regions. This approach is currently used extensively in mouse molecular genetic studies of the brain, as it permits the delivery of transgenes that visualize and manipulate selected populations of neurons, as described in the next section. Most work in the mouse utilizes adeno-associated viral (AAV) vectors which can be generated at high titers. The subtypes (i.e., serotypes) of AAV may not function equivalently across species however, so any application to new species must be carefully tested. Notably, VSV lentiviral vectors appear to successfully infect a wide variety of cell types in species across the animal kingdom [Mundell et al., 2015]. Thus, the potential for viral transduction into germline cells and targeted brain areas is feasible in a wide variety of species.
Circuit-Level Analysis
Many of the key technologies for analyzing neural activity patterns, circuit function, and neuronal connectivity (e.g., electrophysiology, lesions and pharmacology, and neuronal tracers, respectively) are applicable in virtually any model organism, and were indeed often developed outside of traditional model species. A new generation of complementary, genetically encoded tools that allow the observation and manipulation of molecularly defined neuronal subsets has accelerated our understanding of the nervous system. These technologies may also be applied in non-traditional model species, but they require efficient delivery of the requisite transgenes. We next turn our attention to the prospects of using these technologies.
Many electrophysiology experiments that record from or activate multiple cells suffer from a drawback: they indiscriminately affect neurons in a given region. Every brain region exhibits a high degree of heterogeneity in cell types represented. For example, excitatory glutamatergic neurons are often intermingled with inhibitory GABAergic cells. Each class may be further subdivided by the expression of other genes, such as D1 and D2 dopamine receptor subtypes, which respond oppositely to their monoamine ligand [Gerfen and Surmeier, 2011]. One region of the basal forebrain, the preoptic area which controls mating and parental behaviors, is comprised of approximately 70 distinct cell types, each of which is defined by a unique gene expression profile [Moffitt et al., 2018]. Each of these cell types may control different aspects of behavior or physiology, and determining the function of each necessitates technologies that can access these cells independently of one another. While electrophysiological properties can be used to classify cell types during and after recording, finding rare cells within a brain region presents a challenge. Molecularly distinct subsets of cells can be identified through the use of fluorescent transgene expression. Such a label permits an electrophysiologist to selectively record activity in these defined cells [Ma et al., 2015].
Additional transgenes expand the toolbox for circuit analysis beyond electrophysiology. A number of recently developed tools allow the visualization or manipulation of activity of a specified cell set using chemical or optical means. The basic outline is as follows. A cell type is implicated in a behavior, perhaps through genetic or pharmacological tests. In order to directly test its function, a transgene is expressed in this cell type, rendering it manipulatable. By simultaneously manipulating activity and recording behavior, a functional relationship between cell type and behavioral output can be directly assessed.
Optical methods have been developed to observe simultaneously the calcium influx associated with action potentials across large numbers of neurons. This either uses calcium-binding dyes for acute experiments or genetically encoded calcium sensors for long-term imaging. Genetically encoded calcium sensors such as the GCaMP transgenes [Tian et al., 2012; Lin and Schnitzer, 2016] enable imaging of molecularly defined subsets of neurons over multiple days [Jennings et al., 2019]. This provides information about when various classes of neurons become active during selected behavioral routines, though it does not have the ultrafast temporal resolution of traditional electrophysiology. After identifying a population of neurons that is activated during a behavior, a test of their causality may be performed using “effector” transgenes expressed in those cells. One class of these effectors are optogenetic transgenes. These are proteins that are sensitive to particular wavelengths of light, and they respond by passing current across the membrane of the cell in which they are expressed. Thus, when light is shone upon a brain region only the cells that have been genetically programmed to express the optogenetic transgene respond. Various transgenes have been identified or engineered that depolarize or hyperpolarize in response to light, and vary with respect to activation wavelength and other kinetic properties [Lerner et al., 2016]. The delivery of light into the brain is a challenge. Small, translucent animals permit direct visualization and light activation of neurons, an approach leveraged in the larval zebrafish and the hydra; these species enable visualization of the entire nervous system simultaneously [Bene et al., 2010; Dupre and Yuste, 2017]. In larger animals, light delivery and imaging deep in the brain is enabled by implanted fiber optics fitted with lenses [Zhang et al., 2019].
A complementary technology termed chemogenetics uses normally inert chemical compounds to activate selected cells that have been rendered sensitive to the chemical by transgene expression. Two approaches have been utilized. The first is to use receptors for which an endogenous ligand does not exist. For example, only in mammals is the heat-responsive cation channel Trpv1 sensitive to the noxious component of chili peppers, capsaicin [Jordt and Julius, 2002] (non-mammalian species express Trpv1 but it does not respond to capsaicin). The mammalian Trpv1 channel can therefore be expressed in selected neuron types of non-mammalian species, resulting in cell depolarization selectively in these cells when exposed to capsaicin [Chen et al., 2016]. Through protein engineering, designer receptors have been created that respond exclusively to designer drugs (i.e., drugs that have no known endogenous targets) [Rogan and Roth, 2011]. This enables manipulations of membrane potentials – either excitatory or inhibitory, depending on the receptor sequence – to be induced in selected genetic subsets.
These effectors are primarily deployed in traditional model organisms and rely mainly on two delivery mechanisms, viral vectors and transgenesis. In the former, the transgene encoding the effector, with a cell type-specific promoter or bipartite genetic switch such as Cre-lox, is inserted into the genome of a recombinant virus that is injected into a brain region. Upon infection of all cells at the injection site, only a subset of cells will activate transcription via the promoter, resulting in cell-type specificity of the manipulation [Schnütgen et al., 2003; Atasoy et al., 2008]. Typically, viral vectors are deployed in animals with relatively large brains, permitting a researcher to target neurons throughout a single brain region. In non-traditional model species, there has not been any concerted effort to develop Cre transgenic lines, so researchers who wish to use this approach will have to generate these lines themselves. However, a single Cre line enables the use of many virally delivered transgenes for use in activation, ablation, tracing, etc. Fortunately, the creation of a panel of viral vectors for these purposes is simpler than generating individual transgenic lines. Many viruses are already available through core facilities at Addgene and Janelia Research Campus.
In smaller animals, it may not be feasible to target individual brain regions with a viral vector due to difficulties in performing stereotaxic surgery including virus spillover into neighboring regions. In such species, it may be easier to generate transgenic animals, as is commonly done with zebrafish. In order to maximize the utility of transgenic lines, work in Drosophila, mosquitos, and zebrafish utilizes Gal4-UAS (and the conceptually similar Q-system) [Baier and Scott, 2009; Ghosh and Halpern, 2016; Riabinina and Potter, 2016]. These permit separate lines to determine the expression pattern (Gal4 lines) and the effector gene (UAS lines). Although this requires the generation of additional transgenic lines initially, the modularity of the system ultimately results in improved flexibility and fewer lines in the longer term. For non-traditional model species, the optimal gene delivery method will depend on the ease of transgenesis, utility of viral vectors, and the need for brain region specificity.
The Evolution of Cichlid Fish as a Model Organism
I provide here an example of a non-traditional genetic model organism that has much to teach us about neuroscience and evolutionary biology, the cichlid fish Astatotilapia burtoni. The cichlid family contains >2,000 species which are primarily endemic to the Rift Valley lakes of East Africa [Kocher, 2004]. These fish have radiated explosively in just the last ~12 million years – an eyeblink in evolutionary time. Furthermore, the species show extensive diversity on multiple phenotypic axes: morphological, physiological, and behavioral. There has been great interest in mapping the genetic determinants of these traits. Due to the recent divergence of these species, many of them can be hybridized, which has enabled QTL mapping of traits including craniofacial structure (which correlates with feeding behavior), sex determination system, coloration, and visual color sensation [Albertson et al., 2003; Gammerdinger and Kocher, 2018; Kratochwil et al., 2018; Nandamuri et al., 2018]. In response to the interest in genetic mapping, researchers sequenced the genomes of cichlid fish: the first five representative species [Brawand et al., 2014], and more recently hundreds more species [Irisarri et al., 2018; Salzburger, 2018; Conte et al., 2019]. It is likely that, in the near future, all cichlid species genomes will be sequenced, thus permitting powerful comparative genomic approaches to determine control of many different traits. Notably, ethologists have studied the fascinating behaviors of these species for over a century, cataloging and comparing the unique behaviors shown by most of these species, with a particular focus on social and reproductive behaviors [Baerends and Baerends-Van Roon, 1950; Maruska, 2014]. Cichlids make for a terrific set of species to study in the lab because they are hardy and do not require highly specialized equipment, making them inexpensive. They breed well, externally fertilizing 30–100 eggs on a monthly basis, which is important for performing gene modifications.
As social behaviors are regulated by hormones, we reason that identifying the mechanisms that control sex hormone production and sense hormones will make ideal points at which to study the neural circuits that underlie social behaviors. Hormone production is actively regulated in the brain in response to social stimuli; hormone receptors must be expressed in the neural circuits that produce sexually dimorphic behaviors. The neurons that produce gonadotropin-releasing hormone (GnRH1) are the master regulators of the hypothalamic-pituitary-gonadal axis, and thus control levels of the steroid hormones testosterone, estradiol, and progestin in all vertebrates. In order to monitor the function of these key cells, we generated a line of transgenic A. burtoni that expresses EGFP specifically in GnRH1+ neurons [Ma et al., 2015]. We adapted the Tol2 transposon system originally identified in medaka fish and repurposed in several other species for efficient transgene integration [Kawakami, 2007; Juntti et al., 2013]. This permitted us to perform fluorescence-guided electrophysiological recordings from pairs of these cells. We observed that these cells exhibit synchronous firing during spontaneous activity, as has been observed in other vertebrates. This coordination is essential, as only pulsatile, discrete periods of release are sufficient to elicit hormone increases [Belchetz et al., 1978]. Furthermore, we demonstrated the presence of electrical synapses connecting these cells, providing a mechanism for the elusive source of coordination between these cells. These experiments show that non-traditional model organisms can be readily manipulated with transgenic approaches, enabling researchers to ask questions with relevance to a wide variety of species, including humans.
Female Mating Behavior and Its Evolutionary Consequences
Why have cichlids evolved so rapidly? Evolutionary biologists hypothesize that choosy females drive intense sexual selection, and variations in female preference can establish species barriers [Wagner et al., 2012]. Therefore, to understand sexual selection, a key question becomes: how is female reproductive behavior controlled by the brain? Previous work in goldfish and other fish showed that the signaling molecule prostaglandin F2α (PGF), which is produced by the oviduct at ovulation, promotes sexual behavior [Stacey, 1976; Liley and Tan, 1985; Villars et al., 1985; Kidd et al., 2013]. How this acts on a molecular level in the brain was not known, however. We used the sequenced A. burtoni genome to identify a likely receptor for PGF [Juntti et al., 2016]. We then adapted CRISPR to cichlids, and induced mutations in the PGF receptor. Mutant female A. burtoni never spawned with males, demonstrating that PGF signaling is necessary for female sexual behavior. We also used in situ hybridization to map expression of the receptor to just four brain regions, two of which are activated during mating. Since PGF signaling is central to initiating mating, we infer that one or more subsets of these neurons are key nodes in a neural circuit that selects a mate and initiates sexual behavior. Thus, we speculate that the features that each cichlid species finds attractive are processed by neurons that interact with PGF receptor-expressing cells, which in turn receive information regarding reproductive status from the ovary (Fig. 2). The sensory systems that perceive these features may therefore be evolving such that signals from males of only one species drive activity in key sites of the PGF-sensitive circuit. Future studies will leverage the species-independent technologies detailed above to identify cells that perceive features during male courtship, and those cells that effect female motor patterns during spawning.
Female reproductive behavior and hypothesized control mechanisms. a Female cichlids perform a stereotyped mating routine that is activated by prostaglandin F (PGF) and requires Ptgfr. b Model for reproductive behaviors: Ptgfr+ neurons integrate social cues with PGF signaling and act through unidentified neurons to produce reproductive behaviors. Adapted from Juntti et al., 2016.
Female reproductive behavior and hypothesized control mechanisms. a Female cichlids perform a stereotyped mating routine that is activated by prostaglandin F (PGF) and requires Ptgfr. b Model for reproductive behaviors: Ptgfr+ neurons integrate social cues with PGF signaling and act through unidentified neurons to produce reproductive behaviors. Adapted from Juntti et al., 2016.
How might further progress be made in determining the neural and genetic control of mating? There are many approaches that utilize species-independent tools. One is to identify neurons that are active during mate choice or sexual behavior. This can be performed by using in situ hybridization to label cells that express “immediate-early genes,” which are transcribed upon initiation of neural activity. When simultaneously labeled for candidate neuronal marker genes, the region and subset of activated cells activated may be identified. Due to the slow time course of transcription (tens of minutes), this approach does not provide the temporal resolution to determine during exactly which phase of behavior a given neuron is activated. Calcium imaging is an attractive alternative, as it allows the visualization of neural activity on the sub-second timescale. It is likely that calcium imaging transgenes will be useful in numerous non-traditional model species. In the future, transgenic animals can be engineered to render specific cell populations manipulatable, thus permitting causal experiments that link them to behavior.
What genes regulate social behaviors? Likely candidates include those that are differentially expressed across sexes or are induced by sex hormone signaling. High-throughput sequencing of whole transcriptomes may rapidly detect such genes. The function of these genes can be rapidly assessed through CRISPR gene editing. In cichlids, mutant alleles can be generated at high rates for virtually any gene [Juntti et al., 2016 and unpubl. data]. The large family of genetically similar cichlids are also a major asset because it permits comparative genomic studies. Comparing features of sequenced genomes in closely related species, or directly mapping “preference genes” in species hybrids, may highlight key regulators of neural circuits for mate choice. Additional comparative studies of cichlid species will also reveal the genes that regulate species differences in mating preference, parenting style, and mating partner pair bonding.
One might ask, must all the species-specific reagents (e.g., transgenic and CRISPR mutant lines) be replicated in each species? To ask this another way, to what extent should the field focus its efforts on a small number of species? Consolidation brings benefits of sharing data and genetic lines, and provides a standard species platform on which to test hypotheses. One approach is that utilized by researchers studying multiple species of fruit flies. Here, researchers generate mechanistic hypotheses through the use of genetic crosses between multiple Drosophila species, high-throughput sequencing, and other methods. However, when testing those hypotheses, D. melanogaster is the species utilized [Chung et al., 2014], though gene editing technologies are improving among alternative Drosophila species [Seeholzer et al., 2018]. Among cichlids, A. burtoni is a good choice for hypothesis testing, as it is similar to the predicted ancestor of a majority of cichlid species (i.e., haplochromines). Furthermore, many resources have already been generated for these animals, including a genome sequence and protocols for genetic modification. For each model system, this decision may be made independently. One solution is for transgenic lines to be created in one reference species, while CRISPR gene editing of individual genes may be performed in the reference as well as those species hypothesized to have gain of function mutations that give rise to a trait. Additional reference species may also be selected among the non-haplochromine cichlids for testing phenotypes such as pair bonding that cannot be assessed in A. burtoni(or indeed in the >95% of vertebrate species that are polygamous).
The use of non-traditional model species has grown recently and appears poised to rapidly expand further. This expansion will be fueled by new technologies that can be rapidly deployed in many species, while the growth of new communities researching previously understudied organisms will provide fertile ground for new ideas. Judicious choices of species will permit tests of key questions in biology using ideal model organisms and yield unforeseen and exciting results.
Acknowledgements
S.J. would like to thank E. Haag for insightful comments on the manuscript. S.J. is supported by the National Science Foundation (IOS-1825723), the Human Frontiers in Science Program, and the William J. Higgins fund.
Disclosure Statement
The author has no conflicts of interest to declare.