Abstract
The selection of model species tends to involve two typically unstated assumptions, namely: (1) that the similarity between species decreases steadily with phylogenetic distance, and (2) that similarities are greater at lower levels of biological organization. The first assumption holds on average, but species similarities tend to decrease with the square root of divergence time, rather than linearly, and lineages with short generation times (which includes most model species) tend to diverge faster than average, making the decrease in similarity non-monotonic. The second assumption is more difficult to test. Comparative molecular research has traditionally emphasized species similarities over differences, whereas comparative research at higher levels of organization frequently highlights the species differences. However, advances in comparative genomics have brought to light a great variety of species differences, not just in gene regulation but also in protein coding genes. Particularly relevant are cases in which homologous high-level characters are based on non-homologous genes. This phenomenon of non-orthologous gene displacement, or “deep non-homology,” indicates that species differences at the molecular level can be surprisingly large. Given these observations, it is not surprising that some findings obtained in model species do not generalize across species as well as researchers had hoped, even if the research is molecular.
Transgenic models have been so successful that they have become a standard tool in molecular genetics and biomedical studies and are being used to fulfill one of the main goals of the post-genomic era: to assign functions to each gene in the genome. However, the assumption that gene functions and genetic systems are conserved between models and humans is taken for granted, often in spite of evidence that gene functions and networks diverge during evolution.
Vincent J. Lynch, 2009
Introduction
The choice of species in biological research has long been driven by “convenience” [Jørgensen, 2001]. As the Nobel Prize-winning physiologist August Krogh put it in 1929: “For a large number of problems there will be some animal of choice or a few such animals on which it can be most conveniently studied.” Convenience in this context includes ready availability, year-round fertility, low cost, and reduced ethical concerns, as well as anatomical or physiological features that facilitate experimental manipulations and analyses [Krebs, 1975]. An excellent example of such a convenient species is the long-finned squid, Loligo forbesii, whose giant axons allowed early neurophysiologists to make intracellular recordings of neuronal action potentials and, eventually, discover their ionic basis [Hodgkin and Huxley, 1939]. A key element of Krogh’s principle, as it came to be called [Krebs, 1975], was that different research questions tend to require different species for optimal progress. Indeed, experimental biologists in the first half of the 20th century studied a wide variety of species.
By 1950, however, the laboratory rat had emerged as the dominant species for behavioral and physiological research [Beach, 1950; Logan, 2005]. Geneticists, meanwhile, focused their efforts on a relatively small number of species, notably corn, domesticated mice, and the fruit fly Drosophila melanogaster [Davis, 2004; Lynch, 2009]. Work with strategically selected bacteriophages, fungi (yeast), and various bacteria then led to powerful techniques for genome manipulation, opening the doors to molecular biology [Davis, 2004]. These new techniques, in turn, facilitated extensive research on some multicellular species, notably the flowering plant Arabidopsis thaliana, the roundworm Caenorhabditis elegans, and the zebrafish Danio rerio. All of these “model organisms” were initially selected for their experimental convenience, especially their short generation times. However, as research communities formed around these species and began to accumulate species-specific knowledge, techniques, and resources, those communities became another major benefit of selecting these species for research [Bolker, 2017]. As a result, research on mice, zebrafish, Drosophila, C. elegans, and Arabidopsis increased dramatically, while research on most other species waned, at least in proportion [Dietrich et al., 2014; Peirson et al., 2017].
The concentration of research on just a few species seemed justified because many features, especially at the molecular level, were discovered to be broadly shared across species. Many genes and proteins were found to have orthologs (i.e., strict homologs) in distantly related species [Zuckerkandl and Pauling, 1965a; Graham et al., 1989; Mushegian et al., 1998; Mushegian, 2010], and numerous aspects of cellular function and embryonic development likewise appeared similar between quite distant relatives [Krebs, 1975; Carroll, 2008]. Buoyed by these discoveries, some investigators concluded that “it did not matter which animal you chose – fundamental processes were fundamentally conserved” [Grunwald and Eisen, 2002]. For example, it has become relatively common to think that one can study disease mechanisms even in species that do not actually fall ill with the disease of interest, as long as the species has, or is made to possess, homologs of genes that are linked to this disease in Homo sapiens[Rubin et al., 2000; Gould and Gottesman, 2006; Lilienfeld, 2014]. Instead of finding the right species to study a particular problem, as Krogh had advised, biologists increasingly believed that a select group of “model species” could be used to study a wide variety of biological problems. In essence, they came to regard the favored species as “Rosetta stones” [Gest, 1995] and saw them as “examples rather than ‘models’” [Krebs, 1975].
Of course, some biologists continued to be interested in species differences, especially if they are linked to evolutionary “survival value” or reproductive fitness [Krebs and Krebs, 1980]. Moreover, the rapidly increasing number of sequenced genomes has sparked new interest in genetic variation across lineages [e.g., Freilich et al., 2008; Peregrín-Alvarez et al., 2009], and an ever-growing arsenal of gene editing techniques is now allowing researchers to examine gene and protein functions in a wide variety of species [Juntti, 2019]. Students of human biology have also developed renewed interest in variation, especially with regard to sex differences [Cahill, 2006, 2014; Clayton and Collins, 2014] and precision medicine, which is predicated on the idea that humans vary significantly in how they respond to medical stressors and therapies [National Research Council, 2011]. But this interest in variation among humans raises an important question: if critical aspects of human biology vary significantly across individuals, is it reasonable to assume that most of the variation across species can be dismissed as being “not fundamental” and, therefore, not medically or biologically important? Increasingly, the answer is “probably not,” especially in light of the fact that the vast majority of therapies developed in model species has failed in clinical trials, especially for therapies aimed at neurological disorders [Cummings et al., 2014; Petrov et al., 2017].
If species differences are to be taken seriously, then they must impact our decisions about which species to select for research. To that end, it would help to have some “laws of variation” [Darwin, 1859] that could be used to predict which findings are likely to be generalizable across which species. For example, allometric scaling laws can be used to make good predictions about drug dosages for animals of different body size [West et al., 1962]. Outside of pharmacology, however, scaling laws are rarely used to inform biomedical research. Therefore, I here focus on two other principles of variation that have been more widely used to guide species selection, namely:
1.the idea that species differences increase monotonically with phylogenetic distance, and
2.the idea that species similarities are greater at lower levels of biological organization.
Both of these principles seem to be widely accepted among biologists, but they are rarely tested or stated explicitly. Even if they are to some extent “straw men,” I here evaluate them critically.
The Relationship between Similarity and Phylogenetic Distance
As populations divide to form separate species and then split into additional species, the differences between the various lineages tend to accumulate. As a result, it seems reasonable to argue that “the most closely related taxa are those most similar to one another” [Fitch, 1977]. This assumption lies at the heart of all algorithms aimed at recovering the “phylogenetic signal” in complex patterns of species similarities and differences [Felsenstein, 1988; Blomberg and Garland, 2002]. As more and more data from an ever-increasing number of species have been fed into these algorithms, the inferred phylogenies have become increasingly congruent. Some relationships remain stubbornly controversial, and debates about which features carry the most phylogenetic signal persist [Brocchieri, 2001; Salichos and Rokas, 2013; Shen et al., 2017]. However, the basic assumption that species differences tend to accumulate with phylogenetic distance is surely valid, as long as the data set is large enough. Still, there is some value to being more specific about how lineages diverge.
Consider a biological feature that evolves by Brownian motion, along a “random walk” as lineages split repeatedly (Fig. 1a). Simulations of such an evolutionary scenario reveal that the mean difference in trait values between any two taxa increases monotonically with phylogenetic distance (also known as divergence time). It is important to note, however, that the mean difference increases with the square root of divergence time, rather than linearly [Letten and Cornwell, 2015]. This means that closely related (i.e., recently diverged) species tend to vary more than one might expect if one were to extrapolate linearly from the differences observed between more distant relatives. Another important caveat is that the square root rule applies only to the mean trait difference. As illustrated in Figure 1, individual traits may converge as well as diverge, even when they are evolving “randomly.” Adding natural selection into the mix increases the rate at which some traits converge, as well as the divergence rate of other traits. Indeed, an excellent way to test whether a trait is likely to be under selection is to examine whether it diverges more than expected under the random-walk, neutral model of evolution [e.g., Smaers et al., 2017, 2018]. Conversely, traits under strong constraints or stabilizing selection may vary less than expected [Hansen, 1997]. In short, the rule that differences accrue with phylogenetic distance holds only on average.
Although the diverge-with-the-square-root-of-distance rule probably applies to traits at all levels of organization, molecular biologists have long argued that it applies most predictably at the molecular level, where evolution is thought to be predominantly neutral or “nearly neutral” [Kimura, 1985; Nei, 2005]. Specifically, they have argued that evolutionary changes in genes and proteins occur at relatively steady rates, and that this “molecular clock” can be used to estimate lineage divergence dates [Zuckerkandl and Pauling, 1965b]. However, it soon became apparent that different molecules evolve at different rates [Ho and Duchêne, 2014] and that molecules, just like higher level traits, can be under strong selection and exhibit significant amounts of convergent evolution [e.g., Parker et al., 2013; Gallant et al., 2014; Nagy et al., 2014]. Only when one examines aggregate data for many genes or proteins does the expected square root relationship between species differences and divergence time emerge (Fig. 1c, d). Even then, the observed relationship tends to be rather “noisy,” mainly because some taxonomic groups exhibit unusually high rates of evolutionary change.
Indeed, numerous studies have shown that the molecular clock runs at different speeds in different lineages [Mushegian et al., 1998; Berná et al., 2009; Ho and Lo, 2013; Berná and Alvarez-Valin, 2014]. Especially high rates of evolutionary divergence are seen in yeasts, nematodes, fruit flies, tunicates, and teleost fishes (Fig. 2). Most of these species have very short generation times (e.g., 3 days for C. elegans, 6 days for the larvacean tunicate Oikopleura, 7–19 days for Drosophila, and about 3 months for zebrafish and laboratory mice). Since one would expect heritable mutations to accumulate across generations, it is not surprising that these species change rapidly during the course of evolution, at least at the molecular level. In contrast, lineages with longer generation times tend to evolve far more slowly [Amemiya et al., 2013]. Coelacanths, for example, have a gestation time of more than 1 year and do evolve very slowly [Casane and Laurenti, 2013]. Turtles and crocodilians also tend to have long generation times and low evolutionary rates [Green et al., 2014]. Divergence rates can also be influenced by factors other than generation time, including variations in gene repair mechanisms, metabolic rate, and population size [Thomas et al., 2006, 2010]. However, for our purposes, the crucial point is that the variations in evolutionary rates cause graphs of trait difference versus divergence time to exhibit pronounced peaks and valleys (Fig. 2b). Thus, the relationship between these variables is not monotonic; some taxa are major exceptions to the diverge-with-the-square-root-of-distance rule!
How do these considerations influence which species one should select for research? If the goal is to extrapolate the findings to humans, it is best to select a species as close to humans as possible (given budget, ethical, and technical constraints), all the while keeping in mind that even closely related species may diverge considerably in a few traits. Second, one should be wary of studying species that have short generation times and are already known to evolve rapidly. This includes many of the traditional model species but, fortunately, the development of other, less divergent models (e.g., induced pluripotent stem cells from human patients) is now feasible and may, in the long run, be more cost-efficient. Third, it is advisable to compare findings across several non-human species before assuming that they will generalize to H. sapiens. Importantly, those comparisons should include at least some species that do not develop and reproduce as rapidly as the traditional model species, because many of the features shared between the traditional models may turn out to be convergent adaptations to rapid development, rather than primitive traits [Bolker, 1995]. Of course, if the aim of the research is something other than cross-species extrapolation, then working with the traditional model species can still be incredibly useful. As detailed toward the end of this essay, such research can reveal (and often has revealed) general principles that are broadly applicable, even if the details of their implementation vary across species.
The Relationship between Conservation and Biological Level
The discovery that many genes and proteins have orthologs even in distantly related species came as a major surprise in the 1970s and 1980s, exemplified by the finding that most of the vertebrate hox genes have fruit fly orthologs that resemble their mammalian counterparts not only in gene sequence but also in their expression patterns and order on the chromosomes [Graham et al., 1989]. Moreover, many of these distant orthologs were shown to have conserved functions, as evidenced most clearly by cross-species substitution and rescue experiments. For example, expression of a hox gene from a chicken can “rescue” morphological defects observed in Drosophila mutants lacking the gene’s ortholog [Lutz et al., 1996]. Another famous demonstration of functional similarity between orthologous genes was that pax-6 from mice can substitute for its Drosophila ortholog (eyeless) insofar as its expression can induce ectopic eye development in flies [Halder et al., 1995]. Subsequent discoveries highlighted that entire networks of interacting genes and proteins (notably transcription factors) are also more conserved than scientists had expected [Davidson, 2010].
All this conservation at the molecular level created an interesting paradox: if protein-coding genes are so highly conserved, how can species be so different from one another in their morphology, their physiology, and their behavior? As King and Wilson [1975] put it in a highly influential paper that focused on comparing humans to their closest living relatives: “… the genetic distance between humans and the chimpanzee is probably too small to account for their substantial organismal differences.” How can this be, when the genes are the principal driver of organismal development? King and Wilson [1975] answered their own question by proposing that the organismal differences “result chiefly from genetic changes in a few regulatory systems.” This proposal was supported by extensive subsequent research, which emphasized evolutionary changes in cis regulatory elements that determined when and where proteins were expressed [see Carroll, 2005, 2008]. Some researchers also allowed for substantial changes in transcription factors and other protein-coding genes [Lynch and Wagner, 2008], but the general principle still held: evolutionary changes in gene regulation create most of the species differences in morphology, physiology, and behavior.
More generally, biologists believed that novelty at supramolecular levels of organization is created by the combination of highly conserved molecular elements (be they individual proteins or small networks of them) in novel ways. This idea was expressed most clearly by François Jacob [1977], who compared evolution to a human tinkerer, who creates novel objects by combining old elements. This idea has been enormously influential and, at its core, is surely correct. It is consistent, for example, with the discovery that the number of protein-coding genes is much more similar between species than one might expect, given their broad range of phenotypic complexity [Pray, 2008]. However, it seems to me that many biologists have taken the idea too far, concluding that conservation at the molecular level is much higher than it actually is. Since this assertion is difficult to prove, I here take a different tack, namely to review recent discoveries of major species differences in protein-coding genes. In addition, I seek to challenge the more specific notion that species similarities are always greater at the lower levels of organization by describing some instances in which molecular changes were unaccompanied by phenotypic change.
Divergence and Novelty at the Molecular Level
The early days of comparative molecular biology featured numerous reports of specific genes and proteins that have unexpected homologs (usually orthologs) in distant relatives. By contrast, failures to find homologs were rarely reported, both because such failures could stem from diverse technical issues and because reports of non-homology were much less surprising. With the advent of whole genome sequencing, however, researchers began to inventory and compare entire genomes, typically finding that 20–40% of the protein-coding genes did not have orthologs between the major eukaryotic lineages. As additional genomes were added to the mix, more and more of these lineage-specific genes were uncovered [e.g., Villanueva-Cañas et al., 2017], and it became possible to reconstruct which genes were gained or lost at which points in phylogeny [e.g., Babenko and Krylov, 2004; Albalat and Cañestro, 2016]. These studies showed, for example, that more than 1,000 protein-coding gene families were gained with the origin of metazoans and, once more, with the origin of bilaterian animals [Paps and Holland, 2018]. They also revealed extensive gene loss in most lineages [Kortschak et al., 2003; Ogura et al., 2005]. Particularly striking is that roughly 15% of the protein families thought to be ancestral for eumetazoans (all metazoans except sponges and choanoflagellates) seem to have been lost in the lineage leading to fruit flies and, independently, in nematodes [Putnam et al., 2007].
One could argue that the evolutionary gain and loss of genes merely reflects the divergence of homologous genes beyond the threshold at which we can recognize them as homologs (based on above-chance sequence identity). Indeed, gene losses and gains tend to be most frequent in the lineages that exhibit the fastest rate of sequence divergence [e.g., Wyder et al., 2007]. However, genes can clearly be lost without immediately generating a new gene; after all, this is how pseudogenes originate. Moreover, the number of gained genes at a given phylogenetic node is often quite different from the number of lost genes [e.g., Paps and Holland, 2018], which is not what one would expect if novel genes are merely severely modified old genes. In fact, some lineages exhibit bursts of new gene creation, followed by more protracted periods of steady gene loss [Wolf and Koonin, 2013]. One could still argue that novel genes are really just highly modified duplicates of old genes [Ohno, 1970], which would allow gains to outnumber losses. Although there is extensive support for this general hypothesis [e.g., Assis and Bachtrog, 2013], new genes can also form by the fusion of several genes or gene fragments [e.g., Vakirlis et al., 2018] and even from non-coding DNA [see Ponce et al., 2012]. Again, one could argue that such genes are also “not really new,” since they are formed by the recombination of old elements, but this line of argument risks becoming absurdly reductionist, as all genes can be thought of as “merely” unique combinations of the four ancient DNA nucleotides, which in turn are formed from elements that, ultimately, can be traced back to the helium in stars. Do we really want to argue that there is nothing new under the sun?
Leaving aside the issue of new genes, it is clear that even recognizably homologous genes and proteins can differ substantially in structure (Fig. 1c) and function. As mentioned previously, some of the most impressive evidence for conserved functionality comes from cross-species rescue experiments, but those experiments are very challenging and not always successful [Baumeister, 2002]. Moreover, the ability of a gene from one species to rescue deletions of its homolog is often not reciprocated or only partial. For example, the otd gene of fruit flies can rescue some but not all of the deficits caused by deletion of its mouse homolog (otx1). Specifically, otd can rescue the deficits in cortical structure and function, but not the abnormalities in the vestibular apparatus, midbrain, and cerebellum [Acampora et al., 1998]. Similarly, the nkx2.5 gene of mice can rescue visceral mesoderm but not cardiac development in fruit flies lacking its homolog (tinman) [Park et al., 1998; Ranganayakulu et al., 1998). Such incomplete rescue experiments reveal that homologous proteins may acquire or lose some functions during the course of evolution, even as they share others [Lynch and Wagner, 2008; Lynch, 2009]. Substantial functional divergence is especially likely to occur between duplicated genes [i.e., paralogs; Khor and Ettensohn, 2017].
Whether the functional divergence of homologous genes is common or rare is difficult to determine from individual experiments, especially since cross-species rescue experiments may fail for reasons other than functional divergence. However, 27 out of 120 genes that are essential for survival or reproduction in humans were found to be non-essential in mice [Liao and Zhang, 2008], implying that those genes must have changed some of their functions substantially. Therefore, the mere fact that 177 out of 280 human disease-linked genes have homologs in flies [Reiter et al., 2001] need not imply that the functions of these fruit fly genes are shared with humans. To list just one clear counter-example, mutations in the presenilin gene are linked to Alzheimer’s disease in humans, whereas deletion of its homolog in nematodes causes egg-laying deficits [Levitan et al., 1996]. Since the human gene in this case can substitute for its nematode homolog, one could argue that the functional divergence involved only the higher-level phenotypic functions, while the lower-level, biochemical functions remained conserved. However, those molecular functions also diverged, at least to some extent, since nematodes lack beta-secretase [Link, 2006], which in mammals collaborates with presenilin (and some other molecules) to generate the beta-amyloid fragments that are often considered to be a root cause of Alzheimer’s disease. Even when rescue experiments are successful, the interacting partners of the substituted gene may be astonishingly different [e.g., Montalta-He et al., 2002]. Some of this variation is clearly due to evolutionary changes in non-coding sequences [Odom et al., 2007; Wilson et al., 2008], but the main point remains: interspecies variation at the gene and protein level is considerable, and much of that variation is functionally significant.
Comparative molecular biologists have long allowed for a great deal of “neutral” gene or protein sequence divergence, but they traditionally argued that this divergence rarely has functional consequences [Nei, 2005]. However, as reviewed above, functional changes at the molecular level are being discovered at a surprising clip. Even fundamental aspects of cellular metabolism, such as the citric acid cycle, are far more variable than the initial examination of just a few species had indicated [Huynen et al., 1999]. Despite such examples, most biologists are likely to remain convinced that morphological traits are less conservative than genes. This traditional view stems at least in part from the fact that we tend to focus on trait values (i.e., similarities and differences) for morphology and on homologies for molecules. For example, we tend to see bird wings and human arms as very different traits, despite their homology, while homologous proteins are generally assumed to be quite similar even when their amino acid sequences (i.e., their trait values) diverged substantially. Instead, it would be better to ask whether the proportion of genes and proteins that have homologs between two or more species is greater or lower than the proportion of morphological traits that have such homologs. Since homologs may differ in both structure and function [Owen, 1848], this question would be difficult to answer at all levels of biology. Until it is answered, however, it would be unwise simply to assume that evolutionary conservation is much greater for molecules than it is for higher-level traits. Behavior may turn out to be more variable than morphology or physiology [Blomberg et al., 2003], but this, too, remains far from certain.
Deep Non-Homology
A good way to challenge one’s intuitions about the degrees of variation at different levels of biological organization is to consider cases in which variation at the genetic level is not accompanied by corresponding variation at higher levels of organization. Consider, for example, the molecular basis of cell cycle regulation. Cell division is as ancient as life itself and clearly homologous between all species. However, comparisons among plants, yeasts, and mammals revealed that, at least for the transition from G1 to S phase, many of the genes controlling cell division share essentially no sequence similarity [Cross et al., 2011]. Despite this variation in the individual molecules, the topology of the regulatory network appears to be maintained, including both positive and negative feedback loops (Fig. 3). One might argue that the functionally matching genes in these networks are “cryptic homologs” that have diverged beyond the threshold of detectability, but it is at least as likely that some genes were replaced with non-homologous yet functionally similar genes. The latter hypothesis is supported by the discovery of other significant differences in cell cycle regulation between plants and animals, between yeasts and other eukaryotes, and between yeasts and bacteria [Brazhnik and Tyson, 2006; Shultz et al., 2007; Dissmeyer et al., 2010].
The use of non-homologous genes to generate homologous higher-level traits was called “non-orthologous gene displacement” by Koonin et al. [1996], who compared the first two fully sequenced bacterial genomes and found that 12 of 233 investigated orthologs were essential in one species but not in the other, implying functional substitution. An analogous argument can be made for the previously mentioned discovery that 22% of human essential genes have non-essential mouse orthologs [Liao and Zhang, 2008]. Another good example is provided by the circadian clock [Rubin et al., 2006; Tomioka and Matsumoto, 2010; Lam and Chiu, 2018], because the timeless gene tim1 is essential for circadian clock function in flies but has no orthologs in mammals. Moreover, a cryptochrome gene entrains the circadian clock to the light-dark cycle in flies [Hardin, 2005], but the orthologous genes in vertebrates are not influenced by light and, instead, perform a function similar to that of tim1 in flies; meanwhile, vertebrates entrain their circadian clock through very different, non-homologous mechanisms [Van Gelder and Buhr, 2016]. One could argue that flies and vertebrates evolved their circadian clocks and light entrainment mechanisms independently of one another, but the almost universal distribution of light-sensitive circadian clocks among metazoans makes this scenario quite unlikely; so does the recent finding that the circadian clock of honeybees is in some key respects more similar to that of vertebrates than of fruit flies [Rubin et al., 2006]. Yet another example of non-orthologous gene displacement is provided by the wide variety of phylogenetically unrelated crystallins that help build lenses in the eyes of diverse vertebrates, whose last common ancestor surely already had a lens [Wistow, 1993].
The idea that non-homologous mechanisms can be involved in the generation of homologous higher-level traits was first articulated with respect to evolutionary changes in development. Specifically, it has long been known that homologous structures may, on occasion, develop from non-homologous embryonic precursors [e.g., de Beer, 1971; Striedter and Northcutt, 1991; Bolker and Raff, 1996]. Evidence for evolutionary changes in embryonic origin is relatively scarce, but numerous studies have shown that the development of homologous structures frequently involves the action of at least some non-homologous genes. One example of such “developmental systems drift” [True and Haag, 2001] comes from detailed comparative studies of reproductive organ (vulva) development in different nematode species. Vulva development is triggered by an EGF (epidermal growth factor)-like molecule in one of these species, but by WNT (Wingless) signaling in the other [Wang and Sommer, 2011]. Similarly, notochord development involves only partially overlapping sets of genes in different species of tunicates [Kugler et al., 2011], and vertebrate notochords express numerous genes that are not expressed in their tunicate homologs [Kugler et al., 2008]. In short, the data indicate that homologous traits may vary in their developmental mechanisms, just as they may vary in adult structure and function [Wagner, 2007; González, 2015].
The idea that non-homologous mechanisms may undergird homologous traits is counterintuitive, because we tend to think that homology requires “continuity of information” [Van Valen, 1982] and that this information must be provided by the genes. If genes are swapped in or out (or both) during the course of evolution, how can the higher-level traits remain homologous? The best answer is that higher-level traits [or “characters,” see Wagner, 2000] generally emerge through the interactions of many lower-level mechanisms, some of which can compensate for one another. Because of this capacity, the higher-level characters tend to be robust to both environmental and genetic perturbations, a phenomenon that Conrad Waddington [1957] called “canalization.” As a result, traits can maintain their phylogenetic identity as lower-level elements drop out or new ones are added into their causal nexus [Striedter, 1998]. The key idea is that several different constellations of lower-level mechanisms are capable of giving rise to the higher-level trait, and evolution can move smoothly between them. Harking back to Jacob’s [1977] metaphor, evolution tinkers even when it keeps the end product the same! Importantly, the canalization of higher-level traits does not just keep the traits the same [Owen, 1848], it also makes it easier for evolution to explore new regions of trait space, increasing their evolvability [Wagner, 2008a, b]. Given these observations, one would actually expect deep non-homology to be fairly common.
Conclusions and Implications
Based on the reviewed studies, I conclude that species differences at the molecular level are more pronounced and more likely to be functionally significant than early studies in comparative molecular biology had suggested. The differences tend to increase with the square root of phylogenetic distance, but lineages with short generation times tend to diverge faster than others. Importantly, these fast-evolving lineages include most of the traditional model species. Whether rates of divergence increase at higher levels of organization is difficult to determine, but molecular similarities tend to be overstated in basic biology textbooks. The hox genes, for example, are much more variable across species than typical summaries depict; in particular, the tight clustering of hox genes on chromosomes appears to be a derived feature of vertebrates [Duboule, 2007].
These considerations are important, because features that are not broadly conserved are often viewed as being “not fundamental,” although it is rarely clear what “fundamental” in this context means. If it means “relating to mechanism,” then the extensive variation at the molecular level clearly is fundamental. If it is merely synonymous with “broadly conserved,” then one must be careful not to assume that this means “ancestral,” because some similarities may have evolved convergently. Moreover, assuming that only the broadly conserved elements and processes were present in ancestral species risks assigning to those ancestors a highly impoverished set of mechanisms that is unlikely to have been functional, as evident from examining the conserved elements in Figure 3 [see also Mushegian and Koonin, 1996].
As stated in this paper’s opening quotation, the evidence that genes and gene networks sometimes differ substantially between species threatens to undermine the model species paradigm. If causal drift occurs as species diverge, then a putative “therapeutic target” identified in a non-human species may not pay dividends when it is targeted in human patients. One may mitigate this problem by testing the intervention in multiple non-human species (ideally including non-human primates) before beginning clinical trials, but this gets expensive and leaves open the possibility that human biology diverged from that of the examined species in one or more critical respects. These challenges are underscored by the discovery that many therapies tested in human males are less effective, or even deleterious, in women [Clayton and Collins, 2014] and that mutations in disease-linked genes tend to manifest differently in different individuals, even for monogenic diseases like phenylketonuria [Scriver and Waters, 1999]. The general problem is that the functions of many genes are context dependent [Wagner, 2015] and may, therefore, vary between individuals, sexes, and species. There is no easy way around this dilemma, but a good first step would be to recognize the differences as being important. The high failure rate of clinical trials is often attributed to flaws in preclinical research methods [Landis et al., 2012], but species, sex, and individual differences are surely contributing factors.
Finally, it is crucial to note that model species research can reveal important biological principles even when the implementation of those principles varies across species. For example, the early work on cell cycle regulation in yeast revealed functional principles that have turned out to be quite general, even if most of the implementing proteins are not broadly conserved. In fact, biology textbooks are replete with such general principles. Even principles that are lineage or species specific can be useful insofar as they may yield bio-inspired technologies or therapies. For example, understanding the mechanisms underlying the ability of salamanders to regrow a leg may someday lead to therapies for human amputees, even if limb regeneration is not a broadly conserved trait. Species differences in disease susceptibility may likewise yield invaluable insights [e.g., Tian et al., 2013]. In short, variation across species need not be an obstacle to translational research; it can be useful. Its utility could be enhanced by further research into biological principles of variation, as these can help predict (or at least understand) some species differences. For all of these reasons, it continues to be important for biologists to conduct research on a wide variety of species, rather than a shrinking set of “standard” model organisms [Striedter et al., 2014; Logan, 2019; Preuss, 2019].
Acknowledgments
I am grateful to Karger Publishers for its support of this workshop and Special Issue and, more generally, for its sustained support of evolutionary neurobiology. I also thank Todd Preuss, the workshop’s co-organizer, and the other participants for helping to highlight the various issues surrounding species selection in neuroscience research.
Disclosure Statement
The author has no conflicts of interest to declare.