Abstract
Interest in giant viruses has risen sharply since 2003, following the discovery of the Mimivirus and four other protist-infecting giant viruses that are linked to the nucleocytoplasmic large DNA viruses (NCLDVs). Despite considerable heterogeneity in hosts and genome sizes, the NCLDVs have been shown to be monophyletic based on analyses of their sequences and gene repertoires and recent studies have proposed that these viruses share a common ancient ancestor and compose a fourth domain of life. In addition, several characteristics of these giant viruses contradict or do not match the criteria used for the canonical definition of viruses, and the NCLDV denomination is not completely appropriate. We propose here to define a new viral order named Megavirales.
Introduction
The existence of viruses with singularly large particle and genome sizes has been appreciated since the discovery of jumbo bacteriophages in the 1970s and the phycodnaviruses in the early 1980s [1,2]. The interest in giant viruses increased dramatically in 2003 with the discovery of Acanthamoeba polyphaga Mimivirus, whose genome was the largest ever described among viruses (1,181 kb). It encodes more than 900 proteins, including some never identified previously in viruses [3,4]. Overall, the Mimivirus discovery has led to considerable breakthroughs in our understanding of the definition, origin, and evolution of viruses [4,5,6,7]. Consequently, the number of publications and citations related to giant viruses has increased by more than 1 log (online suppl. fig. S1; for all online suppl. material, see www.karger.com?doi=10.1159/000336562). Since 2008, several new giant viruses including close relatives to Mimivirus (Mamavirus, Terra2, Moumou, Courdo 11, Megavirus chilensis) and others more distantly related (Cafeteria roenbergensis virus (CroV), Marseillevirus, and Lausannevirus) have been recovered from different phagocytic protists and water samples by four teams (table 1; online suppl. table S1; fig. 1a) [5,8,9,10,11,12,13,14].
Main features of nucleocytoplasmic large DNA viruses whose genome is available in the NCBI GenBank genome database

Phylogeny reconstruction from a cured concatenated alignment of universal NCVOGs [including primase-helicase (NCVOG0023), DNA polymerase (NCVOG0038), packaging ATPase (NCVOG0249), and A2L-like transcription factor (NCVOG0262)] for the giant viruses currently classified as NCLDVs (a) [modified from [14]] and the Mimiviridae (b). Probabilities are mentioned near branches as a percentage and are used as confidence values of tree branches. Only probabilities at major nodes are shown. Scale bar represents the number of estimated changes per position for a unit of branch length.
Phylogeny reconstruction from a cured concatenated alignment of universal NCVOGs [including primase-helicase (NCVOG0023), DNA polymerase (NCVOG0038), packaging ATPase (NCVOG0249), and A2L-like transcription factor (NCVOG0262)] for the giant viruses currently classified as NCLDVs (a) [modified from [14]] and the Mimiviridae (b). Probabilities are mentioned near branches as a percentage and are used as confidence values of tree branches. Only probabilities at major nodes are shown. Scale bar represents the number of estimated changes per position for a unit of branch length.
All of the previously mentioned protist-associated giant viruses have been linked to nucleocytoplasmic large DNA viruses (NCLDVs) (tables 1, 2, 3, 4) [15,16,17,18]. However, this grouping is not completely appropriate and several unique features of the NCLDVs do not match the criteria for the canonical definition of viruses [6,32,33]. In addition, these giant viruses were suggested to share a common ancestral origin and compose a new domain of life, aside Bacteria, Archaea, and Eukarya [17,18,34]. Therefore, we propose here to define a new viral order named Megavirales.
A brief history of key steps in the definition of the NCLDV superfamily and the Megavirales order

Rationale and Argument Supporting the Definition of a New Viral Order
The Current Definition and Classification of Giant Viruses Are Inappropriate
The canonical definition of viruses was described by Lwoff [33] during the pregenomic era in 1957 and was historically based on negative criteria (online suppl. table S2). Later, genomics failed to identify any common gene in the virosphere that could be equivalent to universal proteins or ribosomal RNA for Eukarya, Archaea, and Bacteria [6,35,36,37]. Thus, viruses remained separate from these biological entities. Recently, a new classification was proposed, which defines them as capsid-encoding organisms as opposed to ribosome-encoding organisms that compose the three canonical domains of life and are used to complete the viral life cycle (fig. 2). Besides, major monophyletic classes of viruses were tentatively defined [ [15,37]. Most giant viruses were linked to one of these classes, NCLDV, which includes Poxviridae, Asfarviridae, Iridoviridae, Ascoviridae, Phycodnaviridae, Mimiviridae, and Marseilleviridae(Marseillevirus and Lausannevirus) (tables 1, 2, 3; fig. 1a; online suppl. table S1) [13,17]. Regarding Mimiviridae, new giant viruses infecting amoebae have been described by La Scola et al. [9 ]in 2010 and phylogeny reconstructions based on highly conserved genes enable delineating three lineages, referred to as A, B, and C. One of these lineages (A) is composed of Mimivirus and closely related viruses (table 1; fig. 1b). The recently described Megavirus chilensis [10] is closely related to a giant virus previously recovered and classified within lineage C (table 1; fig. 1b). CroV has been also classified among the Mimiviridae, apart from the group composed by the lineages A, B, and C (table 1; fig. 1b) [11,14].
Schematic illustrating the relationships between ribosome-encoding organisms and capsid-encoding organisms, including the Megavirales members.
Schematic illustrating the relationships between ribosome-encoding organisms and capsid-encoding organisms, including the Megavirales members.
Despite large heterogeneity in their hosts and genome sizes, the monophyly of the NCLDVs has been attested by phylogenetic and phyletic analyses, and the gene repertoires of these viruses distinguished them from bacteria, archaea, and eukaryotes [15,18,37]. The NCLDVs were originally defined as sharing nine genes found in all families, including three viral hallmark genes (table 3) [15]. Later, Yutin et al. [17] identified a set of 1,445 NCLDV clusters of orthologous groups of proteins, referred to as NCVOGs, that included 177 represented in two or more NCLDV families and 5 present in all viruses. Other viruses, including Myoviridae, Nimaviridae, Herpesviridae, and Polydnaviridae, exhibit large genome and particle sizes, but their gene content precludes their incorporation within the NCLDVs [15,37]. Some viral hallmark genes are shared between the NCLDVs and other large DNA viruses, as exemplified by the B-family DNA polymerases that are shared with herpesviruses and baculoviruses, but there are considerable numbers of other genes shared by the NCLDVs to the exclusion of all other large viruses [17,37]. Moreover, the DNA replication and transcription of herpesviruses and baculoviruses occur exclusively in the nucleus, in contrast to NCLDVs [15]. Regarding Myoviridae, they are tailed bacteriophages [28].
Based on current knowledge, giant viruses and other canonical viruses differ in many aspects, which is not consistent with the concept (conveyed by Lwoff’s classification) that the viral world is a homogeneous class of entities (online suppl. table S2) [6,16,32]. As an example, the huge gap between the Mimivirus and the hepatitis C virus is striking. The specific NCLDV features that strongly challenge the canonical definition of viruses are listed below (online suppl. table S2).
The NCLDVs have a capsid diameter that ranges between 150 and 500 nm, which contradicts the historical concept of viruses as small, ultrafilterable entities [32,38,39,40]. In addition, the NCLDVs have large genomes that range in size between 103 and 1,259 kb and harbor 95–1,120 genes (table 4).
Viral messenger RNAs were detected in Mimivirus and Marseillevirus particles [4,12]. These transcripts encode notably for capsid protein, DNA polymerase, or TFII-like transcription factor. The presence of RNA in the vaccinia virus particles has also been reported [41]. This contradicts a key point of Lwoff’s viral classification, which stated that viruses only harbor one type of nucleic acid [32,33].
The Mimiviridae and Marseilleviridae genomes encode proteins involved in translation, which represents a unique feature of these viruses [4,12,32]. Besides, the genomes of Mimiviridae and Phycodnaviridae exhibit tRNAs [4,28].
The NCLDVs were suggested to have a common ancestral origin dating back to an early stage of Eukarya evolution (table 3) [15,17,34,37]. Thus, Yutin et al. [17 ]used maximum-likelihood reconstruction to delineate a set of 47 conserved genes that were probably present in the genome of the NCLDV common ancestor, which may have been a giant virus (fig. 3). Additionally, the NCLDVs infect a considerable diversity of hosts that belong to the three canonical domains of life [17,28,42]. Moreover, cross-mapping of the NCLDV and host eukaryotic trees generated a complex network in which members of the same NCLDV branch exhibited relationships with eukaryotic organisms of different supergroups [17]. For example, despite the relationship between them, irido-/ascoviruses infect animals, while Marseillevirus infects a protist. Yutin et al. [17] have proposed the hypothesis of a ‘Big Bang-like’ event concomitantly with eukaryogenesis for the origin of the NCLDVs [43].
Functional annotation and probable origin of the reconstructed core gene set of the common ancestor of the NCLDVs (47 NCVOGs) [adapted from [18]].
Functional annotation and probable origin of the reconstructed core gene set of the common ancestor of the NCLDVs (47 NCVOGs) [adapted from [18]].
Furthermore, it was proposed in 2010 that NCLDVs might define a fourth domain of life. This has been based on phylogenetic and phyletic studies of the repertoires of genes involved in information storage and processing and nucleotide transport and metabolism, and shared by Eukarya, Bacteria, Archaea, and the NCLDVs [34]. This work provided additional data supporting the monophyly and common origin of these giant viruses. In addition, it supports the hypotheses that the core genome of the NCLDVs may be as ancient as those of the three current canonical domains of life and that NCLDVs may have emerged as ancient roots from the rhizome of life [34,44]. It was claimed in a recent work that the methodology used by Boyer et al. [34] for phylogenetic reconstructions was not the most appropriate to avoid spurious tree topologies generated by compositional heterogeneity and homoplasy, and alternative informational gene phylogenies did not support a fourth domain of life for NCLDVs [45]. Nevertheless, these trees fail to show a monophyly of Eukarya as well. In addition, other recent findings based on extensive analysis of metagenomic data suggest the existence of domains other than Eukarya, Archaea, and Bacteria [46].
Other Major Features of Giant Viruses Classified Along with NCLDVs
The NCLDVs can be characterized by other peculiar features in addition to those that radically classify them as separate from other viruses.
Poxviridae, Iridoviridae, and Asfarviridae can build viral factories [47], also reported in the case of Mimivirus, Megavirus, Marseillevirus, and Lausannevirus [10,12,13,48]. These factories are associated with a massive production of virions.
The NCLDVs display a high level of genomic plasticity. Indeed, lineage-specific gene expansion and horizontal gene transfer have played a major role in the shaping of their genomes [4,42,49,50,51,52,53]. The proportion of duplicated genes in these viruses was found to range between 8 and 44%, with the highest proportions observed in Mimivirus [50,51]. In addition, horizontal gene transfer has generated considerable genome plasticity and mosaicism, although the direction or source of the transfers and the fraction of gene content involved remain controversial [7,51]. According to Filée [51], 0.8–11.9% of the genes were exchanged with the viral hosts, with the highest proportion being observed in Poxviridae, and up to 9.6% of the gene content was exchanged with bacteria, with the highest proportion being in Mimivirus. The potential mechanisms by which Poxviridae shape their genome through transfers of genes of host or viral origin have been particularly described [54,55,56,57]. In addition, the numbers of gene transfers with bacteria are the greatest for the Mimiviruses, Marseilleviruses, and phycodnaviruses that infect hosts feeding on bacteria [42]. The sympatric lifestyle within a phagocytic protist that grazes on bacteria, giant viruses, and virophages provides many opportunities indeed for these pathogens to gain and exchange genes. Thus, amoebae have been described as hot spots for gene transfer that may lead to the emergence of chimeric viruses and even the creation of new species [12,58]. Interestingly, a reduction in genome size by approximately 16% was recently observed for the Mimivirus when subcultured 150 times in a germ-free amoebal host [59].
In addition to the core gene set, NCLDV genomes contain open reading frames (ORFs) without detectable homologs, also known as ORFans [60]. Strikingly, 2.8–75.2% of ORFs in the NCLDV genomes lack homologs in the NCBI GenBank reference protein sequence database. Moreover, 0.3–10.4% of these ORFs have homologs in the GenBank environmental protein sequence database.
The NCLDVs themselves can be infected by viruses [61], as has been previously shown for eukaryotes, bacteria, and archaea. In 2008, La Scola et al. [5] identified Sputnik, a virus infecting Mamavirus, which led to the creation of the virophage concept. Since then three new virophages have been described in association with Mimiviridae and Phycodnaviridae [9,35,62]. Importantly, virophages may be involved in gene transfer [5,35].
Giant Viruses Classified with the NCLDVs Are Probably Common Inhabitants of Our Biosphere
According to our current knowledge, the NCLDVs remain a minority in the virosphere. Nevertheless, several findings indicate that they are common inhabitants of our biosphere. It is noteworthy that their presence has probably been largely underestimated up to this point because most metagenomic studies have adhered to the dogma of the small size of viruses by filtering samples prior to analysis (fig. 4) [63,64,65]. Notwithstanding, sequences similar to those from Mimivirus, African swine fever virus, and iridoviruses have already been identified in marine environmental samples or human serum and sewage [66,67,68,69,70,71]. Furthermore, giant viruses have been recovered from five different geographical areas worldwide and they have been isolated from approximately 20% of water samples in one study by optimizing amoebal culture protocols [9].
Schematic illustrating how the giant viruses may have been excluded during the assessment of viromes by metagenomic studies that have filtered samples prior to analysis. Such procedures are inevitably preventing the detection of viruses larger than the pores of the filters used, i.e. 0.2–0.45 µm.
Schematic illustrating how the giant viruses may have been excluded during the assessment of viromes by metagenomic studies that have filtered samples prior to analysis. Such procedures are inevitably preventing the detection of viruses larger than the pores of the filters used, i.e. 0.2–0.45 µm.
NCLDV Is Not an Appropriate Denomination and Has No Recognized Taxonomic Meaning
Finally, the NCLDV denomination does not take into account that Mimivirus and Marseillevirus harbor both DNA and RNA. Moreover, the NCLDVs compose a superfamily, a grouping that has no formally recognized taxonomic meaning according to the International Committee on Taxonomy of Viruses (ICTV) (http://www.ictvonline.org/virusTaxonomy.asp?bhcp=1). In the current ICTV classification, none of the NCLDV families are assigned to a viral order. We propose that these viruses should be assigned to a newly defined order (a group of families sharing certain common characteristics according to the ICTV) named Megavirales, in reference to the uncommon size of both the members’ particles and their genomes.
Definition of the Megavirales
Viral members of the new Megavirales order correspond to the giant viruses previously classified within the NCLDVs (table 1; online suppl. table S1). Megavirales can be defined by the criteria mentioned below, as illustrated in figure 5.
Major features of Megavirales members and criteria required for membership in the Megavirales order.
Major features of Megavirales members and criteria required for membership in the Megavirales order.
All of the following single characteristics are required for membership in the order (the monothetical system [72]):
• Giant viral particle and genome, capsid diameter >150 nm and genome size >100 kb (or in that order of magnitude).
• Presence in the gene content of all nine class I NCLDV core genes, i.e. VV D5-type ATPase (superfamily III helicase), DNA polymerase (B family), VV A32 virion packaging ATPase, VV A18 helicase (superfamily II), capsid protein D13L, thiol oxidoreductase, VV D6R/D11L-like helicase (superfamily II), S/T protein kinase, transcription factor VLTF2 [15] and all five NCVOGs found in all NCLDVs (i.e. NCLDV major capsid protein, D5-like helicase-primase, DNA polymerase elongation subunit family B, A32-like packaging ATPase and Poxvirus Late Transcription Factor VLTF3-like) [18]. These genes have various functions and origins.
• Common ancestral origin and membership in the proposed fourth domain of life.
• A jelly-roll capsid protein, which is a hallmark viral protein [6,37]. The capsid is icosahedral in all NCLDVs except in poxviruses, where it forms intermediate structures during virion morphogenesis, but is not a protein of the virion [37,73].
Different combinations of properties among those listed above are required for membership in the order (the polythetical system): presence of both DNA and RNA; presence of proteins involved in the translation apparatus; substantial proportions of genes duplicated and involved in horizontal gene transfer within the genome; substantial proportions of ORFans and metaORFans among the gene repertoire; presence of viral factories; some or all steps of DNA replication and transcription occurring in the host cytoplasm, and possible infection by a virophage.
Conclusion
The tremendous recent increase in knowledge about giant viruses has generated divergence rather than reinforced the borders of the previously defined viral world. Megavirales gather viral entities that appear to be incompatible within the framework of the virosphere as it has been defined since the beginning of virology. Moreover, they lay the foundation for a new understanding in which viruses consolidate their status as early protagonists in evolution.