Abstract
Systems biology refers to system-wide changes in biological components such as RNA/DNA (genomics), protein (proteomics) and lipids (lipidomics). In this review, we provide comprehensive information about morbillivirus replication. Besides discussing the role of individual viral/host proteins in virus replication, we also discuss how systems-level analyses could improve our understanding of morbillivirus replication, host-pathogen interaction, immune response and disease resistance. Finally, we discuss how viroinformatics is likely to provide important insights for understanding genome-genome, genome-protein and protein-protein interactions.
Introduction
Morbilliviruses are classified under the subfamily Paramyxovirinae, the family Paramyxoviridae and the order Mononegavirales [Gibbs et al., 1979]. There are seven known members of the genus morbillivirus: measles virus (MV), rinderpest virus (RPV), peste des petits ruminants virus (PPRV), canine distemper virus (CDV), cetacean morbillivirus, phocine distemper virus and feline morbillivirus [Kumar et al., 2014]. Morbillivirus infection in humans and animals causes profound immunosuppression [de Vries et al., 2015]; however, the individuals that survive infection usually develop lifelong immunity [Kerdiles et al., 2006]. A cross-protection is believed to occur among various prototypes of morbilliviruses [Kumar et al., 2014].
Besides the raw materials required for nucleic acid and protein synthesis, viruses also require several other host factors for successful propagation inside the host. Significant research has been conducted on the individual host factors exploited by the viruses; however, comprehensive quantitative functional insights for all the host factors required for virus replication remain poorly mapped. Upon infection of the host cells, viruses exploit the cellular machinery for its own effective replication in the form of ‘viral factories' through various protein-RNA and protein-protein interactions. The molecular interactions between viral and cellular factors determine the host range and viral pathogenesis. After the advent of high-throughput sequencing tools and proteomics technologies, thousands of host factors required for successful virus replication were rapidly identified, thus enabling insights to the identification of attractive targets for antiviral drug development [Kumar et al., 2011b].
Virion Structure and Genome Organization
Morbilliviruses have linear, single-stranded, negative sense, nonsegmented RNA genomes [Kumar et al., 2014]. The exact lengths of various morbillivirus genomes vary due to the variable size of the junction between the matrix (M) and the fusion (F) protein genes [Radecke et al., 1995]. PPRV virions are pleomorphic particles with a lipid envelope enclosing a helical nucleocapsid that exhibits a characteristic herring bone appearance (fig. 1) [Gibbs et al., 1979]. The PPRV genome consists of 15,948 nucleotides (nt) that encode six structural and two nonstructural proteins in the order of 3′-N-P/C/V-M-F-HN-L-5′ (fig. 2) [Munir et al., 2013]. The hemagglutinin (H) protein of the PPRV also exhibits neuraminidase activity and, hence, is named the hemagglutinin-neuraminidase (HN) protein.
Like other morbilliviruses, the transcription and replication of the PPRV is controlled by untranslated regions (UTRs) at the 3′ and 5′ ends of the genome, known as the genome promoters (GPs) and antigenome promoters (AGPs) [Lamb and Kolakofsky, 2001], and are represented by nt 1-107 and 15840-15948 in the PPRV genome, respectively. The 3′ and the 5′ ends of the PPRV genome consist of a 52-nt-long leader and 37-nt-long trailer regions, respectively. Fifty-two-nt-long leader sequences together with the 3′ UTR of the nucleoprotein (N) gene and conserved 3-nt-long intergenic regions (IG) between them serve as GPs for the synthesis of mRNA and complementary/antigenomic RNA (fig. 2). The AGP is composed of the trailer region, the 5′ UTR of the large (L) protein (after the stop codon) and the IG region between them. The AGP is only involved in the synthesis of the genomic RNA [Barrett et al., 2006]. A stretch of 23-31 nt at the 3′ terminus of both GPs and AGPs is conserved among PPRV strains and believed to act as an essential domain for the promoter activity [Bailey et al., 2007]. The field and vaccine strains of PPRV differ by 6 nt in GPs (at positions 5, 12, 26, 36, 42 and 81) and 1 nt in AGPs (at position 15842) [Abraham, 2005]. Mutation in the GP at position 26 is linked to the attenuated phenotype in morbilliviruses. Mutations in the GP at positions 5 and 12 are only present in the PPRV and RPV vaccine strains, and four other mutations at positions 36, 42, 81 and 15842 are present only in the PPRV vaccine strain. Interestingly, as compared to other morbilliviruses and the PPRV vaccine strain, the field strains of PPRV contain U instead of C residue at position 36. Some of the mutations described above in the GP/AGP regions alone or in combination may be involved in the attenuation/virulence of PPRV, and hence present another region [beside the F and nucleocapsid (N) proteins gene] in the PPRV genome for phylogenetic analysis [Munir et al., 2013].
Viral polymerase synthesizes mRNA in the 3′-to-5′ direction on the genomic RNA template. The terminator region of each gene is followed by a 3-nt-long IG region (fig. 2). The IGregionis also found at the junction of the N gene and the leader sequences, and between the L gene and the trailer region. The IG region consists of a semi-conserved polyadenylation signal, a highly conserved GAA sequence, a semi-conserved start signal for the next gene and variable length of 5′ and 3′ UTRs [Barrett et al., 2006]. In the PPRV, at the junction of the L gene and the trailer region, GAA is replaced by GAU. In some of the PPRV strains, the junctions of the H and L genes may be substituted by GCA. Each gene begins with the conserved UCCU/C sequence. To produce individual viral protein, the transcriptional unit is composed of the coding sequence, IG region and the conserved start and stop signals that flank it [Munir et al., 2013]. All the paramyxoviruses contain a conserved trinucleotide (AGG) sequence at the start of each mRNA species. The UTRs, though with varying lengths, are also present both before and after the open reading frame (ORF) of each gene. A poly U tract located 52 bases downstream of the N open reading frame stop codon is highly conserved among morbilliviruses and acts as a polyadenylation signal for the positive sense transcripts produced by the viral RNA polymerase. The N, M, F and L proteins appear to be the most conserved morbillivirus proteins [Diallo et al., 1994].
Virus Replication
Attachment
The first step of infection, binding of the virus to the host cells and delivery of nucleocapsid into host cell cytoplasm, certainly plays an important role in the pathogenesis of the virus and susceptibility to the host. The first interaction of the PPRV to the host is mediated via binding to the cellular receptor(s) through its attachment protein, the HN protein. Morbilliviruses initially target lymphoid organs and replicate efficiently in the lymphocytes. The signaling lymphocyte activation molecule (SLAM), also called CD150, is the principal cellular receptor for morbilliviruses. It is exclusively expressed on immune cells and, therefore, the viruses have strong lymphoid cell tropism.
Signaling Lymphocyte Activation Molecule
Tatsuo et al. [2000b] first identified SLAM by screening a cDNA library derived from B95a cells that are highly permissive for MV. Transfection of a single clone of cDNA from marmoset B (B95a) cells made 293T cells susceptible (which otherwise are nonsusceptible) to the vesicular stomatitis virus pseudotype bearing the H protein of MV [Tatsuo et al., 2000b]. The single cDNA clone capable of making transfected 293T cells susceptible to the MV-H protein was identified as SLAM. Furthermore, MV, amplified only from SLAM-positive cells, was able to produce clinical signs in the infected animals [Bankamp et al., 2008], therefore SLAM acts as the principal cellular receptor for MV in vivo [Tatsuo et al., 2001]. SLAMs are principally expressed on lymphocytes, monocytes, dendritic cells and macrophages [Aversa et al., 1997b]. SLAMs have a broad involvement in the modulation of innate and acquired immune responses as they regulate T cell activation and have the ability to regulate the functions of natural killer and dendritic cells [Aversa et al., 1997a; Wu and Veillette, 2016].
All morbilliviruses bind to the V domain of SLAM. The SLAM-associated protein (SAP) or EWS/FLI-1-activated transcript 2 are the adaptor molecules associated with the cytoplasmic tail of SLAM [Yan et al., 2007]. The extracellular domain of SLAM may associate with another SLAM molecule present on the adjacent cells. SLAM engagement induces its binding to SAP, and triggers downstream signaling for the upregulation of T helper 2 cytokines [Veillette et al., 2007]. The MV-H protein residues that interact with SLAM are I194, D505, D507, Y529, D530, T531, R533, H536, Y553 and P554 [Masse et al., 2004]. The SLAM-mediated cell entry is crucial for the development of complete pathogenicity of a morbillivirus. The recombinant SLAM-blind lapinized strain of RPV is highly virulent in rabbits and reproduces similar pathogenicity as virulent RPV in cattle, and therefore serves as a useful model for illustrating the in vivo pathogenicity of RPV [Sato et al., 2012].
Cellular receptors determine the host range and tissue tropism of a virus. SLAMs of respective host species (humans, dog, cattle and goats) act as common receptors for MV, CDV, RPV and PPRV [Tatsuo et al., 2001]. For PPRV isolation from clinical specimens, monkey cells expressing goat SLAM are more sensitive than those expressing cattle SLAM [Adombi et al., 2011]. B95a cells express a high level of SLAM on the cell surface [Tatsuo et al., 2000a] and hence serve as a common cell line for the isolation of MV, CDV, RPV and PPRV.
Epithelial Cells Receptors (Nectin-4)
Despite SLAMs, morbilliviruses also infect epithelial cells of the intestines, liver, lungs, trachea, bronchial tubes, oral cavity, esophagus, pharynx and bladder that do not express SLAMs, suggesting the existence of alternative cellular receptors. Several in vitro studies have also illustrated morbillivirus-induced cytopathology as well as virus production in SLAM-negative cell types, such as epithelial or neuronal cells [Tahara et al., 2008]. Before spreading in the lymphatic cells, paramyxoviruses infect the upper respiratory tract epithelium from the luminal side [Yanagi et al., 2006]. However, according to a new model, the systemic spread of wild-type MV depends only on the infection of SLAM-expressing lymphatic cells and the initial virus amplification in the respiratory epithelial cells is not required [von Messling et al., 2006; Yanagi et al., 2006]. When used to infect rhesus monkeys, an epithelial receptor-blind MV that cannot recognize epithelial receptors but maintains SLAM-dependent entry remained virulent but without virus shedding, suggesting a role of other cellular receptors in virus dissemination [Leonard et al., 2008].
In 2011, by employing microarray and siRNA knockdown, two independent research groups discovered a new morbillivirus receptor, PVRL4 (Nectin-4), which is expressed on epithelial cells [Muhlebach et al., 2011; Noyce et al., 2011] and binds strongly to the H protein [Muhlebach et al., 2011]. The region of the H protein that interacts with the epithelial cell receptors has been mapped (I456, L464, L482, P497, Y541 and Y5430) [Leonard et al., 2008; Tahara et al., 2008]. The Nectin family proteins comprise three Ig-like loops (V and two C2-type domains) in their extracellular domains. Out of the four members of the Nectin family (Nectin-1 to Nectin-4), only Nectin-4 functions as an epithelial cell receptor [Muhlebach et al., 2011; Noyce et al., 2011]. The details of the interaction between Nectin-4 and the morbillivirus H protein is largely unknown. After systemic infection, it is believed that the infected lymphocytes and dendritic cells transmit the virus to epithelial cells using Nectin-4 located at the basolateral side.
Alternative Receptors
Despite lymphocyte and epithelial cells, morbilliviruses have also been detected from endothelial and neuronal cells [Sato et al., 2012]. MV and CDV exhibit strong neuronal cell tropism (persistent encephalitis) [Sato et al., 2012] which do not express both SLAM and Nectin-4. The morbillivirus cell entry, independent of SLAM and CD46 (and probably Nectin-4) has also been observed in vitro in a variety of cell lines [Fujita et al., 2007; Hashimoto et al., 2002]. This evidence indicates the existence of other alternative receptors. Emerging evidence suggests the involvement of heparin-like glycosaminoglycans [Fujita et al., 2007], cellular cyclophilin B and CD147 as cellular receptors for morbillivirus [Watanabe et al., 2010].
Both SLAM and Nectin-4 have been implicated in PPRV entry into the host cell. Whereas SLAM is important for initial interaction, Nectin-4 serves as an exit receptor for dissemination of the virus throughout the body and promotion of the amplification and subsequent release of the virus via different secretions [Birch et al., 2013].
Entry
Morbillivirus entry principally depends on the H and F proteins [Ader-Ebert et al., 2015] that closely associate to facilitate membrane fusion at a neutral pH. Binding of the H protein to a specific cell surface receptor acts as a stimulus to trigger the F protein [Jardetzky and Lamb, 2014]. The activated F undergoes a series of irreversible conformational changes that subsequently lead to merger of the viral envelope and host cell membrane, therefore resulting in fusion pore formation [Porotto et al., 2011].
Replication and Transcription
Like other RNA viruses, following release of the nucleocapsid from the viral envelope, the replication and transcription of the morbillivirus RNA occurs in the cytoplasm. Virion-associated RNA-dependent RNA polymerase (RdRp) present in the infecting virions initiates the synthesis of both mRNA and the complementary RNA (cRNA). The transcription begins following binding of theRdRp at the GP located on genomic RNA [Barrett et al., 2006]. Each transcriptional unit (coding sequence and noncoding flanking regions) are synthesized in the ‘start-stop' mode. The RdRp can access another downstream transcriptional unit only when the preceding unit has completely synthesized.During transcription, the RdRp may detach from the template (at IG) and may reinitiate the transcription at GP and, therefore, can control the quantity of individual protein to be synthesized. The N protein, which is required in greater proportions, is most abundantly transcribed because it is located most closely to the GP (fig. 1). In contrast, the L protein transcribed in the lowest amount is located farthest from the GP. In the paramyxoviruses, individual mRNA species is transcribed as naked RNA, which undergoes capping at their 5′ end and polyadenylation at the 3′ end by the virus-encoded polymerase, and hence is stable and can be efficiently translated by the host ribosomes [Barrett et al., 2006]. Polyadenylation signals (UUUU) of the mRNA transcript are present before each IG region [Munir et al., 2013].
Unlike other viral transcripts which produce a single protein, the morbillivirus P gene produces 3 different proteins: P, a structural protein, and two nonstructrual proteins, C and V. The P protein is produced from the first initiation codon whereas an alternative reading frame at the second start codon produces the C protein [Munir et al., 2013]. This is due to the fact that the first AUG is not located in the perfect Kozak consensus sequences (A/GXXAUGG) which is required for the efficient synthesis of proteins [Kozak, 1984]. The mRNA for the V protein is generated by cotranslational editing by the addition of one or more G residues in the P mRNA at a conserved editing site (3′-AAUUUUUCCCGUGUC-5′) [Schneider et al., 1997].
Sometime after synthesis of the mRNA, the RdRp switches to synthesize complementary RNA (antigenome RNA). Like the genomic RNA, cRNA is also associated with the N protein. According to one model, accumulation of unassembled N protein in the cytoplasm is a major driving force in switching the RdRp function from mRNA to cRNA synthesis [Wertz et al., 1998], whereas another model is based on the existence of two different forms of RdRp, one for replication and another for transcription [Kolakofsky et al., 2004].
Virus Assembly and Release
Assembly of the surface glycoproteins, the M protein and the ribonucleoproteins (RNPs) at the plasma membrane, and their subsequent budding, forms new paramyxovirus particles. The process of morbillivirus assembly and release is poorly studied. The M protein plays a major role in the assembly and release of paramyxoviruses. It serves as an adapter to link together the structural components of the virions (viral glycoproteins and RNPs) and cellular membranes, as well as driving incorporation of the genomic RNA into budding virions by interacting with nucleocapsid at virus assembly sites. Although M is the major protein responsible for paramyxovirus assembly and release, other viral proteins such as H, F and C, as well as several host factors, have also been implicated.
Assembly
The assembly of viral proteins is a specific and complex process that involves coalescence of the viral components at discrete sites on cellular membranes followed by host cell membrane protrusion. Lipid rafts are rich in cholesterol and sphingolipids that have a rigid, ordered structure with limited flexibility, and therefore can act as platforms for virus assembly [Takahashi and Suzuki, 2011]. Envelope glycoproteins of RNA viruses are not evenly distributed on the cell surface but rather clustered within the lipid rafts membrane microdomains to form the nucleation points for budding [Lyles, 2013]. Paramyxoviruses glycoproteins are selectively targeted to the raft microdomains. In some viruses only the F, but not the H protein, has the intrinsic ability to be incorporated into membrane rafts [Vincent et al., 2000], whereas in others both the glycoproteins can associate with the rafts [Laliberte et al., 2007]. In addition to the glycoproteins, other viral components, such as the N [Laliberte et al., 2007] and M proteins [Pohl et al., 2007], can also associate with the lipid raft microdomains. Accumulation of the viral components at the cell membranes facilitates coalescence of the multiple membrane microdomains, where viruses create their own assembly platforms [Lyles, 2013]. Besides acting as sites of assembly, the raft domains also contribute to the infectivity of the newly formed paramyxovirus particles [Chang et al., 2012]. Like other enveloped viruses, paramyxovirus particles are formed when all the structural components of the virus have assembled at selected sites on the membranes, where newly synthesized virus particles bud and then pinch off to release from the infected cells, allowing the infection to spread to new cells/hosts [El Najjar et al., 2014]. In some paramyxoviruses, such as MV and Sendai virus (SV), the lipid rafts are needed as platforms for the assembly but do not contribute to budding [Gosselin-Grenet et al., 2006]. Therefore, the functional significance of these raft domains in paramyxovirus life cycles varies among various family members.
Budding
Enveloped viruses bud by the formation of membrane protrusion followed by membrane scission and release of the virus particles from the infected cells. Besides the viral-viral and the viral-host protein interactions, interactions of the viral proteins with the membrane lipids play an important role in the induction of membrane curvature and the final membrane fission [Rossman and Lamb, 2013]. Budding of paramyxovirus is principally driven by the M protein, which binds and oligomerizes underneath the plasma membrane to drive the membrane deformation needed for the curvature formation [Takimoto et al., 2001].
Multiple mechanisms of paramyxoviruses budding are known [Harrison et al., 2010; Takimoto and Portner, 2004]. A short stretch of amino acids in many paramyxoviruses M proteins, known as the L domain (which varies among viruses, P[T/S]AP, PPxY, YxxL), has late budding functions. The L domain interacts with the cellular proteins of the ESCRT (endosomal sorting complex required for transport; part of the vacuolar protein sorting pathway) and promotes the membrane fission step that eventually leads to release of the progeny virus particles from the plasma membrane [McDonald and Martin-Serrano, 2009]. However, both ESCRT-dependent [Duan et al., 2014] and independent [Salditt et al., 2010] mechanisms of budding have been implicated in paramyxovirus budding. SV budding is well characterized among paramyxoviruses [Fouillot-Coriou and Roux, 2000]; the ability of the SV-F protein to form a bud depends on a TYTLE motif present in the cytoplasmic tail of the protein [Essaidi-Laziosi et al., 2013], suggesting that this motif is required for efficient binding.
Increasing evidence also suggests the role of the cytoskeleton in paramyxovirus budding, emerging from the fact that a large amount of actin was found associated with SV and mutations in the actin-binding domain of the SV-F protein resulted in a significant reduction in SV virus-like particle production. In association with cellular factors that are usually involved in exocytic pathways, the cytoplasmic tail domains of the glycoproteins have been implicated in paramyxoviruses budding [El Najjar et al., 2014]. Clustering of glycoproteins in the lipid raft microdomains creates a pulling force on the plasma membrane to induce an initial membrane deformation that is further elongated by oligomerization of the M protein [Liljeroos et al., 2013].
Systems Virology and Host-Pathogen Interaction
The role of the individual viral proteins in virus replication and disease pathogenesis has paved the way for the development of antiviral drugs that target viral components. However, due to frequent mutations in the viral genome, the virus-centric approach for drug development has resulted in drug resistance. Several isolated studies have identified some unique host-targeting antiviral agents that tend not to develop drug resistance [Kumar et al., 2011b]. However, precise information on all the cellular proteins required for effective viral replication cannot be predicted by analyzing individual pathways/proteins and, hence, are not well suited in delineating the complex and multifaceted virus-host interactions.
Systems biology deals with comprehensive understanding of biological systems through the combined use of biology, mathematics and computer science. Systems-level analyses utilize high-throughput technologies to evaluate system-wide changes in biological components, such as RNA/DNA (genomics), proteins (proteomics), metabolites (metabolomics), lipids (lipidomics) and carbohydrate (glycomics). The biological system for systems virology may range from an infected cell, to tissues, to whole organisms. The emergence of next-generation sequencing has created enormous possibilities for generating system-wide information, including: (i) to report quantitative and qualitative differences within individuals of the same species [Abecasis et al., 2010]; (ii) to characterize the interaction spectrum of DNA-binding proteins [Park, 2009], and (iii) to create genome-wide profiles of epigenetic modifications [Datta et al., 2015]. The high-throughput data are integrated and analyzed using computer/mathematical algorithms to generate predictive models of the system, which eventually allow experimental perturbations of the system. Rather than focusing on a predetermined small set of molecules (genes, proteins or metabolites), systems virology is instead an unbiased approach to deal with system-wide changes in the host following virus infection and, hence, represents a comprehensive systems-level view of host-virus interaction.
Genome-Wide Changes in Host Genes following Virus Infection
Genome-wide host transcriptome profiling following infection of a wide variety of viruses has accumulated an enormous amount of data to the DNA data bank. Nevertheless, transcriptome analyses of peripheral blood mononuclear cells (PBMCs) following morbillivirus infection [Nanda et al., 2009], including PPRV [Manjunath et al., 2015], have also been analyzed. Transcriptome analyses of bovine dendritic cells following RPV infection (bovine pathogen) and a wild-type MV (human pathogen) suggest that, compared to RPV, MV induces a robust and rapid interferon response. Pathogenic and nonpathogenic RPV also induce significant differences, with the latter inducing a slightly higher interferon response as well as significant effects on the transcription of genes involved in cell cycle regulation [Nanda et al., 2009]. PPRV infection to PBMCs leads to differential expression of at least 985 genes that are involved in regulating immune regulatory, spliceosomal and apoptotic pathways [Manjunath et al., 2015]. Previously, with a traditional reductionist approach, only handfuls of these genes were known. The new host genes identified are likely to provide newer insights in understanding virus replication, pathogenesis and immune response.
Host-Proteome Signatures following Virus Infection
To effectively propagate and evade the host's immune response, viruses lead to alteration in the metabolic function of numerous host cells. Previous studies have identified numerous host proteins that regulate virus replication, though comprehensive information about changes in all the host proteins could not be ascertained using such a reductionist approach. Global quantitative proteome profiling has been studied by employing technologies such as two-dimensional gel electrophoresis coupled with MALDI-TOF identification, mass spectrometry, surface-enhanced laser desorption/ionization protein chip technology, reverse-phase protein array technology and stable isotope labeling by amino acids in cell culture combined with LC-MS/MS. This rapidly evolving field has identified quantitative proteome profiles of a wide variety of viruses, including MV, where its infection in A549/hSLAM cells was found to induce the differential expression of 38 proteins, 18 of which were uniquely associated with MV infection [Billing et al., 2014]. With the bioinformatics analyses, protein groups such as cytoskeleton, transcription/translation, metabolism, immune response and mitochondrial proteins were identified that are involved in regulating cell death and apoptosis. The approach can also be used to identify the role of host cell kinases in virus infection, and the comparisons can be made between two different viruses [Pelkmans et al., 2005]. These approaches and their systems-level analyses should improve our understanding of various aspects of disease pathogenesis, as well as uncovering new biomarkers.
Virus-Host Interactomes
System-wide siRNA or shRNA screens have identified numerous host factors required for efficient virus replication. Computational analyses have been used to construct and describe virus-host interactomes [Watanabe and Kawaoka, 2015] which in turn have identified cellular targets for therapeutic intervention [Kumar et al., 2011a]. Such host-virus interactomes have been generated for a wide variety of viruses, including PPRV [Manjunath et al., 2015], and have highlighted the cellular factors that may be important in viral replication, virulence, pathogenesis and immune response [Law et al., 2013]. A meta-analysis of virus-host interactomes identified both common as well as virus-specific host targets, suggesting that a common drug target for multiple viruses can be developed [Watanabe and Kawaoka, 2015].
Glycomics
Protein carbohydrate interactions (glycoconjugates) not only occur inside cells for various biological processes, but also take place at the host cell surface in the initiation of the infection by the viruses. In the postgenomic era, glycomics (the functional study of carbohydrates in living organisms) has emerged as one of the important fields in virus research. Protein glycosylation patterns may vary in different cell types [Basak and Compans, 1983]. Individual cell lines vary in the sequon (three amino acid local sequence requirement for N-glycosylation) usage and hence have different glycan structures that could complicate antigen presentation. For example, insect cells are more likely to utilize certain sequons than egg or mammalian cell platforms and hence the compositions, branching patterns, sizes and electrostatic charges of the HA-linked N-glycans strikingly vary according to the cell types [An et al., 2013]. These differences could affect vaccine properties where a standardized set of reagents are prepared from a single source (e.g. a hen egg), and hence inadvertently affect the results of vaccine potency testing. Methods are available for nanoLC/MSE glycan MALDI-TOF MS permethylation profiling to analyze and monitor HA glycosylation in influenza vaccines for lot-to-lot comparisons [An et al., 2015]. The implication of such methodologies in morbillivirus research will improve our understanding of disease pathogenesis and product development.
Noncoding RNAs
The Encyclopedia of DNA Elements (ENCODE) is a collaborative consortium of research groups with the goal of building a comprehensive list of functional elements in the human genome. It was established in 2003 (pilot phase) and contains all the data produced by ENCODE investigators. Besides containing data on protein coding genes, it also contains information about noncoding genes, such as long noncoding RNA and microRNA that are known to play roles in transcriptional and epigenetic gene regulation.
RNA-seq analyses of the host response to viral infections have revealed the differential expression of a variety of host's long noncoding RNAs in the infected cells that may potentially be involved in regulating the innate immune response to a variety of viruses [Peng et al., 2010]. Furthermore, the sequencing of small RNAs in virus-infected cells revealed the differential expression of over 200 small RNAs, which include small nuclear RNAs, piwi-associated small RNAs and host microRNAs (miRNAs) [Chang et al., 2011] that play important roles in transcription, immune activation and regulation of the cell cycle. The miRNAs are important in host-virus interactions where the host limits virus infection by differentially expressing miRNAs that target essential viral genes [Xie et al., 2012]. On the other hand, viruses, particularly the DNA viruses, have also evolved the ability to downregulate or upregulate the expression of specific cellular RNAs to regulate their replication [Bao et al., 2011]. To detect and quantify miRNA expression, a number of methodologies have been developed, including Northern blot [Lee et al., 2003], real-time polymerase chain reaction [Cheng and Li, 2005], microarrays [Liu et al., 2004], deep sequencing [Friedlander et al., 2008], an adeno-associated virus (AAV) reverse infection array [Dong et al., 2010] and an AAV reverse infection array-based dual-reporter system designated as the miRNA Asensor array [Tian et al., 2012]. Following the advent of these high-throughput miRNA-profiling methods, there has been a rapid accumulation of data on virus-associated host miRNA. Increasing evidence also suggests the role of miRNA in morbillivirus replication [Baertsch et al., 2014; Leber et al., 2011]. Such information is likely to provide insights for a better understanding of morbillivirus-host interactions.
Host-Associated Signatures of Virulent and Avirulent Viral Phenotypes
The molecular signatures of the host can explain the virus strain-dependent (virulent/avirulent strains) severity of the disease. For example, influenza A virus-infected lung epithelial (A549) cells revealed the subtle differences in the ability to induce specific host responses, making H5N1 influenza viruses more virulent than the H1N1 [Chakrabarti et al., 2010]. It was evident from such studies that highly pathogenic viruses upregulate or downregulate almost the same set of genes as does the lower pathogenic viruses, though the first with a greater magnitude and with different kinetics. Therefore, just acquiring qualitative information of the differentially expressed genes in response to infection may only provide a part of the information needed to predict pathogenicity. The kinetics and magnitude of the host response are important determinants of the outcome of the disease, and this may have important implications for antiviral therapy. Though the host-targeting agents have fewer tendencies to develop drug resistance, emerging evidence suggests these are not quite successful [Resa-Infante et al., 2015]. It is likely that rather than depending only on the target, effective host-directed therapy will also depend on the timing at which elements of the host response are suppressed or enhanced. The future application of systems virology in PPRV and other morbilliviruses is likely to explain the disease mechanisms of the avirulent (vaccine strain), virulent and highly virulent strains.
Resistance and Susceptibility of Hosts to Viral Infections
Following an acute infection, the recovery rate in PPRV-affected goats is comparatively lower than in sheep. Similarly, RPV has high affinity for Asian cattle compared to African cattle [Couacy-Hymann et al., 1995]. A breed effect on susceptibility/resistance to PPRV has also been reported [Lefevre and Diallo, 1990]. Although comprehensive information about all the host factors is lacking, one of the elements that has been identified and which makes water buffalo resistant to PPR (compared to goats) is the higher basal level expression of Toll-like receptors 3/7 [Dhanasekaran et al., 2014]. Systems virology can unravel all the host factors that may be responsible for disease resistance, like those identified for chicken flocks, differently susceptible for necrotic enteritis [Kim et al., 2014].
Signatures of Vaccine Efficacy
The application of systems-level analyses to vaccinology (vaccinomics) has enabled the identification of the host gene signatures predictive of vaccine immunogenicity [Nakaya and Pulendran, 2015]. Transcriptomic analysis of PBMCs isolated 3-7 days postvaccination from healthy adults with yellow fever vaccine (YF-17D) revealed host gene signatures involved in antiviral sensing and viral immunity, including the type I IFN pathway [Querec et al., 2009]. The functional relevance of one of the genes contained within the predictive signatures, the eukaryotic initiation factor-α kinase 4 (EIF2AK4), was later identified in programming dendritic cells to stimulate CD8+ T cell responses [Querec et al., 2009]. Further computational analysis identified signatures of gene expression, induced 3 or 7 days postvaccination, which correlated with the subsequent magnitude of antibody and cell-mediated immune response [Querec et al., 2009], a proof-of-concept evidence that the systems approaches could indeed be used to identify early correlates (host) of the later immunogenicity of the vaccine. Similarly, systems vaccinology was used to evaluate immunity to the influenza vaccine with the goal of identifying early host gene signatures that correlate with immunogenicity [Nakaya et al., 2011]. Vaccination with either inactivated influenza vaccine or live-attenuated influenza vaccine predicted host signatures consisting of genes with previously known functions (antibody response) as well as other genes with previously unidentified roles in antibody responses. One such gene from the predictive signature was CAMK4, encoding the CaMKIV kinase, which is known to be involved in multiple immune system processes but it was not known if it had a role in antibody responses [Nakaya et al., 2011]. Later studies with CAMK4-knockout mice confirmed that the CAMK4was important in regulating B cell responses [Liu et al., 2012]. Vaccine-associated genome-wide host gene signatures of morbilliviruses are largely unknown, but such information will allow important insights to help improve the quality, quantity and persistence of the vaccine-induced immune response.
Viroinformatics
The advent of new sequencing tools and bioinformatics has accumulated a large amount of genomic and experimental data. Viroinformatics is the amalgamation of virology and bioinformatics which involves the application of information and communication technology in various aspects of virus research, such as viral metagenomics (virome), viral recombination and integration, RNA folding, RNAi studies, protein-protein interaction, structural analysis, phylotyping, genotyping and drug design [Sharma et al., 2015]. As of July 2015, the International Committee on Taxonomy of Viruses (ICTV) listed 3,704 viral species across 7 orders, 111 families and 609 genera. The data bank of the National Center for Biotechnological Information (NCBI) contains nt sequences of >40,000 viral strains (across more than 900 species). Currently, there are more than 100 web servers and databases containing virus-related information [Sharma et al., 2015]. The web resources of general utility that may be applicable to most of the viruses are listed in table 1. Comprehensive information on virus-specific web resources may be found elsewhere [Sharma et al., 2015]. This knowledge is likely to improve our understanding of genome-genome, protein-genome and protein-protein interactions for the development of effective vaccines and common drug targets against viral pathogens.
Acknowledgements
The Department of Science and Technology, Government of India supported this work under project number SB/SO/AS-20/2014 (to N. Kumar). The funders had no role in the study design, collection, analysis and interpretation of data, writing of the report or the decision to submit the article for publication.
Disclosure Statement
The authors declare no conflicts of interest.