Genome sequences are available for 3 human-infecting malaria parasites, Plasmodium falciparum, P. vivax and P. knowlesi, and population genomics data are available for many endemic regions. This review summarizes how genomic data have been used to develop new, species-specific molecular targets for better malaria diagnosis. The combination of bioinformatics and genomics has been used to identify new sequence targets suitable for diagnostic applications and assess their viability within the context of global Plasmodium sequence variation. The selection criteria maximized the sensitivity and specificity of the novel targets. At least one target from each species was found to be suitable for molecular diagnosis of malaria with some advantages over existing molecular methods. The promise of using genome sequence data to develop sensitive, genus- or species-specific diagnostic methods for other pathogens of public health interest is strong. This undertaking together with what we envision as the future of malaria diagnosis in the ‘omic' era is discussed.
Advances in sequencing technology have led to an explosion of sequence generation, especially for infectious agents. As a result, genome sequences for many eukaryotic and prokaryotic pathogens have been generated and are publicly available. For example, the NCBI Genome Project database (http://www.ncbi.nlm.nih.gov/genome) provides access to complete genomes of a myriad of human pathogens. The almost-exponential availability of pathogen genome sequences has been made possible by major technological advances, or ‘next-generation' DNA sequencing techniques, which have dramatically reduced the cost and time required to generate sequence data .
The genome sequence for the most lethal human malaria parasite Plasmodium falciparum (∼23 Mb) was first published in 2002  based on the laboratory-adapted parasite isolate 3D7. However, thousands of global Plasmodium strains and clinical isolates have since been examined for single nucleotide polymorphisms (SNPs) or sequenced to generate a map of global variation and identify and track changes related to major phenotypes such as drug resistance [7,8]. The nuclear genome sequences for the human-infecting P. vivax (∼27 Mb) and zoonotic P. knowlesi (∼24 Mb) are also known [3,4], and all available genome sequences can be easily accessed at (http://PlasmoDB.org)  (along with other genome-wide datasets, such as SNPs, expression data, proteomics, etc). PlasmoDB provides tools to make comparisons between emerging high-throughput data sets for these organisms.
Now, more than ever, the role of genomics in understanding Plasmodium parasite biology is evident . It is hoped that the availability of Plasmodium genome sequences will revolutionize the discovery of new drug targets, development of new vaccine targets and drug-resistant markers, and lead to improvement in other tools for fighting and controlling malaria such as diagnostics. Recently, genome-based association studies have contributed to identification of genomic regions associated with drug resistance [7,8]. The significance of genomic data for understanding parasite biology, pathogenesis, drug resistance, and population structure has been extensively reviewed recently . The scope of the current review is to highlight how genomics is contributing to the immediate public health needs of malaria diagnosis.
The molecular biologist's toolbox has improved tremendously since the advent of the polymerase chain reaction (PCR) in the mid-1980s. PCR-based assays are fairly standard in most laboratories and can be used for disease detection, forensics and drug-resistance surveillance testing. While several other disease-detection methods exist, molecular tools are still the most specific and sensitive tools. Molecular diagnostic techniques rely on in-depth knowledge of the target sequence and its global variability. Therefore, molecular diagnostic tools benefit tremendously from the increased availability of pathogen genome sequences. For example, genome sequences can be mined to discover conserved sequence regions suitable for the design of specific primers that do not cross-react with either the host or other species of the same genus. Likewise, highly variable regions can be avoided when designing detection assays, as these would lead to false negative results. Sequences which exist in high copy numbers can be used to design a more sensitive assay compared to assays utilizing single- or low-copy number sequence targets.
Why New Tools for Malaria Diagnosis?
Malaria continues to be a major global public health challenge. There were approximately 225 million cases of malaria in 2009 with an estimated 718,000 deaths reported . Five different Plasmodium species with different clinical implications infect humans in different combinations around the world. There is a renewed, concerted, international effort to fight and eliminate malaria through funding from the Global Fund, the U.S. Presidents Malaria Initiative, the Bill and Melinda Gates Foundation and other private and public sources. Thus, new diagnostic assays and tools that are robust enough to accurately detect the species of infecting parasite(s), for case management, detection of transmission foci of malaria reservoirs (submicroscopic infections) and for monitoring the success of malaria control and elimination programs are needed.
Existing tools for malaria diagnosis include microscopy, parasite antigen/enzyme detection kits (commonly referred to as rapid diagnostic tests (RDTs)) and molecular tools, which are mostly restricted to reference laboratories (reviewed in ). Microscopy remains the gold standard when a suitable infrastructure is available. This is the cheapest method and can differentiate species of malaria parasites and provide quantitative data on the level of parasitemia. One limitation of microscopy is that it can fail to identify mixed infections and/or low levels of parasitemia. Microscopic detection limits range from 50-100 parasites/μL, depending on the level of expertise of the microscopist and other factors (reviewed in ). As malaria often occurs in communities where even microscopic diagnosis is not easily available, RDTs are an alternative diagnostic tool. Current RDTs capture products such as the histidine-rich protein -2 (Pf HRP-2), which is P. falciparum-specific, or the aldolase and lactate dehydrogenase enzyme (LDH) which are genus-specific and are therefore capable of detecting all Plasmodia, but are not able to discriminate species. However, highly divergent species such as P. ovale can produce negative RDT results due to sequence variation . In addition, the lack of the hrp-2 gene in some P. falciparum parasites in parts of South America and Africa [13,14] has raised some concerns about HRP-2-based RDT use due to the potential for false negative results. Proteomics data generated from multiple strains can be used to develop new-generation RDTs utilizing multiple protein targets to overcome existing challenges.
Nucleic acid-based techniques such as PCR have revolutionized pathogen detection and identification, by offering high sensitivity and specificity. Molecular diagnostic methods for malaria diagnosis have at least 2 advantages compared to the other methods: they can accurately define the species of malaria parasite(s), and they can detect parasite densities that are well below microscopy limits. Previously, the field of malaria diagnosis was wrapped in debates as to what tool is clinically relevant for diagnosis and treatment of malaria, especially in highly endemic regions; this stemmed from the fact that molecular tools are very sensitive in detecting submicroscopic infections which often do not translate to clinical manifestations (subclinical). Others argued that molecular tests may detect ‘lingering' parasite DNA as opposed to active infections and may therefore not be appropriate assays to use. However, the malaria elimination era makes it paramount that even the ‘last parasite' is found. In addition, recent studies have clearly demonstrated that submicroscopic gametocyte carriers (who can only be detected molecularly) are capable of transmitting malaria . Molecular tools are thus appealing for use in both case management and programmatic operations, such as monitoring and evaluation of control programs. Molecular diagnostic tools have helped to identify zoonotic transmission of P. knowlesi in parts of Southeast Asia . These infections were previously diagnosed as P. malariae or P. falciparum by microscopy. Molecular tools are also helping to identify new species of malaria parasites including P. falciparum-like parasite species in non-human primates .
Evolution of Molecular Tools for Malaria Diagnostics
Several molecular diagnostic tools for malaria are available, the majority of which are PCR-based assays. Recent technologies have led to the development of other amplification tools, such as loop-mediated isothermal amplification assays, that are simpler and more field-adaptable [18,19]. Molecular diagnosis of malaria parasites began with the use of the 18S ribosomal RNA (18S rRNA) gene as the target about 20 years ago , and this method is widely used in many reference laboratories with various modifications. This target was a logical choice in the pre-genomics era. Its regions of conserved sequence allowed cloning from multiple Plasmodium species facilitating the subsequent design of species-specific primers. Also, all eukaryotic organisms that had been examined to that date contained multiple, often hundreds of identical copies of 18S rRNA , so it seemed likely that this target would lead to a very sensitive assay. Plasmodium genome sequences have subsequently revealed that the 18S rRNA target is present in 4-8 divergent copies, depending upon the species [2,3]. PCR's sensitivity is greatly influenced by the starting target molecule copy number; a low target copy number within the parasite limits the detection capabilities of these assays, especially when the parasitemia is low. Previous multiplex assays for simultaneous detection of malaria parasite species showed decreased sensitivity, particularly in detecting the minor species . The 18S rRNA gene target presents challenges for effective multiplex platforms which would cut back on costs, test time and reduce contamination possibilities resulting from many primers competing for the same target.
Use of Genomics and Bioinformatics to Improve Malaria Diagnostic Assays
The new era of genomics and sequence availability can be harnessed to improve molecular diagnostic tools for malaria. Recently, we developed a bioinformatics approach to identify new sequence targets that are suitable for diagnostic applications (fig. 1) in P. falciparum, P. vivax and P. knowlesi. Ideal targets will be species-specific, present in many, well-conserved copies (ensuring high sensitivity) and amenable to PCR or other nucleic acid amplification techniques. Our semi-automated bioinformatics pipeline employed a number of screens to ensure candidate targets met these criteria [23,24].
Our candidate target selection process began by accessing genome sequence data for P. falciparum (strain 3D7), P. vivax (strain Sal-1) and P. knowlesi (strain H) from PlasmoDB. Each genome was mined for repetitive sequence content, and a consensus repeat sequence (CRS) for each repeat family was generated. We screened and eliminated all CRSs with significant similarity to human sequences, significant similarity to artificial sequences that may have been introduced during genome sequencing and those containing internal tandem repeats that could potentially interfere with PCR amplification and similarity to sequences from other Plasmodium spp. Only species-specific CRSs that did not exhibit high levels of polymorphism in existing SNP data sets were considered further. We also eliminated candidates <300 bp to allow for primer design and evaluation of target conservation. Repeat families with at least six copies were considered for further testing, yielding a total of 21 P. falciparum, 68 P. vivax and 19 P. knowlesi candidates.
Six P. falciparum, seven P. vivax and four P. knowlesi putative targets were selected for further validation. Over 64 primer pairs were designed to these targets and empirically tested in conventional PCR amplification assays and multiplex assays (for P. falciparum and P. vivax). At least one putative target from each species (summarized in table 1) was found to significantly improve existing diagnostic capabilities: candidates Pfr364 (P. falciparum), Pvr47  and Pvr64 (P. vivax; designed for loop mediated isothermal amplification assays, ), and Pkr140 (P. knowlesi) .
Validation of PCR Primers from Novel Targets
All primers designed to the four novel targets were shown to correctly amplify the gene of interest with high specificity and sensitivity (table 1). In addition, the limits of detection of these assays were improved. Of interest is the novel primer set for detection of P. knowlesi which exhibits stage-dependent morphological similarities to P. malariae and P. falciparum . The commonly used molecular assay for the detection of P. knowlesi infection was recently noted to cross-react with P. vivax, leading to potential false positive results for a small proportion of human clinical P. vivax samples  and other simian Plasmodium species . Primers designed to the Pkr140 target consistently identified P. knowlesi without cross-reacting with other human malaria parasites and five other primate Plasmodium parasites (P. simiovale, P. cynomolgi, P. inui, P. coatneyi, and P. hylobati).
Genomic data mining allowed us to develop sensitive and highly specific PCR tests for three Plasmodium species. Similar approaches can be used for the other human-infecting species as data are available. The newly developed tests highlighted here need further validation in the field, and their performance can be improved by modifying reaction parameters as needed.
Limitations of the Described Approach
What we have described above is only an example of how genome data can be exploited for developing new diagnostic methods. The genome search methodology we have developed can be further modified depending upon the specific needs of the tools being considered for development. The use of a single strain to select a candidate sequence as described here is a limitation that can be improved. To ascertain that the designed primers are capable of interspecies detection, one could test the novel primers against different strains from across the world, as was previously done by Demas et al. . However, as additional SNP and population variation data become freely accessible via PlasmoDB and other resources such as MalariaGen (Genomic Epidemiology Network), these databases can be used to determine sequence variation in much greater detail and aid the design of novel primers capable of detecting field isolates.
New-Generation Molecular Tools
The current ‘omic' era greatly complements the growing number of new-generation tests and platforms for molecular diagnostic tests observed today. Recently, a microfluidic technology-based TaqMan low-density array (Life Technologies, Carlsbad, Calif., USA) card for real-time PCR detection of 21 different respiratory pathogens was described . Such array formats can be adapted for detection of multiple malaria parasite species and can be integrated with detection of other pathogens of relevance. Further development of less-expensive, simple, field-usable DNA microarrays for malaria detection will be useful for large-scale surveillance programs. Several isothermal amplification assays have been described, e.g. the SAMBA HIV-1 test for the rapid visual detection of HIV-1 by a dipstick , loop-mediated isothermal amplification  and its modification for real-time detection using portable devices . Mens et al.  recently developed a direct-blood PCR assay visualized by nucleic acid lateral flow immune-assay capable of detecting Plasmodium; this assay attempts to circumvent the DNA isolation step, a major hurdle towards the simplification of molecular tests. The current direction is towards simple, robust and cheap tests that can detect multiple disease agents and hopefully enable their use in point-of-care settings with limited resources.
Use of Proteomics to Improve Malaria Antigen-Based RDTs
The next-generation RDTs will require a better selection of Plasmodium antigens that are highly abundant, highly conserved across geographic regions and species-specific. The use of genomics and bioinformatics is not limited to the development of molecular assays; these tools together with proteomics can be used to select for Plasmodium antigens that meet these standards. In an effort to improve the current malaria RDTs, Kattenberg et al.  recently tested monoclonal antibodies against three Plasmodium antigens: glutamate-rich protein, dihydrofolate reductase-thymidylate synthase and heme detoxification protein for their potential use in RDTs. This study highlights the feasibility of developing next-generation RDTs with multiple protein targets that could be identified and validated using proteomics data.
Molecular Barcoding for Differentiating Parasite Populations and Their Use in Outbreak and Elimination Programs
In outbreak investigations and during post-elimination monitoring of malaria, it will be important to identify the geographic source of malaria parasites. Recently, a molecular barcode study using 24 SNPs from the P. falciparum genome was developed and tested for its utility in differentiating parasite populations . Further validation and improvement of such molecular barcodes and curation of publicly shared databases housing these data will be invaluable for public health investigations and tracking of drug-resistant parasite populations. Using multi-locus genotyping of microsatellite markers, it has been shown that recent Peruvian P. falciparum populations have evolved from five clonal lineages in the post-malaria eradication era , demonstrating the utility of genomic data in facilitating the public health response to control and eliminate malaria.
Availability of genome data for human malaria parasites has led to significant advances in understanding the biology of malaria parasites and determination of genetic targets associated with evolving resistance to artemisinin as well as other applications . We have highlighted the value of generating population level genomic data, applying bioinformatics methods and mining these data to identify novel targets for malaria diagnosis and surveillance. We view these initial forays with bioinformatics as only the tip of the iceberg when considering the possible uses of genomic data to address a large variety of public health demands. Novel diagnostic tests with better sensitivity and specificity than current tests can be developed with advancing genomics knowledge of malaria parasites. Such tools will be highly relevant for malaria control programs and efficient public health response.