Abstract
Epigenetics is the study of heritable and non-heritable genetic coding that is additive to information contained within classical DNA base pair sequences. Differential methylation has a fundamental role in the development and outcome of malignancies, chronic and degenerative diseases and aging. DNA methylation can be measured accurately and easily via various molecular methods and has become a key technology for research and healthcare delivery, with immediate roles in the elucidation of disease natural history, diagnostics and drug discovery. This review focuses on cancers of the lower genital tract, for which the most epigenetic information exists. DNA methylation has been proposed as a triage for women infected with human papillomavirus (HPV) and may eventually directly complement or replace HPV screening as a one-step molecular diagnostic and prognostic test. Methylation of human genes is strongly associated with cervical intraepithelial neoplasia (CIN) and cancer. Of the more than 100 human methylation biomarker genes tested so far in cervical tissue, close to 20 have been reported in different studies, and approximately 10 have been repeatedly shown to have elevated methylation in cervical cancers and high-grade CIN (CIN2 and CIN3), most prominently CADM1, EPB41L3, FAM19A4, MAL, miR-124, PAX1 and SOX1. Obtaining consistent performance data from the literature is quite difficult because most methylation studies used a variety of different assay methodologies and had incomplete and/or biased clinical specimen sets, varying assay thresholds and disparate target gene regions. There have been relatively few validation studies of DNA methylation biomarkers in large population-based screening studies, but an encouraging development more recently is the execution of well-designed studies to test the true performance of the markers in real-world settings. Methylation of HPV genes, especially HPV16, HPV18, HPV31, HPV33 and HPV45, in disease progression has been a major focus of research. Elevated methylation of the HPV16 L1 and L2 open reading frames, in particular, is associated with CIN2, CIN3 and invasive cancer. Essentially all cancers have high levels of methylation for human genes and for driver HPV types, which suggests that quantitative methylation tests may have utility in predicting CIN2 and CIN3 that are likely to progress. It is still early in the process of development of methylation biomarkers, but already they are showing strong promise as a universal and systematic approach to molecular triage, applicable to all cancers, not just cancer of the cervix. DNA methylation testing is better than HPV genotyping triage and is competitive with or complementary to other approaches such as cytology and p16 staining. Genome-wide studies are underway to systematically expand methylation classifier panels and find the best combinations of biomarkers. Methylation testing is likely to show big improvements in performance in the next 5 years.
Epigenetics
A principal goal of epigenetics is to understand the molecular patterning controls that govern the decoding and flow of genetic information [1,2]. For example, a simple sequence of DNA nucleotides such as ACTGACTG embedded in a chromosome carries information that the cell may decode in different ways based on the associated context. The cell may use the information to make part of a protein, part of a regulatory or structural element or perhaps a silent spacer. If the sequence changes to ACCGACTG, the mutation from T to C may or may not lead to functionally different information in the cell, depending on how the sequence is decoded. Epigenetics relates to how a gene sequence at any particular chromosomal location may be decoded in different ways without recourse to nucleotide mutations. As we will see below, differential decoding can lead to profoundly altered outcomes such as a change from a normal to a cancerous cell. Epigenetic mechanisms play a dynamic role in the creation and development of the human body, maintenance of health, ageing and death [1,2]. In contrast to genetic traits, epigenetic patterns are rarely inherited by offspring in a Mendelian manner; rather, they accumulate and increase in complexity from relatively simple imprinted profiles that evolve actively during development. Although the vast majority of methylated CpG motifs are not inherited meiotically, they are transmitted mitotically through cell lineages with good fidelity.
Epigenetic patterns guide future developmental possibilities and lock differentiated tissues into given states. The processes are affected by the environment, which leads to some interesting consequences; for example, particular epigenetic patterns accumulate on DNA and proteins during embryogenesis and may be further modified over time in ways that are different for every individual. Detrimental exposure to carcinogens, a poor diet, stress, a lack of exercise, obesity, etc., result in disease-promoting patterns, while beneficial lifestyle factors and effective drugs may result in health-maintaining or homeostatic patterns. The patterns are fluid and evolve to represent the molecular history of exposures and events in the body. With new molecular assays, we are learning how to read these codes, a pathway of discovery likely to dramatically change prognostic medicine and provide much greater accuracy in the prediction of disease outcomes to the point of practical personalized medicine [1,2,3].
Epigenetic patterns dictate how and when given sets of genes are expressed or silenced in tissues. A certain pattern of methylation may support neuronal differentiation, while another may support carcinoma (fig. 1). An interesting aspect of epigenetics is that the patterns can change during life and, although they are quite stable once established, they are potentially reversible or modifiable by drugs. Some epigenetic drugs may allow a return to an earlier state of health, while other exposures may induce progression to worse disease. The epigenetic mechanism can be imagined as a conductor of the genetic orchestra, bringing out the artistic individuality of each played piece.
Schematic representation of DNA methylation, somatic mutations and HPV infection events during the development of normal epithelial tissues and progression to cervical cancer. There are tens of thousands of DNA changes that occur during the process depicted in this example, most of which are methylation. Epigenetic events are highly complex and play an important role in both normal development and progression to cancer. Normal methylation activity is shown along the arrows by CH3 in green, while abnormal methylation is shown in red. Mutational events (mut) are shown as an averaged mutational burden estimated from exome-sequencing studies [Lau and Lorincz, in preparation], with the larger letters (mut) indicating more mutations during progression from CIN3 to cancer. The main steps of high-risk HPV (hrHPV) infection and phenotypic effects are shown by the hexagons with the letter H and the red arrows.
Schematic representation of DNA methylation, somatic mutations and HPV infection events during the development of normal epithelial tissues and progression to cervical cancer. There are tens of thousands of DNA changes that occur during the process depicted in this example, most of which are methylation. Epigenetic events are highly complex and play an important role in both normal development and progression to cancer. Normal methylation activity is shown along the arrows by CH3 in green, while abnormal methylation is shown in red. Mutational events (mut) are shown as an averaged mutational burden estimated from exome-sequencing studies [Lau and Lorincz, in preparation], with the larger letters (mut) indicating more mutations during progression from CIN3 to cancer. The main steps of high-risk HPV (hrHPV) infection and phenotypic effects are shown by the hexagons with the letter H and the red arrows.
The Interface of Epigenetics and Genetics
The heart of the epigenetic mechanism is methylation patterning, which occurs on various macromolecules including DNA and proteins [1,2]. At its simplest, the chemistry of mammalian DNA methylation involves the enzymatic addition or removal of a methyl group (CH3) on specific cytosine (C) nucleotides in a DNA sequence. This results in an alteration of the local steric and hydrophobic properties of the DNA. The presence of one or a few methyl groups may interfere with binding of regulatory factors due to steric or electrostatic exclusion, or it may enhance other interactions due to newly conducive van der Waals surfaces. Many methyl groups on a given sequence will lead to substantially more hydrophobic stretches of DNA that can become locally condensed and less accessible to proteins in general. At extreme levels of methylation, long segments of DNA or entire chromosomes condense into heterochromatin and become predominantly transcriptionally inert. The overall effects of changes in DNA methylation lead to temporary or permanent masking or unmasking of genetic information represented by the affected sequences. Aberrant methylation has additional effects on DNA integrity during carcinogenesis. DNA instability leads to an accumulation of somatic mutations encompassing nucleotide changes and gene copy number variations due to indels, translocations, transpositions, amplications and loss of heterozygosity [1,2,4]. DNA methylation occurs most commonly on the 5-position of the aromatic ring of a C that is immediately followed by a G; these sequences are called CpG sites or dyads because they form a reversed image on the anti-parallel DNA strand. Methylated CpG are often located in CpG islands, which are regions of DNA, mostly in the vicinity of gene promoters or in gene bodies, that have a higher-than-expected representation of CpG motifs [5]. Much past research on DNA methylation has concentrated on CpG sites in these islands, but it is now clear that important methylation and demethylation events often occur on the edges of CpG islands (CpG shores, shelves or boundaries) or indeed completely away from the CpG islands [5,6,7]. Methylation also occurs on proteins, most prominently on histones that coat DNA in cell nuclei and form chromatin. CpG islands are often heavily methylated on their DNA and histones, except in regions where there are active gene promoters. It is for this reason that methylation of tumour suppressor gene promoters is so important - given enough methyl groups, the gene is silenced and has an effect analogous to a disabling Watson-Crick mutation or indel [1,2,5]. DNA and protein methylation work in concert to reinforce the overall patterns that govern decoding of genetic information.
Given its biological importance, it is not surprising that DNA methylation is tightly controlled by specific enzymes (DNA methyl transferases) and regulatory factors, which are in turn controlled by the innate system circuitry of the tissue [1,2,4]. Methylation command systems have a window to the outside world, and external stimuli are allowed to impact and guide the patterns that develop. The epigenetic-environmental interface is part of a natural system that exists for the purpose of providing a more responsive and dynamic somatic palette than available from chance accumulation of Watson-Crick-type nucleotide mutations. Mutations that occur in germ cells can result in an intergenerational evolutionary step, but it is obvious that this response rate is many orders of magnitude lower than the epigenetic response rate. Of course, since carcinogenesis in the soma is a dead-end pathway, the build-up of mutational and epi-mutational events during disease progression does not noticeably impact the intergenerational flow of coding information. A key difference between epigenetic change and classical Watson-Crick mutations is that methylation changes are, at least in principle, more easily reversible, a feature that gives hope for future epi-reversals of minimally genetically damaged cancers. In other words, as long as the cancer has not festered too long and developed huge numbers of somatic mutations, it could be induced to differentiate into a permanently quiescent tissue by targeted epigenetic drugs.
Measurable methylation pattern change in humans can be quite rapid, taking only a few months; hence, it is a deep reservoir for discovery of biomarkers that precede or evolve with disease [8,9]. Epigenetic biomarkers are especially important in complex multi-gene diseases such as common cancers, and epigenetic changes are much more frequent than genetic changes. The study of DNA methylation in particular is producing a huge number of interesting biomarkers. Although there are many diverse aspects to methylation, the translational studies usually focus on CpG patterns. DNA methylation is a mitotically transmitted epigenetic motif that can be measured with good accuracy in tissue biopsies, exfoliated cells, blood and other fluids and can be related to exact physical locations in the genome; this feature lends itself more readily to medical diagnostics. Clinically relevant methylation changes are known in many human cancers such as cervix, prostate, breast, colon, bladder, stomach, oesophagus and lung cancers [1,10,11,12]. Differential methylation may have a central role in the development and outcome of most if not all human malignancies. The advent of accurate quantitative assays, including epigenome-wide deep sequencing, holds great promise for biomarker and drug discovery and the implementation of routine molecular clinical testing.
DNA Methylation and Cervical Cancer
Changes in DNA methylation are central to most cancers, including genital tract cancers. The changes cause defective gene expression, genetic instability through faulty condensation of chromosomes, and silencing of mobile DNAs such as jumping genes (transposons) and viruses. Attempted silencing of mobile DNA can spur molecular evolution as variants of the targeted DNA escape repression, which adds to host genetic instability. Methylation mechanisms can also amplify the effects of host mutations and lead to pre-cancers and cancers with relatively few detectable relevant genetic changes. Genome-wide studies of cancers generally show thousands of host somatic mutations or polymorphisms, but most changes have no significant associations or have quite small effect sizes. Much causal molecular pathway information seems to be missing or hidden in an excess of incidental mutations [13,14], suggesting that key disease progression drivers may often be epigenetic in origin.
From a global perspective, cervical cancer was one of the most important cancers in women before the advent of widespread cytological screening; unfortunately, it is still among the most common cancers in many developing countries due to ineffective control measures. GLOBOCAN [15] data show an estimated 528,000 cases and 266,000 deaths worldwide due to cervical cancer, of which 83,000 cases and 35,000 deaths occurred in more developed regions and 445,000 cases and 230,000 deaths occurred in less developed regions. Papanicolaou (Pap) cytology screening programs detect most cervical intraepithelial neoplasias (CIN) with a potential to transform into malignancy and for which treatment may prevent the cancer. Unfortunately, the Pap test is difficult to implement and retain at high quality, especially in underdeveloped countries [16]. Numerous studies over the last 20 years have shown that even high-quality Pap cytology may miss 30% or more of high-grade CIN (CIN2 and CIN3) and invasive cancers [17,18,19,20]. Pap screening has been successful because it is repeated frequently over the lifetime of a woman.
Human papillomavirus DNA testing is more sensitive but less specific than Pap cytology [20,21,22]. The large majority of HPV infections become undetectable within a few years, and such transient infections are not associated with progression to cancer [21]. If performed at similar intervals, HPV testing produces positive results much more often in women who do not have CIN2 or CIN3 than does cytology testing; the medical community has struggled with this fact since the introduction of HPV DNA testing [21,22].
New Molecular Triage Tests for HPV-Positive Women
It seems increasingly likely that HPV testing will eventually replace Pap cytology in most parts of the world [18,21,22]. The relatively low specificity of HPV DNA testing is slowing progress. Finding a better triage test for HPV-infected patients remains an important goal. Of greater importance are accurate molecular prognostic classifiers for people with hrHPV-positive (hrHPV+) test results, which could be done on the screening specimen and would reflexively indicate the future risk of progression. The ability to accurately tell whether the HPV infection will become a CIN3 or disappear would radically transform screening programs. The results would be reduced testing, lower costs, fewer overtreatments and less anxiety; this process may eventually eliminate cytology in cervical cancer screening.
With high anticipated volumes of HPV testing, the impact of small differences in assay performance are greatly magnified, and at a population level there is a large cost to pursuing a testing strategy with an overall lower specificity. There are many current triage tests for hrHPV+ people, including cytology, immunostaining for p16 and Ki-67 and genotyping for HPV16, HPV18, HPV33 and HPV45 or combinations [17,23,24,25,26]. Only genotyping has the benefit of a fully integrated molecular test, although merely differentiating among high-risk versus medium-risk HPV types adds relatively little of clinical value [18,24,25,27]. Numerically, most cervical HPV infections are of medium risk, which furthermore represents a diverse group, and they cannot be ignored because these types lead to a lot of cases of CIN3 and cancer [21,24,25,26,27,28,29].
A new and promising triage for hrHPV+ women is DNA methylation testing [1,8,10,17,27,28,30]. Methylation measurement can be done as a simple reflex to the original screening specimen. The sample for the test can be placed in a transport liquid or mailed to the lab as a dry swab, eliminating the need for expensive and inconvenient transport media required for the preservation of morphology. There are many different molecular methodologies for DNA methylation testing. The most common is quantitative methylation-specific PCR (QMSP) [31]. Some studies have used pyrosequencing because it can be more accurately correlated to an absolute level of methylation, usually expressed as a simple percentage or proportion [8]. An older and almost obsolete method is bisulphite sequencing. The method gives good details of methylation variation on individual strands of DNA but is labour intensive and involves a potentially biasing cloning step [32]. More recently, methylation testing has moved into the genome-wide era with various chip-based and massively parallel deep-sequencing approaches [31].
DNA Methylation Testing: Viral or Host Targets?
Regardless of the methodology, DNA methylation assays have 3 basic target sequence designs that interrogate differential methylated CpG patterns on HPV genomes, human genomes or a combination of HPV and human DNA. It is not obvious which of these designs will produce the best classifiers going forward. Methylation assessment of both HPV and human genes provides synergistic information, but the assay design is slightly more complex [32,33,34]. The combination assay may have a greater sensitivity for CIN3 and a better unbiased area (AUC) under the receiver operator characteristic curve [35]. Almost all of the cancers can be detected in exfoliated cell samples by any of the 3 approaches, but catching cancers is not as important as catching pre-cancers [10,36]. The argument that an absence of detectable methylation in CIN2 and CIN3 identifies high-grade lesions that do not progress does postulate an interesting idea and if true would quickly revolutionize patient management strategies. However, the idea needs to be proven in large well-designed clinical studies of which there are none to date.
HPV Genome Methylation
Differential changes in HPV DNA methylation were first reported in HPV1a in 1984 [37]. Similarly to methylation patterns seen more recently in the genital hrHPV types, HPV1 genomes extracted from warts had high levels of methylation of the viral late regions and low levels in the upstream regulatory and early region [9]. Gross genomic methylation patterns in the genital hrHPV types are quite similar. HPV16 has many differentially methylated CpG sites, with increased methylation of many but not all CpG sites in the LI, L2 and E2 genes being strongly associated with cervical carcinogenesis (fig. 2) [9,30,38]. HPV16 and HPV18 combined contribute to approximately 70% of cervical cancers, while HPV31 and HPV33 are among the next most prevalent types, causing another 8% of cancers [21,24]. Expanding the methylation classifier to a panel of hrHPVs has been diagnostically worthwhile [34,35]. Accurate measurement of individual CpG DNA methylation on HPV genomes may be useful not only in epidemiological studies but also in following up HPV-infected women. A number of papers indicate that CpG methylation in HPV16 increases at a rate of about 0.5-0.7% per year while the infection remains persistent [8,9]. It is possible that this feature could be used to track infections in populations over time and to calculate on-going risks of progression to CIN3 as methylation approaches levels characteristic of high-grade lesions. An interesting question is to determine whether some patients have a faster methylation rate than others and if so whether this may be linked to a higher risk of cancer.
Simplified representation of DNA methylation patterns during progression to cervical cancer. a Levels of DNA methylation across the genome of HPV16, adapted from figure 1A of Mirabello et al. [9]. The lines represent the median percentage of methylation averaged across discrete sets of CpG sites in each HPV16 genomic region. HPV16 contains 113 CpG sites (depending on the variant), and the levels depicted by the lines represent an approximate running average across the genome of HPV16, grouping adjacent (10-20) methylated CpGs. The levels of methylation of individual CpG sites are highly variable, with some CpG sites between other highly methylated sites showing no methylation. The bar at the bottom shows the various genomic regions, such as the upstream regulatory region (URR) and the early (E) and late (L) open reading frames of the HPV16 circular genome, which is shown here linearized in the URR for ease of visualization. b Median levels of DNA methylation of the human gene EPB41L3 and 2 CpG sites in HPV16 L1 (6,367 and 6,389; genomic positions as shown in a) as a function of histopathological diagnosis. The lower and upper edges of the boxes represent the 25th and 75th percentiles, respectively, while the horizontal white lines show the median methylation values, and the 5 and 95% values are shown by the lower and upper limits of the vertical lines, respectively. Adapted from figure 1 of Louvanto et al. [10]. CIN grades 1, 2 and 3 are represented by CIN1, CIN2 and CIN3, respectively. Cervical cancer is shown by SCC and cervical adenocarcinoma by ADC.
Simplified representation of DNA methylation patterns during progression to cervical cancer. a Levels of DNA methylation across the genome of HPV16, adapted from figure 1A of Mirabello et al. [9]. The lines represent the median percentage of methylation averaged across discrete sets of CpG sites in each HPV16 genomic region. HPV16 contains 113 CpG sites (depending on the variant), and the levels depicted by the lines represent an approximate running average across the genome of HPV16, grouping adjacent (10-20) methylated CpGs. The levels of methylation of individual CpG sites are highly variable, with some CpG sites between other highly methylated sites showing no methylation. The bar at the bottom shows the various genomic regions, such as the upstream regulatory region (URR) and the early (E) and late (L) open reading frames of the HPV16 circular genome, which is shown here linearized in the URR for ease of visualization. b Median levels of DNA methylation of the human gene EPB41L3 and 2 CpG sites in HPV16 L1 (6,367 and 6,389; genomic positions as shown in a) as a function of histopathological diagnosis. The lower and upper edges of the boxes represent the 25th and 75th percentiles, respectively, while the horizontal white lines show the median methylation values, and the 5 and 95% values are shown by the lower and upper limits of the vertical lines, respectively. Adapted from figure 1 of Louvanto et al. [10]. CIN grades 1, 2 and 3 are represented by CIN1, CIN2 and CIN3, respectively. Cervical cancer is shown by SCC and cervical adenocarcinoma by ADC.
Human Genome Methylation
More than 100 human genes have been proposed as possible methylation biomarkers of cervical cancer [1,27,28,32,35,36,39,40,41,42,43,44]. New genes are being discovered every year, increasingly by quasi-genomic approaches. So far, there does not seem to be any single gene that is sufficiently sensitive and specific for identifying CIN3, but some panels of genes look quite promising. Many alternative biomarker combinations can be constructed, and there are few overlaps of genes in many of the well-described panels. A plausible explanation for the abundance of riches is that aberrant methylation is an extensive and pervasive phenomenon in carcinogenesis, and it is practical to compose varied classifier panels from many different pathways to achieve similar if not identical triage possibilities. Table 1 shows a systematic review of the performance of human and HPV gene classifiers for detecting CIN2, CIN3 and cancer in 19 studies published since 2011 that met certain quality criteria as listed.
Performance characteristics of selected DNA methylation studies in exfoliated cervical cell specimens for a high-grade CIN and cancer histopathological endpoint

A pair of genes with extensive evidence of good performance in the scientific literature are MAL and CADM1. A combination QMSP test for these two genes in a set of specimens of hrHPV+ women from a screening population with a high-grade CIN and cancer (collectively CIN2+) endpoint gave a sensitivity of 68% (95% CI 50-81%) and a specificity of 75% (95% CI 70-80%) at a selected cut-off, compatible with greater than 70% specificity. At another cut-off (table 1), the sensitivity of CADM1/MAL was 84% (95% CI 72-93%), the specificity was 52% (95% CI 48-57%), and the positive predictive value (PPV) was 25% (95% CI 17-32%), with an AUC of 0.72 [40]. In the same set of women, the triage sensitivity and specificity values for cytology were 66% (95% CI 50-79%) and 79% (95% CI 74-83%), respectively (table 1). Cytology combined with genotyping for HPV16 and HPV18 had a sensitivity of 84% (95% CI 72-93%) and a specificity of 54% (95% CI 47-59%).
In a self-sampling study, DNA methylation testing for MAL and miR-124 as a triage for screening program-non-compliant women who tested positive for HPV on self-collected specimens showed a similar performance versus cytology done in the clinic. The sensitivity for CIN2+ was 70.5% (95% CI 66.1-75.0%) for methylation triage and 70.8% (95% CI 66.1-75.4%) for cytology triage, respectively, while the PPV was significantly lower (31.7%, 95% CI 26.3-37.1%, p < 0.001) for DNA methylation triage than for cytology triage (50.3%, 95% CI 42.3-58.4%) [41].
There have been a number of studies of DNA methylation testing of the genes EPB41L3, JAM3, TERT and C13ORF18; a combination of these genes gave a sensitivity of 65% and a specificity of 79% for detecting CIN2+ in an early study on a set of tissue bank convenience specimens (table 1) [42]. A QMSP study on the genes DLX1, ITGA4, RXFP3, SOX17 and ZNF671 showed 100% (95% CI 85.4-100%) sensitivity for 19 invasive squamous cancers, while the sensitivity and specificity of a CIN2/3 (excludes cancer) endpoint were 43.1% (95% CI 30.8-56%) and 88.7% (95% CI 82.1-93.5%), respectively [36].
Studies on mainly Asian women employing QMSP tests for methylation of the genes PAX1 and SOX1 for the detection of CIN3+ have shown a promising performance. In a study of patients in a colposcopy population, the sensitivity of PAX1 was 64% and for SOX1 it was 71%, while the specificity was 91 and 77%, respectively, and the AUC was 0.77 and 0.83, respectively (table 1) [43]. Another methylated gene of interest is POU4F3, which has been tested in several studies from Asia [45]. Methylation levels of POU4F3 were assayed using QMSP on a set of 85 testing (verification) specimens at a pre-defined cut-off established in a larger training set of specimens. The sensitivity of the POU4F3 test for CIN3+ was 74%, and the specificity was 89%, with an AUC of 0.86. These results are quite encouraging, and it will be interesting to see the validation results. A caveat of the POU4F3 study (excluded from table 1) is that the testing set was highly enriched for squamous cancers and adenocarcinomas, which may produce bias in the estimation of test performance. A study of SOX1 (with POU4F3 and some additional genes) suggests that methylation could be used to triage women with atypical glandular cells [46], with several potential genes giving similar results. The study included 55 women with atypical glandular cells on cytology, and the histological endpoint was 9 cases of CIN3+ (1 adenocarcinoma and 1 squamous cell carcinoma). Interestingly, the results showed that the best gene, i.e. SOX1, had a sensitivity of 100% (95% CI 59-100%), a specificity of 67% (95% CI 52-80%) and an AUC of 0.83 compared to 57% (95% CI 18-90%), 75% (95% CI 60-85%) and 0.66, respectively, for HPV DNA testing.
Some of the more recent methylation studies since 2014 on screening and colposcopy populations had better designs and larger sample sizes, and sometimes they also had distinct training and validation phases. These studies therefore may give more realistic assessments of DNA methylation classifier performance [47,48,49,50,51,52,53,54,55,56,57]. The studies showed sensitivity for CIN2+ or CIN3+ to be in the range of 69-74% for clinician-collected specimens, while the specificity ranged from 66 to 76%.
Combination HPV and Host DNA Testing
Two papers published simultaneously from different teams showed that a combination of methylation measurements of either DAPK1 or EPB41L3 in combination with methylation of HPV16, HPV18, HPV31 or HPV45 provides good triage for women with hrHPV+ test results [32,33,34,35]. The study with the DAPK1- HPV combination on samples from a US colposcopy population, that was enriched with additional cancers from Norway, gave a sensitivity of 80% and a specificity of 89% for detecting CIN2+. In comparison, the study with the EPB41L3- HPV combination on a UK colposcopy population (not enriched for cancers) gave a sensitivity of 90%, a specificity of 49%, a PPV of 51% and an AUC of 0.82 [32,33,34]. The EPB41L3- HPV gene combination (also called the S5 classifier) was validated in a separate large set of hrHPV+ women from a screening population in the UK and gave a sensitivity of 74% (95% CI 59-85%), a specificity of 65% (95% CI 60-70%) and an AUC of 0.78 for CIN2+, while for CIN3+ the sensitivity was 84% (95% CI 62-94%), the specificity was 63% (95% CI 58-68%), and the AUC was 0.84 (table 1) [35]. In comparison to HPV16 and HPV18 genotyping on the same samples, the S5 classifier had a much better triage performance, giving a higher sensitivity and specificity (p < 0.0001).
Virtues and Limitations of DNA Methylation Testing
DNA methylation classifiers are a work in progress. There are many options for improving the classifiers by finding better genes and combing them in more sophisticated weighting combinations. The current performance leaves much to be desired. While some studies have shown an impressive performance, with a sensitivity and specificity close to 100%, this has been in the context of artificial diagnostic scenarios. In rigorous studies with more formal training and validation designs, the sensitivity of methylation testing is usually less than 90%, with an accompanying acceptable specificity and PPV which may range from 50 to 70% and from 20 to 50%, respectively. Some critics may note that methylation assay performance characteristics are quite poor; however, it should be remembered that the data are for triage performance in enriched sets of women with on-going hrHPV infections and genetic and environmental risk factors of unknown intensity and duration. In such settings, point specificity and PPV never approach what may be considered excellent values, and experts who understand the cervical cancer screening field would not expect current performance values of specificity and PPV close to 100%. In comparison to DNA methylation testing, the use of cytology to triage screen-positive women has a similar but slightly better performance, especially when trying to maximize sensitivity (fig. 3). For the moment, in screening programs where cytology is already established and a switch to HPV screening is underway, it seems the better choice to designate cytology as the triage test for hrHPV+ women. However, in the case of new HPV screening programs that are thinking of establishing cytology as the triage test and now need to face the burdensome task of training cytologists and maintaining quality over long periods of time, careful consideration should be given to the alternative of DNA methylation triage.
Scatterplot of DNA methylation studies in table 1 (circles) showing the data in a receiver operator characteristic format, plotting sensitivity (y-axis) vs. 1 - specificity (x-axis). The figure plots methylation data for both CIN2+ and CIN3+ endpoints with all genes combined or individually, when available. In some studies, both CIN2+ and CIN3+ data were given, thus the graph presents some methylation data as multiply counted, for the individual values please refer to table 1. The relationship between the x and y methylation data in the tabulated studies was fitted as a log model with the solid line representing the relationship of best fit. y = 11.387ln(x) + 33.374; R2 = 0.3253. Repeat cytology for triage of women with an earlier diagnosis of ASCUS or worse cytology in the meta-analysis of Arbyn et al. [20] showed a sensitivity of 82% and a specificity of 58% and is indicated by an open square and an arrow. The other open squares depict results obtained from cytology triage testing done in some of the studies shown in table 1. Boers et al. [55]; endpoint CIN2+, sensitivity = 84, specificity = 51; endpoint CIN3+, sensitivity = 91, specificity = 48. De Strooper et al. [56]; endpoint CIN2+, sensitivity = 66, specificity = 84; endpoint CIN3+, sensitivity = 66, specificity = 79. De Strooper et al. [57]; endpoint CIN2/3, sensitivity = 77, specificity = 76; endpoint CIN3, sensitivity = 88, specificity = 76. De Strooper et al. [56]; endpoint CIN2+, sensitivity = 64, specificity = 84; endpoint CIN3+, sensitivity = 64, specificity = 79. De Vuyst et al. [50]; endpoint CIN2+, sensitivity = 95, specificity = 46; endpoint CIN3+, sensitivity = 95, specificity = 46. Verhoef et al. [28]; endpoint CIN2+, sensitivity = 74, specificity = 81; endpoint CIN3+, sensitivity = 78, specificity = 78. Hesselink et al. [40]; endpoint CIN2+, sensitivity = 66, specificity = 79. The specificity estimates for CIN3+ contain additional uncertainty due to lack of information in some studies on whether the CIN2 were included or excluded from the normal/CIN1 category. The red line represents the relationship of best fit: y = 20.46ln(x) + 9.4293; R2 = 0.7022. The relative performance of HPV16 and 18 genotyping for triage of CIN2+ or CIN3+ was obtained from some recent triage studies and is shown by the triangles. Hesselink et al. [27]; endpoint CIN3+, sensitivity = 58%, specificity = 88%. Verhoef et al. [28]; endpoint CIN2+, sensitivity = 59%, specificity = 66%; endpoint CIN3+, sensitivity = 68%, specificity = 66%. Lagos et al. [29]; endpoint CIN2+, sensitivity = 54%, specificity = 66%; endpoint CIN3+, sensitivity = 64%, specificity = 66%. DeVuyst et al. [50]; endpoint CIN2+ or CIN3+, sensitivity = 40%, specificity = 76%. Lorincz et al. [35]; endpoint CIN2+, sensitivity = 54%, specificity = 71%. The model of best fit is shown by the green line and has the equation: y = 5.5308ln(x) + 38.434; R2 = 0.0565. The study by Lagos et al. did not have associated methylation data.
Scatterplot of DNA methylation studies in table 1 (circles) showing the data in a receiver operator characteristic format, plotting sensitivity (y-axis) vs. 1 - specificity (x-axis). The figure plots methylation data for both CIN2+ and CIN3+ endpoints with all genes combined or individually, when available. In some studies, both CIN2+ and CIN3+ data were given, thus the graph presents some methylation data as multiply counted, for the individual values please refer to table 1. The relationship between the x and y methylation data in the tabulated studies was fitted as a log model with the solid line representing the relationship of best fit. y = 11.387ln(x) + 33.374; R2 = 0.3253. Repeat cytology for triage of women with an earlier diagnosis of ASCUS or worse cytology in the meta-analysis of Arbyn et al. [20] showed a sensitivity of 82% and a specificity of 58% and is indicated by an open square and an arrow. The other open squares depict results obtained from cytology triage testing done in some of the studies shown in table 1. Boers et al. [55]; endpoint CIN2+, sensitivity = 84, specificity = 51; endpoint CIN3+, sensitivity = 91, specificity = 48. De Strooper et al. [56]; endpoint CIN2+, sensitivity = 66, specificity = 84; endpoint CIN3+, sensitivity = 66, specificity = 79. De Strooper et al. [57]; endpoint CIN2/3, sensitivity = 77, specificity = 76; endpoint CIN3, sensitivity = 88, specificity = 76. De Strooper et al. [56]; endpoint CIN2+, sensitivity = 64, specificity = 84; endpoint CIN3+, sensitivity = 64, specificity = 79. De Vuyst et al. [50]; endpoint CIN2+, sensitivity = 95, specificity = 46; endpoint CIN3+, sensitivity = 95, specificity = 46. Verhoef et al. [28]; endpoint CIN2+, sensitivity = 74, specificity = 81; endpoint CIN3+, sensitivity = 78, specificity = 78. Hesselink et al. [40]; endpoint CIN2+, sensitivity = 66, specificity = 79. The specificity estimates for CIN3+ contain additional uncertainty due to lack of information in some studies on whether the CIN2 were included or excluded from the normal/CIN1 category. The red line represents the relationship of best fit: y = 20.46ln(x) + 9.4293; R2 = 0.7022. The relative performance of HPV16 and 18 genotyping for triage of CIN2+ or CIN3+ was obtained from some recent triage studies and is shown by the triangles. Hesselink et al. [27]; endpoint CIN3+, sensitivity = 58%, specificity = 88%. Verhoef et al. [28]; endpoint CIN2+, sensitivity = 59%, specificity = 66%; endpoint CIN3+, sensitivity = 68%, specificity = 66%. Lagos et al. [29]; endpoint CIN2+, sensitivity = 54%, specificity = 66%; endpoint CIN3+, sensitivity = 64%, specificity = 66%. DeVuyst et al. [50]; endpoint CIN2+ or CIN3+, sensitivity = 40%, specificity = 76%. Lorincz et al. [35]; endpoint CIN2+, sensitivity = 54%, specificity = 71%. The model of best fit is shown by the green line and has the equation: y = 5.5308ln(x) + 38.434; R2 = 0.0565. The study by Lagos et al. did not have associated methylation data.
It may be predicted that in risk-based algorithms that incorporate cumulative secularly integrated risk information the interval specificity and PPV values will be much higher. Most of the risk puzzle may be solved by delaying colposcopy until an optimal moment, which will become possible with an accurate understanding of the patient's comprehensive risk profile. Strong follow-up programs and the willingness of patients to wait under surveillance until a critical intervention point has been reached are other important components of such a follow-up algorithm. DNA methylation tests need further development and commercialization. Assays need to be simplified and automated. While these are limitations for near-term molecular assay simplification, automation and cost reductions are standard practice in today's technologically sophisticated world and merely represent short-term barriers that are easily and quickly overcome. It is likely to take longer for clinicians to become comfortable with the use of DNA methylation tests and to disseminate their use than for companies to simplify the technologies and make them available routinely.
Future Perspectives
DNA methylation assays are relatively simple, inexpensive and robust. In principle, hundreds of tests can be performed every week by average research laboratory technicians, and robotic equipment can increase the throughput to thousands of specimens per week. Improved methylation assays that have better or additional human genes and other hrHPV types should increase the AUC of the classifiers further. A triage AUC >0.9 may be within reach in a few years and is consistent with a sensitivity of 90% and a specificity of ∼80%. Considering a combination reflex testing approach with hrHPV screening at a sensitivity and specificity of 95 and 90%, respectively, and applying a reflex methylation triage with an AUC of 0.9 to the hrHPV positives, we could expect a final molecular screening algorithm sensitivity and specificity of approximately 85 and 98%, respectively, for CIN2/3 in the first round of screening. Close to 100% of the cervical cancers would be detected [10,18,36]. A way to improve on the sensitivity of 85% for CIN 2/3 is to refer women who are hrHPV+ but methylation-negative to retesting for HPV in 1-2 years, which would identify women with persistent infection and reveal most of the remaining CIN2+ cases. DNA methylation assays have competitive performance versus other triage options today, while some realistic near-term improvements would place methylation testing at the forefront of reflex triage tests for hrHPV+ women. In principle, DNA methylation testing could replace HPV DNA testing altogether, with just modest improvements in test technology. Overall, the performance of methylation assays may be substantially superior if the method can differentiate between CIN2/3 that progresses and lesions that do not. A combination of methylation biomarkers with somatic mutations identified by deep sequencing may produce hybrid classifier panels with additional prognostic features that could allow both certainty of event and estimated time to event calculations. We can expect to see many better assays and also commercially competing methylation assays in the near future.
Acknowledgements
I thank all of the participating patients in my studies. I am indebted to my work colleagues and collaborators whose roles in my studies were of profound importance. Particular thanks go to the staff of the Centre for Cancer Prevention, Wolfson Institute of Preventive Medicine, including Jack Cuzick, Peter Sasieni, Adam Brentnall, Amar Ahmad, Louise Cadman and Janet Austin. I also thank the current and past members of the Molecular Epidemiology Laboratory including Wai King Lau, Natasa Vasiljevic, Caroline Reuter, Dorota Scibior Bentkowska, Rhian Warman, Rawinder Banwait and Paul Carter. The unpublished DNA methylation systematic review and meta-analysis of Helen Kelley was of great value for this review. Interested parties may contact me or Helen for additional details of the review and characteristics of the data selection and mining. This work was supported by a Cancer Research UK program grant (No. C569/A10404). Funders had no role in the review design, collection, analysis, interpretation of data or writing of this review.
Disclosure Statement
There were no conflicts of interests of which the author was aware in the conduct of this review. The author did not receive any payments and does not anticipate any future payments, compensation, financial or beneficial considerations or favours for the conduct of the review or the writing of this paper.