Abstract
Objective: To evaluate the efficacy of the automated screening system FocalPoint for cervical cytology quality control (QC) rescreening. False-negative rates (FNRs) were evaluated by a multi-institutional retrospective study. Study Design: Cervical cytology slides that had already been reported as negative for intraepithelial lesion or malignancy (NILM) were chosen arbitrarily for FocalPoint rescreening. Slides stratified into the highest 15% probability of being abnormal were rescreened by a cytotechnologist. The slides that were abnormal were reevaluated by a cytopathologist to be false negatives. Results: Rescreening of 12,000 slides, i.e. 9,000 conventional slides and 3,000 liquid-based cytology (LBC) slides, was performed; 9,826 (7,393 conventional and 2,433 LBC) were satisfactory for FocalPoint (2,174 were determined unsatisfactory) and those within the highest 15% of probability (1,496, i.e. 1,123 conventional and 373 LBC) were rescreened. As a result, 117 (96 conventional and 21 LBC) were determined as abnormal (other than NILM) and the FNR was 1.19%. Among these 117 slides, 40 (35 conventional and 5 LBC) were determined as high-grade squamous intraepithelial lesion and greater (HSIL+). Conclusion: Of 117 (1.19%) abnormal slides detected, 40 (0.41%) were determined to be HSIL+. This result suggests that FocalPoint is effective for QC rescreening of cervical cytology.
Introduction
Under the ‘Health Service Law for the Aged', population-based, organized cervical cancer screening using cervical cytology was launched in 1983 nationwide in Japan. Currently, in organized screening, women who are aged ≥20 years are eligible to be tested biennially. For cervical cytology, conventional or liquid-based cytology (LBC) is utilized, but to reduce costs, conventional cytology is predominantly chosen.
The cervical cancer screening rate in 2007 was 24.5% in Japan [1], and is extremely low when compared with Europe (approx. 80%) and the USA (82.6%) [2]. Results in 2007 showed that 3,538,132 Japanese women participated in the organized cervical cancer screening, and of these, 40,023 (1.1%) had slides that were classified as abnormal, i.e. other than negative for intraepithelial lesion or malignancy (NILM) and were recalled. Only 24,153 (60.3%) women participated in the recall examination and 1,921 (0.05%) were diagnosed as having high-grade squamous intraepithelial lesion and greater (HSIL+) [1].
Recently, the government has made efforts to raise awareness for cervical cancer screening. As such, the screening rate has been slowly increasing; in 2013, it was 32.7% [3], and it is expected to continue to rise (the target level is 50%). As a result, the number of specimens will increase.
In Japan, Cytology Quality Control Guidelines [4] state that 10% quality control (QC) rescreening is recommended (arbitrarily chosen). Currently, the specimens outside of the arbitrarily chosen 10% that have been diagnosed as NILM by cytotechnologists are reported without any further evaluation by a cytopathologist. Only abnormal (other than NILM) slides are reevaluated by a cytopathologist. In an ideal situation, for QC, we would be able to rescreen all NILM specimens microscopically and then reevaluate. However, this is statistically impossible due to the sheer number of specimens.
In the last 20 years, several studies have demonstrated that rescreening the highest 10-20% of specimens stratified according to the risk of abnormality by an automated screening system, AutoPap 300 (the predecessor of FocalPoint, NeoPath Inc., Washington, USA), is more accurate and efficient than the 10% random manual rescreening [5,6,7]. The Food and Drug Administration (FDA) approved the AutoPap 300 as a QC rescreening device for cervical cytology in 1996 [8].
In Japan, the reports of 2 studies were published in 2001 showing that the false-negative rates (FNR) determined by using FocalPoint (Nippon Becton Dickinson Co. Ltd., Tokyo, Japan) for QC rescreening was only 0.05% [9] and 0.19% [10], respectively. As a result of these studies, the QC application of FocalPoint has not been widespread.
Considering the growing number of specimens, it is important to find false-negative (FN) specimens more efficiently. This study reviewed the efficiency of the automated screening system FocalPoint for QC rescreening by evaluating the FNRs at multiple institutions.
Materials and Methods
Study Design and Collected Samples
This study was a retrospective, multi-institutional study involving 9 institutions in Japan and was conducted with the approval of the institutional review board of each institution involved. Cervical cytology slides from each institution that had been previously manually screened by a cytotechnologist and reported as NILM from January to December, 2013, were chosen arbitrarily and then sent to the Cancer Institute Hospital for automated rescreening. Both conventional slides and LBC slides were reevaluated.
For the LBC method, the BD SurePath LBC system (Nippon Becton Dickinson Co.) was utilized. The Papanicolau staining procedure was the routine procedure in each institution. In Japan, only physicians are permitted to take smears and the type of physician depends on the facility (smears are mainly taken by gynecologists). The cervical sampling device (cotton swabs, wooden spatula and broom-type brush) differs by facility.
The total number of cervical cytology slides and the number of abnormal slides (other than NILM) in each of the 9 facilities for 2013 (January to December) was also counted.
Rescreening Procedure Using the Automated Screening System
For rescreening purposes, FocalPoint was utilized. This is an automated screening system with FDA approval (in 1996) for use as a QC rescreening device [8]. Essentially, it is an automated microscopy station with on-board computers. It can process approximately 300 conventional or LBC slides within a 24-hour period. Detailed steps are shown in figure 1. Slides that were reported primarily as NILM by a cytotechnologist were then rescreened by the FocalPoint system. The rescreening method was as follows. To begin, the slides were assessed as to whether they could be processed by the machine. Slides that were deemed unsatisfactory by FocalPoint were not rescreened and were excluded from the study. The classification of unsatisfactory slides was the failure of either the specimen preparation procedure, e.g. insufficient staining (process review, PR), or of the specimen collection process, e.g. poor cellularity (scant cellularity, SC) in accordance with the FocalPoint requirements. Only slides that were accepted for rescreening by FocalPoint, henceforth referred to as ‘satisfactory', were incorporated into this study. These satisfactory slides were stratified according to the likelihood of an abnormality, and then those within the highest 15% of probability were labeled ‘for QC review' [8]. After labeling, they were rescreened by 2 experienced cytotechnologists separately and in accordance with the 2001 Bethesda System [11].
Flow chart of the study design. Abnormal: intraepithelial lesion or malignancy.
The slides that were reclassified as abnormal (other than NILM) by at least 1 of the cytotechnologists were then microscopically reevaluated by 3 different cytopathologists independently, in order to determine the final diagnosis. To be determined as FN, this had to be confirmed by at least 2 cytopathologists and have a final diagnosis of abnormal.
Among the slides that were determined as FN, cervical biopsies were performed at some of the facilities (not all), as all cases were first evaluated as NILM and for NILM cases, a biopsy is deemed unnecessary. For the FN slides, manual rescreening by a cytotechnologist and then reevaluation by a cytopathologist are considered the gold standard.
Statistical Analysis
The 9 participating institutions were classified into the following 5 categories: medical health check-up facilities, screening centers (laboratories), general hospitals, university hospitals and centers for cancer.
The FN and HSIL+ rate was calculated using the total number of satisfactory slides (scanned by FocalPoint) as a denominator and the number of slides with FN and HSIL+ as numerators. The occurrence rate of unsatisfactory slides was calculated using the number of all investigated slides as a denominator and the number of certain unsatisfactory slides as numerators. The percentage of slides and the corresponding 95% confidence interval (CI), based on the Clopper-Pearson method, were calculated. The χ2 test was used to compare the proportions between groups. Statistical significance was defined as p < 0.05. All of the data were statistically analyzed using SAS v9.3 (SAS Institute, Inc., Cary, N.C., USA) and R v3.1.2 software (R Foundation for Statistical Computing, Austria).
Results
A total of 12,000 NILM slides including 9,000 conventional slides from all 9 participating facilities and 3,000 LBC slides from 2 facilities (Nos. 5 and 9) were rescreened (table 1). There were 2,174 slides that were disqualified as unsatisfactory by FocalPoint. Of the satisfactory 9,826 slides, 1,496 that scored within the highest 15% of probability were extracted for reevaluation (QC review; table 1).
False-Negative Rate
In total, 1,496 slides were extracted and then rescreened by cytotechnologists and 182 (1.9%) were reclassified as abnormal. After independent reevaluation by 3 cytopathologists, 117 of 182 slides (1.19%, 95% CI 0.99-1.43) were determined as FN (table 1: FN, other than NLIM). Of the FN slides, 40 [0.41%, 95% CI 0.29-0.55; atypical squamous cells, cannot exclude HSIL (ASC-H): 22; HSIL: 14; HSIL + atypical glandular cells (AGC): 1; squamous cell carcinoma (SCC): 3] were determined as HSIL+ (table 1: HSIL+). For the HSIL+ slides, the main cause of FN was the presence of a few abnormal cells with a hemorrhagic or inflammatory background.
There were 65 slides that were diagnosed as abnormal by the cytotechnologists and were evaluated as NILM by the cytopathologists. The cells were reevaluated by the cytopathologists as benign cells, e.g. squamous metaplastic cells, reactive squamous cells with slightly enlarged nucleus or with perinuclear halo without atypia, metaplastic endocervical cells, reserve cells, etc.
The number of FNs in the subgroup of conventional slides was 96 (1.30%, 95% CI 1.05-1.58) and the number of FNs in the subgroup of LBC slides was 21 (0.86%, 95% CI 0.51-1.32). There was no significant difference between the FNRs determined by the conventional and LBC methods (p = 0.108). The number of HSIL+ in the subgroup conventional slides was 35 (0.47%, 95% CI 0.33-0.66; ASC-H: 18; HSIL: 13; HSIL + AGC: 1; SCC: 3) and that in the subgroup LBC slides was 5 (0.21%, 95% CI 0.07-0.48; ASC-H: 4; HSIL: 1). There was no significant difference between the HSIL+ rate of the conventional method and LBC method (p = 0.107).
Facility-specific FNR for the conventional method varied from 0.13 to 4.0%, demonstrating the differences across facilities (p < 0.001; table 2). The FNR range for the LBC method was 1.28-0.60%, showing little difference between facilities (p = 0.081; table 2). The FNR in each facility category is shown in figure 2. The FNR for both university hospitals and centers for cancer was significantly higher than for the other facilities (p < 0.001).
Facility category-specific FNRs for all slides: 0.53% for medical health check-up (MHC) facilities, 1.01% for screening centers, 0.75% for general hospitals, 1.14% for university hospitals and 1.88% for centers for cancer.
Facility category-specific FNRs for all slides: 0.53% for medical health check-up (MHC) facilities, 1.01% for screening centers, 0.75% for general hospitals, 1.14% for university hospitals and 1.88% for centers for cancer.
Occurrence Rate of Unsatisfactory Slides
Detailed information about the unsatisfactory slides in each facility is summarized in table 3. The number of unsatisfactory slides for all slides assessed was 2,174 (18.1%); 1,113 (9.3%) were disqualified due to the failure of the specimen preparation procedure (PR) and 1,061 (8.8%) were disqualified due to the failure of the specimen collection process (SC).
The number of unsatisfactory slides for the conventional and LBC methods was 1,607 (17.9%) and 567 (18.9%), respectively, and there was no significant difference between methods (p = 0.198). Furthermore, the number of disqualifications due to PR for the conventional and LBC methods was 753 (8.4%) and 360 (12.0%), respectively, showing a significantly higher occurrence with the LBC method (p < 0.001). The number of disqualifications due to SC for the conventional and LBC methods was 854 (9.5%) and 207 (6.9%), respectively, showing a significantly higher occurrence with the conventional method (p < 0.001).
These facility-specific proportions of disqualification for the conventional and LBC methods varied by 3.5-33.2 and 5.9-25.4% respectively, demonstrating the differences between each of the facilities (both p < 0.001).
Discussion
In this study, cervical cytology slides that were classified as NILM were rescreened utilizing the FocalPoint system. Over the course of our rescreening, 117 (1.19%) FNs were detected and 40 (0.41%) of these were diagnosed as HSIL+.
Previous studies in Japan indicated that the utilization of FocalPoint for QC rescreening purposes demonstrated that the FNR of a screening center (laboratory) was only 0.05% [9] and that of 5 medical health check-up facilities was 0.19% [10]. For this study, rescreening was also conducted in several other types of facilities. Our study determined that the FNR is 1.19%, which is higher than in the previous studies. On the other hand, studies in the USA observed the FNR to be 1.43% [5], 1.18% [6], 1.29% [12] and 1.13% [13]; the FNR in our study is similar to these.
In addition, the occurrence rate of HSIL+ in all FN specimens (HSIL+/FN rate) in this study was 34.2% (40/117). Several US studies, which also utilized FocalPoint and were performed under similar conditions to ours, determined that the HSIL+/FN rates were 8.00% (4/50, QC = 13.8%) [5], 7.75% (11/142, QC = 20%) [6], 15.90% (7/44, QC = 10%) [12] and 3.68% (13/353, QC = 10%) [13]; the HSIL+/FN rate (34.2%) in our study is, however, significantly higher than these.
Previous studies in the USA on QC with 10% random manual rescreening compared with FocalPoint showed that the FN detection rate was 3-8 times higher when rescreened by FocalPoint [5,6,7,12,13,14,15]. When only HSIL+ cases were defined as FN, the rates were still 3.5-5.2 times higher [15,16,17]. This suggests that FocalPoint is more effective in determining if a specimen is either HSIL+ or FN. As our FNR was similar to past studies, a comparison with 10% manual rescreening was not conducted.
In this study, the FNRs using different specimen preparation methods were also calculated. The FNR with the conventional method was 1.30% and the FNR with the LBC method was 0.86%. From this, we see that there is no significant difference. However, in the 2 facilities that implemented both of these methods, the number of FN slides with the conventional method was 37/2,391 (1.55%; online suppl. table 1; for all online suppl. material, see www.karger.com/doi/10.1159/000449499), higher than with the LBC method at 21/2,433 (0.86%; online suppl. table 1). A previous study in the USA reported that the FNR with the conventional method was 11% and the FNR of the LBC method (BD SurePath) was 5%, i.e. the former is much higher than the latter [18]. The differences between the conventional and LBC methods could also be clarified by a comparison of the facilities.
The facility-specific FNR varied from 0.13 to 4.0% (that for the university hospital and the center for cancer was much higher). A higher rate is due to the number of potential abnormal cases being higher than in other facilities (online suppl. table 2). Thus, the FNR depends on the prevalence of potential abnormalities.
In this study, the number of unsatisfactory slides was 2,174 (18.1%) and 1,113 (9.3%) were disqualified due to the failure of the specimen preparation procedure (PR), and 1,061 (8.8%) were disqualified due to the failure of the specimen collection process failure (SC).
Previous studies in Japan reported that the disqualification rate for FocalPoint was 12.3% for the screening center [9] and 10.3% (PR: 8.1%; SC: 2.2%) for the medical health check-up facilities [10]. Compared with the studies in the USA, where the disqualification rate was 2.5 [5], 4.8 [13] and 5.9% (PR rate only) [6]. The disqualification rates observed in Japan, including our study, are much higher than that in the USA.
In a recent study from Portugal, the disqualification rate of the conventional method was 30.8%, mostly resulting from PR, such as staining deficiency or smears being too thick [19]. Reviewing the unsatisfactory slides in our study according to specimen preparation method, the PR was higher with the LBC method (12.0%) than with the conventional method (8.4%) and the SC was higher with the conventional method (9.5%) than with the LBC method (6.9%); the reason the SC was higher is presumed to be that cotton swabs were used for collecting the specimens in 4 of the 9 participating facilities. The facility-specific disqualification rate for the conventional method varied from 3.5 to 33.2% and the facilities using cotton swabs (Nos. 3, 7, 8 and 9) showed a higher tendency to have SC. As for the 2 facilities implementing the LBC method, the disqualification rate of facility No. 5 was 5.95% and that of the facility No. 9 was 25.4%. Facility No. 5 had a much lower rate as a result of FocalPoint being previously introduced for QC purposes and because they had been making efforts to ensure that all slides are satisfactory (adjusting the staining conditions to minimize unsatisfactory slides). In any case, to minimize the number unsatisfactory slides, it is suggested that the devices and methods to be used for specimen collection should be carefully considered and that the specimen preparation procedure should be intended for FocalPoint.
There are limitations to this study. (1) When determining FN slides, manual rescreening by cytotechnologists and cytopathologists was used as the gold standard and any histological diagnosis for FNs was conducted. (2) Microscopic reevaluations of unsatisfactory slides (18.1%) were not conducted, so the reason for disqualification and the possibility of FN in the unsatisfactory slides was not assessed. (3) For the LBC method, only the BD SurePath LBC system was used and LBC slides processed by other instruments were not examined. (4) Although this was a multi-institutional study, it was retrospective and specimen collection devices and preparation procedures utilized by each of the participating facilities differed. (5) This was not a prospective case-control study comparing the currently recommended 10% random rescreening with the conditions used in this study.
Despite the limitations listed above, we found 117 FN slides through this multi-institutional study and also determined that 40 of these slides were HSIL+ that potentially requires clinical treatment. This demonstrates that the use of FocalPoint is effective for the QC rescreening method of cervical cytology and to prevent any potential FNs from being misdiagnosed.
Acknowledgements
We would like to thank Drs. Toshiko Jobo, Kuniko Iihara and Masako Suzuki for their contributions to the re-evaluations, Mr. Shigeharu Hatakeyama, Mr. Shigenori Ohtsuka and Mr. Shinji Miyake for their contributions to the rescreening process, Dr. Kiyoko Kato, Dr. Tetsuro Oishi, Mr. Kisaburo Ueno, Mr. Minoru Tagami and Mr. Eisaku Toji for their contributions to slide preparation, Mr. Junzo Fujiyama for his contributions to the FocalPoint operation and, finally, Mr. Tatsuo Kagimura for contributions to the statistical analysis. The Translational Research Informatics Center analyzed the existing dataset that other investigators had collected and guaranteed the quality.
This study was supported by funding from the Japanese Society of Clinical Cytology for the use of automated screening systems for quality control purposes regarding cervical cytology.
Disclosure Statement
The authors of this study have no relevant financial interests to report.