Introduction: The Japan Lung Cancer Society (JLCS) and the Japanese Society of Clinical Cytology (JSCC) have proposed a new four-tiered cytology reporting system for lung carcinoma (JLCS-JSCC system). Prior to the proposal, the Papanicolaou Society of Cytopathology (PSC) had proposed a revised reporting system (PSC system), which comprises the “neoplastic, benign neoplasm, and low-grade carcinoma” category (N-B-LG category), in addition to the 4 categories of the JLCS-JSCC system. This study aimed to evaluate the interobserver agreement of the JLCS-JSCC system with an additional dataset with more benign lesions in comparison with the PSC system. Methods: We analyzed 167 cytological samples, which included 17 benign lesions, obtained from the respiratory system. Seven observers classified these cases into each category by reviewing one Papanicolaou-stained slide per case according to the JLCS-JSCC system and PSC system. Results: The interobserver agreement was moderate in the JLCS-JSCC (k = 0.499) and PSC (k = 0.485) systems. Of the 167 samples, 17 samples were benign lesions: 7 pulmonary hamartomas, 5 sclerosing pneumocytomas, 2 squamous papillomas, one solitary fibrous tumor, one meningioma, and one lymphocytic proliferation. There were diverse sample types as follows: 11 touch smears, 3 brushing smears, 2 aspirations, and one sputum sample. Fourteen samples (82.3%) were categorized into “negative” or “atypical” by more than half of the observers in the JLCS-JSCC system. Conversely, 3 samples were categorized as “suspicious” or “malignant” by more than half of the observers in the JLCS-JSCC system. On the other hand, 11 samples (64.7%) were categorized into the N-B-LG category by more than half of the observers in the PSC system. Conclusions: The concordance rate in the JLCS-JSCC system was slightly higher than that in the PSC system; however, the interobserver agreement was moderate in both the JLCS-JSCC and PSC systems. These results indicate that both the JLCS-JSCC and PSC systems are clinically useful. Therefore, both systems are expected to have clinical applications. It may be important to integrate the 2 systems and construct a universal system that can be used more widely in clinical practice.
Lung cancer is the leading cause of cancer-related deaths worldwide [1, 2]. Pathological and cytological classifications of lung carcinoma are crucial for effective patient management . With the possibility of long-term survival through personalized medicine, cytology is becoming increasingly important, and there is a need for the standardization of guidelines in the field of cytology. However, no widely accepted categorization system for respiratory cytology has been established to date. In 2016, the Papanicolaou Society of Cytopathology (PSC) proposed a pulmonary cytology specimen terminology and revised a classification (PSC system) [4, 5]; it comprises the following 6 categories: “nondiagnostic,” “negative for malignancy,” “atypical,” “neoplastic, benign neoplasm, and low-grade carcinoma,” “suspicious for malignancy,” and “positive for malignancy.” Conversely, the Japan Lung Cancer Society (JLCS) and Japanese Society of Clinical Cytology (JSCC) have proposed a new reporting system for lung carcinoma (JLCS-JSCC system) based on the results evaluated by 7 cytopathologists from multicenters . The JLCS-JSCC reporting system comprises 3 steps (Table 1). First, the samples are classified as adequate for diagnostic evaluation or inadequate, owing to processing and/or staining artifacts. Second, if the samples are adequate, they are classified into the following 4 categories: “negative for malignancy,” “atypical cells,” “suspicious for malignancy,” and “malignancy.” One of the differences between the PSC and JLCS-JSCC systems is regarding the categorization of benign lesions. Our previous study had included a few cases diagnosed as benign neoplasms ; in the present study, we collected additional cytological specimens, including more benign neoplasm cases, from multiple centers in the same manner. This study aimed to confirm the utility of the JLCS-JSCC system with an additional dataset with more benign lesions than in the PSC system.
Materials and Methods
Cytological samples were collected in the same manner as we did previously. Briefly, 167 cytological samples were collected, which were extended from the previous study, from 8 hospitals (Kagawa University Hospital, Osaka Habikino Medical Center, National Organization Hospital Ibarakihigashi National Hospital, Tokyo Medical University Hospital, Osaka International Cancer Institute, National Hospital Organization Osaka National Hospital, National Cancer Center Hospital, and Kitasato University Hospital). The hospitals randomly selected the cases with electronic medical records and collected one Pap-stained glass slide per patient. After excluding cases with an “inadequate” category to evaluate target cells on the provided glass slides, glass slides were sent to a management cytotechnologist (A.T.) with clinical information (age and sex), and the final clinicopathological diagnosis of each case was made at the hospitals. Subsequently, glass slides were randomized and anonymized by the management cytotechnologist and were sent to 7 institutes, including the 3 hospitals, without clinical information and final clinicopathological diagnosis.
Cytotechnologists and cytopathologists from the 7 hospitals classified the slides into each category by reviewing both the JLCS-JSCC system (four-tier method) and PSC system (five-tier method) [5, 6]. Briefly, the JLCS-JSCC system comprises the following categories: negative for malignancy (NM), atypical cells (ACs), suspicious for malignancy (SM), and malignancy (ML); the PSC system comprises the following categories: NM and AC; and the neoplastic, benign neoplasm, and the low-grade carcinoma (N-B-LG) category: SM and ML. The observers had 7–36 years of experience. They entirely evaluated the glass slides, although there were some markings by cytotechnologists on the glass slides. The evaluation was performed after a washout interval of at least over 2 months. After the evaluation, the data were sent for analysis. An agreement was regarded as unanimous if all the observers agreed, sub-unanimous if 6 or 7 observers agreed, and consensus if 4 or more of the observers agreed. Risk of malignancy (ROM) was defined for each category as the number of confirmed malignant cases divided by the total number of cases in the diagnostic category of each observer, and the mean ROM among the observers was calculated.
Fleiss kappa (κ) for overall agreement among all observers was calculated for pairwise comparison between individual observers . The level of agreement based on the kappa statistics was defined as follows: κ < 0.20 as slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as good, and >0.81 as very good agreement . The kappa score was calculated using statistical software CRAN.R irr package (ver.4.1.0).
Patients’ ages ranged from 29 to 88 years; 102 of them were men, and 65 were women. Most of the samples were obtained via bronchial brushing (n = 72, 43.1%), followed by touch smear (n = 27, 16.1%) and transbronchial aspiration cytology (TBAC) (n = 20, 12.0%). Additionally, samples from endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA), sputum, bronchoalveolar lavage, transcutaneous aspiration cytology, and CT-guided needle cytology were also included (Table 2). The final clinicopathological diagnosis was made based on the pathological diagnosis at the hospitals. As biopsy could not be performed, some cases were diagnosed based only on cytology specimens or clinical findings (Table 3). The final clinicopathological diagnoses of the cases without malignancy at the hospitals were as follows: 17 benign tumors (10.2%), 15 negatives (9.0%), 13 nonspecific inflammation (7.8%), 8 granulomas (4.8%), 5 infections (3.0%), 3 interstitial pneumonia (1.8%), and 6 others (3.6%). Malignant tumors comprised 95 primary lung carcinomas and 5 others as follows: 50 adenocarcinomas (29.9%), 21 squamous-cell carcinomas (12.6%), 12 small-cell carcinomas (7.2%), 4 carcinoid tumors (2.4%), 4 not-otherwise-specified carcinomas (2.4%), 2 large-cell neuroendocrine carcinomas (1.2%), 2 salivary gland-type carcinomas (1.2%), 3 metastatic lung tumors (1.8%), and 2 lymphomas (1.2%).
Cytological Diagnosis Agreement among Seven Observers in the JLCS-JSCC and PSC Systems
The distribution of the number of concordant observers in the JLCS-JSCC and PSC systems is presented in Figure 1. The consensus agreement and unanimous agreement in the JLCS-JSCC system were 87.4% and 40.7%, respectively, whereas those in the PSC system were 85.6% and 35.3%, respectively. If the SM and ML categories were combined in each system, sub-unanimous agreement in the JLCS-JSCC and PSC systems was 60.4% and 52.7%, respectively.
The interobserver agreement was moderate in both reporting systems. However, the agreement was better for the JLCS-JSCC system with a kappa of 0.499 (range, 0.405–0.650) than for the PSC system with a kappa of 0.485 (range, 0.379–0.681).
Determination of Benign Lesions Using the JLCS-JSCC and PSC Systems
The cohort contained 17 benign tumor cases (10.1%). Hamartomas were the most common lesion (7/17, 41.1%), followed by sclerosing pneumocytoma (SP) (5/17, 29.4%) and squamous papilloma (PP) (2/17, 11.8%), as well as one meningioma, one solitary fibrous tumor (SFT), and one lymphoid hyperplasia (Table 4). The most common procedure type was touch smear (11/17, 64.7%), and other procedure types included brushing, 3 cases; aspiration, 2; and sputum, one. In the JLCS-JSCC system, most cases (15/17, 88.2%) were of the benign category (NM or AC). A typical image of a hamartoma showing proliferation of single spindle-shaped cells intermingling with the cartilaginous matrix is presented in Figure 2 (case 3). Six cytotechnologists have determined this case to be in the benign category. Three cases, which more than half of the cytotechnologists determined as the SM or ML category, included SP (case 8), PP (case 13), and SFT (case 15). Figure 3 presents the cytological features of SP (case 8), showing a prominent papillary growth pattern (Fig. 3a) with differently sized cuboidal cells (Fig. 3b) admixed with hemosiderin-laden macrophages (Fig. 3c). In some areas, a large nucleus or nuclear grooving was observed (Fig. 3b), and many observers in our study suspected pulmonary adenocarcinoma with these features. The cytological features of the SFT with a cellular area composed of oval to short spindle cells with scant cytoplasm are presented in Figure 4 (case 15). Since this case contained a cellular area with ACs, more than half of the observers suspected nonsmall-cell carcinoma with high-grade atypia. According to the PSC system, 11 cases (64.7%) were categorized as N-B-LG by more than half of the observers (Table 4). The details of each observer’s assessment of N-B-LG category lesions according to the PSC system are presented in Table 5. There were cases wherein 5 of the 7 observers judged the lesion to be malignant, even though they should be in the benign category (average, 26.7%; range, 0–80%).
ROM, Sensitivity, and Specificity
The average ROM in the JLCS-JSCC system according to the 7 observers was 17.7% (13.6–27.5%) for the NM category, 42.8% (21.1–62.5%) for the AC category, 67.1% (50.0–80.0%) for the SM category, and 91.4% (80.9–96.8%) for the ML category (Table 6). It was clearly stratified from the NM category to the ML category, which was the same as our previous finding . If the NM and AC categories and SM and ML categories in the JLCS-JSCC system were regarded as negative and positive results, respectively, the average sensitivity and specificity were 80.6% (range, 65.6–91.0%) and 80.7% (range, 67.0–89.0%), respectively. On the other hand, the average ROM in the PSC system according to the 7 observers was 20.0% (13.6–28.5%) for the NM category, 42.1% (23.5–77.2%) for the AC category, 31.7% (0–80.0%) for the N-B-LG category, 69.6% (59.1–80.0%) for the SM category, and 89.9% (70.4–98.1%) for the ML category.
We demonstrated that the consensus and unanimous agreement of the JLCS-JSCC system were 87.4% and 40.7%, respectively, which were better than those in our prior study and those of the PSC system in our study. Further, the average ROM in the JLCS-JSCC system according to the 7 observers was clearly stratified from the NM category to the ML category, which was the same as our previous finding. For benign lesions, most cases (88.2%) were categorized as negative or atypical in the JLCS-JSCC system. Since our study was extended from our previous study with more benign tumors involved, our results may indicate that the JLCS-JSCC system is acceptable for clinical use.
Since a widely accepted classification scheme for respiratory cytology, which leads to good communication between physicians and cytopathologists, has not been established, the PSC has proposed a revision of the reporting system for pulmonary cytology based on multidisciplinary formulation consisting of pathologists, radiologists, oncologists, and surgeons in 2016 . The categories include nondiagnostic, NM, AC, N-B-LG, SM, and ML. To clarify the utility of the system, some researchers have reported ML risk categories to set up rational patient management. In the report by Layfield et al. , ROMs were 20–43% for the NM category, 54% for AC, 82% for SM, and 77–100% for ML (Table 6). Canberk et al.  have calculated the ROM with 1,290 respiratory cytology samples, with 48.27% for the NM category, 59.09% for AC, 90% for SM, and 89.74% for ML. In our study, the average ROM in the PSC system according to the 7 observers was 20.0% for the NM category, 42.1% for the AC category, 69.6% for the SM category, and 89.9% for the ML category. On the other hand, the average ROM in the JLCS-JSCC system according to the 7 observers was 17.7% for the NM category, 42.8% for AC, 67.1% for SM, and 91.4% for ML. Although the definitions of NM, AC, SM, and ML in the JLCS-JSCC system were not the same as those in the PSC system, similar results were obtained in both systems. These results indicate that the JLCS-JSCC system is efficient for clinical use and that the PSC system can also be used in clinical settings. Thus, both systems are expected to have clinical applications.
Despite the usefulness of the PSC system from the ROM point of view, one of the reasons why the PSC system is difficult to use in clinical practice is because of the “N-B-LG” category. In the PSC system, the benign neoplasm category includes the following lesions: pulmonary hamartomas, PPs, granular cell tumors, hemangiomas, and SPs . Additionally, the “neoplasm-undetermined malignant potential” category includes the following lesions: epithelioid hemangioendotheliomas, clear cell tumors, SPs, meningiomas, Langerhans cell histiocytosis, SFTs, inflammatory myofibroblastic tumors, and myoepithelial neoplasms. However, the categorization of such benign lesions has not been sufficiently evaluated, including in our previous study. Therefore, we evaluated the utility of both systems using the new cohort containing more benign tumors. In this cohort, there were 17 cases of benign lesions: 7 pulmonary hamartomas, 5 PPs, 2 SPs, one SFT, one meningioma, and one lymphoid hyperplasia. SP is a rare benign tumor of the lung and histologically shows papillary, solid, angiomatoid, or sclerotic or combinations of these 4 basic patterns . SP typically presents as single discrete coin lesions on imaging; therefore, it is easy to suspect SP clinically. However, in our study, one of the 5 SPs was diagnosed as a malignant tumor by more than half of the observers in the JLCS-JSCC system; this may be because of cellular ACs that were observed and that the diagnosis was made without clinical information, such as radiology findings. SP can be misdiagnosed as malignant if clinical information is not provided. Moreover, in this cohort, one SFT was included. An SFT is a mesenchymal tumor of the fibroblastic type that can affect virtually any region of the body . Neoplastic cells were arranged in a patternless architecture with alternating hypo- and hypercellular areas and a prominent branching vasculature. Most tumors present as well-defined, slow-growing masses, which can be cured by surgery. In contrast, a small percentage of SFTs (10–25%) behaved in a more aggressive way, with local recurrence . Determining the tumor if it is benign or malignant with cytology and biopsy specimens is difficult if the clinical information, such as age or tumor size, is lacking . In our study, over half of the observers categorized this case as SM or ML. We consider that this decision was reasonable because an SFT is a very rare tumor, and the specimen showed a very cellular area composing of oval to short spindle-shaped cells with scant cytoplasm. Further, there were cases in which 5 of the 7 observers judged the lesion as malignant, although they should be in the benign category. Fundamentally, the purpose of cytology was not to accurately detect benign lesions but to prevent malignant lesions from being overlooked. Taken together, the N-B-LG category may not be appropriate for clinical practice. However, in the current study, the average ROM in the PSC system according to the 7 observers was 31.7% for the N-B-LG category, which was lower than the one for the AC category. This result may be significant for the establishment of the N-B-LG category; however, it may also be because data were obtained from observers with sufficient experience in this field. We believe that the classification scheme should be a universal system that can be used by both experienced and inexperienced cytopathologists worldwide. Therefore, further studies of the N-B-LG category are necessary.
Regarding the nonadequate category, previous studies have shown ROM of nonadequate cases. Layfield et al.  reported that the ROM of the category was 40% (Table 6). On the other hand, Canberk et al.  reported that the ROM of the category was 64.01% in their study of 1,290 respiratory cytology samples. In the JLCS-JSCC system, cases are classified into the nonadequate or adequate category at the first step . Only adequate cases were included in this study; thus, data for the nonadequate cases are unavailable. Since the nonadequate cases have been shown to have a high ROM in previous studies, we intend to study the ROM in such cases in a future study.
The previous cohort included only 2 benign lesions, and thus, we collected more benign lesions in the present study. However, our study still has some limitations. First, as mentioned in our previous study, since clinical information or imaging findings were not provided to the observers, the cytological diagnosis was made only based on Pap-stained slides. This method of cytological evaluation is different from that of cytological diagnosis in daily practice. If clinical information and imaging findings were provided to the observers, cytological agreement among the observers would improve. Second, samples were not consecutive and were selected by cytotechnologists or cytopathologists of the hospitals for the purpose of this interobserver reproducibility study. Especially, cases that involve difficulty in differentiation between benign and malignant lesions were included. This selection bias made cytological diagnosis more difficult than the routine cytological diagnosis in daily practice. Further studies on routine samples using the JLCS-JSCC system are needed to confirm our results. Third, TBACs or EBUS-TBNA, which are increasing in diagnostic practice, were limited in our study. In such cases, the determination of samples whether it is adequate or inadequate is more important than other types of specimens. In our study, inadequate cases were excluded. Further study with samples obtained using TBAC or EBUS-TBNA is needed to generalize our findings.
In conclusion, the JLCS-JSCC system is considered appropriate for clinical use, and the PSC system can also be used in clinical settings; thus, it is hard to consider one as more useful clinically than the other. Therefore, a unified system in the frame of the international activities such as the WHO is needed.
The authors thank Prof. Robert Y. Osamura, Prof. Fernando Schmitt, and Prof. Lukas Bubendorf for inspiring us to establish a new cytology reporting system for lung cancer. The authors also thank Dr. Satoru Shimizu for giving us advice on the statistical analyses of the data.
Statement of Ethics
Study approval statement: The study protocol was approved by the Ethics Committee of Tokyo Women’s Medical University (approval number: 4873; approval date: July 26, 2018) and the Ethics Committee of all participating facilities. Consent was waived for this retrospective study which analyzed cytological specimens with limited clinical information.
Conflict of Interest Statement
The authors have no conflicts of interest to declare.
The JLCS and JSCC covered the expenses of the meeting, postal fee of the glass slides, and fee for the English editing service.
Yoshizawa A., Hiroshima K., Takenaka A., Haba R., Kawahara K., Minami Y., Kakinuma H., Shibuki Y., Miyake S., and Satoh Y contributed to the study concept and design. Haba R., Kawahara K., Miyake S., Kajio K., Kiyonaga K., Matsubayashi J., and Nagao T. provided glass slides. Takenaka A. collected and managed glass slides. Takenaka A., Haba R., Kawahara K., Kakinuma H., Shibuki Y., Miyake S., Kajio K., Kiyonaga K., Nagatomo M., Nishimura S., Mano M., Matsubayashi J., Motoi N., Nagao T., Nakatsuka S., and Yoshida T. reviewed the glass slides. Yoshizawa A., Hiroshima K., and Satoh Y. contributed to analysis and interpretation of data. Yoshizawa A. drafted the manuscript. Hiroshima K., Haba R., Kawahara K., Minami Y., and Satoh Y. contributed to critical revision of the manuscript.
Data Availability Statement
All data generated or analyzed during this study are included in this article. Further inquiries can be directed to the corresponding author.