Abstract
Introduction: Since no universal cytological classification system for lung cancer has been established, the Japanese Lung Cancer Society and the Japanese Society of Clinical Cytology (JSCC) jointly established and reported four cytological categories: negative for malignancy, atypical cells, suspicious for malignancy, and malignancy. In 2022, the WHO Reporting System for Lung Cytopathology was published. This system presented five cytological classifications, including the four cytological category classifications above and insufficient/inadequate/nondiagnostic. The creation of a classification alone is not practical in actual clinical practice. Thus, we evaluated the reproducibility of the classification through tutorials and identified the issues and problems involved in the wide dissemination of this classification. Methods: Forty-two cases were selected from those used in previously published articles, and diagnosis and tutorial systems were created. The first diagnostic round and tutorial and the second diagnostic round were conducted on the web. Participants were recruited via the JSCC website and emails. Images (×100 and ×400) of the lesions to be diagnosed were categorizing by 4 cytological categories (benign, atypical, suspicious for malignancy, malignant), 7 suggestive pathological diagnoses, and 4 cytological features. The mean correct or incorrect answer rates for the 42 cases and the mean correct response rates for 105 participants were compared between the first and second rounds using McNemar’s test and t tests to identify cases with diagnostic difficulties and high tutorial effects. Results: Comparing the correct response to cytological categories, the results showed that 17 of 42 cases improved significantly. The mean number of correct answers for the four cytological categories increased significantly from 16.0 (38.1%) in the first round to 20.3 (48.3%) in the second round (p < 0.001). For the seven suggestive pathological diagnoses, the mean number of correct answers increased significantly from 20.3 (48.3%) in the first round to 25.1 (59.8%) in the second round (p < 0.001). The mean number of correct responses increased significantly from 40.2 (38%) in the first round to 51.5 (49%) in the second round (p = 0.0147). Four cases were difficult to match even after the tutorial and three cases were highly affected by the tutorial. The most important basis for diagnoses was nuclear findings in the first and second rounds. Conclusion: Comprehensive tutorials on diagnostic criteria are needed to effectively implement this system globally. In particular, devising ways to appropriately diagnose cancers with mild atypia or without characteristic morphology is important.
Introduction
Histological and cytological classifications are essential for diagnosing and managing lung cancer. Cytology specimens of lung cancer are useful for morphological and genetic diagnosis even when tissue samples are unavailable. However, no universal category classification for lung cytopathology, such as histological classification, was available. Conversely, the Papanicolaou Society of Cytopathology in the USA published a category classification [1‒3], and Japan also used a separate cytological classification system [4, 5]. Thus, categorization was different in each country. Developing a common cytological classification system that could be used in articles and presentations at academic conferences was necessary. Therefore, the Japanese Lung Cancer Society and the Japanese Society of Clinical Cytology (JSCC) jointly developed the lung cytological classification system, which included negative for malignancy, atypical cells, suspicious for malignancy, and malignancy categories [6]. Yoshizawa et al. performed a follow-up study using these four categories plus benign tumors and confirmed the usefulness of this system [7]. In 2022, the WHO Reporting System for Lung Cytopathology was published. This system was based on the Japanese system and included the following cytology categories: insufficient/inadequate/nondiagnostic, benign, atypical, suspicious for malignancy, and malignant [8‒10]. Diagnostic categories in cytological reporting system must be linked to diagnostic management recommendations to improve communication with clinicians and support patient care. The WHO reporting system must be truly international to serve the needs of patients worldwide in many differently medically resourced settings.
However, the mere creation of a classification system is not practical in actual clinical practice. Thus, we evaluated the reproducibility of the classification system through a tutorial and investigated the issues and problems associated with the wide dissemination of this classification system. This study is the first reproducibility test of the WHO Reporting System for Lung Cytopathology and is the first step toward international standardization.
Material and Methods
Study Cohort
Forty-two cases from our previous analyses were selected for this study [6, 7]. All cases were histologically confirmed by biopsies or surgical tissues. The cases were selected through discussions among four cytopathologists and six cytotechnologists and consisted of a mixture of easily diagnosed and difficult-to-diagnose cases. Benign, inflammatory, and malignant cases were selected, including 10 benign, 12 atypical, 12 suspicious for malignancy, and 8 malignant cases. These included 26 brushing samples, 6 touch preparations of tumors, 4 lavage specimens, 3 sputum samples, 2 transbronchial aspiration samples and 1 washing forceps specimen. All specimens were fixed in 95% ethanol and stained with Papanicolaou staining.
No inadequate/inadequate/nondiagnosis case was included in this study. The correct answers for these cases were given in the previous studies [6, 7]. The cases are listed in Table 1.
Case summary
Case No. . | Cytological category . | 1st round* . | 2nd round** . | p value . | Pathological diagnosis . | 1st round* . | 2nd round . | p value . |
---|---|---|---|---|---|---|---|---|
analyzed difference of number of correct response for cytological category . | correct response for 5 suggestive pathological diagnoses was performed in “suspicious for malignancy” and “malignant” cases only . | |||||||
1 | Suspicious for malignancy | 15 (14.3%) | 29 (27.6%) | 0.022 | SQCC | 32 (30.5%) | 49 (46.7%) | 0.017 |
2 | Benign | 75 (71.4%) | 83 (79%) | 0.170 | Normal cell | |||
3 | Atypical | 11 (10.5%) | 46 (43.8%) | <0.001 | Reactive | |||
4 | Malignant | 44 (41.9%) | 50 (47.6%) | 0.471 | Mucoepidermoid carcinoma | 22 (21.0%) | 40 (38.1%) | 0.007 |
5 | Atypical | 38 (36.2%) | 51 (48.6%) | 0.061 | Granuloma | |||
6 | Suspicious for malignancy | 15 (14.3%) | 36 (34.3%) | 0.001 | ADC | 61 (58.1%) | 72 (68.6%) | 0.136 |
7 | Benign | 87 (82.9%) | 93 (88.6%) | 0.264 | Interstitial pneumonia | |||
8 | Suspicious for malignancy | 26 (24.8%) | 38 (36.2%) | 0.067 | ADC | 79 (75.2%) | 82 (78.1%) | 0.710 |
9 | Atypical | 28 (26.7%) | 28 (26.7%) | 1.000 | Interstitial pneumonia | |||
10 | Suspicious for malignancy | 19 (18.1%) | 34 (32.4%) | 0.021 | ADC | 41 (39.0%) | 47 (44.9%) | 0.361 |
11 | Malignant | 60 (57.1%) | 72 (68.6%) | 0.074 | Adenoid cystic carcinoma | 48 (45.7%) | 75 (71.4%) | <0.001 |
12 | Atypical | 35 (33.3%) | 53 (50.5%) | 0.010 | Interstitial pneumonia | |||
13 | Suspicious for malignancy | 28 (26.7%) | 39 (37.1%) | 0.127 | ADC | 70 (66.7%) | 75 (71.4%) | 0.499 |
14 | Malignant | 69 (65.7%) | 74 (70.5%) | 0.472 | ADC | 87 (82.9%) | 87 (82.9%) | 1.000 |
15 | Benign | 31 (29.5%) | 59 (56.2%) | <0.001 | Sclerosing pneumocytoma | |||
16 | Atypical | 31 (29.5%) | 39 (37.1%) | 0.302 | Radiation pneumonia | |||
17 | Suspicious for malignancy | 33 (31.4%) | 43 (41%) | 0.155 | ADC | 39 (37.1%) | 45 (42.9%) | 0.391 |
18 | Benign | 91 (86.7%) | 96 (91.4%) | 0.302 | Sarcoidosis | |||
19 | Atypical | 30 (28.6%) | 48 (45.7%) | 0.007 | Organizing pneumonia | |||
20 | Suspicious for malignancy | 35 (33.3%) | 42 (40%) | 0.324 | ADC | 70 (66.7%) | 80 (76.2%) | 0.078 |
21 | Malignant | 94 (89.5%) | 93 (88.6%) | 1.000 | ADC | 69 (57.1%) | 65 (61.9%) | 0.499 |
22 | Benign | 98 (93.3%) | 96 (91.4%) | 0.752 | Hamartoma | |||
23 | Benign | 47 (44.8%) | 58 (55.2%) | 0.100 | Granuloma | |||
24 | Suspicious for malignancy | 18 (17.1%) | 34 (32.4%) | 0.008 | SQCC | 25 (23.8%) | 38 (36.2%) | 0.031 |
25 | Malignant | 94 (89.5%) | 89 (84.8%) | 0.267 | SCLC | 100 (95.2%) | 90 (85.7%) | 0.024 |
26 | Suspicious for malignancy | 26 (24.8%) | 40 (38.1%) | 0.040 | ADC | 58 (55.2%) | 77 (73.3%) | 0.001 |
27 | Atypical | 31 (29.5%) | 47 (44.8%) | 0.021 | Hypersensitivity pneumonitis | |||
28 | Benign | 33 (31.4%) | 51 (48.6%) | 0.004 | Meningioma | |||
29 | Malignant | 63 (60%) | 67 (63.8%) | 0.571 | Carcinoid | 66 (62.9%) | 80 (76.2%) | 0.014 |
30 | Atypical | 11 (10.5%) | 30 (28.6%) | 0.002 | Eosinophilic pneumonia | |||
31 | Suspicious for malignancy | 20 (19%) | 32 (30.5%) | 0.067 | ADC | 88 (83.8%) | 91 (86.7%) | 0.579 |
32 | Atypical | 34 (32.4%) | 44 (41.9%) | 0.155 | Acute pneumonia | |||
33 | Benign | 29 (27.6%) | 35 (33.3%) | 0.307 | Papilloma | |||
34 | Suspicious for malignancy | 36 (34.3%) | 38 (36.2%) | 0.871 | ADC | 88 (83.8%) | 90 (85.7%) | 0.480 |
35 | Atypical | 25 (23.8%) | 39 (37.1%) | 0.035 | Acute interstitial pneumonia | |||
36 | Malignant | 68 (64.8%) | 79 (75.2%) | 0.063 | SQCC | 99 (94.3%) | 104 (99.0%) | 0.074 |
37 | Atypical | 26 (24.8%) | 47 (44.8%) | 0.002 | Interstitial pneumonia | |||
38 | Benign | 7 (6.7%) | 15 (14.3%) | 0.061 | Granuloma | |||
39 | Malignant | 60 (57.1%) | 54 (51.4%) | 0.391 | ADC | 63 (60.0%) | 76 (72.4%) | 0.012 |
40 | Benign | 29 (27.6%) | 43 (41%) | 0.040 | Solitary fibrous tumor | |||
41 | Atypical | 5 (4.8%) | 26 (24.8%) | <0.001 | Inflammation | |||
42 | Suspicious for malignancy | 35 (33.3%) | 54 (51.4%) | 0.007 | ADC | 68 (64.8%) | 69 (65.7%) | 1.000 |
Case No. . | Cytological category . | 1st round* . | 2nd round** . | p value . | Pathological diagnosis . | 1st round* . | 2nd round . | p value . |
---|---|---|---|---|---|---|---|---|
analyzed difference of number of correct response for cytological category . | correct response for 5 suggestive pathological diagnoses was performed in “suspicious for malignancy” and “malignant” cases only . | |||||||
1 | Suspicious for malignancy | 15 (14.3%) | 29 (27.6%) | 0.022 | SQCC | 32 (30.5%) | 49 (46.7%) | 0.017 |
2 | Benign | 75 (71.4%) | 83 (79%) | 0.170 | Normal cell | |||
3 | Atypical | 11 (10.5%) | 46 (43.8%) | <0.001 | Reactive | |||
4 | Malignant | 44 (41.9%) | 50 (47.6%) | 0.471 | Mucoepidermoid carcinoma | 22 (21.0%) | 40 (38.1%) | 0.007 |
5 | Atypical | 38 (36.2%) | 51 (48.6%) | 0.061 | Granuloma | |||
6 | Suspicious for malignancy | 15 (14.3%) | 36 (34.3%) | 0.001 | ADC | 61 (58.1%) | 72 (68.6%) | 0.136 |
7 | Benign | 87 (82.9%) | 93 (88.6%) | 0.264 | Interstitial pneumonia | |||
8 | Suspicious for malignancy | 26 (24.8%) | 38 (36.2%) | 0.067 | ADC | 79 (75.2%) | 82 (78.1%) | 0.710 |
9 | Atypical | 28 (26.7%) | 28 (26.7%) | 1.000 | Interstitial pneumonia | |||
10 | Suspicious for malignancy | 19 (18.1%) | 34 (32.4%) | 0.021 | ADC | 41 (39.0%) | 47 (44.9%) | 0.361 |
11 | Malignant | 60 (57.1%) | 72 (68.6%) | 0.074 | Adenoid cystic carcinoma | 48 (45.7%) | 75 (71.4%) | <0.001 |
12 | Atypical | 35 (33.3%) | 53 (50.5%) | 0.010 | Interstitial pneumonia | |||
13 | Suspicious for malignancy | 28 (26.7%) | 39 (37.1%) | 0.127 | ADC | 70 (66.7%) | 75 (71.4%) | 0.499 |
14 | Malignant | 69 (65.7%) | 74 (70.5%) | 0.472 | ADC | 87 (82.9%) | 87 (82.9%) | 1.000 |
15 | Benign | 31 (29.5%) | 59 (56.2%) | <0.001 | Sclerosing pneumocytoma | |||
16 | Atypical | 31 (29.5%) | 39 (37.1%) | 0.302 | Radiation pneumonia | |||
17 | Suspicious for malignancy | 33 (31.4%) | 43 (41%) | 0.155 | ADC | 39 (37.1%) | 45 (42.9%) | 0.391 |
18 | Benign | 91 (86.7%) | 96 (91.4%) | 0.302 | Sarcoidosis | |||
19 | Atypical | 30 (28.6%) | 48 (45.7%) | 0.007 | Organizing pneumonia | |||
20 | Suspicious for malignancy | 35 (33.3%) | 42 (40%) | 0.324 | ADC | 70 (66.7%) | 80 (76.2%) | 0.078 |
21 | Malignant | 94 (89.5%) | 93 (88.6%) | 1.000 | ADC | 69 (57.1%) | 65 (61.9%) | 0.499 |
22 | Benign | 98 (93.3%) | 96 (91.4%) | 0.752 | Hamartoma | |||
23 | Benign | 47 (44.8%) | 58 (55.2%) | 0.100 | Granuloma | |||
24 | Suspicious for malignancy | 18 (17.1%) | 34 (32.4%) | 0.008 | SQCC | 25 (23.8%) | 38 (36.2%) | 0.031 |
25 | Malignant | 94 (89.5%) | 89 (84.8%) | 0.267 | SCLC | 100 (95.2%) | 90 (85.7%) | 0.024 |
26 | Suspicious for malignancy | 26 (24.8%) | 40 (38.1%) | 0.040 | ADC | 58 (55.2%) | 77 (73.3%) | 0.001 |
27 | Atypical | 31 (29.5%) | 47 (44.8%) | 0.021 | Hypersensitivity pneumonitis | |||
28 | Benign | 33 (31.4%) | 51 (48.6%) | 0.004 | Meningioma | |||
29 | Malignant | 63 (60%) | 67 (63.8%) | 0.571 | Carcinoid | 66 (62.9%) | 80 (76.2%) | 0.014 |
30 | Atypical | 11 (10.5%) | 30 (28.6%) | 0.002 | Eosinophilic pneumonia | |||
31 | Suspicious for malignancy | 20 (19%) | 32 (30.5%) | 0.067 | ADC | 88 (83.8%) | 91 (86.7%) | 0.579 |
32 | Atypical | 34 (32.4%) | 44 (41.9%) | 0.155 | Acute pneumonia | |||
33 | Benign | 29 (27.6%) | 35 (33.3%) | 0.307 | Papilloma | |||
34 | Suspicious for malignancy | 36 (34.3%) | 38 (36.2%) | 0.871 | ADC | 88 (83.8%) | 90 (85.7%) | 0.480 |
35 | Atypical | 25 (23.8%) | 39 (37.1%) | 0.035 | Acute interstitial pneumonia | |||
36 | Malignant | 68 (64.8%) | 79 (75.2%) | 0.063 | SQCC | 99 (94.3%) | 104 (99.0%) | 0.074 |
37 | Atypical | 26 (24.8%) | 47 (44.8%) | 0.002 | Interstitial pneumonia | |||
38 | Benign | 7 (6.7%) | 15 (14.3%) | 0.061 | Granuloma | |||
39 | Malignant | 60 (57.1%) | 54 (51.4%) | 0.391 | ADC | 63 (60.0%) | 76 (72.4%) | 0.012 |
40 | Benign | 29 (27.6%) | 43 (41%) | 0.040 | Solitary fibrous tumor | |||
41 | Atypical | 5 (4.8%) | 26 (24.8%) | <0.001 | Inflammation | |||
42 | Suspicious for malignancy | 35 (33.3%) | 54 (51.4%) | 0.007 | ADC | 68 (64.8%) | 69 (65.7%) | 1.000 |
Including the average number and percentage of correct responses among the 105 participants.
McNemar’s test.
*1st round: cytological diagnosis before tutorial.
**2nd round: cytological diagnosis after tutorial.
Research Participants
Participants were members of JSCC and qualified as cytotechnologists or cytopathologists. Participants were recruited through advertisements on the JSCC website and through the mailing list of 10,921 JSCC members. Written informed consent for the collection of participant data was obtained when a request for participation was received. The following data about the participants were collected: participant name, facility name, age, facility type, years of cytology experience, job title, and specialty. Participant names and facility names were collected only to assign personal numbers for participation in the study. Personal data were anonymized by assigning a personal number.
Study Design
A web-based response system using WEBCAS (WOW WORLD Inc., Tokyo, Japan) was developed. A ×100 and a ×400 image centered on the lesion to be diagnosed was included for each of the 42 cases. The images were classified based on the following: 4 categories, including 10 benign, 12 atypical, 12 suspicious for malignancy, and 8 malignant lesions; 7 suggestive pathological diagnosis meaning the histological type inferred from the morphology of the cytology specimen, including 5 benign tumors, 17 inflammatory changes, 13 adenocarcinoma (ADC), 3, squamous cell carcinoma (SQCC), 1 small-cell carcinoma (SCLC), 0 large-cell neuroendocrine carcinoma, and 3 other malignant tumors; and 4 choices for the first and second basis for the diagnosis, including cell cluster, cytoplasmic, nuclear findings, or other. All answers were collected online; the first response system was disclosed at 4 weeks. One week after the end of the first round, the correct answers and tutorial were presented for 2 weeks. The tutorial comprised a PDF file available online for 2 weeks, detailing the cytological findings and key observation points based on case images from the study and allowing participants to access and study the material at their convenience. The second round started 4 weeks after the end of the presentation of correct answers and tutorial. The cases and the questions for the second round were the same as the first round (Table 1; Fig. 1). This study schedule was designed based on the results of the Ebbinghaus forgetting curve of human memory [11].
Test schedule and participants. The outline of the study was announced by email and on the website of the Japanese Society for Clinical Cytology. Two weeks were set aside for accepting participants, and 180 applications were received. The study system was then opened for 4 weeks, and the first round was conducted with 140 participants (77.8% of the applicants). One week after the completion of the study, the correct answers were presented for 2 weeks as an educational period. The second round was conducted 4 weeks after the end of the presentation of correct answers. The second round included 105 participants (58.3% of the applicants and 75% of the participants in the first round).
Test schedule and participants. The outline of the study was announced by email and on the website of the Japanese Society for Clinical Cytology. Two weeks were set aside for accepting participants, and 180 applications were received. The study system was then opened for 4 weeks, and the first round was conducted with 140 participants (77.8% of the applicants). One week after the completion of the study, the correct answers were presented for 2 weeks as an educational period. The second round was conducted 4 weeks after the end of the presentation of correct answers. The second round included 105 participants (58.3% of the applicants and 75% of the participants in the first round).
Calculation Method
The correct and incorrect answer rates were defined as the number of correct and incorrect answers divided by the total number of cases. The correct response rates were defined as the number of correct respondents divided by the total number of participants.
Assessment Factors
We compared the mean correct and incorrect answer rates for the 42 cases among the participants before and after the tutorial, and the mean correct response rates for the 105 participants among the cases. Cases that were difficult to match were extracted, and diagnostic points were examined. To evaluate the impact of the tutorial based on cytology experience, we analyzed changes in correct response rates after the tutorial based on years of experience (19 participants with 1–5, 15 with 6–10, 26 with >10 years, 19 with >20 years, and 26 with >30 years of experience).
Statistics
McNemar’s test was used to analyze difference of number of correct response for cytological category. Paired Student t tests were used to compare mean correct and incorrect answer rates and mean correct response rates. To examine the effects of the tutorial on each category and to identify cases with difficult-to-concordant diagnoses and cases with high tutorial effects, the first and second correct response rates were compared for each of the 42 cases using McNemar’s tests. A p value of <0.05 was considered significant.
Results
Study Participants
The first round consisted of 180 research applicants and 140 participants (77.8% applicants). The second round consisted of 105 participants (58.3% of applicants, 75% of first round participants) (Fig. 1). Results from 105 participants were analyzed.
Comparison of the Correct Answers by Experience
Significant difference was found only between less than 5 years and 6–10 years (p = 0.03).
Comparison of the Correct Response for Cytological Category
As shown in Table 1, the results showed that 17 of 42 cases improved significantly, especially 8 of 12 atypical cases and 6 of 12 suspicious for malignancy cases.
Comparison of the Correct Answers per Participant
As shown in Table 2 and Figure 2, the mean number of correct answers for cytologic diagnostic categories increased from 16.0 of 42 cases (38.1%) in the first round to 20.3 of 42 cases (48.3%) in the second round (p < 0.001). In the case of the benign, the atypical, and the suspicious for malignancy, the mean number and rate of correct answers increased significantly between rounds 1 and 2 (p < 0.001). Conversely, for malignant cases, the mean number and rate of correct answers increased between the first and the second rounds; however, the difference was not statistically significant (p = 0.223).
Mean number and percentage of correct and incorrect answers in each category among the 42 cases
Cytological category . | Answers . | Incorrect cytological category . | 1st round * . | 2nd round ** . | p value . |
---|---|---|---|---|---|
All (42) | Correct | 16.0 (38.1%) | 20.3 (48.3%) | <0.001 | |
Benign (10) | Correct | 5.0 (50%) | 6.0 (60%) | <0.001 | |
Incorrect | Atypical | 2.1 (21%) | 2.2 (22%) | 0.273 | |
Suspicious for malignancy | 1.1 (11%) | 1.0 (10%) | 0.157 | ||
Malignant | 1.8 (18%) | 0.87 (8.7%) | <0.001 | ||
Atypical (12) | Correct | 2.9 (24.2%) | 5.0 (41.7%) | <0.001 | |
Incorrect | Benign | 2.5 (21%) | 2.7 (22.5%) | 0.278 | |
Suspicious for malignancy | 2.4 (20%) | 2.2 (18%) | 0.240 | ||
Malignant | 4.2 (35%) | 2.1 (17.5%) | <0.001 | ||
Suspicious for malignancy (12) | Correct | 2.9 (24.2%) | 4.4 (36.6%) | <0.001 | |
Incorrect | Benign | 0.9 (7.5%) | 0.7 (5.8%) | 0.079 | |
Atypical | 1.8 (15%) | 1.9 (15.8%) | 0.283 | ||
Malignant | 6.3 (52.5%) | 5.0 (41.7%) | 0.002 | ||
Malignant (8) | Correct | 5.3 (66.3%) | 5.5 (68.8%) | 0.223 | |
Incorrect | Benign | 0.52 (6.5%) | 0.32 (4%) | 0.027 | |
Atypical | 0.63 (7.9%) | 0.5 (6.25%) | 0.134 | ||
Suspicious for malignancy | 1.6 (20%) | 1.7 (21%) | 0.282 |
Cytological category . | Answers . | Incorrect cytological category . | 1st round * . | 2nd round ** . | p value . |
---|---|---|---|---|---|
All (42) | Correct | 16.0 (38.1%) | 20.3 (48.3%) | <0.001 | |
Benign (10) | Correct | 5.0 (50%) | 6.0 (60%) | <0.001 | |
Incorrect | Atypical | 2.1 (21%) | 2.2 (22%) | 0.273 | |
Suspicious for malignancy | 1.1 (11%) | 1.0 (10%) | 0.157 | ||
Malignant | 1.8 (18%) | 0.87 (8.7%) | <0.001 | ||
Atypical (12) | Correct | 2.9 (24.2%) | 5.0 (41.7%) | <0.001 | |
Incorrect | Benign | 2.5 (21%) | 2.7 (22.5%) | 0.278 | |
Suspicious for malignancy | 2.4 (20%) | 2.2 (18%) | 0.240 | ||
Malignant | 4.2 (35%) | 2.1 (17.5%) | <0.001 | ||
Suspicious for malignancy (12) | Correct | 2.9 (24.2%) | 4.4 (36.6%) | <0.001 | |
Incorrect | Benign | 0.9 (7.5%) | 0.7 (5.8%) | 0.079 | |
Atypical | 1.8 (15%) | 1.9 (15.8%) | 0.283 | ||
Malignant | 6.3 (52.5%) | 5.0 (41.7%) | 0.002 | ||
Malignant (8) | Correct | 5.3 (66.3%) | 5.5 (68.8%) | 0.223 | |
Incorrect | Benign | 0.52 (6.5%) | 0.32 (4%) | 0.027 | |
Atypical | 0.63 (7.9%) | 0.5 (6.25%) | 0.134 | ||
Suspicious for malignancy | 1.6 (20%) | 1.7 (21%) | 0.282 |
Paired Student t test, one-sided test.
*1st round: cytological diagnosis before tutorial.
**2nd round: cytological diagnosis after tutorial.
The mean number of correct answers by category for the 105 participants in the first and second rounds are shown. Significance was set at p < 0.05; the mean number of correct answers increased after the tutorial.
The mean number of correct answers by category for the 105 participants in the first and second rounds are shown. Significance was set at p < 0.05; the mean number of correct answers increased after the tutorial.
Comparison of the Incorrect Answers per Participant
The results of the incorrect answer for 42 cases by cytological diagnostic category are shown in Table 2. The mean number of cases out of 10 benign and 12 atypical cases miscategorized as malignant were significantly lower in the second round than in the first round (p < 0.001). Of the 12 suspicious for malignancy cases, the number of cases miscategorized as malignant decreased significantly lower in the second round than in the first (p = 0.002). Of the 8 malignant cases, the number of cases miscategorized as benign decreased significantly lower in the second round than in the first (p = 0.027).
Comparison of the Correct Answer Rates by Suggestive Pathological Diagnosis
As shown in Table 3, the mean number of correct answers for all participants increased significantly from 20.3 of 42 (48.3%) in the first round to 25.1 of 42 (59.8%) in the second round (p < 0.001). The mean number of correct answers to benign tumors, inflammatory changes, and all five carcinomas of suggestive pathological diagnosis combined into one carcinoma category all increased significantly in the second round (p < 0.001).
Mean number and percentage of correct answers per case in the benign tumor, inflammation, and cancer groups among the 42 cases
Cytological diagnosis . | 1st round* . | 2nd round** . | p value . |
---|---|---|---|
All (42) | 20.3 (48.3%) | 25.1 (59.8%) | <0.001 |
Benign tumor (5) | 2.3 (46%) | 3.0 (60%) | <0.001 |
Inflammation (17) | 6.0 (35.3%) | 8.1 (47.6%) | <0.001 |
Cancer (20) | 12.0 (60%) | 13.5 (67.5%) | <0.001 |
Cytological diagnosis . | 1st round* . | 2nd round** . | p value . |
---|---|---|---|
All (42) | 20.3 (48.3%) | 25.1 (59.8%) | <0.001 |
Benign tumor (5) | 2.3 (46%) | 3.0 (60%) | <0.001 |
Inflammation (17) | 6.0 (35.3%) | 8.1 (47.6%) | <0.001 |
Cancer (20) | 12.0 (60%) | 13.5 (67.5%) | <0.001 |
Paired Student t test, one‐sided test.
*1st round: cytological diagnosis before tutorial.
**2nd round: cytological diagnosis after tutorial.
Comparison of the Correct Response Rates per Case
As shown in Table 4, the mean number of correct response rates increased significantly from 40.2 of 105 (38%) in the first round to 51.5 of 105 (49%) in the second round (p = 0.015). In each category, the mean number of correct response rates for the 10 benign and 8 malignant cases increased after the tutorial, but the difference was not significant. In contrast, the mean number of correct response rates for the 12 atypical cases increased significantly from 25.3 (24%) in the first round to 41.5 (39.5%) in the second round (p < 0.001). The mean number of correct response rates for the 12 suspicious for malignancy increased significantly from 25.5 (24.3%) in the first round to 38.3 (36.5%) in the second round (p < 0.001). The mean number of correct response rates for suggestive pathological diagnosis of suspicious for malignancy and malignant increased after the tutorial, although the difference was not significant.
Mean number and percentage of correct responses per case among the 105 participants
Cytological category . | 1st round* . | 2nd round** . | p value . |
---|---|---|---|
All | 40.2 (38%) | 51.5 (49%) | 0.015 |
Benign | 52.5 (50%) | 62.8 (59.8%) | 0.228 |
Atypical | 25.3 (24%) | 41.5 (39.5%) | <0.001 |
Suspicious for malignancy | 25.5 (24.3%) | 38.3 (36.5%) | <0.001 |
Malignant | 69.0 (65.7%) | 72.0 (68.6%) | 0.359 |
Each presumptive tissue types (suspicious for malignancy and malignant) | 63.7 (60.6%) | 71.5 (68.0%) | 0.123 |
Cytological category . | 1st round* . | 2nd round** . | p value . |
---|---|---|---|
All | 40.2 (38%) | 51.5 (49%) | 0.015 |
Benign | 52.5 (50%) | 62.8 (59.8%) | 0.228 |
Atypical | 25.3 (24%) | 41.5 (39.5%) | <0.001 |
Suspicious for malignancy | 25.5 (24.3%) | 38.3 (36.5%) | <0.001 |
Malignant | 69.0 (65.7%) | 72.0 (68.6%) | 0.359 |
Each presumptive tissue types (suspicious for malignancy and malignant) | 63.7 (60.6%) | 71.5 (68.0%) | 0.123 |
Paired Student t test, one-sided test.
*1st round: cytological diagnosis before tutorial.
**2nd round: cytological diagnosis after tutorial.
Cases of Low Tutorial Effectiveness
A McNemar’s test was performed to analyze differences in the number of correct response rates for cytological categories among the 105 participants before and after the tutorial for each case. Cases with no significant difference before and after the tutorial in each category, low correct response rates, and poor tutorial effects were selected (Table 1). Case 38 (granulomatous lesion, benign) had the lowest correct response rate among the benign cases (1st: 6.7% vs. 2nd: 14.3%, p = 0.061). Case 9 (interstitial pneumonia, atypical) had same correct response rate in the first and second rounds (1st: 26.7 vs. 2nd: 26.7%, p = 1.000). The correct response rate in Case 34 (ADC, suspicious for malignancy) did not increase significantly after the tutorial (1st: 34.3% vs. 2nd: 36.2%, p = 0.871). Case 4 (mucoepidermoid carcinoma, malignant) had the highest correct response rates among the four categories, but the correct response rate did not increase after the tutorial (1st: 41.9% vs. 2nd: 47.6%, p = 0.471) (Fig. 3a–d).
Cases with low tutorial effects. a Case 38, granuloma in the benign category. In this case, the cells do not have a distinct nuclear size difference or chromatin hyperexpansion. The cytoplasm is somewhat thicker and more abundant, with additional squamous metaplasia; magnification: ×100 (ai) and ×400 (aii). b Case 9, interstitial pneumonia in the atypical category. The nuclei of the cells are not different in size, and the chromatin is uniform, but dense proliferating cells are observed. The cells are presumed to be bronchial epithelial cells, although no obvious lineage brushes can be observed; magnification: ×100 (bi) and ×400 (bii). c Case 34, ADC in the suspicious for malignancy category. Cells with diverse chromatin are irregularly stacked. ADC is presumed, but size disparity and nuclear atypia are not obvious; magnification: ×100 (ci) and ×400 (cii). d Case 4, mucoepidermoid carcinoma in the malignant category. Cell clusters are observed in the cytoplasm, and mucus-containing, eccentric nucleus and mucus-free cells are adjacent to each other. The nuclear findings are similar. The case is presumed to be mucoepidermoid carcinoma; magnification: ×100 (di) and ×400 (dii).
Cases with low tutorial effects. a Case 38, granuloma in the benign category. In this case, the cells do not have a distinct nuclear size difference or chromatin hyperexpansion. The cytoplasm is somewhat thicker and more abundant, with additional squamous metaplasia; magnification: ×100 (ai) and ×400 (aii). b Case 9, interstitial pneumonia in the atypical category. The nuclei of the cells are not different in size, and the chromatin is uniform, but dense proliferating cells are observed. The cells are presumed to be bronchial epithelial cells, although no obvious lineage brushes can be observed; magnification: ×100 (bi) and ×400 (bii). c Case 34, ADC in the suspicious for malignancy category. Cells with diverse chromatin are irregularly stacked. ADC is presumed, but size disparity and nuclear atypia are not obvious; magnification: ×100 (ci) and ×400 (cii). d Case 4, mucoepidermoid carcinoma in the malignant category. Cell clusters are observed in the cytoplasm, and mucus-containing, eccentric nucleus and mucus-free cells are adjacent to each other. The nuclear findings are similar. The case is presumed to be mucoepidermoid carcinoma; magnification: ×100 (di) and ×400 (dii).
A McNemar’s test was performed to analyze the difference in the number of correct responses for the five suggestive pathological diagnoses (ADC, SQCC, SCLC, large-cell neuroendocrine carcinoma, and other malignancies) for the 20 suspicious for malignancy and malignant cases. A significant tutorial effect was detected in 8 out of 20 cases (Table 1). The correct response rate for Case 4 (mucoepidermoid carcinoma, malignant) was low but significantly increased after the tutorial (1st: 21% vs. 2nd: 38.1%, p = 0.007) (Fig. 3d). In Case 24 (SQCC, suspicious for malignancy), there was also significant difference between after the tutorial (1st: 23.8% vs. 2nd: 36.2%, p = 0.031) (Fig. 4a). Case 25 (SCLC, suspicious for malignancy) had a higher percentage of correct responses (1st: 95.2% vs. 2nd: 85.7%, p = 0.024), although the percentage of correct responses decreased after the tutorial compared to before (Fig. 4b).
Difficult-to-diagnose cases by histological type. a Case 24, SQCC in the suspicious for malignancy category. The background of the specimen is necrotic. In this case, irregularly stacked cell clusters and large atypical cells are seen. Estimating the histological classification is difficult; magnification: ×100 (ai) and ×400 (aii). b Case 25, SCLC in the malignant category. In this case, the chromatin is fine and granular, and the naked nucleus-like cells lacking cytoplasm appear as loosely bound aggregates. SCLC is presumed; magnification: ×100 (bi) and ×400 (bii).
Difficult-to-diagnose cases by histological type. a Case 24, SQCC in the suspicious for malignancy category. The background of the specimen is necrotic. In this case, irregularly stacked cell clusters and large atypical cells are seen. Estimating the histological classification is difficult; magnification: ×100 (ai) and ×400 (aii). b Case 25, SCLC in the malignant category. In this case, the chromatin is fine and granular, and the naked nucleus-like cells lacking cytoplasm appear as loosely bound aggregates. SCLC is presumed; magnification: ×100 (bi) and ×400 (bii).
Cases of High Tutorial Effectiveness
A McNemar’s test was performed for the cytological category of each case, and cases with a high tutorial effect (an increased number of correct responses) were selected (Table 1). The correct response rate for Case 15 (sclerosing pneumocytoma, benign) significantly increased (1st: 29.5% vs. 2nd: 56.2%, p < 0.001) (Fig. 5a). The correct response rate for Case 3 (rheumatoid lung, atypical) (Fig. 5b) significantly increased (1st: 10.5% vs. 2nd: 43.8%, p < 0.001). The correct response rate for Case 41 (inflammatory changes, atypical) (Fig. 5c) significantly increased (1st: 4.8% vs. 2nd: 24.8%, p < 0.001). The correct response rate for Case 6 (ADC, suspicious for malignancy) (Fig. 5d) significantly increased (1st: 14.3% vs. 2nd: 34.3%, p = 0.001).
The cases with high tutorial effects. a Case 15, sclerosing pneumocytoma in the benign category. In this specimen, the vascular interstitium is in the center of the cell aggregates, and little variation in the size of the cells can be detected. In addition, the cell density has increased. No obvious nuclear atypia can be detected, and the cells appear to be a cluster of atypical cells similar to type II alveolar epithelium. Spindle cells are also seen in the specimen. We presume sclerosing alveolar epithelioma, which is a benign tumor; magnification: ×100 (ai) and ×400 (aii). b Case 3, rheumatoid lung in the atypical category. The background of this specimen is inflammatory. The cells are irregularly stacked, with markedly different sizes of nuclei and nuclear atypia. However, these cells are uniform with no chromatin hyperplasia, and normal lineage cylinder epithelial cells are present. Regenerative bronchial epithelial cells are presumed; magnification: ×100 (bi) and ×400 (bii). c Case 41, inflammation in the atypical category. Cell-dense aggregates are arranged in a fenestrated pattern with some stacking. Although the nuclei are enlarged, the individual cells do not show clear atypia, suggesting bronchial epithelial cells; magnification: ×100 (ci) and ×400 (cii). d Case 6, ADC in the suspicious for malignancy category. An inflammatory background can be observed. The cell aggregates show irregular stacking. The cytoplasm of these cells is lacy, the nuclei are irregularly sized, and small nucleoli are present in the nuclei. Nuclear atypia is also present, and the chromatin is fused. The cells are suspected to be ADC but are also suspicious for malignancy due to the marked degeneration and vacuolation of the cytoplasm; magnification: ×100 (di) and ×400 (dii).
The cases with high tutorial effects. a Case 15, sclerosing pneumocytoma in the benign category. In this specimen, the vascular interstitium is in the center of the cell aggregates, and little variation in the size of the cells can be detected. In addition, the cell density has increased. No obvious nuclear atypia can be detected, and the cells appear to be a cluster of atypical cells similar to type II alveolar epithelium. Spindle cells are also seen in the specimen. We presume sclerosing alveolar epithelioma, which is a benign tumor; magnification: ×100 (ai) and ×400 (aii). b Case 3, rheumatoid lung in the atypical category. The background of this specimen is inflammatory. The cells are irregularly stacked, with markedly different sizes of nuclei and nuclear atypia. However, these cells are uniform with no chromatin hyperplasia, and normal lineage cylinder epithelial cells are present. Regenerative bronchial epithelial cells are presumed; magnification: ×100 (bi) and ×400 (bii). c Case 41, inflammation in the atypical category. Cell-dense aggregates are arranged in a fenestrated pattern with some stacking. Although the nuclei are enlarged, the individual cells do not show clear atypia, suggesting bronchial epithelial cells; magnification: ×100 (ci) and ×400 (cii). d Case 6, ADC in the suspicious for malignancy category. An inflammatory background can be observed. The cell aggregates show irregular stacking. The cytoplasm of these cells is lacy, the nuclei are irregularly sized, and small nucleoli are present in the nuclei. Nuclear atypia is also present, and the chromatin is fused. The cells are suspected to be ADC but are also suspicious for malignancy due to the marked degeneration and vacuolation of the cytoplasm; magnification: ×100 (di) and ×400 (dii).
Most Important Basis for Diagnoses in Each Case
Nuclear findings were the most important basis for diagnoses in 31 of 42 cases (74%) in the first round and 31 of 42 cases (74%) in the second round, either. The details of the most important basis for diagnosis are listed in Table 5.
Most important basis for categorization in each case
. | First round . | Second round . | ||
---|---|---|---|---|
Categorization | First basis for categorize | Second basis for categorize | First basis for categorize | Second basis for categorize |
Nuclear findings | 31 | 20* | 31 | 20* |
Cell aggregate | 10 | 13* | 10 | 12* |
Cytoplasmic | 1 | 10 | 1 | 12 |
. | First round . | Second round . | ||
---|---|---|---|---|
Categorization | First basis for categorize | Second basis for categorize | First basis for categorize | Second basis for categorize |
Nuclear findings | 31 | 20* | 31 | 20* |
Cell aggregate | 10 | 13* | 10 | 12* |
Cytoplasmic | 1 | 10 | 1 | 12 |
*An equal number of participants chose different categories.
Discussion
The mean correct answer rates increased significantly after the tutorial. As shown in Table 2, the effects of the tutorial for the new reporting system for lung cytopathology were significant, especially in atypical and suspicious for malignancy cases. The tutorial also affected the correct answer rates for benign and malignant category cases, but the effects were limited. Part of the problem in reproducibility, especially for intermediate categories, is that the morphologic changes are a spectrum through which we draw artificial lines in an attempt to correlate with biological behavior [12‒14]. In other words, benign and malignancy are easy to distinguish morphologically.
The tutorial was not effective in improving diagnoses in some cases. For example, Case 38 was a granulomatous lesion and benign. However, the participants may have diagnosed the case as atypical or malignant based on the nuclear findings, cytoplasmic thickness, and squamous metaplasia. Case 9 was an atypical one that most participants diagnosed as benign. That was an interstitial pneumonitis case, with no cell size discrepancy or homogeneous chromatin, but the image exhibited dense proliferating cells. The participants were misled to believe that the case was benign, but the dense proliferating cells indicate that the case should be diagnosed as atypical. In the suspicious for malignancy Case 34, almost one-third of the participants diagnosed malignant in the first and second rounds, indicating that the tutorial had no effect. In this case, the irregular dense proliferating cells had a variety of chromatin, suggesting ADC. However, no clear differences in size or nuclear atypia were observed, and the case should have been diagnosed as suspicious for malignancy rather than malignant. Case 4 was mucoepidermoid carcinoma. The nuclear findings were similar between the two types of cells, and the participants have been confused about the difference between malignancy and suspicious for malignancy.
Meanwhile, the highest tutorial effect was indicated in Case 15, a sclerosing pneumocytoma in the benign category. Although the cells lacked nuclear atypia, the high cell density may have discouraged participants from categorizing the case as benign in the first round, and many participants categorized the image as malignant or suspicious for malignancy. However, the tutorial highlighted the lack of size disparity in the cells and the lack of nuclear atypia. Thus, the percentage of correct responses increased in the second round.
The most important diagnostic point evaluated by the participants was nuclear findings, which may have led to diagnostic confusion when the atypia of nuclei was mild. Cytological diagnoses should be based on differences in the nuclear size and chromatin distribution. Reactive cells may also have more pleomorphic, binucleated, or enlarged nuclei than malignant cells, which may lead to overdiagnosis if the focus is solely on nuclear findings [15]. However, specimens assigned to the atypical category will demonstrate lesser degrees of cytomorphologic abnormality than those assigned to the suspicious for malignancy category [16]. In this study, the percentage of correct responses for benign, atypical, and suspicious for malignancy categories increased when the tutorial highlighted the lack of chromatin increase and nuclear atypia.
The main limitation of this study was that only two images were presented instead of the entire slide. In general, cytology specimens are examined for the target cells, the background cells, necrosis, and mucin. However, this study focused on the tutorial effects of the new classification system. Thus, we wanted the participants to focus only on the target cells, and we presented only two images. In addition, since this was a cytology study involving many cytology specialists, the use of virtual slides and the large number of cases increased the time required for case observation longer, making it difficult to maintain concentration, and there was concern that this would interfere with the work of the cytology specialists, so the number of cases was limited to 42, a slightly smaller number than before [17].
Conclusion
We showed that web-based tutorials are particularly effective for benign, atypical, and suspicious for malignancy categories. Malignant category had a high correct answer and response rate and remained high before and after the tutorial; therefore, the tutorial should be fine as similar volume as our study. In order for this reporting system to be used effectively in many countries, it is necessary to provide and disseminate appropriate tutorials on diagnostic criteria via the internet. In particular, detailed explanations and annotations on important points should be provided to enable correct diagnosis of cancers with mild atypia, and cancers lacking characteristic morphology.
Acknowledgments
We many thank the 140 participants, especially the 105 cytotechnologists and cytopathologists who participated in both rounds of this study and JSCC secretariats. We also thank Dr. Satoru Shimizu of Tokyo Women's Medical University for giving us advice on statistical analysis of the data in this reproducibility study of the classification through tutorials.
Statement of Ethics
Written informed consent was waived for this retrospective study which analyzed. Cytological specimens with limited clinical information by the Ethics Committee of Tokyo Women’s Medical University (Approval No.: 4873-R; approval date: May 19, 2021).
Conflict of Interest Statement
The authors have no conflicts of interest to declare.
Funding Sources
The Japan Lung Cancer Society and the Japanese Society of Clinical Cytology paid the developing of the web-based response and tutorial system and fee for the English editing service.
Author Contributions
Study concept and design: Yuko Minami, Kenzo Hiroshima, Akihiko Yoshizawa, Akemi Takenaka, Reiji Haba, Kunimitsu Kawahara, Hirokuni Kakinuma, Yasuo Shibuki, Shinji Miyake, and Yukitoshi Satoh. Reviewing the cases: Yuko Minami, Akemi Takenaka, Hirokuni Kakinuma, Yasuo Shibuki, and Shinji Miyake. Management of images: Akemi Takenaka. Analysis and interpretation of data: Yuko Minami, Kenzo Hiroshima, Akihiko Yoshizawa, and Yukitoshi Satoh. Drafting of the manuscript: Yuko Minami. Critical revision of the manuscript: Yuko Minami, Kenzo Hiroshima, Akihiko Yoshizawa, Reiji Haba, Kunimitsu Kawahara, and Yukitoshi Satoh.
Data Availability Statement
All data analyzed during this study are included in this article. Further inquiries can be directed to the corresponding author.