Introduction: The aim of the study was to perform the first meta-analysis for assessment of the pooled risk of malignancy of each category of the Sydney system for reporting of lymph nodal aspirates along with the evaluation of diagnostic accuracy. Methods: PubMed/MEDLINE and Embase were searched with the following keywords: “(Lymph node) AND (fine needle aspiration biopsy) OR (International system OR Sydney system)” in the timeframe 2020 to August 4, 2023. The selected articles were assessed for the risk of bias by the QUADAS-2 tool. The meta-analysis for sensitivity (SN) and specificity for each cut-off, that is, “atypical considered positive,” “suspicious of malignancy considered positive,” and “malignant considered positive” for the lesions, was carried out after excluding the inadequate samples in each study. To assess the diagnostic accuracy, summary receiver operating characteristic curves were constructed, and the diagnostic odds ratio was pooled in both scenarios. Results: Nine studies, all of which were retrospective cross-sectional studies, were evaluated with a total of 13,205 cases. The SN and specificity for the “atypical and higher risk categories” considered positive for malignancy were 97% (95% CI, 95–99%) and 96% (95% CI, 91–98%), respectively. The SN and specificity for the “suspicious of malignancy and higher risk categories” considered positive for malignancy were 91% (95% CI, 85–95%) and 99% (95% CI, 97–100%), respectively. The SN and specificity for the “malignant” considered positive for malignancy were 75% (95% CI, 65–84%) and 100% (95% CI, 99–100%), respectively. The pooled area under the curve was 99–100% for each of the cut-offs. Conclusion: This meta-analysis highlights the accuracy of the Sydney system in reporting lymph node aspirates. It exhibits the significance of the “suspicious” and “malignant” categories in diagnosing malignancy and of the “benign” category in excluding malignancy.

Fine needle aspiration biopsy (FNAB) is a minimally invasive, cost-effective, rapid, and safe technique to evaluate palpable lymph nodal swellings including both palpable and non-palpable enlarged nodes. The Sydney system for reporting lymph node cytopathology was proposed to develop a standardised format for reporting lymph node aspirates and for better communication between the clinician and the pathologist. In addition, each category of the Sydney system was linked to its risk of malignancy and management recommendations [1].

The categories of the Sydney system include I/L1: non-diagnostic; II/L2: benign; III/L3: atypical cells of undetermined significance or atypical lymphoid cells of uncertain significance; IV/L4: suspicious of malignancy; and V/L5: malignant [1]. Following its development, a large number of studies have been done to predict the usefulness of the Sydney system in the diagnosis of lymph node aspirates [2‒10]. However, a precise estimate of sensitivity (SN) and specificity, which is a systematic review and meta-analysis of the diagnostic accuracy of the Sydney system, in the diagnosis of lymph node aspirates is lacking.

The objective of this study was to estimate the diagnostic accuracy of FNAB in diagnosing malignancy in patients with nodal swelling. The secondary objective was the evaluation of the inadequacy rate of FNAB in nodal aspirates.

This systematic review with meta-analysis was registered in the Open Science Foundation on August 4, 2023. The review protocol is available at https://osf.io/4aq2k. The studied population consisted of any patient presenting to the hospital with symptomatic or radiologically detected lymph nodal swelling. FNAB for a primary nodal swelling reported by the Sydney system was considered the index test. Histopathological examination by either core needle biopsy or incisional/excisional biopsy was considered the reference standard. However, adequate clinicoradiological follow-up was also included for the benign aspirates. The outcome measures evaluated were SN and specificity in diagnosing malignancy per lesion in adequate nodal aspirates for each category.

The malignant category is comprised of Hodgkin’s lymphoma, non-Hodgkin’s lymphoma, and metastatic carcinoma. The proportion of inadequate aspirates in each study was the outcome variable for the secondary objective. There were three possible decision cut-offs to decide if an FNAB reported by the Sydney system was positive for malignancy: (i) “atypical” or any higher risk category was considered a positive test result for malignancy, (ii) “suspicious of malignancy” or higher may be considered positive for malignancy, and (iii) “malignant” was considered positive.

The different SN and specificity of each of these diagnostic cut-offs were assessed separately. The accuracy of the procedure was also evaluated using the area under the curve (AUC).

PubMed and Embase were searched for studies published from 2020 to August 4, 2023, which was the date of the final search. The search included studies from 2020 onward because the first proposal for the Sydney system was published online in July 2020 [1]. The authors were contacted if full texts could not be retrieved or for any supplementary information.

The search strategy was to find articles at the intersection of four main concepts: lymph node (the site of the lesion), FNAB (the index test), diagnostic accuracy (the type of study), and 2020–2023 (the period of interest). These four concepts were joined by the Boolean “AND” operator. Related search terms within each main concept were joined using the Boolean “OR” operator and used to expand the search. For example, PubMed search was “(lymph node) AND (“fine needle aspiration biopsy” OR FNAB) OR (International system OR Sydney system).” The search results were imported into the Rayyan online software and screened blindly and independently by two investigators. Any differences in the screen results were resolved by consensus among the other investigators. Only journal articles were included. The full text of potentially eligible articles was then reviewed. Studies evaluating lymph node FNAB using the Sydney system in patients presenting in a hospital with a nodal swelling were included. The selection of studies was carried out independently by two authors. In case of conflict between the two independent reviewers, consensus was reached by discussion with the other authors, with at least three out of the total of five authors needing to agree. The relevant data were extracted from the studies by a single investigator and checked by a second investigator.

Statistical analysis was performed using the Stata (version 13) and RevMan 5.4. The selected articles were assessed for the risk of bias by the QUADAS-2 tool for each study, and conflicts were resolved through discussion. The direction of bias was deduced. For flow and timing, the risk of bias for overestimating accuracy was determined. The meta-analysis for SN and specificity for each cut-off, that is, “atypical considered positive,” “suspicious of malignancy considered positive,” and “malignant considered positive” for the lesions, was carried out after excluding the inadequate samples in each study. To estimate the pooled ROM of each category, a random single proportion model was applied, and the between-study heterogeneity (τ2) was estimated using the maximum likelihood method. To assess the diagnostic accuracy, summary receiver operating characteristic curves were constructed, and the diagnostic odds ratio (DOR) was pooled in both scenarios. The summary receiver operating characteristic curves, their summary points (false positive [FP] rate on the horizontal axis and SN on the vertical axis), and the pooled AUC value were estimated using the summary ROC model. The DORs were calculated from true positive, true negative, FP, and false negative per study and were meta-analysed by applying a random-effects model using the inverse variance method and the restricted maximum likelihood estimator for τ2. Heterogeneity was evaluated using I2 statistic. We used funnel plots to evaluate potential publication bias. The strength and consistency of the diagnostic accuracy estimates were assessed by a likelihood ratio scatter matrix. A two-tailed p < 0.05 was considered statistically significant.

The PRISMA diagram for the study selection is presented in Figure 1. Nine studies were identified and the distribution of the FNAB categories along with their risk of malignancy and inadequacy rates are tabulated in Table 1. The true positive, true negative, FP, and false negative for each of the studies for analysis that is “atypical considered positive,” “suspicious considered positive,” and “malignant considered positive” are presented in online supplementary Table 1 (for all online suppl. material, see https://doi.org/10.1159/000535797). It was attempted to separate the cases of lymphoma from the metastatic cancers; however, it was not possible for all studies due to the construct of the studies evaluated.

Fig. 1.

PRISMA flow diagram for selection of studies.

Fig. 1.

PRISMA flow diagram for selection of studies.

Close modal
Table 1.

Distribution of the FNAB categories along with their risk of malignancy and inadequacy rate

AuthorYearInadequateBenignAtypicalSuspiciousMalignant
% of casesROM, %% of casesROM, %% of casesROM, %% of casesROM, %% of casesROM, %
Makarenko, 2022 [22021 6.9 58.3 31.2 6.4 14.9 69.2 8.6 96.7 38.4 99.3 
Gupta, 2021 [32021 4.1 27.5 48.6 11.5 0.5 66.7 1.4 88 45.4 99.6 
Vigliar, 2021 [42021 6.7 50 34.7 1.9 8.3 58.3 4.3 100 46 100 
Caputo, 2022 [52021 0.6 66.7 49.1 9.4 1.6 28.6 3.9 100 44.8 99.8 
Ahuja, 2022 [62021 4.4 9.1 40.5 1.5 0.8 37.5 22.8 96.9 31.5 98.2 
Uzun, 2022 [72022 4.8 16.6 56.2 0.7 7.1 88.8 9.5 100 22.4 100 
Kanhe, 2023 [82022 80.3 0.2 0.4 100 0.9 92.3 12.3 100 
Juanita, 2023 [92023 3.7 66.7 20.1 15.6 8.2 76.9 10.7 94 57.2 98.9 
Balasubramanian, 2023 [102023 10.6 26.3 54.6 7.2 76.9 3.7 82.3 27 100 
AuthorYearInadequateBenignAtypicalSuspiciousMalignant
% of casesROM, %% of casesROM, %% of casesROM, %% of casesROM, %% of casesROM, %
Makarenko, 2022 [22021 6.9 58.3 31.2 6.4 14.9 69.2 8.6 96.7 38.4 99.3 
Gupta, 2021 [32021 4.1 27.5 48.6 11.5 0.5 66.7 1.4 88 45.4 99.6 
Vigliar, 2021 [42021 6.7 50 34.7 1.9 8.3 58.3 4.3 100 46 100 
Caputo, 2022 [52021 0.6 66.7 49.1 9.4 1.6 28.6 3.9 100 44.8 99.8 
Ahuja, 2022 [62021 4.4 9.1 40.5 1.5 0.8 37.5 22.8 96.9 31.5 98.2 
Uzun, 2022 [72022 4.8 16.6 56.2 0.7 7.1 88.8 9.5 100 22.4 100 
Kanhe, 2023 [82022 80.3 0.2 0.4 100 0.9 92.3 12.3 100 
Juanita, 2023 [92023 3.7 66.7 20.1 15.6 8.2 76.9 10.7 94 57.2 98.9 
Balasubramanian, 2023 [102023 10.6 26.3 54.6 7.2 76.9 3.7 82.3 27 100 

The risk of bias assessed for each study is given in Figure 2a, and the summary of the risk of bias assessment across all included studies is given in Figure 2b. The risk of bias in patient selection was unclear in 5 out of 9 studies. The risk of index test bias was unclear in 6 out of 9 studies. The risk of patient applicability, index test applicability, and reference test applicability were low in all studies. All studies did not report whether the FNAB was reported before the histopathological examination, and a risk of reference test bias category of unclear was given in all such cases. 2 out of 9 studies had a high risk of flow bias.

Fig. 2.

a Traffic light plot for the risk of bias of the included studies. b Summary of the risk of bias across all included studies.

Fig. 2.

a Traffic light plot for the risk of bias of the included studies. b Summary of the risk of bias across all included studies.

Close modal

Table 2 shows the pooled ROM associated with each category of the Sydney system. The “insufficient,” “benign,” “atypical,” “suspicious,” and “malignant” categories were associated with a pooled ROM of 34% (95% confidence interval [95% CI], 22–45%), 5% (95% CI, 2–8%), 65% (95% CI, 52–78%), 93% (95% CI, 89–98%), and 100% (95% CI, 99–100%), respectively. Heterogeneity was higher in the “insufficient” and “atypical” categories compared with the other categories.

Table 2.

Pooled ROM associated with each category

CategoriesNumber of studies pooledROM, %95% CI, %τ2τI2, %
Insufficient 34 22–45 0.02 0.141 83.56 
Benign 2–8 98.43 
Atypical 65 52–78 0.03 0.173 80.53 
Suspicious 93 89–98 53.06 
Malignant 100 99–100 45.74 
CategoriesNumber of studies pooledROM, %95% CI, %τ2τI2, %
Insufficient 34 22–45 0.02 0.141 83.56 
Benign 2–8 98.43 
Atypical 65 52–78 0.03 0.173 80.53 
Suspicious 93 89–98 53.06 
Malignant 100 99–100 45.74 

The forest plots for SN and specificity and bivariate SN-specificity (SROC) plots for the analysis considering “atypical and higher risk categories” as positive for malignancy are depicted in Figures 3a and 4a, respectively. The SN was 97% (95% CI, 95–99%), and the specificity was 96% (95% CI, 91–98%). The pooled AUC was 99%, indicating excellent diagnostic accuracy (Fig. 5a, Table 3).

Fig. 3.

a Forest plot of SN and specificity for the threshold where the “atypical” category was considered positive for malignancy. b Forest plot of SN and specificity for the threshold where the “suspicious of malignancy” category was considered positive for malignancy. c Forest plot of SN and specificity for the threshold where the “malignant” category was considered positive for malignancy.

Fig. 3.

a Forest plot of SN and specificity for the threshold where the “atypical” category was considered positive for malignancy. b Forest plot of SN and specificity for the threshold where the “suspicious of malignancy” category was considered positive for malignancy. c Forest plot of SN and specificity for the threshold where the “malignant” category was considered positive for malignancy.

Close modal
Fig. 4.

a Bivariate sensitivity versus specificity plot showing the results of the meta-analysis for the accuracy when the “atypical” category was considered positive for malignancy. b Bivariate sensitivity versus specificity plot showing the results of the meta-analysis for the accuracy when the “suspicious of malignancy” category was considered positive for malignancy. c Bivariate sensitivity versus specificity plot showing the results of the meta-analysis for the accuracy when the “malignant” category was considered positive for malignancy.

Fig. 4.

a Bivariate sensitivity versus specificity plot showing the results of the meta-analysis for the accuracy when the “atypical” category was considered positive for malignancy. b Bivariate sensitivity versus specificity plot showing the results of the meta-analysis for the accuracy when the “suspicious of malignancy” category was considered positive for malignancy. c Bivariate sensitivity versus specificity plot showing the results of the meta-analysis for the accuracy when the “malignant” category was considered positive for malignancy.

Close modal
Fig. 5.

a Forest plot for the AUC with the results of fixed and random-effects meta-analysis for the accuracy when the “atypical” category was considered positive for malignancy. b Forest plot for the AUC with the results of fixed and random-effects meta-analysis for the accuracy when the “suspicious of malignancy” category was considered positive for malignancy. c Forest plot for the AUC with the results of fixed and random-effects meta-analysis for the accuracy when the “malignant” category was considered positive for malignancy.

Fig. 5.

a Forest plot for the AUC with the results of fixed and random-effects meta-analysis for the accuracy when the “atypical” category was considered positive for malignancy. b Forest plot for the AUC with the results of fixed and random-effects meta-analysis for the accuracy when the “suspicious of malignancy” category was considered positive for malignancy. c Forest plot for the AUC with the results of fixed and random-effects meta-analysis for the accuracy when the “malignant” category was considered positive for malignancy.

Close modal
Table 3.

Meta-analysis of AUC with the results of meta-analysis for the accuracy when “atypical” category and higher risk categories was considered positive for malignancy

StudyAUCStandard error95% CIzp value
Makarenko, 2022 [20.91 0.0179 0.875–0.945   
Gupta, 2021 [30.92 0.0128 0.895–0.945   
Vigliar, 2021 [40.97 0.0128 0.945–0.995   
Caputo, 2022 [50.95 0.0102 0.930–0.970   
Ahuja, 2022 [60.96 0.0102 0.940–0.980   
Uzun, 2022 [70.99 0.00765 0.975–1.000   
Kanhe, 2023 [80.99 0.00765 0.975–1.000   
Juanita, 2023 [90.9 0.0255 0.850–0.950   
Balasubramanian, 2023 [100.94 0.0179 0.905–0.975   
Total (fixed effects) 0.966 0.00369 0.959–0.973 261.784 <0.001 
Total (random effects) 0.952 0.00998 0.932–0.972 95.34 <0.001 
StudyAUCStandard error95% CIzp value
Makarenko, 2022 [20.91 0.0179 0.875–0.945   
Gupta, 2021 [30.92 0.0128 0.895–0.945   
Vigliar, 2021 [40.97 0.0128 0.945–0.995   
Caputo, 2022 [50.95 0.0102 0.930–0.970   
Ahuja, 2022 [60.96 0.0102 0.940–0.980   
Uzun, 2022 [70.99 0.00765 0.975–1.000   
Kanhe, 2023 [80.99 0.00765 0.975–1.000   
Juanita, 2023 [90.9 0.0255 0.850–0.950   
Balasubramanian, 2023 [100.94 0.0179 0.905–0.975   
Total (fixed effects) 0.966 0.00369 0.959–0.973 261.784 <0.001 
Total (random effects) 0.952 0.00998 0.932–0.972 95.34 <0.001 

The forest plots for SN and specificity and bivariate SN-specificity (SROC) plots for the analysis considering “suspicious of malignancy and higher risk categories” as positive for malignancy are depicted in Figures 3b and 4b, respectively. The SN was 91% (95% CI, 85–95%) and the specificity was 99% (95% CI, 97–100%). The pooled AUC was 99%, indicating excellent diagnostic accuracy (Fig. 5b, Table 4).

Table 4.

Meta-analysis of AUC with the results of meta-analysis for the accuracy when “suspicious of malignancy” and higher risk category was considered positive for malignancy

StudyROC areaStandard error95% CIzp value
Makarenko, 2022 [20.89 0.0179 0.855–0.925   
Gupta, 2021 [30.91 0.0128 0.885–0.935   
Vigliar, 2021 [40.97 0.0128 0.945–0.995   
Caputo, 2022 [50.98 0.0051 0.970–0.990   
Ahuja, 2022 [60.97 0.00765 0.955–0.985   
Uzun, 2022 [70.91 0.0153 0.880–0.940   
Kanhe, 2023 [80.97 0.0102 0.950–0.990   
Juanita, 2023 [90.91 0.0255 0.860–0.960   
Balasubramanian, 2023 [100.91 0.0179 0.875–0.945   
Total (fixed effects) 0.961 0.00334 0.955–0.968 287.551 <0.001 
Total (random effects) 0.939 0.0114 0.917–0.961 82.119 <0.001 
StudyROC areaStandard error95% CIzp value
Makarenko, 2022 [20.89 0.0179 0.855–0.925   
Gupta, 2021 [30.91 0.0128 0.885–0.935   
Vigliar, 2021 [40.97 0.0128 0.945–0.995   
Caputo, 2022 [50.98 0.0051 0.970–0.990   
Ahuja, 2022 [60.97 0.00765 0.955–0.985   
Uzun, 2022 [70.91 0.0153 0.880–0.940   
Kanhe, 2023 [80.97 0.0102 0.950–0.990   
Juanita, 2023 [90.91 0.0255 0.860–0.960   
Balasubramanian, 2023 [100.91 0.0179 0.875–0.945   
Total (fixed effects) 0.961 0.00334 0.955–0.968 287.551 <0.001 
Total (random effects) 0.939 0.0114 0.917–0.961 82.119 <0.001 

The forest plots for SN and specificity and bivariate SN-specificity (SROC) plots for the analysis considering “malignant” as positive for malignancy are depicted in Figures 3c and 4c, respectively. The SN was 75% (95% CI, 65–84%) and the specificity was 100% (95% CI, 99–100%). The pooled AUC was 100%, indicating excellent diagnostic accuracy (Fig. 5c; Table 5).

Table 5.

Meta-analysis of AUC with the results of meta-analysis for the accuracy when only “malignant” was considered positive for malignancy

StudyROC areaStandard error95% CIzp value
Makarenko, 2022 [20.82 0.023 0.775–0.865   
Gupta, 2021 [30.88 0.0128 0.855–0.905   
Vigliar, 2021 [40.93 0.0179 0.895–0.965   
Caputo, 2022 [50.95 0.0102 0.930–0.970   
Ahuja, 2022 [60.72 0.0255 0.670–0.770   
Uzun, 2022 [70.79 0.0204 0.750–0.830   
Kanhe, 2023 [80.95 0.0102 0.930–0.970   
Juanita, 2023 [90.85 0.0306 0.790–0.910   
Balasubramanian, 2023 [100.86 0.023 0.815–0.905   
Total (fixed effects) 0.903 0.00518 0.893–0.913 174.363 <0.001 
Total (random effects) 0.863 0.0243 0.816–0.911 35.535 <0.001 
StudyROC areaStandard error95% CIzp value
Makarenko, 2022 [20.82 0.023 0.775–0.865   
Gupta, 2021 [30.88 0.0128 0.855–0.905   
Vigliar, 2021 [40.93 0.0179 0.895–0.965   
Caputo, 2022 [50.95 0.0102 0.930–0.970   
Ahuja, 2022 [60.72 0.0255 0.670–0.770   
Uzun, 2022 [70.79 0.0204 0.750–0.830   
Kanhe, 2023 [80.95 0.0102 0.930–0.970   
Juanita, 2023 [90.85 0.0306 0.790–0.910   
Balasubramanian, 2023 [100.86 0.023 0.815–0.905   
Total (fixed effects) 0.903 0.00518 0.893–0.913 174.363 <0.001 
Total (random effects) 0.863 0.0243 0.816–0.911 35.535 <0.001 

The pooled DOR for “atypical and higher risk categories” considered positive was 777.47 (95% CI, 263.23–2296.32), also indicating a high level of diagnostic accuracy (Fig. 6a). The pooled DOR for “suspicious and higher risk categories” considered positive was 1,064.94 (95% CI, 337.31–3,362.21), also indicating a high level of diagnostic accuracy (Fig. 6b). The pooled DOR for the “malignant” considered positive was 823.72 (95% CI, 249.63–2718.02), also indicating a high level of diagnostic accuracy (Fig. 6c).

Fig. 6.

a DOR forest plot for detecting malignancy when the “atypical” category was considered positive for malignancy. b DOR forest plot for detecting malignancy when the “suspicious of malignancy” category was considered positive for malignancy. c DOR forest plot for detecting malignancy when the “malignant” category was considered positive for malignancy.

Fig. 6.

a DOR forest plot for detecting malignancy when the “atypical” category was considered positive for malignancy. b DOR forest plot for detecting malignancy when the “suspicious of malignancy” category was considered positive for malignancy. c DOR forest plot for detecting malignancy when the “malignant” category was considered positive for malignancy.

Close modal

The likelihood scatter matrix graph suggests that there is substantial evidence that the cut-off “atypical and higher categories considered positive” is useful in ruling in and out malignancy as the negative likelihood ratio was less than 0.1, while the positive likelihood ratio was greater than 10 (Fig. 7a). The likelihood scatter matrix graph suggests that there is moderate evidence that the cut-off “suspicious of malignancy and higher risk category considered positive” is useful in ruling in but not ruling out malignancy (Fig. 7b). The likelihood scatter matrix graph suggests that there is moderate evidence that the cut-off “only malignant considered positive” is useful in ruling in but not ruling out malignancy as it has a positive likelihood ratio greater than 100 (Fig. 7c). Lastly, a funnel plot was constructed for “atypical and higher categories,” “suspicious of malignancy and higher risk category,” and “malignant” considered positive, which did not reveal the presence of publication bias (p = 0.65, p = 0.72, and p = 0.48, respectively) (Fig. 8).

Fig. 7.

Likelihood ratio scatter matrices for the different diagnostic thresholds. The points represent the studies and are numbered according to the order given in Table 1. The red diamond with lines represents the meta-analytic estimate of the likelihood ratios with confidence intervals. a Likelihood ratio scatter matrices for detecting malignancy when the “atypical” category was considered positive for malignancy. b Likelihood ratio scatter matrices for detecting malignancy when the “suspicious of malignancy” category was considered positive for malignancy. c Likelihood ratio scatter matrices for detecting malignancy when the “malignant” category was considered positive for malignancy.

Fig. 7.

Likelihood ratio scatter matrices for the different diagnostic thresholds. The points represent the studies and are numbered according to the order given in Table 1. The red diamond with lines represents the meta-analytic estimate of the likelihood ratios with confidence intervals. a Likelihood ratio scatter matrices for detecting malignancy when the “atypical” category was considered positive for malignancy. b Likelihood ratio scatter matrices for detecting malignancy when the “suspicious of malignancy” category was considered positive for malignancy. c Likelihood ratio scatter matrices for detecting malignancy when the “malignant” category was considered positive for malignancy.

Close modal
Fig. 8.

Funnel plot to evaluate the publication bias in the study for different thresholds. a “Atypical” category was considered positive for malignancy. b “Suspicious of malignancy” category was considered positive for malignancy. c “Malignant” category was considered positive for malignancy.

Fig. 8.

Funnel plot to evaluate the publication bias in the study for different thresholds. a “Atypical” category was considered positive for malignancy. b “Suspicious of malignancy” category was considered positive for malignancy. c “Malignant” category was considered positive for malignancy.

Close modal

To our knowledge, the present study is the first systematic review and meta-analysis regarding the utility of the Sydney system in reporting lymph node cytopathology. Previously, similar work has been done on the IAC Yokohama system and the Bethesda System for Reporting Thyroid Cytopathology [11‒13]. The pooled risk of malignancy was calculated for each category of the Sydney system utilising the data from the 9 published studies. The “insufficient,” “benign,” “atypical,” “suspicious,” and “malignant” categories were associated with a pooled ROM of 34%, 5%, 65%, 93%, and 100%, respectively. Heterogeneity was higher in the “insufficient” and “atypical” categories compared with the other categories. The reasons attributed to inadequate cases were predominantly deep-seated difficult-to-aspirate nodes. The time for follow-up for evaluation of the benign lesions ranged from 6 months to 1 year for different studies.

The present study demonstrated the role of FNAB for both ruling in and ruling out malignancy through different thresholds. The cut-off of “atypical and higher categories” has a SN of 97% with a lower 95% CI of 95% and a specificity of 96%. The negative likelihood ratio was very low; thus, this threshold is useful in both ruling in and ruling out malignancy.

The threshold of “suspicious of malignancy and higher categories” had a SN of 91% with a lower 95% confidence interval of 85% and a high specificity of 99% with a lower 95% CI of 97%. The cut-off of “malignant” had a low SN of 75% with a lower 95% CI of 65% and a very high specificity of 100%. Thus, both these thresholds had a high positive likelihood ratio, thereby ruling in the diagnosis of malignancy. This is supported by the rule of thumb of SNOUT and SPIN, whereby high SN and specificity help in ruling out and ruling in a disease, respectively [14].

The statistical analysis for assessment of SN, specificity, and diagnostic accuracy should be done after excluding the cases in the insufficient category, as they cannot be included in either negative or positive for malignancy. However, two out of the nine studies have included insufficient cases in their statistical analysis [3, 5].

The Sydney system recommends the utility of rapid on-site evaluation (ROSE) to assess the cellularity and adequacy of the sample. ROSE helps in reducing the inadequacy rate and also allows the division of the sample for ancillary studies like flow cytometry and cell block, if needed. While seven of the nine studies used ROSE to assess the adequacy of the sample, none of them performed any separate analysis to evaluate the utility of ROSE in lymph node fine needle aspirates [2, 4‒6, 8‒10]. The Sydney system proposes two diagnostic levels. The first level involves the categorisation of the various lymph node aspirates into the five diagnostic categories, while the second level involves the application of ancillary techniques like flow cytometry and immunohistochemistry to arrive at a final specific diagnosis. All studies except Juanita et al. [9] performed ancillary techniques for further definite categorisation of the nodal aspirates.

One of the problems of the Sydney system is its reproducibility in different labs. A recent study by Caputo et al. [15] evaluated the digital examination of lymph node cytopathology using the Sydney system. The overall interobserver concordance was moderate (Fleiss κ = 0.476). Category‐wise analysis revealed higher agreement for the inadequate, benign, and malignant categories and much less agreement for the atypical and suspicious categories (p < 0.0001 for all). The lower agreement for atypical and suspicious categories was expected because these cases represent a grey zone in which the diagnostic confidence is insufficient to render a clear‐cut diagnosis of benign or malignant categories.

The present systematic review and meta-analysis had a few limitations. Firstly, all the studies included were retrospective. Secondly, high levels of heterogeneity were observed in the assessment of the pooled ROM, especially for the insufficient, benign, and atypical categories. This meta-analysis highlights the accuracy of the Sydney system in reporting lymph node aspirates. It exhibits the significance of the “suspicious” and “malignant” categories in diagnosing malignancy and of the “benign” category in excluding malignancy.

An ethics statement is not applicable because this study is based exclusively on the published literature.

The authors have no conflicts of interest to declare.

The authors received no funding for this study.

Sana Ahuja contributed to writing of the protocol, assessment of the risk of bias, and primary writing of the manuscript; is an independent assessor in the screening of studies; and reviewed the extracted data. Adil Aziz Khan is an independent assessor in the screening of studies, checked the extracted data, checked and reviewed the risk of bias assessment, and critically reviewed the manuscript. Rhea Ahuja contributed to literature search and statistics and critically reviewed the manuscript. Pragun Ahuja contributed to literature search, helped to achieve consensus in case of conflict in screening, and critically reviewed the manuscript. Sufian Zaheer involved in study ideation, checked the extracted data, checked and reviewed the risk of bias assessment, and critically reviewed the manuscript.

All data generated in this study are included in this article and its supplementary files. Further enquiries can be directed to the corresponding author.

1.
Al-Abbadi
MA
,
Barroca
H
,
Bode-Lesniewska
B
,
Calaminici
M
,
Caraway
NP
,
Chhieng
DF
et al
.
A proposal for the performance, classification, and reporting of lymph node fine-needle aspiration cytopathology: the Sydney system
.
Acta Cytol
.
2020
;
64
(
4
):
306
22
.
2.
Makarenko
VV
,
DeLelys
ME
,
Hasserjian
RP
,
Ly
A
.
Lymph node FNA cytology: diagnostic performance and clinical implications of proposed diagnostic categories
.
Cancer Cytopathol
.
2022
;
130
(
2
):
144
53
.
3.
Gupta
P
,
Gupta
N
,
Kumar
P
,
Bhardwaj
S
,
Srinivasan
R
,
Dey
P
et al
.
Assessment of risk of malignancy by application of the proposed Sydney system for classification and reporting lymph node cytopathology
.
Cancer Cytopathol
.
2021
;
129
(
9
):
701
18
.
4.
Vigliar
E
,
Acanfora
G
,
Iaccarino
A
,
Mascolo
M
,
Russo
D
,
Scalia
G
et al
.
A novel approach to classification and reporting of lymph node fine-needle cytology: application of the proposed Sydney system
.
Diagnostics
.
2021
;
11
(
8
):
1314
.
5.
Caputo
A
,
Ciliberti
V
,
D’Antonio
A
,
D’Ardia
A
,
Fumo
R
,
Giudice
V
et al
.
Real-world experience with the Sydney System on 1458 cases of lymph node fine needle aspiration cytology
.
Cytopathology
.
2022
;
33
(
2
):
166
75
.
6.
Ahuja
S
,
Malviya
A
.
Categorisation of lymph node aspirates using the proposed Sydney system with assessment of risk of malignancy and diagnostic accuracy
.
Cytopathology
.
2022
;
33
(
4
):
430
8
.
7.
Uzun
E
,
Erkilic
S
.
Diagnostic accuracy of Thinprep® in cervical lymph node aspiration: assessment according to the Sydney system
.
Diagn Cytopathol
.
2022
;
50
(
5
):
253
62
.
8.
Kanhe
R
,
Tummidi
S
,
Kothari
K
,
Agnihotri
M
.
Utility of the proposed Sydney system for classification of fine-needle aspiration cytopathology of lymph node: a retrospective study at a tertiary care center
.
Acta Cytol
.
2023
;
67
(
5
):
455
67
.
9.
Juanita
J
,
Ikram
D
,
Sungowati
NK
,
Purnama
IP
,
Amalia
A
,
Ningrati
AF
et al
.
Diagnostic accuracy of lymph nodes fine needle aspiration biopsy based on the Sydney system for reporting lymph node cytology
.
Asian Pac J Cancer Prev
.
2023
;
24
(
6
):
1917
22
.
10.
Shanmugasundaram
S
,
Balasubramanian
NB
,
Sundari Amirthakatesan
A
.
The application of the proposed Sydney system for reporting lymph node cytopathology: a five-year experience of an academic institution in south India
.
Acta Cytol
.
2023
;
67
(
4
):
365
77
.
11.
Nikas
IP
,
Vey
JA
,
Proctor
T
,
AlRawashdeh
MM
,
Ishak
A
,
Ko
HM
et al
.
The use of the international academy of cytology Yokohama system for reporting breast fine-needle aspiration biopsy
.
Am J Clin Pathol
.
2023
;
159
(
2
):
138
45
.
12.
Paul
P
,
Azad
S
,
Agrawal
S
,
Rao
S
,
Chowdhury
N
.
Systematic review and meta-analysis of the diagnostic accuracy of the international academy of cytology Yokohama system for reporting breast fine-needle aspiration biopsy in diagnosing breast cancer
.
Acta Cytol
.
2023
;
67
:
1
16
.
13.
Vuong
HG
,
Chung
DGB
,
Ngo
LM
,
Bui
TQ
,
Hassell
L
,
Jung
CK
et al
.
The use of the Bethesda system for reporting thyroid cytopathology in pediatric thyroid nodules: a meta-analysis
.
Thyroid
.
2021
;
31
(
8
):
1203
11
.
14.
Sackett
DL
,
Straus
S
.
On some clinically useful measures of the accuracy of diagnostic tests
.
ACP J Club
.
1998
129
2
A17
9
.
15.
Caputo
A
,
Fraggetta
F
,
Cretella
P
,
Cozzolino
I
,
Eccher
A
,
Girolami
I
et al
.
Digital Examination of LYmph node CYtopathology Using the Sydney system (DELYCYUS): an international, multi-institutional study
.
Cancer Cytopathol
.
2023
;
131
(
11
):
679
92
.