Abstract
Introduction: Nodal metastases (lymph node metastasis [LNM]) are one of the major determinants of prognosis following surgery for intrahepatic cholangiocarcinoma (ICC). Previous studies investigating the correlation between clinical-radiological features and the probability of LNM include patients undergoing inadequate nodal sampling. Aim of this study was to develop a model to predict the risk of LNM in patients undergoing adequate lymphadenectomy using preoperative clinical and radiological features. Methods: Patients undergoing radical surgery for ICC with adequate lymphadenectomy at seven Italian Centers between 2000 and 2023 were collected and divided into a derivation and a validation cohort. Logistic regression and dominance analysis were applied in the derivation cohort to identify variables associated with LNM at pathology. The final coefficients were derived from the model having the highest c-statistic in the derivation cohort with the lowest number of variables included (parsimony). The model was then tested in the external validation cohort, and the linear predictor was divided into quartiles to generate four risk categories. Results: A total of 693 patients were identified. Preoperative CA 19-9, clinically suspicious lymph nodes at radiology, patients’ age, and tumor burden score were significantly associated with LNM. These factors were included in a model (https://aicep.website/calculators/) showing a c-statistic of 0.723 (95% CI: 0.680, 0.766) and 0.771 (95% CI: 0.699, 0.842) in the derivation and validation cohort, respectively. A progressive increase of pathological lymph node positivity across risk groups was observed (29.9% in low-risk, 45.1% in intermediate-low risk, 51.5% in intermediate-high risk, and 87.3% in high-risk patients; p = 0.001). Conclusions: A novel model that combines preoperative CA 19-9, clinically suspicious lymph nodes at radiology, patients’ age, and tumor burden score was developed to predict the risk of LNM before surgery. The model exhibited high accuracy and has the potential to assist clinicians in the management of patients who are candidate to surgery.
Introduction
Intrahepatic cholangiocarcinoma (ICC) is a challenging and aggressive cancer, for which liver resection can offer the chance of a cure [1]. Although lymphadenectomy in the setting of ICC is being increasingly adopted, still more than 40% of patients do not receive any form of nodal sampling during resection [2]. The 8th edition of the American Joint Committee on Cancer (AJCC) staging manual recommends routine lymph node evaluation of at least six lymph nodes (adequate lymphadenectomy) as part of the surgical procedure [3]. The ongoing debate around lymphadenectomy for ICC is due to the relatively limited evidence on whether it confers or not a survival benefit [4]. Indeed, while some studies have demonstrated improved survival [5, 6], other data have failed to show the advantage and argued that routine lymph node dissection might be unnecessary [7].
To date, several studies have tried to investigate the correlation between clinical and radiological features and the probability of lymph node metastases (LNMs) at pathology [8‒11]. Although some evidence exists in this regard, studies include radiological features that are not routinely performed [12], utilize machine learning algorithms without the possibility of external validation and clinical use, and, notably, attempt to predict LNM regardless of the adequacy of lymphadenectomy. The ability to preoperatively estimate the likelihood of pathological LNM is crucial for strategic oncological and surgical management. Preoperative imaging alone using size and appearance of regional lymph nodes has poor accuracy in predicting LNM [8]. Moreover, the risk of LNM is poorly correlated with radiological T stage, and as much as 30% of patients with single lesions <3 cm large have positive lymph nodes [13]. Given the abovementioned considerations, all patients should undergo adequate lymphadenectomy to correctly stage patients with ICC. However, given the scarce evidence proving survival benefit, some authors believe that lymph node dissection may be unnecessary or potentially harmful, particularly in cases at high risk for postoperative morbidity.
This study aimed at developing a model to predict the likelihood of LNMs at pathology in patients undergoing surgery for ICC using clinical and radiological features routinely collected preoperatively. The model is developed on a surgical series of patients for whom an adequate lymphadenectomy (at least 6 lymph nodes) was performed. Generalizability is ensured through a derivation/validation cohort approach. Finally, the odds ratio of missing LNMs by inadequate lymphadenectomy is estimated.
Methods
A multicenter prospectively collected database from five Italian Centers was used to gather consecutive patients who underwent liver resection for ICC between January 2000 and December 2023. All patients were included, regardless of preoperative suspicion of lymph nodal involvement. Additional inclusion criteria were age >18, a confirmed pathological diagnosis of ICC, liver resection performed with curative intent, and the absence of extrahepatic metastases. From the whole database, patients were divided into an “adequate lymphadenectomy” group (excision of at least 6 lymph nodes regardless of the anatomical site according to the TNM AJCC 8th edition Cancer Staging Manual) and “inadequate lymphadenectomy” group (excision of 1–5 lymph nodes regardless of the anatomical site).
The study protocol fulfilled the Declaration of Helsinki and its updates. Data analysis was approved by the Institutional Review Board of each participating center as a retrospective, multicenter, observational study to be reported according to the Strengthening the Reporting of Cohort, Cross-Sectional and Case-Control Studies in Surgery (STROCSS) guidelines [14]. This data collection complies with the Regulation (EU) 2016/679 of the European Parliament and of the Council, of April 27, 2016, on the protection of individuals with regard to the processing of personal data. The study is compliant with the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) [15].
Data Collection and Definitions
Demographic and clinical data were collected. The clinical suspect of LNMs was defined as the radiological evidence of regional nodes larger than 1 cm or the presence of regional nodes <1 cm metabolically active at 18FDG-positron emission tomography (PET). Pathological tumors’ characteristics were also collected including the number of harvested lymph nodes together with the number of metastatic lymph nodes. Perioperative data such as type of liver resection and associated procedures were also collected. Major hepatectomy was defined as the removal of ≥3 adjacent Couinaud segments. Tumor burden score (TBS) was defined as the distance from the origin on a Cartesian plane that incorporated maximum tumor size and number of liver lesions. The calculation of TBS was as follows: TBS2 = (maximum tumor diameter)2 + (number of liver lesions)2.
Model Development
The flow diagram of the study population is reported in Figure 1. Briefly, once patients with adequate lymphadenectomy were identified, the entire cohort was divided into one derivation and one (external) validation cohort. The derivation cohort was composed by four of the seven involved centers, whereas the remaining three formed the validation cohort. The division into derivation and validation cohorts was based on each center sample size, aiming at obtaining a 75%/25% proportion, and not based on specific characteristics of the facilities. Subsequently, logistic regression and dominance analysis were performed in the derivation cohort to identify variables significantly associated with pathologically positive lymph nodes. The final model construction started by including the variable resulted first in the dominance analysis and then reevaluating discrimination and calibration by including the subsequent variables until their best values with the fewest variables as possible. Through 10-fold cross validation (CV), different prediction models were evaluated, and final coefficients were derived from the model having the highest c-statistic in the derivation cohort with the lowest number of variables included (parsimony). The final model was then tested in the external validation cohort. Additionally, the linear predictor was divided into quartiles generating four risk groups labeled as having “low,” “intermediate-low,” “intermediate-high,” and “high” probability of LNMs [16].
Statistical Analysis
Patients with missing data were excluded from the analysis. Data were described as number and percentages or median and interquartile ranges (IQRs) as appropriate. Logarithmic transformation (Log10) was applied to skewed predictors before entering the model. Logistic regressions used clustered standard errors to account for the multicentric nature of the study. The dominance analysis developed by Azen [17] was used to determine the relative importance of each predictor to the outcome. Dominance analysis compares pairs of predictors across all subsets of the predictors in a model to determine the additional contribution that each predictor makes to the prediction model. A predictor is considered dominant or more important when it makes a greater contribution to every possible subset of predictors than any of the other predictors.
Discrimination was assessed through c-statistic [18], and calibration was evaluated using the Spiegelhalter’s z-statistic [19]. Negative z test values indicate that observed outcomes are lower than the predicted ones (i.e., the model overestimated the probabilities); positive z test values indicate that observed outcomes are higher than the predicted ones (i.e., the model underestimated the probabilities). The closest to zero the z value, the better the calibration. A p value for the z test <0.05 indicates that the model is not calibrated. The number needed to treat or to harm was also properly calculated. All the analyses were performed using STATA (StataCorp., Stata Statistical Software: Release 18).
Results
The flow diagram of the study population selection is reported in Figure 1. After selection criteria were applied, excluding 136 patients who did not receive lymphadenectomy and 86 with missing data, 1,037 patients were retained for the planned analysis with a median follow-up from surgery of 33 months (IQR: 18–77). Among the selected study population, 344 had inadequate (1–5 lymph nodes retrieved) lymphadenectomy (33.2%) and were consequently left in stand-by from the initial analyses. This identified 693 patients, with complete clinico-pathological data, who had adequate lymphadenectomy (≥6 lymph nodes). This latter group represented the training cohort which was split into the derivation set of 522 patients, originating from four centers, and a validation set of 171 patients originating from the remaining three centers. The characteristics of the study populations are reported in Table 1. In particular, 287 of the 522 patients in the derivation cohort (55%) and 83 of the 171 patients in the validation cohort (48.5%) were found to have at least one metastatic node at final pathology.
Variables . | Adequate lymphadenectomy . | Inadequate lymphadenectomy (n = 344) . | |
---|---|---|---|
derivation cohort (n = 522) . | validation cohort (n = 171) . | ||
Era | |||
2000–2005 | 28 (5.4%) | 4 (2.3%) | 48 (14.0%) |
2006–2011 | 149 (28.5%) | 14 (8.2%) | 85 (24.7%) |
2012–2017 | 202 (38.7%) | 74 (43.3%) | 134 (39.0%) |
2018–2023 | 143 (27.4%) | 79 (46.2%) | 77 (22.4%) |
Age, years | 67 (59, 74) | 68 (58, 73) | 66 (58, 72) |
Female | 306 (58.6%) | 92 (53.8%) | 158 (45.9%) |
Chronic liver disease | 77 (14.8%) | 47 (27.5%) | 76 (22.1%) |
Albumin, g/dL | 3.9 (3.5, 4.4) | 4.3 (4.0, 4.6) | 4.2 (3.8, 4.5) |
Bilirubin, mg/dL | 0.91 (0.66, 1.30) | 0.58 (0.46, 0.80) | 0.71 (0.50, 1.01) |
ALBI score | −2.57 (−2.93, −2.21) | −2.97 (−3.20, −2.70) | −2.83 (−3.08, −2.50) |
MELD score | 9 (8, 12) | 7 (6, 8) | 8 (7, 9) |
CA 19–9, U/mL | 53 (28, 130) | 39 (10, 126) | 34 (16, 100) |
Multifocal | 104 (19.9%) | 22 (12.9%) | 65 (18.9%) |
Diameter, cm | 5.4 (4.2, 7.3) | 6.0 (4.0, 8.0) | 5.5 (3.7, 8.0) |
TBS | 5.6 (4.4, 7.5) | 6.1 (4.1, 8.1) | 5.6 (3.8, 8.1) |
Macrovascular invasion | 71 (13.6%) | 32 (18.7%) | 46 (13.8%) |
Clinical suspicious lymph nodes | 138 (26.4%) | 48 (28.1%) | 61 (17.7%) |
Major hepatectomy | 225 (43.1%) | 118 (69.0%) | 211 (61.3%) |
Venous reconstruction | 12 (2.3%) | 10 (5.9%) | 18 (5.2%) |
Examined lymph nodes | 10 (7, 13) | 9 (7, 14) | 3 (2, 4) |
Positive lymph nodes | 287 (55.0%) | 83 (48.5%) | 70 (21.0%) |
PLNR (only in N+) | 0.30 (0.19, 0.45) | 0.25 (0.12, 0.50) | 0.50 (0.33, 1.00) |
Variables . | Adequate lymphadenectomy . | Inadequate lymphadenectomy (n = 344) . | |
---|---|---|---|
derivation cohort (n = 522) . | validation cohort (n = 171) . | ||
Era | |||
2000–2005 | 28 (5.4%) | 4 (2.3%) | 48 (14.0%) |
2006–2011 | 149 (28.5%) | 14 (8.2%) | 85 (24.7%) |
2012–2017 | 202 (38.7%) | 74 (43.3%) | 134 (39.0%) |
2018–2023 | 143 (27.4%) | 79 (46.2%) | 77 (22.4%) |
Age, years | 67 (59, 74) | 68 (58, 73) | 66 (58, 72) |
Female | 306 (58.6%) | 92 (53.8%) | 158 (45.9%) |
Chronic liver disease | 77 (14.8%) | 47 (27.5%) | 76 (22.1%) |
Albumin, g/dL | 3.9 (3.5, 4.4) | 4.3 (4.0, 4.6) | 4.2 (3.8, 4.5) |
Bilirubin, mg/dL | 0.91 (0.66, 1.30) | 0.58 (0.46, 0.80) | 0.71 (0.50, 1.01) |
ALBI score | −2.57 (−2.93, −2.21) | −2.97 (−3.20, −2.70) | −2.83 (−3.08, −2.50) |
MELD score | 9 (8, 12) | 7 (6, 8) | 8 (7, 9) |
CA 19–9, U/mL | 53 (28, 130) | 39 (10, 126) | 34 (16, 100) |
Multifocal | 104 (19.9%) | 22 (12.9%) | 65 (18.9%) |
Diameter, cm | 5.4 (4.2, 7.3) | 6.0 (4.0, 8.0) | 5.5 (3.7, 8.0) |
TBS | 5.6 (4.4, 7.5) | 6.1 (4.1, 8.1) | 5.6 (3.8, 8.1) |
Macrovascular invasion | 71 (13.6%) | 32 (18.7%) | 46 (13.8%) |
Clinical suspicious lymph nodes | 138 (26.4%) | 48 (28.1%) | 61 (17.7%) |
Major hepatectomy | 225 (43.1%) | 118 (69.0%) | 211 (61.3%) |
Venous reconstruction | 12 (2.3%) | 10 (5.9%) | 18 (5.2%) |
Examined lymph nodes | 10 (7, 13) | 9 (7, 14) | 3 (2, 4) |
Positive lymph nodes | 287 (55.0%) | 83 (48.5%) | 70 (21.0%) |
PLNR (only in N+) | 0.30 (0.19, 0.45) | 0.25 (0.12, 0.50) | 0.50 (0.33, 1.00) |
Nx patients were excluded from all the analyses. Continuous data are reported as medians and IQR.
ALBI, Albumin-Bilirubin score; MELD, model for end-stage liver disease; PLNR, positive lymph node ratio.
Preoperative Lymph Node Positivity Prediction Model
In the derivation cohort, logistic regression identified preoperative CA19-9, clinically suspicious lymph nodes at radiology, radiological size, radiological multifocality, and the TBS related to lymph node positivity after adequate lymphadenectomy with a p < 0.05, whereas age and female sex were related to the outcome with a p < 0.10 (Table 2). Dominance analysis showed that clinically suspicious lymph nodes at radiology were the most important variable, accounting for 65.2% of the overall fit of the multivariable model, followed by preoperative CA19-9 which accounted for 14.1% of the overall fit.
Variables . | Univariable . | Multivariable . | ||
---|---|---|---|---|
coeff (95% CI) . | p value . | standardized dominance statistic . | rank . | |
Age (years) | −0.015 (−0.030, 0.001) | 0.061 | 0.029 | 5 |
Female (vs. male) | 0.311 (−0.039, 0.662) | 0.082 | 0.023 | 7 |
Chronic liver disease (vs. absent) | 0.165 (−0.325, 0.655) | 0.509 | - | - |
Albumin (g/dL) | −0.097 (−0.414, 0.221) | 0.550 | - | - |
Bilirubin (mg/dL) | −0.067 (−0.228, 0.094) | 0.413 | - | - |
ALBI score | −0.061 (−0.406, 0.285) | 0.727 | - | - |
MELD score | 0.020 (−0.038, 0.079) | 0.496 | - | - |
CA 19-9 (Log10; U/mL) | 0.896 (0.538, 1.254) | 0.001 | 0.141 | 2 |
Multifocal (vs. single) | 0.540 (0.092, 0.987) | 0.018 | 0.028 | 6 |
Diameter (cm) | 0.070 (0.002, 0.138) | 0.045 | 0.058 | 4 |
TBS | 0.086 (0.015, 0.156) | 0.017 | 0.069 | 3 |
Macrovascular invasion (vs. absent) | 0.199 (−0.302, 0.700) | 0.436 | - | - |
Clinical lymph nodes (vs. absent) | 2.159 (1.624, 2.694) | 0.001 | 0.652 | 1 |
Major hepatectomy (vs. minor) | −0.340 (−0.687, 0.211) | 0.157 | - | - |
Venous reconstruction (vs. absent) | −0.205 (−1.349, 0.940) | 0.726 | - | - |
Variables . | Univariable . | Multivariable . | ||
---|---|---|---|---|
coeff (95% CI) . | p value . | standardized dominance statistic . | rank . | |
Age (years) | −0.015 (−0.030, 0.001) | 0.061 | 0.029 | 5 |
Female (vs. male) | 0.311 (−0.039, 0.662) | 0.082 | 0.023 | 7 |
Chronic liver disease (vs. absent) | 0.165 (−0.325, 0.655) | 0.509 | - | - |
Albumin (g/dL) | −0.097 (−0.414, 0.221) | 0.550 | - | - |
Bilirubin (mg/dL) | −0.067 (−0.228, 0.094) | 0.413 | - | - |
ALBI score | −0.061 (−0.406, 0.285) | 0.727 | - | - |
MELD score | 0.020 (−0.038, 0.079) | 0.496 | - | - |
CA 19-9 (Log10; U/mL) | 0.896 (0.538, 1.254) | 0.001 | 0.141 | 2 |
Multifocal (vs. single) | 0.540 (0.092, 0.987) | 0.018 | 0.028 | 6 |
Diameter (cm) | 0.070 (0.002, 0.138) | 0.045 | 0.058 | 4 |
TBS | 0.086 (0.015, 0.156) | 0.017 | 0.069 | 3 |
Macrovascular invasion (vs. absent) | 0.199 (−0.302, 0.700) | 0.436 | - | - |
Clinical lymph nodes (vs. absent) | 2.159 (1.624, 2.694) | 0.001 | 0.652 | 1 |
Major hepatectomy (vs. minor) | −0.340 (−0.687, 0.211) | 0.157 | - | - |
Venous reconstruction (vs. absent) | −0.205 (−1.349, 0.940) | 0.726 | - | - |
Variables with a p < 0.10 entered in the multivariable model. Dominance statistic determines the importance of independent variables and standardized values derived from their normalization to sum to 100% of the overall model fit statistic value. TBS derived from the formula √((diameter2) + (tumor number2)) and was preferred to its components because of the higher dominance. Consequently, age, female, suspicion of clinical lymph node positivity, Log10 CA 19-9, and TBS were retained for subsequent coefficient estimations.
ALBI, Albumin-Bilirubin score; MELD, model for end-stage liver disease.
Modeling results are reported in Table 3. After 10-fold CV, the highest c-statistic with the lowest number of variables was obtained with the inclusion of the presence of clinical suspicious lymph nodes at radiology, preoperative CA 19-9, TBS, and age, resulting in a c-statistic of 0.723 (95% CI: 0.680, 0.766), with a z-statistic of −0.201, indicating that the developed model was well calibrated (p = 0.580). When the model was tested in the validation cohort of 171 patients, the c-statistic was 0.771 (95% CI: 0.699, 0.842) and the z value was −0.670 (p = 0.749), providing that both discrimination and calibration were maintained.
Variables introduced in the 10-fold CV model . | c-statistic . | z-statistic . | z-statistic p value . |
---|---|---|---|
Clinical suspicious lymph nodes | 0.671 (0.637, 0.704) | −0.012 | 0.505 |
Clinical suspicious lymph nodes + CA 19-9 | 0.699 (0.654, 0.743) | −0.041 | 0.516 |
Clinical suspicious lymph nodes + CA 19-9 + TBS | 0.706 (0.662, 0.750) | −0.074 | 0.529 |
Clinical suspicious lymph nodes + CA 19-9 + TBS + agea | 0.723 (0.680, 0.766) | −0.201 | 0.580 |
Clinical suspicious lymph nodes + CA 19-9 + TBS + age + female sex | 0.723 (0.681, 0.765) | −0.275 | 0.608 |
Variables introduced in the 10-fold CV model . | c-statistic . | z-statistic . | z-statistic p value . |
---|---|---|---|
Clinical suspicious lymph nodes | 0.671 (0.637, 0.704) | −0.012 | 0.505 |
Clinical suspicious lymph nodes + CA 19-9 | 0.699 (0.654, 0.743) | −0.041 | 0.516 |
Clinical suspicious lymph nodes + CA 19-9 + TBS | 0.706 (0.662, 0.750) | −0.074 | 0.529 |
Clinical suspicious lymph nodes + CA 19-9 + TBS + agea | 0.723 (0.680, 0.766) | −0.201 | 0.580 |
Clinical suspicious lymph nodes + CA 19-9 + TBS + age + female sex | 0.723 (0.681, 0.765) | −0.275 | 0.608 |
Calibration was assessed through Spiegelhalter’s z-statistic: negative values indicate that observed outcomes are lower than the predicted (i.e., the model overestimated probabilities); positive z test values indicate that observed outcomes are higher than the predicted (i.e., the model underestimated probabilities). Closest to zero the z value, better the calibration. A z-statistic p value >0.05 indicates that the model is well calibrated.
The linear predictor of the model can be calculated as 0.403 + 0.383 if lymph nodes are suspicious at preoperative imaging +0.104 × Log10 CA 19-9 + 0.014 × TBS −0.003 × age (in years).
TBS, tumor burden score.
aThe final coefficients were calculated for the clinical suspicious lymph nodes (0.383), Log10 CA 19-9 (0.104), TBS (0.014), and age (−0.003) together with the constant value (0.403). In fact, the model with the variable female did not improve either the c-statistic or the z-statistic in respect of the model with four variables.
Comparison with Previous Clinical Models
In the validation cohort, the present model was also tested against previous published models. The c-statistic of the “enhanced imaging model” [10] was 0.725 (95% CI: 0.647, 0.802) which was similar to that of the present model (p = 0.095). However, its z-statistic was 3.933, indicating underestimation of the actual probabilities (p = 0.001). The c-statistic of the “clinical risk score” [9] was 0.611 (95% CI: 0.527, 0.695) which was lower than that of the present one (p = 0.004). Its z-statistic was 4.211, indicating that also this model underestimated the actual probabilities (p = 0.001).
Risk Groups and the Risk of Missing Positive Lymph Nodes
The linear predictor of the model was grouped into quartiles (Fig. 2). In the training cohort, this led to a progressive increase of lymph node positivity across groups, being 29.9% in patients at low risk, 45.1% in patients at intermediate-low risk, 51.5% in those at intermediate-high risk, and 87.3% in high-risk patients (p = 0.001 for trend across groups). As expected, in patients with inadequate lymphadenectomy, lymph node positivity was lower, but risk groups could nonetheless be stratified by the model (p = 0.001 for trend across groups). This was further confirmed when the probability of finding nodal metastases in each risk category was stratified according to the number of nodes removed when performing inadequate lymphadenectomy (Fig. 3): for instance, in patients at low risk, the probability of finding nodal metastases linearly rises from 4.6% to 23.1% if one or five nodes are removed.
Finally, as reported in Figure 2, a higher preoperative risk correlated with higher odds of missing positive lymph nodes in the case of inadequate lymphadenectomy. In other words, in patients at low risk, 1 every 7 patients (number needed to harm = 6, IQR 9–3) would have missed positive lymph nodes because of inadequate lymphadenectomy. On the opposite, in patients at high risk, 1 in every three patients (number needed to harm = 2, IQR 3–1) would have missed positive lymph nodes because of inadequate lymphadenectomy.
Discussion
In this study, we developed and validated a score to preoperatively predict pathological LNMs in patients undergoing surgery for ICC. Clinically suspicious lymph nodes at radiology, preoperative CA 19-9, TBS (combination of primary tumor size and number), and patients’ age allow for a linear stratification of the risk, potentially guiding perioperative treatment strategies in patients with ICC who are candidates for surgical resection.
Regional lymphadenectomy is a crucial part of the surgical procedure in many cancers. In some, such as in breast cancer, it has a role in staging and guides adjuvant treatment. In others, such as in gastric or rectal cancer, evidence suggests that LND could play a role as part of the curative surgical process in terms of oncological radicality [20]. In the setting of ICC, although its role in staging has been confirmed and the impact of LNM on prognosis is dramatic, the debate on the survival benefit of LND is ongoing [4, 5, 21]. A recent systematic review did not show any benefit, but most of the included studies were unmatched comparisons of retrospective series and did not consider the preoperative suspicion of LNM [4]. In a recent paper [5], we showed a survival benefit of adequate LND in patients with no preoperative evidence of LNM, particularly in those at a less advanced stage (in terms of number and size of the lesions and tumoral markers) and with no liver disease. However, no definitive benefit in terms of local recurrence has been demonstrated so far. Several reports showed that, to provide an accurate staging of the disease, a minimum of 6 lymph nodes should be retrieved defining it as adequate lymphadenectomy according to the TNM AJCC manual 8th edition [3, 22]. This has been increasingly adopted by the surgical community, as recently shown in a registry study from the National Cancer Database (NCDB), in which a significant increase both in sampling of at least one lymph node (from 55.5% in 2010–2012 to 61.7% in 2016–2019) and in adequate lymphadenectomy (from 14.2% in 2010–2012 to 18.2% in 2016–2019) has been observed [2]. In the present series of 1,259 patients, only 10.8% did not receive any nodal sampling. Among 1,037 patients who had at least one lymph node analyzed at pathology, 693 patients (66.8%) received an adequate lymphadenectomy (>5 lymph nodes retrieved) and formed a high-quality case series for evaluating the predictors of pathological LNM.
Contrast-enhanced CT scan and MRI have a low accuracy in determining the nodal status of patients with ICC, with a sensitivity of only 40–50% and specificity of 77–92% [8, 23, 24]. This accuracy might be only partially improved by incorporating 18FDG-PET into preoperative staging, leading to a pooled sensitivity and specificity of 69.1% and 88.4%, respectively, according to the most recent review [25]. Therefore, several attempts at developing prediction models combining both clinical and radiological features to evaluate the presence of nodal metastases at pathology have been conducted. In a recent study, Rhee et al. [9] developed an LNM score that combines serum carcinoembryonic antigen and two MRI findings, suspicious LN and bile duct invasion. The score was associated with LNM at pathology and outperformed MRI-suspicious LN alone (area under the curve 0.703 vs. 0.604, p = 0.004). In another large registry study, Tsilimigras et al. [10] developed an enhanced imaging model incorporating clinical and imaging data to predict LNM. Similar to our study, they found age, multifocality, and CA 19-9 as predictors of LNM when combined with imaging findings. We tested these models in the validation cohort, against our model: the c-statistic of the first was 0.611, significantly lower than the present one (p = 0.004). The c-statistic of the second was 0.725 (95% CI: 0.647, 0.802), similar to that of the present model (p = 0.095). However, its z-statistic was 3.933, indicating that that model significantly underestimated current probabilities (p = 0.001).
The previously cited studies, and to the best of our knowledge all studies assessing the preoperative risk of having LNM, use pathological data (namely, nodal status according to pathology from the resected specimen) to determine the final nodal status and therefore to measure the model’s accuracy. Pathological data are derived from heterogeneous surgical specimens in terms of lymphadenectomy extension, which is often confined to the sampling of one or two lymph nodes. Consequently, it is likely that these models base their accuracy on a pathological understaging. In this study, our model was developed on a cohort of patients undergoing adequate lymphadenectomy with the removal of at least 6 lymph nodes, the minimum standard for providing a correct pathological nodal staging of the disease. We identified four risk categories for LNM at pathology according to clinical and radiological features. These categories demonstrated a good correlation with pathological findings both in patients receiving adequate LND and in those receiving retrieval of at least one LN.
The relative risk of LNM for each category was significantly higher in patients receiving adequate LND, as the result of a correct pathological staging. For instance, the lowest risk category displays a risk of 29.9% of having LNM when the model is applied to patients who underwent adequate LND. This risk appears very high when compared to the 11.6% risk of patients with the same characteristics but receiving an inadequate LND (as shown in Fig. 2) or when compared to the lowest risk categories of the previously cited studies. The comparison of these two populations (adequate vs. inadequate LND) allows for the evaluation of the number of patients needed to treat, in terms of harm, when performing an inadequate lymphadenectomy: intuitively, this number decreases when the risk of having LNM increases, and becomes negligible when the risk is very high. In other words, the lower the risk of having LNM, the higher the need of an adequate LND to detect occult nodal disease. Considering that a systematic LND might increase the risk of postoperative morbidity and mortality, particularly in patients with cirrhosis or comorbidities [7, 26], our results add a piece of information when assessing the risk/benefit ratio of the procedure. In fact, in patients at high risk of LNM, the risk of nodal understaging is lower and a simple nodal sampling is more likely to result in a correct staging of the disease.
The results of the present study should be interpreted considering some limitations. First, due to its retrospective nature, selection bias could have been introduced in terms of patients and tumor selection. This might be even more pronounced considering the large time span in which the cohort was enrolled and the multicenter nature of the study. However, we sought to statistically reduce these biases using a training and a validation cohort, and by developing the prediction model by weighting the coefficients of a 10-fold CV. A second limitation is the radiological preoperative LN assessment: in this study, clinical suspicion of LNM was defined as the radiological evidence of regional nodes >1 cm, or the presence of regional nodes <1 cm metabolically active at 18FDG-PET. This definition is not universally accepted, as recently shown in a systematic review and meta-analysis [4], and therefore, the accuracy of the present model may slightly vary if other criteria for clinical suspicion are used. Moreover, preoperative data on histological classification of ICC from biopsy were collected in less than 10% of cases, and therefore, we were not able to assess the impact of different types of ICC (i.e., large bile duct type vs. small bile duct type) on the risk of LNM. Lastly, the model was not externally validated. Additional validation from other health care systems (i.e., Eastern or US patient cohorts) would further test the model’s performance and eventually adjust it as per TRIPOD guidelines [15].
In summary, in a cohort of 692 patients who underwent adequate lymphadenectomy for ICC, the risk of having LNM was around 50%. A novel model that combines preoperative radiological and clinical data was developed to predict the risk of LNM before surgery. Through internal validation, this model exhibited high accuracy, surpassing the performance of other clinico-radiological models. The proposed model has the potential to assist clinicians in preoperative decision-making regarding the extension of lymphadenectomy during ICC resection, offering a chance to better weight risks and benefits of the procedure.
Statement of Ethics
The study protocol conformed to the ethical guidelines of the 2013 Declaration of Helsinki, the 2018 edition of the Declaration of Istanbul, and was approved by the Research Ethics Committee of the Fondazione IRCCS Istituto Nazionale Tumori di Milano (National Cancer Institute, INT 0087/23). Formal consent by written signature was waived for this type of retrospective study according to the Ethical Committee of the Fondazione IRCCS Istituto Nazionale Tumori di Milano.
Conflict of Interest Statement
The authors have no conflicts of interest to declare.
Funding Sources
No funding was needed for this study.
Author Contributions
Study conception and data analysis: C.S. and A.C. Acquisition of data: F.R., L.A., F.A., S.D.S., M.S., G.M., and M.M. Interpretation of data: G.M.E., M.C., F.D.B., F.G., A.R., G.E., L.A., and G.B. Drafting manuscript: C.S., A.C., V.M., and G.B. All the authors critically reviewed and revised the draft manuscript and approved the final version for submission.
Data Availability Statement
All data generated or analyzed during this study are included in this article. Further inquiries can be directed to the corresponding author.