Abstract
Introduction: Chronic kidney disease (CKD) is classified according to the estimated glomerular filtration rate (eGFR), but kidney volume (KV) can also provide meaningful information. Very few radiomics (RDX) studies on CKD have utilized computed tomography (CT). This study aimed to determine whether non-enhanced computed tomography (NECT)-based RDX can be useful in evaluation of patients with CKD and to compare it with KV. Methods: The NECT scans of 64 subjects with impaired kidney function (defined as <60 mL/min/1.73 m2) and 60 controls with normal kidney function were retrospectively analyzed. Kidney segmentations, volume measurements, and RDX features extraction were performed. Machine-learning models using RDX were constructed to classify the kidneys as having structural markers of impaired or normal function. Results: The median KV in the impaired kidney function group was 114.83 mL vs. 159.43 mL (p < 0.001) in the control group. There was a statistically significant strong positive correlation between KV and eGFR (rs = 0.579, p < 0.001) and a strong negative correlation between KV and serum creatinine level (rs = −0.514, p < 0.001). The KV-based models achieved the best area under the curve (AUC) of 0.746, whereas the RDX-based models achieved the best AUC of 0.878. Conclusions: RDX can be useful in identifying patients with impaired kidney function on NECT. RDX-based models outperformed KV-based models. RDX has the potential to identify patients with a higher risk of CKD based on imaging, which, as we believe, can indirectly support clinical decision-making.
Introduction
Chronic kidney disease (CKD) refers to a group of conditions that impair kidney function characterized by decline in glomerular filtration rate (GFR) below 60 mL/min per 1.73 m2, persisting for at least 3 months [1]. The estimated GFR (eGFR) is calculated using various formulas, such as the Dietary Modification in Renal Disease (MDRD) or Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) [1].
Serum creatinine levels are commonly measured prior to a radiological procedure involving intravenous contrast medium administration due to the risk of contrast nephrotoxicity [2]. Although low eGFR is a primary limitation of performing contrast-enhanced examinations, computed tomography (CT) still plays a crucial role in diagnosing the underlying causes of kidney diseases. Furthermore, CT can indirectly provide valuable information regarding the structural progression of CKD. This can be achieved using a variety of methods, including visual assessment by a radiologist, kidney length measurements [3], and more advanced techniques such as kidney volume (KV) quantification [3] or radiomics (RDX).
RDX is a method for quantitatively analyzing medical images by extracting mathematical features that describe shape and texture, which cannot be assessed by a radiologist’s eye alone [4]. These features can be categorized into several classes, including shape, first-order features (describing the intensity distribution), second-order features (describing relationships between voxels), and higher-order features [4]. Such data can then be analyzed, often using machine-learning models [4, 5]. As fibrosis is the most common histopathological feature of CKD [6], kidneys affected by CKD may exhibit RDX features that differ from those of healthy kidneys. While RDX studies on CKD have used magnetic resonance imaging (MRI) or ultrasound (US) to evaluate kidney function, very few have explored the topic of CT-based RDX in CKD [6].
This study aimed to determine whether non-enhanced CT-based RDX can be useful in the identification of CKD and to compare its performance with models using KV as a single variable. Upon reviewing the literature, we believe this topic has not been extensively explored to date.
Methods
This retrospective study was conducted in accordance with the Declaration of Helsinki and was reviewed and approved by the Bioethics Committee (Approval No. RNN/174/24/KE; date of approval: 9th July 2024). Informed written consent was obtained from all patients prior to the study. The study was also conducted in adherence to the CheckList for EvaluAtion of Radiomics (CLEAR) research guidelines for RDX studies [7]. The CLAER checklist is provided in the online supplementary material (for all online suppl. material, see https://doi.org/10.1159/000543305).
Abdominal computed tomography (CT) scans performed between November 2019 and June 2024 were retrospectively analyzed. The inclusion criterion was a serum creatinine level measured prior to the examination. The exclusion criteria were as follows: (1) clinical context of acute kidney injury; (2) presence of renal tumors (however, patients with a history of renal tumors who had previously undergone nephrectomy were eligible for inclusion); (3) renal transplant; (4) hydronephrosis; (5) polycystic kidney disease; (6) horseshoe kidney; (7) incomplete medical documentation; and (8) poor image quality. Kidneys of patients with eGFR >60 mL/min/1.73 m2 were labeled as normal functioning, whereas kidneys of patients with eGFR <60 mL/min/1.73 m2 were labeled as impaired kidney function. The MDRD formula was used for the eGFR calculation.
Abdominal CT images were acquired utilizing a GE Revolution CT 64 Slice scanner (GE Healthcare, Chicago, IL, USA). The acquisition parameters were as follows: 120 kV voltage and 2.5 mm slice thickness. Most examinations were multiphasic and included contrast-enhanced phases; however, in this study, only the pre-contrast phases were analyzed. Kidney segmentation was performed using 3D Slicer version 5.7.0 software (www.slicer.org) employing TotalSegmentator extension version 8426cdf (https://github.com/wasserth/TotalSegmentator) [8].
This extension performs fully automated segmentation of all human organs [8], which is crucial for RDX research, as automation enhances reproducibility and reduces interobserver variability [9]. Segmentations of abdominal structures other than the kidneys were deleted by the authors. The primary objective was to obtain segmentations of the kidney parenchyma, encompassing both the cortex and medulla, while excluding the renal sinus. Large cysts, particularly those extending beyond the kidney surface and causing volume exaggeration, were also excluded. In certain cases, manual adjustments were made to the automated segmentations using 3D Slicer tools (e.g., brush, eraser). Each segmentation was reviewed and approved by consensus between two radiologists. The step-by-step segmentation process is illustrated in Figure 1, with the final sample segmentation shown in Figure 2. The KVs were measured based on segmentation performed using 3D Slicer tools. The segmentation masks were saved as separate files, and the paths to these files, along with the original image paths, were recorded in a CSV file.
NECT images (voltage: 120 kV, slice thickness: 2.5 mm) of a patient with impaired kidney function. All images in the axial plane. Step-by-step segmentation process is presented: image before segmentation (a); after automated segmentation of all abdominal organs and structures in 3D Slicer with TotalSegmentator extension (b); after deletion of abdominal structures other than the kidneys (c); after manual adjustments (d).
NECT images (voltage: 120 kV, slice thickness: 2.5 mm) of a patient with impaired kidney function. All images in the axial plane. Step-by-step segmentation process is presented: image before segmentation (a); after automated segmentation of all abdominal organs and structures in 3D Slicer with TotalSegmentator extension (b); after deletion of abdominal structures other than the kidneys (c); after manual adjustments (d).
NECT images (voltage: 120 kV, slice thickness: 2.5 mm) of a patient with normal kidney function in axial, coronal, and sagittal plane, as well as a volumetric reconstruction. Sample kidney segmentation performed in 3D Slicer with TotalSegmentator extension is presented. In this case, no manual adjustment was needed.
NECT images (voltage: 120 kV, slice thickness: 2.5 mm) of a patient with normal kidney function in axial, coronal, and sagittal plane, as well as a volumetric reconstruction. Sample kidney segmentation performed in 3D Slicer with TotalSegmentator extension is presented. In this case, no manual adjustment was needed.
RDX features were extracted separately for each kidney using PyRadiomics version 3.0.1 (https://pyradiomics.readthedocs.io) [10] within the Python framework. A YAML format parameter file was created to configure image processing settings [9]. Voxels were resampled to a resolution of 1 × 1 × 1 mm [7, 10]. Additional filters were applied, including wavelet transform (with low (L) and high (H) pass filters, covering all eight possible decomposition combinations: LLL, LLH, LHL, LHH, HLL, HLH, HHL, and HHH) [7, 11], as well as the Laplacian of Gaussian (LoG) filter [7, 9, 11] with three different sigma values: 0.5, 1.0, and 2.0. A binwidth of 8 was chosen.
The extracted features belonged to the following classes: shape (3D), first-order features, gray-level co-occurrence matrix (GLCM), gray-level dependence matrix (GLDM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), and neighboring gray tone difference matrix (NGTDM) (https://pyradiomics.readthedocs.io/en/latest/features.html).
Further analysis was performed using the Orange Data Mining version 3.37.0 framework (https://orangedatamining.com) [4, 12]. The data were preprocessed using minimum-maximum normalization. To reduce dataset dimensionality and eliminate redundant data [9] logistic regression with the Least Absolute Shrinkage and Selection Operator (LASSO) was applied [5]. The selected features were then used to construct machine-learning models to classify the kidneys as with normal or impaired function. Among hundreds of machine-learning models, random forest (RF) and support vector machine (SVM) are regarded as the most accurate for RDX research [13, 14] and were thus chosen for this study. K-nearest neighborhood (kNM) was also included for comparison. The models hyperparameters were manually optimized to achieve the best area under the curve (AUC). The hyperparameters were set as follows: for RF: number of trees = 500, maximum depth = 5, minimum samples split = 5; for SVM: radial basis function (RBF) kernel, cost (C) = 1.3, regression loss (ɛ) = 0.10; for kNM: Euclidean geometry, weighting by distance, number of neighbors = 5. The performance of the models was evaluated using 10-fold cross-validation.
Separate models were created based on KV as a single variable. The performance of these models was then compared using AUC.
Conventional statistical methods were also employed in this study. The Shapiro-Wilk test was applied to test for normal distribution. For non-parametric features, the Mann-Whitney U test was used. The chi-square test was applied for categorical data. Correlations were calculated using the Spearman formula. Statistical significance was defined as p < 0.05. Conventional statistical calculations were performed in Python using the SciPy 3.9.2 library.
Results
A total of 64 patients (124 kidneys) with impaired kidney function as subjects and 60 patients (120 kidneys) with normal kidney function as controls were enrolled. The characteristics of the study cohort are summarized in Table 1.
Characteristics of the study cohort
. | Patients with impaired kidney function . | Patients with normal kidney . | p value . |
---|---|---|---|
Patients, n (% male) | 64 (48.4) | 60 (46.7) | 0.84 |
Age, years | 74.5 (21–93) | 50.0 (22–89) | <0.001 |
Serum creatinine level, mg/dL | 1.95 (1.11–14.23) | 0.68 (0.43–1.15) | <0.001 |
eGFRMDRD, mL/min per 1.73 m2 | 27.9 (3.6–58) | 104.45 (64.8–195.6) | <0.001 |
Number of kidneys (number of patients with one kidney) | 124 (4) | 120 (0) | - |
KV, mL | 114.83 (9.4–269.16) | 159.43 (53.41–317.62) | <0.001 |
. | Patients with impaired kidney function . | Patients with normal kidney . | p value . |
---|---|---|---|
Patients, n (% male) | 64 (48.4) | 60 (46.7) | 0.84 |
Age, years | 74.5 (21–93) | 50.0 (22–89) | <0.001 |
Serum creatinine level, mg/dL | 1.95 (1.11–14.23) | 0.68 (0.43–1.15) | <0.001 |
eGFRMDRD, mL/min per 1.73 m2 | 27.9 (3.6–58) | 104.45 (64.8–195.6) | <0.001 |
Number of kidneys (number of patients with one kidney) | 124 (4) | 120 (0) | - |
KV, mL | 114.83 (9.4–269.16) | 159.43 (53.41–317.62) | <0.001 |
Data are presented as median and range in parentheses, if not specified otherwise.
MDRD, Dietary Modification in Renal Disease; eGFR, estimated glomerular filtration rate.
The KV boxplots are shown in Figure 3. There was a statistically significant strong positive correlation between KV and eGFR (rs = 0.579, p < 0.001) and a strong negative correlation between KV and serum creatinine level (rs = −0.514, p < 0.001). For each included kidney, a total of 1,106 RDX features were individually extracted.
KV boxplots. Medians presented as horizontal lines inside boxplots. Outliers presented as dots outside boxplots.
KV boxplots. Medians presented as horizontal lines inside boxplots. Outliers presented as dots outside boxplots.
Using KV as the only parameter, RF, SVM, and kNM achieved their best AUC of 0.732, 0.746, and 0.684, respectively. The additional details are presented in Table 2. The receiver operating characteristic curves are shown in Figure 4. Among the machine-learning models, SVM showed the best performance.
Machine-learning models performance using KV as the only one variable
Model . | AUC . | CA . | F1 . | Prec . | Recall . | MCC . | LogicLoss . |
---|---|---|---|---|---|---|---|
SVM | 0.746 | 0.701 | 0.700 | 0.703 | 0.701 | 0.404 | 0.588 |
RF | 0.732 | 0.664 | 0.664 | 0.664 | 0.664 | 0.328 | 0.879 |
kNM | 0.684 | 0.627 | 0.627 | 0.628 | 0.627 | 0.255 | 2.179 |
Model . | AUC . | CA . | F1 . | Prec . | Recall . | MCC . | LogicLoss . |
---|---|---|---|---|---|---|---|
SVM | 0.746 | 0.701 | 0.700 | 0.703 | 0.701 | 0.404 | 0.588 |
RF | 0.732 | 0.664 | 0.664 | 0.664 | 0.664 | 0.328 | 0.879 |
kNM | 0.684 | 0.627 | 0.627 | 0.628 | 0.627 | 0.255 | 2.179 |
SVM, support vector machine; RF, random forest; kNM, k-nearest neighborhood; CA, classification accuracy; Prec., precision; MCC, Matthews correlation coefficient; LogicLoss, logistic loss.
Receiver operating characteristic (ROC) curves for machine-learning models with KV as the only one variable. Curve colors: green for support vector machine (SVM), orange for random forest (RF), and purple for k-nearest neighborhood (kNM).
Receiver operating characteristic (ROC) curves for machine-learning models with KV as the only one variable. Curve colors: green for support vector machine (SVM), orange for random forest (RF), and purple for k-nearest neighborhood (kNM).
Logistic regression with LASSO achieved the highest AUC of 0.881, with a lambda value of 0.33. RDX features were ranked based on their logistic regression coefficients and the top ten [15] were selected for further analysis. These selected features were compared between the groups using conventional statistics, with statistical significance presented in Table 3. Not all features deemed important for the model by LASSO were found to be significant using conventional statistics.
Statistical significance of features selected for further analysis
. | p value . |
---|---|
log-sigma-1-0-mm-3D_glcm_Imc1 | 0.68 |
wavelet-LHH_glszm_SizeZoneNonUniformityNormalized | 0.08 |
log-sigma-2-0-mm-3D_firstorder_Mean | <0.001 |
wavelet-LLL_glcm_Imc2 | <0.001 |
original_glszm_LargeAreaEmphasis | <0.001 |
original_glszm_ZoneVariance | <0.001 |
wavelet-LHH_firstorder_Mean | 0.15 |
log-sigma-2-0-mm-3D_glszm_GrayLevelNonUniformityNormalized | 0.47 |
wavelet-HLL_firstorder_Mean | 0.009 |
wavelet-HLH_glszm_ZoneVariance | <0.001 |
. | p value . |
---|---|
log-sigma-1-0-mm-3D_glcm_Imc1 | 0.68 |
wavelet-LHH_glszm_SizeZoneNonUniformityNormalized | 0.08 |
log-sigma-2-0-mm-3D_firstorder_Mean | <0.001 |
wavelet-LLL_glcm_Imc2 | <0.001 |
original_glszm_LargeAreaEmphasis | <0.001 |
original_glszm_ZoneVariance | <0.001 |
wavelet-LHH_firstorder_Mean | 0.15 |
log-sigma-2-0-mm-3D_glszm_GrayLevelNonUniformityNormalized | 0.47 |
wavelet-HLL_firstorder_Mean | 0.009 |
wavelet-HLH_glszm_ZoneVariance | <0.001 |
Using RDX features, RF, SVM, and kNM achieved the best AUC of 0.868, 0.878, and 0.728, respectively. Detailed results are provided in Table 4, with the corresponding receiver operating characteristic curves illustrated in Figure 5. Similarly, SVM outperformed the other machine-learning models.
Machine-learning models performance using RDX features
Model . | AUC . | CA . | F1 . | Prec . | Recall . | MCC . | LogicLoss . |
---|---|---|---|---|---|---|---|
SVM | 0.878 | 0.787 | 0.787 | 0.787 | 0.787 | 0.574 | 0.438 |
RF | 0.868 | 0.795 | 0.795 | 0.795 | 0.795 | 0.590 | 0.456 |
kNM | 0.728 | 0.668 | 0.668 | 0.669 | 0.6668 | 0.339 | 2.011 |
Model . | AUC . | CA . | F1 . | Prec . | Recall . | MCC . | LogicLoss . |
---|---|---|---|---|---|---|---|
SVM | 0.878 | 0.787 | 0.787 | 0.787 | 0.787 | 0.574 | 0.438 |
RF | 0.868 | 0.795 | 0.795 | 0.795 | 0.795 | 0.590 | 0.456 |
kNM | 0.728 | 0.668 | 0.668 | 0.669 | 0.6668 | 0.339 | 2.011 |
SVM, support vector machine; RF, random forest; kNM, k-nearest neighborhood; CA, classification accuracy; Prec., precision; MCC, Matthews correlation coefficient; LogicLoss, logistic loss.
Receiver operating characteristic (ROC) curves for machine-learning models using RDX features. Curve colors: green for support vector machine (SVM), orange for random forest (RF), and purple for k-nearest neighborhood (kNM).
Receiver operating characteristic (ROC) curves for machine-learning models using RDX features. Curve colors: green for support vector machine (SVM), orange for random forest (RF), and purple for k-nearest neighborhood (kNM).
Discussion
Evaluation approaches for CKD structural advance through imaging have traditionally emphasized kidney measurements, with kidney length historically regarded as the most critical parameter [3]. More recent methods utilize KV as a predictor of kidney function, calculated using either the ellipsoid formula [16] or segmentation techniques, which may be semi-automated or fully automated [17]. It is well established that KV exhibits a strong correlation with kidney function [18‒22].
While some RDX-based studies have explored CKD, the field remains relatively novel, with US and MRI being the primary modalities investigated [6]. For example, studies have demonstrated the utility of US, showing that US-based RDX is effective in evaluating CKD patients [23‒26], including the quantification of renal fibrosis [24, 26]. Similarly, MRI-based RDX research has shown its usefulness in the preliminary assessment and prediction of renal function in patients with autosomal dominant polycystic kidney disease (ADPKD) [27].
Few studies have made use of CT-based RDX in the context of CKD. A recent review by Zhao et al. [6] highlighted only one such study by Amiri et al. [11] which employed CT scans obtained prior to radiotherapy (RTH) alongside clinical data to predict the development of CKD in patients undergoing RTH of abdominal region. This study achieved remarkable performance, with an AUC of 0.99 and an accuracy of 0.94 using the RF classifier. Upon reviewing the literature, we identified very few additional studies with similarities to ours. Other research utilizing CT-based RDX demonstrated its effectiveness in distinguishing healthy kidneys from diabetic kidney disease [28], healthy kidneys from CKD [15], and evaluating renal function in autosomal dominant polycystic kidney disease (ADPKD), where RDX outperformed height-adjusted total KV [29]. Among these, the study by Kizildag et al. [15] is the only one with a comparable study protocol. Despite their larger cohort size (225 vs. 124), our study extracted a significantly greater number of RDX features (1,106 vs. 13) [15]. We also excluded patients with ADPKD. Notably, their findings of significant correlations between certain RDX features and kidney function align with our results [15].
In our study, we observed a significant correlation between KV and kidney function (rs = 0.579, p < 0.001), consistent with findings reported by other researchers [19‒22]. We assessed the utility of both RDX and KV in relation to kidney function and their ability to differentiate between patients with eGFR >60 mL/min/1.73 m2 and <60 mL/min/1.73 m2. Machine-learning models constructed in this study demonstrated the superiority of RDX-based models over KV-based models, with the former achieving a higher AUC (0.878 vs. 0.746). These results align with those reported by Calvaruso et al. [29], although their study specifically focused on patients with ADPKD.
Some features used in our calculations would have been deemed insignificant had we relied solely on conventional statistical methods instead of machine learning. Additionally, our findings emphasize the critical role of filter application in RDX studies. Notably, only two of the ten most important features of our models were derived from the original images, aligning with observations from other studies [11, 23]. The remaining eight features were extracted using LoG or wavelet transform filters. Among feature classes, GLSZM emerged as the most significant, with half of the top ten features belonging to this class. The GLSZM is a class of RDX features that quantify the spatial relationships of connected voxels with identical gray-level intensities [30]. This differs from other studies that emphasized the importance of GLRN or GLCM features [11, 23], highlighting the need for additional research in this area.
RDX is an emerging and evolving field of science that relies on complex methodologies, which can differ significantly among researchers [6, 7, 9]. Currently, RDX is neither quick nor user-friendly, requiring extensive software use and at least basic programming skills. Given these challenges, we recognize that it is still premature for its routine application in clinical practice. Indeed, we found no examples in the literature of its clinical use in the context of CKD.
Nevertheless, we remain optimistic that continued research in this area will lead to the development of simplified tools and greater standardization of methods, enabling RDX to support clinical decision-making, at least indirectly. As suggested by other researchers [6], we believe RDX has the potential to aid in the early identification of patients at risk of CKD, complementing KV, and eGFR assessments. This could facilitate earlier implementation of nephroprotective strategies to slow the progression of kidney disease.
Our study has several limitations. The study cohort was small, consisting of 64 patients with impaired kidney function and 60 controls, and it was heterogeneous in terms of kidney disease types and their underlying causes. Additionally, the control group was significantly younger. Another limitation was that the inclusion criterion relied on only a single serum creatinine measurement. It is important to note that we consider this study a pilot investigation, and we plan to expand our research to include a larger patient population. Future studies will also incorporate more comprehensive clinical data, such as a history of kidney diseases, comorbidities, and pharmacological treatments.
Conclusions
Based on our findings, we conclude that RDX can be a valuable tool for identifying patients with structural markers of CKD on non-enhanced CT. RDX-based models outperformed KV-based models in the assessment of structural advancement of CKD. RDX shows promise in identifying patients at higher risk of CKD through imaging, potentially enabling earlier implementation of nephroprotective strategies to mitigate the progression of kidney disease.
Statement of Ethics
This retrospective study was conducted in accordance with the Declaration of Helsinki and was reviewed and approved by the Bioethics Committee of the Medical University of Lodz (Approval No. RNN/174/24/KE; date of approval: 9th July 2024). Informed written consent was obtained from all patients prior to the study.
Conflict of Interest Statement
The authors declare no conflicts of interest.
Funding Sources
The research was funded by the statutory fund of the department under Grant No. 503/1-136-01/503-11-001.
Author Contributions
P.B. and L.S. conceptualized this study. P.B., A.D., and K.F. contributed to data collection. P.B. performed formal data analysis and drafted the manuscript. A.D. helped draft the manuscript. P.B., I.K. and L.S. revised the manuscript. All authors have read and approved the final manuscript.
Data Availability Statement
The data that support the findings of this study are not publicly available due to their containing information that could compromise the privacy of research participants but are available from the corresponding author [P.B.] upon reasonable request.