Abstract
Background: There is substantial overlap in MRI findings between phyllodes tumors (PTs) and fibroadenomas (FAs). Our study was performed to investigate the value of conventional MRI texture analysis in the differential diagnosis of PTs and FAs. Methods: Preoperative MRI data – including axial T1WI, T2WIFS (T2WI with fat suppression), dynamic contrast-enhanced (DCE)-T1WI2min and DCE-T1WI7min (T1WI post-strengthened for 2 and 7 min, respectively, on DCE-MRI) – of 45 patients with PTs and 67 patients with FAs were retrospectively analyzed. MaZda 4.7 software was used to manually draw the maximum ROIs at the same lesion level of the above MRI images. The optimized feature selection methods included Fisher’s coefficient, probability of classification error and average correction coefficient (POE + ACC), and mutual information (MI) as well as a combination of the above 3 methods (F + POE + ACC + MI [FPM]), respectively. The misclassification rates of PTs and FAs were compared between texture analysis and subjective diagnosis by radiologists. Results: The DCE-T1WI7min images had the lowest misclassification rate of 10.71% (12/112). The misclassification rate for the radiologists’ analysis (31.25%, 35/112) was higher than that of all the texture analysis, and there was a statistically significant difference between the radiologists’ misclassification rates and those from the FPM method in terms of the T2WIFS and DCE-T1WI2min images (all p < 0.05), and for the DCE-T1WI7min images by using the Fisher and FPM methods (all p < 0.05). Conclusion: Texture analysis of conventional MRI can be used as an assistant tool in providing a certain objective basis for differentiating PTs from FAs.
Introduction
Phyllodes tumors (PTs) are a rare type of fibroepithelial tumor of the breast, accounting for 2–3% of all fibroepithelial lesions and <1% of all breast tumors [1]. They are characterized by the bidirectional differentiation of epithelial and mesenchymal cells and have similar clinical manifestations, molecular expressions, and histological characteristics as fibroadenomas (FAs), both of which can transform or coexist with each other [2, 3]. Nonetheless, preoperative diagnosis is crucial because they require different therapies. FAs needs only enucleation or follow-up, whereas PTs requires wide local excision with adequate margins due to the high local recurrence rate [1, 4]. There are many similarities between the MRI findings of PTs and FAs [4, 5], especially when PTs appears as smooth, homogeneous nodules, which are difficult to distinguish from FAs. In addition, approximately 4% of FAs can present as large, rapidly growing masses with necrosis, which are indistinguishable from invasive PTs [6]. Although lobulation and cystic component were found to be statistically different in the differentiation of these 2 tumors [4], these qualitative data may yield different results depending on the subjectivity of the physicians. Currently, the traditional diagnosis method for breast tumors is mainly based on the experience of experts and requires much from doctors. Furthermore, many breast MRI sequences are taken, especially by dynamic contrast-enhanced MRI (DCE-MRI), which generates a large amount of image information that makes information integration for the human brain very challenging. When the doctor’s experience and knowledge are insufficient, the diagnosis speed is slow with low accuracy results. With the development of digital image processing, pattern recognition, and artificial intelligence, computer-aided diagnosis for cancer has received increasing attention. In recent years, texture analysis has become a computer image postprocessing software innovation. Through quantitative analysis of grayscale distribution features, spatial features, and pixel intensities in images, a series of texture feature parameters that are helpful in revealing the potential heterogeneity within lesions are obtained [7]. At present, MRI texture analysis has been applied to breast cancer molecular subtype classification, treatment response prediction, and prognosis [8, 9]. However, there are few studies on conventional MRI sequence texture analyses for the differential diagnosis of PTs and FAs. The purpose of this study was to explore the value of texture analysis in different MRI sequences in the differentiation between PTs and FAs of the breast.
Patients and Methods
Patients
Clinical and MRI data on 45 patients with PTs and 67 patients with FAs confirmed by postoperative pathology in our hospital from October 2012 to March 2019 were collected. The inclusion criteria were as follows: (a) all tumors must be pathologically confirmed as PTs or FAs by surgery or effective biopsy, and (b) all patients underwent routine MRI examination of the breast in our hospital 2 weeks before surgery, including axial T1WI, T2WIFS, and DCE-MRI scanning. The exclusion criteria were as follows: (a) patients who had received chemotherapy or radiation before MRI examination, and (b) patients whose MRI quality was poor and could not be used for texture analysis. All 112 patients were female: 45 patients had PTs (48 lesions), among whom 43 patients had single unilateral lesions and 2 patients had multiple unilateral lesions; 67 patients (72 lesions) had FAs, among whom 62 patients had single unilateral lesions and 5 patients had multiple unilateral lesions. For the PT group, the patients were between 16 and 71 years of age, with an average age of 45.0 ± 10 years; in the FA group, the patients were between 12 and 66 years of age, with an average age of 32.0 ± 14 years.
Imaging Protocol
All patients were examined with a 1.5-T MRI scanner (Siemens Magnetom Aera, Germany) in the standard prone position with the breasts immobilized using an 8-channel dedicated breast coil. The MRI acquisition protocols were standardized in all cases and consisted of 3 parts in the following order. First, transverse T1WI (spin echo, TR = 8.6 ms, TE = 4.7 ms) and fat-suppressed T2WI (fast spin echo, TR = 5,600 ms, TE = 57 ms) images were obtained. Second, transverse DWI was performed using the spin echo-echo planar imaging sequence; the diffusion sensitivity coefficient (b value) was 1,000 s/mm2, TR = 3,300 ms, TE = 94 ms; flip angle = 90°; layer thickness = 5 mm; FOV = 320 × 320 mm, matrix = 128 × 28; number of averages = 3; parallel acquisition with an acceleration factor of 2; an acquisition time of approximately 120 s, and the application of motion probing gradient pulse along the x, y, and z directions with b values of 0 and 800 s/mm2. Third, after DWI, axial 7-phase dynamic contrast acquisitions were performed with a 3-dimensional fat-suppressed T1 fast-field echo sequence (TR = 4.62 ms, TE = 1.75 ms; layer thickness = 1.5 mm, interlayer spacing = 0, FOV = 360 × 360 mm, and matrix = 384 × 320). After a 90-s plain scan, an enhanced scan was performed. The contrast agent, Gd-DTPA (Omniscan, GE Healthcare, Ireland), was injected into the elbow vein by a high-pressure syringe at a dose of 0.1 mmol/kg and a flow rate of 4.0 mL/s. Subsequently, 7 phases were continuously collected without intervals. Each scanning duration was approximately 60.01 s, the layer thickness was 3 mm, and the number of single-phase scanning layers was 112.
Texture Analysis and Feature Selection
Image selection: MRI images of all the patients were re-binned to 6 bits/pixel and exported to “BMP” format by an in-house software from the picture archiving and communicating system (PACS) workstation. All the images were adjusted to a consistent window width and window level. Then, these data were stored on a personal computer for further analysis. Without knowledge of the clinical or histological information and without viewing any other images, 2 senior radiologists (C. Zhang and X. Luo, who had 10 and 15 years of experience in breast imaging diagnosis, respectively) together analyzed all the MRI sequences in a random per-patient order on a 2,048 × 2,560 pixel resolution grayscale PACS monitors. Considering that axial DCE-T1WI7min sequence shows the best contrast enhancement, based on this figure, we selected the images of the largest lesion layer and used T1WI, T2WIFS, and DCE-T1WI2min images of the same layer. When the 2 radiologists disagreed, they reached a consensus through consultation.
Texture analysis: MaZda (version 4.7; Lodz University of technology, Institute of Electronics, http://www.eletel.p.lodz.pl/mazda/) was used to analyze the images at the selected maximum level of focus, and the texture feature parameters were calculated by manually sketching the ROI. The calculation and evaluation process of texture features was as follows: the ROI was manually outlined by one of the radiologists along the edge of the tumor, including areas with cystic area, necrosis, and hemorrhaging. First, the DCE-T1WI7min images were analyzed, and then, based on this figure, the ROI of other sequences should be as consistent as possible with that of the DCE-T1WI7min (Fig. 1). For each ROI, gray-level normalization was performed using the limitation of dynamics μ ± 3σ (μ is the gray-level mean; σ is the gray-level SD) to minimize the influence of contrast and brightness variations [10]. By using MaZda, textural features of all the ROIs can be calculated from the gray-level co-occurrence matrix (GLCM), the gray-level run length matrix (run), the absolute gradient (GRA), the autoregressive model (ARM) and the wavelet transform (WAV) method (Table 1). As there are many texture features, it is necessary to first select the texture features most capable of identifying these 2 lesions. MaZda provides 3 methods for the selection of texture features, namely, the Fisher coefficient, probability of classification error combined average correlation coefficients (POE + ACC), and mutual information (MI); each method can select 10 texture feature parameters with discriminating values. In addition, these 3 methods (Fisher + POE + ACC + MI [FPM]) can be combined to select a total of 30 texture feature parameters for further classification analysis. In this study, all 4 methods were used to select the most valuable texture feature parameters. Then, we input the extracted texture feature parameters into the b11 statistical software provided by MaZda and classified and analyzed the texture features of the 4 selected sequences through the automatic training of the artificial neural network (ANN) model to evaluate the accuracy of the differential diagnosis of PTs and FAs. The results are expressed as the misclassification rate. The lower the misclassification rate is, the higher the recognition degree of the sequence for PTs and FAs, and the more texture features are available to distinguish these 2 tumors.
Subjective Judgment by Radiologists
The diagnosis results were obtained by the analysis of the above 4 MRI sequences by 2 senior radiologists (C. Zhang and X. Luo) without knowing the pathological results. When the 2 doctors disagreed, consensus was reached through consultation. Based on the pathological results, the misclassification rate for the imaging radiologists was calculated.
Statistics
The following were the objectives of the statistical analysis: (1) to compare the difference in the misclassification rates for Fisher’s coefficient, POE + ACC, MI, and FPM between the 4 sequences, (2) to compare the difference in misclassification rates from these 4 methods between any 2 of the 4 sequences, and (3) to analyze the difference between the radiologists’ misclassification rate and the lowest misclassification rate of each sequence. The χ2 test was adopted for the above statistical analysis, and the data were processed in SPSS (version 20.0 for Windows; SPSS, Chicago, IL, USA). All count data were expressed as percentages. Differences were considered statistically significant if p < 0.05.
Results
All the texture analysis results from MaZda are summarized in Table 1, and the demonstration ROIs are shown on axial-sectional images based on T2-weighted (Fig. 1a, 2a), and DCE-T1WI-weighted images (Fig. 1b, c, 2b, c), including the cystic areas and necrosis. Texture parameters are shown in Figures 1d, 2d.
Among the 4 sequences, the texture features that distinguish PTs from FAs are mainly from the DCE-T1WI7min images, with the lowest misclassification rate of 10.71% (12/112) obtained by the FPM method (shown in Table 2), among which 10 FAs were misdiagnosed as PTs and 2 PTs were misdiagnosed as FAs. The misclassification rates of the feature selection methods were similar in Fisher’s coefficient, POE + ACC, and MI (16.07–28.57%, on average 23.21% for Fisher’s coefficient; 20.54–24.11%, on average 22.32% for POE + ACC; 20.54–26.78%, on average 23.88% for MI). However, the misclassification rate of the combination of the 3 methods (10.71–20.54%, on average 15.85% for FPM) was lower than that of any other single method (shown in Table 2). There was no statistically significant difference for misclassification rates between any 2 of the 4 sequences except for that obtained by FPM and Fisher’s coefficient method between the DCE-TIWI7min and T1WI images (χ2 = 4.097 and 5.046; p = 0.043 and 0.025, respectively; shown in Table 3).
The misclassification rate for the radiologists’ subjective reading of PTs and FAs was 31.25% (35/112), among which 22 PTs were misdiagnosed as FAs and 13 FAs were misdiagnosed as PTs. The radiologists’ subjective misclassification rate was higher than all those obtained by texture analysis of each sequence. No statistically significant difference was found in the misclassification rates between the radiologists’ subjective reading and T1WI image texture analysis. However, there was a statistically significant difference between the radiologists’ misclassification rates and those from the FPM method in terms of the T2WIFS and DCE-T1WI2min images (χ2 = 6.247 and 8.114, p = 0.012 and 0.004, respectively), and for the DCE-T1WI7min images by the Fisher and FPM method (χ2 = 7.143 and 14.244, p = 0.008 and 0.000, respectively; shown in Table 4).
Discussion
PTs are unique neoplastic lesions that are comprised of both stromal and epithelial components, and these tumors may have similar clinical features and histopathological appearance to FA, which is the most frequent benign tumor of the breast [2, 5]. PTs are subdivided into benign, borderline, and malignant categories depending on histologic features including stromal cellularity, nuclear atypia, mitotic activity, and tumor margin appearance [1]. The majority of PTs behave in a benign fashion, with the risk of local recurrence ranging from 17% in benign PTs to 30% in malignant PTs [1, 11]. Overall, distant metastasis rates in the literature (a recent review) are 16.71% (66/395), 1.62% (7/431), and 0.13% (2/1524) for malignant, borderline, and benign PTs, respectively [11]. However, pathological details of these extremely rare cases of metastatic benign PTs were not provided [11-13]. Clinically, the treatment of PTs is dramatically different from that of FAs; both benign and malignant PTs require complete surgical excision with wide tumor-free margins (generally, >1 cm), in contrast to the follow-up or simple enucleation for FAs [5, 14]. Therefore, accurate preoperative distinction between PTs and FAs is particularly significant, in both the choice of therapy program and the prognosis of patients.
Currently, most radiologists identify these 2 tumors by the tumor size, boundary, T2WI signal characteristics, DCE-MRI enhancement pattern, TIC curve type, and axillary lymph node status, and so on. However, misdiagnosis occurs from time to time, which is, to some extent, relevant to the clinical experience of radiologists and the lack of objective quantitative indicators. In addition, it is a challenge for the human brain to integrate an amount of multi-parameter image information and determine the reference weight of different MRI sequences, especially when PTs and FAs yield very similar MRI findings. Compared with traditional empirical image identification, the texture analysis can provide much information that is not visible to the naked eye. Its advantage lies in that it does not rely on the subjective and clinical experience of imaging physicians and provides only objective quantitative information such as the gray level of the image itself [7].
MaZda is the main tool for texture analysis and can calculate all kinds of texture feature parameters, including histograms, gray-level cooccurrence matrices, gray-level run length matrices, absolute gradients, autoregressive models, and wavelet transforms, for a total of 287 species and 6 classes, including the focal shape, direction, pixel signal intensity distribution, and texture features such as the wavelet transform, and all are for quantitative data [15]. The ANN classification analysis method in MaZda is one of the machine learning methods, which consists of subparts such as preprocessing, feature extraction, and classification. It has the ability to self-learn and can obtain reasonable output based on input. It selects diagnostic factors from medical imaging data as input variables. Through continuous training, a model with the ability to identify the image data has been established for medical image auxiliary analysis [16]. At present, the ANN technology has been used in the assisted diagnosis of breast tumors. Markopoulos et al. [17] used a specific ANN to identify benign and malignant breast calcifications, and they found that the ANN test results were statistically significantly better than those of 3 physicians in the control group, on average. Peng et al. [18] found that ANNs can be used for the differential diagnosis of benign and malignant breast diseases from mammograms with a specificity of 98.00%, a sensitivity of 97.33%, and a misclassification rate of 2.50%. MaZda selects and extracts the most valuable texture features as the dependent variable for an ANN dichotomous function. Generally, the ROI set is taken as the dependent variable (a maximum of 16 ROIs is allowed to be set at the same time, and ROIs with different names represent different pathological types), and the misclassification rate is obtained through automatic training.
In this study, conventional MRI images of the breast were analyzed by texture analysis, and the value of identifying PTs and FAs by ANN classification was discussed. The results suggest that texture analysis can be used as an auxiliary tool for the differential diagnosis of PTs and FAs in conventional MRI images, and the misclassification rate of PTs and FAs was as low as 10.71% (12/112), which was significantly lower than that derived from the radiologists’ subjective reading, which was 31.25% (35/112). The results also showed that Fisher’s coefficient, the POE + ACC, and MI methods alone were not significantly different for the differentiation of PTs and FAs among the texture feature selection methods, while the FPM method was more effective and had the optimal differential diagnosis effect. The FPM method is a combination of Fisher’s coefficient, the POE + ACC, and MI methods, with more comprehensive texture features (30 texture features can be extracted). In addition, the training frequency and training sample size of the ANN was also increased, which made up for the deficiency in the single Fisher coefficient, POE + ACC, and MI methods [15]. Therefore, the weight value obtained for FPM was the most stable, the misclassification rate was the lowest, and it was more advantageous than the traditional imaging method with radiologists’ subjective readings.
T2WI sequence texture analysis has been suggested to be of very important application value in the differential diagnosis of tumors. For example, T2WI image texture analysis has importance in the screening and differential diagnosis of benign breast lesions and can improve the diagnosis rate and specificity [19]. Some studies suggest that T2WIFS texture analysis can distinguish liver cancer, hepatic hemangiomas, and metastatic tumors [20]. The main reason is that the T2WI echo time is relatively long, which increases the tissue resolution [21], so the image contains more differential texture features with identification value. The results in our study indicate that T2WIFS sequence texture analysis is feasible in the differential diagnosis of PTs and FAs, and its minimum misclassification rate is significantly lower than that derived from the radiologists’ subjective reading. We believe that the differences in the T2WIFS signal of these 2 tumors may be an important reason for the differences in the texture characteristics. PTs usually present heterogeneous, high signals on T2WIFS images, which is related to the existence of cracked glandular cavity structures and cystic necrosis [22]. Therefore, the heterogeneity is high, and the image texture is disorganized and unevenly distributed. In contrast, most FAs showed a homogeneous, high signal (related to tumor mucinous degeneration), and a few cases showed a homogeneous, low signal (related to tumor interstitial sclerosis), with relatively low heterogeneity. Breast ultrasound and histopathological studies have also shown that PTs are generally more prone to internal structural heterogeneity than FAs, which may be related to the degenerative changes in the internal structure caused by the rapid growth of PTs [5, 23].
DCE-MRI of the breast has become a routine clinical examination method to analyze hemodynamic changes in lesions by measuring dynamic changes in the signal strength (such as TIC curves). Although these curves are simple and feasible, it can reflect only the change in the mean signal enhancement, and it is not helpful in the differential diagnosis of PTs and FAs [4, 5]. In delayed contrast-enhanced images, based on the imaging experience and subjective visual judgment of physicians, PTs are more prone to heterogeneous enhancement, while FAs are more prone to homogeneous enhancement; however, there is no statistically significant difference between the 2 enhancement patterns [4]. Therefore, both BI-RADS (Breast Imaging Reporting and Data System) classification and ordinary experience suggest that the texture features of lesions and their changes with dynamic enhancement are more important. Previous studies have shown that dynamic enhanced T1WI breast texture analysis can better reflect tumor heterogeneity, and the texture difference may reflect the potential pathological subtypes of breast cancer [8, 24] and can monitor the tumor’s response to treatment [10, 25]. In the 4 sequence texture analysis results, we found that the lowest misclassification rates were from the DCE-T1WI7min images, followed by the DCE-T1WI2min images, and both of them were lower than that of T1WI images. There was a statistically significant difference between the misclassification rate obtained from the DCE-TIWI7min and T1WI images, indicating that dynamic enhanced T1WI can improve the contrast between breast tumors and glandular tissue, and there was a substantial difference in the changes in the internal texture features of the lesion during the DCE-T1WI7min scan, which had a high diagnostic value [26]. In addition, these 2 misclassification rates were also lower than those of radiologists’ subjective readings, indicating that the recognition of internal texture features by computer-aided diagnosis is more advantageous than the recognition of signal intensity changes by human vision alone.
Conclusion
The results of this study indicate that conventional MRI texture analysis can be used in the noninvasive, simple, and accurate differentiation of PTs and FAs, which shows great application prospects and provides a certain objective basis for the establishment of clinical strategies. This study also has some limitations. First, texture analysis can be used only as a new method for auxiliary diagnosis, so to obtain a better diagnostic effect, it needs to be combined with conventional MRI findings. Second, the 2 radiologists did not classify the grades of PTs because the differentiation of benign and malignant PTs is difficult with conventional MRI. Third, the ROI data in this study are 2-dimensional, which may lead to some data bias.
Statement of Ethics
The Ethics Committee of the Daping Hospital of the Army Medical University approved the study. Informed consent was obtained from all individual participants included in the study.
Disclosure Statement
The authors declare that they have no competing interests.
Funding Sources
This work was supported by funding from the Chongqing Clinical Research Centre of Imaging and Nuclear Medicine (CSTC2015YFPT-gcjsyjzx0175) and the special fund of the Central Government guide for the development of local science and technology (YDZX20175000004270).
Author Contributions
X. Li: study concept and design. N. Jiang: drafting of the manuscript. L. Zhong and P. Zhong: acquisition of data. C. Zhang and X. Luo: analysis and interpretation of data. X. Li: critical revision of the manuscript for important intellectual content.