Abstract
Introduction: Development and validation of a deep learning algorithm to automatically identify and locate epiretinal memberane (ERM) regions in OCT images. Methods: OCT images of 468 eyes were retrospectively collected from a total of 404 ERM patients. One expert manually annotated the ERM regions for all images. A total of 422 images (90%) and the remainig 46 images (10%) were used as the training dataset and validation dataset for deep learning algorithm training and validation, respectively. One senior and one junior clinician read the images. The diagnostic results were compared. Results: The algorithm accurately segmented and located the ERM regions in OCT images. The image-level accuracy was 95.65%, and the ERM region-level accuracy was 90.14%, respectively. In comparison experiments, the accuracies of the junior clinician improved from 85.00% to 61.29% without the assistance of the algorithm to 100.00% and 90.32% with the assistance of the algorithm. The corresponding results of the senior clinician were 96.15%, 95.00% without the assistance of the algorithm, and 96.15%, 97.50% with the assistance of the algorithm. Conclusions: The developed deep learning algorithm can accurately segment ERM regions in OCT images. This deep learning approach may help clinicians in clinical diagnosis with better accuracy and efficiency.
Introduction
Epiretinal membrane (ERM) is the proliferation of cellular tissue on the surface of the retina, which can occur in any area of the retina [1]. It’s called macular ERM when the fibrous proliferation membrane is located around the macula. According to the etiology, ERM can be divided into the idiopathic macular ERM (IMEM) and the secondary macular ERM [2, 3]. Many studies indicate that RPE cells are the main participants in ERM; RPE cells are the main participants in IMEM. The patient will experience vision loss, visual deformation, metamorphopsia, micropsia or macropsia, photopsia, and diplopia [4, 5]. At present, the diagnosis of ERM is usually based on clinical examination and imaging studies (OCT). However, the ERMs appear as a glinting, water-silk, shifting light reflex from the inner surface of the retina in the early stage, which makes the diagnosis easily confused, especially for beginners. If ERM is not diagnosed in the early stages, it will cause macular edema and macular holes in the later stages, which will seriously affect vision.
Spectral domain OCT is a noncontact, noninvasive imaging technique that images two-dimensional and three-dimensional views of living retinal tissue by spectral analysis of the interference patterns of backscattered light. Despite the significant advantages of OCT in the diagnosis of fundus diseases, it is still a time-consuming task for ophthalmologists to examine and interpret the images. In OCT images, macular ERM is mainly manifested as an irregular ultra-high reflective layer on the retina. Part of the ERM may pull the retina, causing the tractive retinal detachment sometimes, and it will cause the central fovea macula to disappear. At present, the diagnosis of macular ERM can only be subjectively judged by clinicians based on OCT images, and the high reflection from macular may come from ERM or due to the reflection on the level of the retina itself. Furthermore, vitrectomy silicone oil filling or silicone oil residue can also show a high reflection. All these factors lead to difficulties in clinical diagnosis.
Thanks to the advances of artificial intelligence (AI), especially the deep learning algorithms, recent years have witnessed encouraging AI applications in healthcare and clinical practices [6‒9]. Deep learning (DL) is a subfield of AI with exciting advances recently [10]. Technically, deep learning algorithms are neural network structures that can extract presentations of input data. DL algorithms of convolutional neural networks and other more sophisticated structures have been successfully applied in image analysis. Notably, in medical image readings, deep learning algorithms achieved human-level performance in tasks such as detection [11‒13], segmentation [14‒19], and classification [20, 21]. Recently, DL has been successfully applied in analyzing medical images of ultrasound [22, 23], CT [24‒26], and MRI [27‒29]. In ophthalmology, there is also an emerging study of applying AI to analyze ultrasound images [30], retinal fundus images [31], and OCT [11, 15, 19, 32‒36]. More recently, there are emerging reports of utilizing AI algorithms in OCT images to diseases such as diabetic retinopathy [37], glaucoma [11, 38], macular degeneration [36], and retinal detachment [39]. However, there is still a lack of AI studies for the diagnosis of ERM. Recent studies applied a binary classification algorithm to identify ERM cases from normal cases using OCT images [40, 41]. However, the ERM regions are not segmented in the OCT images, which limits the value in assisting clinicians in reading OCT images. There is relatively little literature on the use of DL to further identify and diagnose ERM, and the clinical diagnosis of ERM relies on the subjective judgment of the examiner, whereas ERM is an OCT anomaly. The usefulness of the DL model in detecting image features perceived by human observers, as well as more subtle abnormalities that are not perceived by human observers, is demonstrated. Therefore, it is more important to study the clinical value of DL in ERM diagnosis. Therefore, this study aimed to develop a deep learning algorithm to directly identify and locate the ERM regions in OCT images.
Materials and Methods
In this retrospective study, data of ERM patients treated in our hospital were collected. This study was approved by the Ethics Review Committee of the Affiliated Hospital of Southwest Medical University (KY2019049). This study follows the principles of the Declaration of Helsinki.
The overall schema of this study was illustrated in Figure 1. We collected the OCT images dataset of ERM cases and invited senior clinicians to manually annotate the ERM regions as ground truths. A training and validating approach was adopted to develop a deep neural network algorithm of the U-Net structure [42]. Afterward, we further evaluated the usefulness of the AI method in assisting senior and junior clinicians in reading OCT images. The experiments showed that the proposed deep learning algorithm could segment the ERM regions in OCT images with encouraging accuracies at the human level. Furthermore, the algorithm could assist both senior and junior clinicians in improving the accuracies of reading OCT images.
Patient Characteristics
We retrospectively collected the OCT images of ERM patients treated in the Department of Ophthalmology at the Affiliated Hospital of Southwest Medical University between January 1, 2018, and December 31, 2018. A total of 404 patients were studied, including 183 male patients and 221 female patients. The presence of ERM was confirmed by two experienced ophthalmologists in all images. All OCT images collected are anonymized before further processing to protect patient privacy.
OCT Images Acquisition
All subjects were examined using a CirrusTM HD-OCT instrument (5000Angioplex, version 9.5) from Zeiss, Germany. The type of OCT image acquired is HD. The OCT images with a signal intensity greater than 8 were selected (the image signal range was 0–10; the higher the signal intensity, the higher the image quality). In total, we collected 468 OCT images.
Image Annotation
We invited one expert with more than 20 years of experience to manually annotate the ERM region of interest (ROI) in OCT images using in-house developed software. The expert strictly followed a quality control protocol to ensure all annotations were of the same standard. Since the ERM regions might appear in multiple places, the expert drew the outlines of all visible ROIs in each OCT image. The annotations of all ROIs were processed to generate pixel-wise masks, which were used as ground truth. The raw OCT images and accompanying masks formed the dataset for later algorithm development and evaluation.
After annotation, we randomly divided the images (n = 468) into one training cohort (n = 422, 90%) and another independent validation cohort (n = 46, 10%) based on patients. This approach ensured that no images of the same individual appeared in both the training and validation datasets. The images from the training cohort formed the training dataset for algorithm training, while the images from the validation cohort were later used to evaluate the performance of the proposed algorithm and comparisons with human readers.
Deep Learning Algorithm
In this study, our aim was to identify and localize the ERM regions in OCT images in order to examine the presence of ERM. To accomplish this, we developed a deep learning algorithm of multiple neural network layers. As shown in the schema of the deep learning algorithm (Fig. 2), the utilized U-Net was composed of one encoder and one decoder [42]. The encoder took raw OCT images and the masks as inputs and transformed the images into a lower-dimension latent space by performing down samplings. In this way, the abstract information in the images was extracted. On the contrary, the decoder utilized the representations in the latent space, applied upsamplings to restore the size, and finally generated masks as outputs. The encoder involved multiple convolutional and pooling layers, while the decoder involved anti-convolutional and anti-pooling layers. Together, the encoder-decoder architecture achieved competitive performance in many medical imaging tasks [18].
We implemented the deep learning algorithm using the Python (3.7.3) programming language with publicly available libraries of NumPy (1.16.2), PyTorch (1.1.0), and CUDA (10.1.105). We trained and evaluated the algorithm on a convenient computing server equipped with a deep learning processor (Tesla P40).
Performance Evaluation
For a given OCT image, there might be one or multiple ERM regions presented. In practice, the major aim of clinicians is to examine the presence of ERM regions regardless of the number of regions. If the deep learning algorithm successfully identifies at least one ERM region among all presented regions, that would be enough to decide whether this OCT image has ERM or not. Therefore, we introduced the image-level accuracy, imagei = 1 if at least one ERM region was correctly identified in the ith image, otherwise no regions were correctly identified. From another perspective of ROIs, we introduced the ERM region level accuracy to describe how many ERM regions were successfully identified and localized in an image, namely ROIi as the number of ERM regions correctly identified in the ith image. In other words, for the given ith OCT image, the value of imagei indicated the ERM condition as ERM case or non-ERM case, while the value of ROIi indicated how many ERM regions in the image were identified. Both imagei and ROIi are metrics based on individual images. To describe the average performance of the algorithm on a dataset of OCT images, we introduced the corresponding average accuracies as
where n is the number of images, and
where m is the number of total ERM regions in all images.
To further evaluate the value of the proposed algorithm in assisting clinicians, we conducted comparison experiments. We first randomly divided the 46 images of the validation dataset further into two sets; one had 20 images, and another had 26 images. We invited one senior clinician with about 20 years of experience and one junior clinician with 5 years of experience to participate. The two clinicians were first trained to follow the same annotation protocol. In the first experiment, the first set (dataset A) of 20 raw OCT images without any annotations predicted by the algorithm was provided for the senior and junior clinicians to annotate. In the second experiment, the second set (dataset B) of 26 OCT images with annotations predicted by the algorithm was provided to the participants for reference. After the two experiments, an expert clinician served as the judge to examine their annotation results. Corrected annotations of ERM regions were used to calculate the aforementioned imagei and ROIi for the senior and junior clinicians, respectively. The results of both clinicians with and without the assistance of the results predicted by the algorithm were compared with the algorithm.
Results
Patient Characteristics and OCT Images
The summary of the patient characteristics is provided in Table 1. OCT images were retrospectively collected from a total of 404 ERM patients. We randomly divided the whole OCT images (n = 468) into two separate datasets, namely one training dataset (n = 422, 90%) and one validation dataset (n = 46, 10%), respectively.
Performance
We first trained the algorithm on the OCT images of the training dataset. In this stage, we applied the Adam optimizer with the an epoch of two and a batch size of one. Afterward, the algorithm was evaluated on the validation OCT image dataset. The predicted segmentations were visualized on the OCT image and assessed by the judge to determine whether the results agreed with the ground truth. As a result, among the total 46 images in the validation set, the algorithm successfully identified 44 images as ERM cases. Namely, the image level average accuracy was 95.96%. In the total 71 ERM ROIs, the algorithm correctly identified 64 ROIs with an ROI level average accuracy of 90.14%.
In Figure 3, we have demonstrated examples of the segmentations of ERM regions obtained by the algorithms. As shown in the examples, multiple ERM regions were segmented, agreeing with the manual annotations by the human expert. The evaluation of the validation dataset indicated that the algorithm was capable of accurately identifying and locating the ERM regions.
Table 2 provides the results of the comparison experiments for algorithms and two participants. As summarized in the table, the algorithm achieved a human level of accuracy in identifying ERM cases and ROIs. The results of the participants improved significantly with the assistance of the algorithm compared to the results obtained without the assistance. Especially, the junior clinician benefited significantly from the assistance of algorithms with remarkable improvements, which were close to the senior clinician.
Discussion
In this study, we developed an AI algorithm based on deep learning neural network to automatically identify and locate ERM in OCT images. Our results showed that the deep learning algorithm is capable of achieving accuracies with encouraging agreement with professional clinicians. Furthermore, our experiments showed that the proposed algorithm could significantly improve the accuracies of clinicians in subjective detecting presences of ERM on OCT images. Our findings suggest that the quantitative and objective method using state-of-the-art deep learning algorithms may assist clinicians in ERM diagnosis and has important implications in wider clinical practices.
Accurate diagnosis of ERM is critical for surgery treatment. At present, OCT imaging has become a convenient method for clinicians in regional diagnosis. In a typical ERM diagnosis, a clinician needs to examine OCT images to determine the presence of ERM on OCT images. However, this subjective manual examination relies mainly on the experience of one clinician, which inevitably leads to deviations among clinicians and difficulties in diagnosis quality control. This examination is also challenging because the ERM may only present a small slice in blurry OCT images. Furthermore, OCT imaging devices can easily produce a large number of images. It would be a time-consuming task for clinicians to manually examine all generated OCT images. Therefore, with the advances of OCT imaging technologies, it’s necessary to develop an AI method to automated exam OCT images in detecting ERM.
Deep learning algorithms have attracted significant attention in medical image analysis [18, 43‒47]. However, compared to the abundant literature on retinal diseases, only a few reported studies focus on diagnosing ERM using deep learning algorithms based on OCT images [40, 41]. In a previous study, deep learning was applied to discriminate ERM cases from normal cases of OCT images [40]. The reported algorithm slightly outperforms the clinicians in classifying the OCT image [40]. Similarly, in another study, support vector machine and deep learning methods were compared in detecting ERM using 3D-OCT images [41]. However, as pointed out by the authors, their work only focused on comparing the discriminating ability of binary classification methods using 3D-OCT images, which did not investigate the value of these methods in clinical settings [41]. Furthermore, 3D-OCT images are not easy to obtain as 2D OCT images, limiting the clinical adoption of the approach. Though, the methods of these two studies can assist clinicians by identifying ERM images. However, they are still unsatisfactory because they only classify OCT images and return a label as ERM or normal without any visualized segmentations of ERM in OCT images. Unlike these previous studies, this study directly identifies and locates ERM regions in OCT images with competitive accuracies both in image level and ERM level. The proposed method is capable of automatically generating a visualization of the pixel-wise segmentation in OCT images. This advantage can significantly assist clinicians in ERM diagnosis, as our experiment showed.
Our study contributes to the literature of applying AI technologies in ERM diagnosis based on OCT images. The proposed deep learning algorithm achieved acceptable accuracy in segmentation of ERM on OCT images and showed promising value in assisting clinicians. First, this approach is capable of generating visualized segmentations of ERM, which can improve the accuracy of clinicians in ERM diagnosis. The objective and quantitative results may help avoid the subjective bias of clinicians. Second, this fully automated approach can significantly improve clinical efficiency by saving time for clinicians. With the help of a deep learning algorithm, the number of OCT imaging sessions can increase for more careful examination without worries of increasing reading time. Lastly, this study also adds new insights into the line of studies using AI technologies in the diagnosis of retinal diseases.
However, this study still has several limitations. First, the size of the used dataset is small. AI methods, especially deep learning algorithms, usually require data of a large number of cases. Though the proposed algorithm trained on the small dataset performs well, it can be further refined by training with more cumulated OCT images. As another limitation, the present study only focuses on ERM segmentation rather than discriminating ERM from normal cases. Therefore, our dataset only contains ERM images. In future research, the algorithm needs to be further expanded and investigated to be able to classify ERM images from normal cases and simultaneously generate visualizations of all ERM presences in the OCT images. The value of the deep learning approach would be significantly enhanced by achieving these two objectives. Finally, this study is based on 2D OCT images rather than 3D OCT images. It would be worth further investigations of developing deep learning algorithms that are capable of automatically segmenting ERM in 3D OCT images, which might assist clinicians in ERM diagnosis.
Conclusion
In conclusion, deep learning algorithms can automatically identify and locate ERM in OCT images with promising accuracy. The results showed that the proposed algorithm could also improve the accuracy of clinicians in OCT image readings. This approach provides encouraging quantitative and subjective results, which may improve clinical diagnostic accuracy and efficiency. This study shows the attractive potentials of deep learning applications in diagnosing ERM and other retinal diseases.
Statement of Ethics
In this retrospective study, data were all collected of EMR patients treated in our hospital. All patient information is anonymized. The data presented in this study was provided by the Affiliated Hospital of Southwest Medical University. Written informed consent from participants was not required for the study presented in this article in accordance with local guidelines. Ethics approval for this study was obtained from the Ethics Review Committee of Affiliated Hospital of Southwest Medical University (KY2019049).
Conflict of Interest Statement
The authors report no conflicts of interest.
Funding Sources
This study is supported by the National Natural Science Foundation (82272077), National Key Research and Development Program (2020YFF0305104), Key Research and Development Project of Science & Technology Department of Sichuan Province (2020YFS0324), and Applied Basic Research Project of Science & Technology Department of Luzhou City (2018-JYJ-45), Youth Innovation in Medical Research in Sichuan Province (Q15014).
Author Contributions
Yong Tang and Xiaorong Gao contributed equally to this work. Conceived and designed the experiments: Yong Tang, PhD, Xiaorong Gao, MS, Weijia Wang, MS, Yue He, MD, and Yujiao Dan, MS. Performed the experiments: Yong Tang, PhD Xiaorong Gao, MS, Weijia Wang, MS, Yujiao Dan, MS, and Yong Tang, PhD. Analyzed the data: Yong Tang, PhD, Weijia Wang, MS, and Linjing Zhou, BS. Contributed reagents/materials/analysis tools: Xiaorong Gao, MS, Yujiao Dan, MS, Song Su, MD, Jiali Wu, MS, and Hongbin Lv, MD.
Data Availability Statement
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to their containing information that could compromise the privacy of research participants.
Additional Information
Yong Tang and Xiaorong Gao contributed equally to this work.