Abstract
Introduction: In patients with acute ischemic stroke, the location and volume of an irreversible infarct core determine prognosis and treatment. We aimed to determine if automated CT perfusion (CTP) is non-inferior to diffusion-weighted imaging (DWI) or fluid-attenuated inversion recovery (FLAIR) in predicting the acute infarct core. Methods: In this systematic review and meta-analysis, we searched MEDLINE and EMBASE from 1960 to December 2020. Five outcome measures were examined: volumetric difference, volumetric correlation, sensitivity and specificity at the patient level, Dice coefficient, and sensitivity and specificity at the voxel level. A random-effects meta-analysis was performed for volumetric difference and correlation. Results: From 3,986 studies retrieved, 48 studies met our inclusion criteria with 46 studies on anterior circulation, one study on posterior circulation, and one study on lacunar infarct strokes. In anterior circulation stroke, there were no significant mean volumetric differences between CTP and acute DWI (cerebral blood flow [CBF] 0.52 mL, 95% CI [−0.07, 1.11], I2 0.0%; relative CBF [rCBF] 3.01 mL, 95% CI [−0.46, 6.48], I2 82.6%; relative cerebral blood volume [rCBV] −12.84 mL, 95% CI [−38.56, 12.88], I2 96.2%) and between CTP and delayed DWI or FLAIR (rCBF −1.29 mL, 95% CI [−6.49, 3.92], I2 91.8%; rCBV −5.80 mL, 95% CI [−16.20, 4.60], I2 84.2%). Mean correlation between CTP and acute DWI was 0.90 (95% CI [0.80, 0.95], I2 60.0%) for rCBF and 0.84 (95% CI [0.58, 0.94], I2 93.5%) for rCBV. Mean correlation between CTP and delayed DWI or FLAIR was 0.74 (95% CI [0.57, 0.85], I2 94.6%) for rCBF and 0.90 (95% CI [0.69, 0.97], I2 93.1%) for rCBV. Sensitivity and specificity at the patient level were reported by three studies and Dice coefficient by four studies. Statistical analysis could not be performed for sensitivity and specificity at the voxel level. Limited evidence was available for posterior circulation or lacunar infarct strokes. Conclusion: Due to significant heterogeneity and insufficient high-quality studies reporting each outcome, there is insufficient evidence to reliably determine the accuracy of CTP prediction of the infarct core compared to DWI or FLAIR.
Introduction
In patients with acute ischemic stroke, the location and volume of an irreversible infarct core determine prognosis and treatment. Thrombolysis [1‒3] and thrombectomy [4, 5] may improve patient outcomes when administered beyond traditional time windows in patients with a reversible ischemic penumbra that is distinct from an irreversible infarct core. Infarct core volumes can be measured by different imaging modalities, including diffusion-weighted imaging (DWI), fluid-attenuated inversion recovery (FLAIR), and computer tomographic perfusion (CTP).
DWI is the preferred reference standard for defining the acute infarct lesion, being able to detect alterations in water diffusion within minutes from onset of ischemia [6]. A potential limitation of DWI is partial reversibility of early DWI infarct lesions especially after reperfusion, but this is relatively uncommon [7]. FLAIR is an alternative to DWI to define the acute infarct core [8]. But FLAIR does not reliably detect the infarct lesion within the first few hours from onset of ischemia [9]. Other limitations of both DWI and FLAIR are limited access to magnetic resonance imaging (MRI) scanners, longer acquisition times than CT, patient contraindications, and intolerance.
CTP provides a faster and more accessible alternative to DWI and FLAIR in defining the acute infarct core [10]. Following CTP data acquisition, source images are transferred to a post-processing workstation where attenuation-time curves and perfusion parameters are calculated [11]. These parameters include cerebral blood flow (CBF), cerebral blood volume (CBV), mean transit time, time-to-peak, and time-to-maximum (Tmax). Each parameter is displayed as a parametric map of the brain with a colour scale representing the values. These parameters can be interpreted clinically where the infarct core is an area of reduced CBF and CBV, while the surrounding penumbra has prolonged time-to-peak or mean transit time. Multiple observational studies have validated the ability of CTP parameters to delineate the infarct core by applying automatic threshold values and comparing against DWI and FLAIR [12, 13]. For example, a relative CBF (rCBF) value of <30% is recommended by RAPID software to estimate the infarct core [14]. Limitations of CTP include radiation and contrast exposure. Additionally, patient factors such as proximal arterial stenosis or low cardiac output can affect contrast delivery and calculations of perfusion parameters [11]. Indeed, the accuracy of automated CTP estimation of infarct core has recently been called into question [15, 16].
Two previous systematic reviews [17, 18] have determined overall sensitivity and specificity of CTP for diagnosing acute ischemic stroke. These systematic reviews were predominantly composed of studies that used visual assessment of perfusion maps to outline the infarct core, which may be inaccurate at identifying penumbra-infarct mismatch when compared to automated threshold criteria [19]. Furthermore, to our knowledge, no systematic review [17, 18, 20, 21] has examined volumetric or spatial accuracy measures of CTP compared to DWI or FLAIR. These measures include volumetric differences between CTP and DWI or FLAIR, volumetric correlation, Dice similarity coefficient, and sensitivity and specificity at the voxel level [12‒14]. We aimed to conduct an updated systematic review and meta-analysis to determine if automated CTP is non-inferior to DWI or FLAIR in predicting the infarct core in patients with acute ischemic stroke.
Methods
We conducted a systematic review and meta-analysis according to an a priori protocol and reported it according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses framework [22] (online suppl. Data; for all online suppl. material, see www.karger.com/doi/10.1159/000524916). The protocol and data supporting the findings of this study are available from the corresponding author upon reasonable request.
Search Strategy
An electronic search of MEDLINE and EMBASE was conducted from 1960 to 23 December 2020 without language restrictions. We used the key terms stroke, brain ischemia, computer tomography perfusion, and infarct core (online suppl. Data). Two reviewers (N.E.L. and B.C.) independently screened the titles and abstracts obtained from the search and excluded irrelevant reports. The two authors retrieved full-text articles for the remaining references and independently screened the full-text articles for inclusion. Abstracts without a full publication were treated as full text. We also screened the reference lists of included studies and review articles to identify additional publications.
Inclusion and Exclusion Criteria
We included all studies on acute ischemic stroke in adult patients that identified the infarct core using automated CTP with follow-up DWI or FLAIR within 7 days as the reference standard. This included studies with acute DWI imaging and studies with delayed DWI or follow-up FLAIR imaging. We also included studies with multi-parametric modelling and deep learning.
We excluded studies that required human input for CTP infarct identification, e.g., manual outlining of the infarct core on CTP; studies that did not perform comparisons between the CTP and DWI or FLAIR infarct cores; and studies that did not report the CTP parameter utilized. Reasons for exclusion of the full-text articles were recorded. Disagreements on inclusion of studies were resolved by a third reviewer (A.B.).
Quality Assessment
Data Extraction
We extracted data on study design, patient population and size, reperfusion therapies and timing, CTP parameters and thresholds for defining the infarct core, sensitivity and specificity for each threshold where provided, assessment of volumetric and spatial accuracy between CTP and MRI, image acquisition protocols and timing, processing, and statistical methods. Data were extracted by N.E.L. and entered into a spreadsheet.
Data Analysis
We summarized study demographics, reperfusion therapies, image acquisition, processing, and statistical methods. We identified five outcome measures of diagnostic or spatial accuracy of the acute infarct core. These were volumetric difference, volumetric correlation, sensitivity or specificity at the patient level, Dice similarity coefficient, and sensitivity or specificity at the voxel level.
Mean volumetric difference was defined as the difference between CTP and DWI or FLAIR infarct volume. Median volumetric differences and interquartile ranges were converted to mean volumetric differences and standard deviation [25]. Studies were considered to report mean volumetric differences if they reported either mean or median with a measure of dispersion. Mean volumetric differences were analysed by a random-effects meta-analysis with restricted maximum likelihood method. Meta-analysis was performed separately for studies with acute DWI imaging and studies with delayed DWI or FLAIR imaging.
Volumetric correlation outcome measures were Pearson’s R, intraclass correlation, and Spearman’s correlation. These correlation parameters were converted to Fisher’s Z and analysed using a random-effects meta-analysis with restricted maximum likelihood method. Meta-analysis was performed separately for studies with acute DWI imaging and studies with delayed DWI or FLAIR imaging. The effect of different types of correlation parameters was planned to be analysed on meta-regression.
Sensitivity and specificity at the patient level were presented in a table due to low number of studies anticipated from previous systematic reviews [17, 18]. Dice similarity coefficients were also summarized in a table.
Sensitivity and specificity at the voxel level were initially planned to be analysed by a hierarchical summary receiver-operating characteristic curve using the Rutter and Gatsonis model [26]. Due to the lack of true-positive (TP), false-positive (FP), true-negative (TN), and false-negative (FN) values at the voxel level, this analysis could not be conducted. We were also unable to generate forest plots of sensitivity or specificity due to the lack of standard deviation or confidence interval (CI) values.
Funnel plots were used to assess reporting bias in meta-analysis with at least 10 studies or subgroups [27]. Heterogeneity was explored by meta-regression if meta-analysis was performed on at least 10 studies or subgroups [27]. The following factors were planned to be investigated on meta-regression: CTP coverage, post-processing algorithm, threshold, method of delineating the infarct core on MRI, median time from stroke onset to CTP, median time from CTP to MRI, MRI modality, type of reperfusion therapy, time to reperfusion, whether reperfusion or recanalization was successful, and whether the study was validating a previously defined threshold. Due to a limited number of studies, each covariate was tested by itself on meta-regression. The Bonferroni correction was applied for meta-regression p values. N.E.L. conducted statistical analyses under supervision by M.B. Statistical analyses were conducted in R-4.0.3 with meta package 4.18-0 [28].
Results
The search identified a total of 3,986 records. After removing duplicates and screening title and abstracts for irrelevant studies, there were 163 abstracts remaining. A search of the references of included studies and previous systematic reviews [17, 18, 20, 21] identified 4 additional studies. A total of 167 full-text papers were reviewed and 48 studies [8, 12‒14, 29‒72] met the inclusion criteria (Fig. 1). These studies included a total of 3,782 participants. There were 46 studies on anterior circulation stroke (n = 3,683), one study on posterior circulation stroke (n = 83), and one study on lacunar infarcts (n = 16) (online suppl. Table 1).
Flowchart of the studies retrieved from the systematic review search strategy.
Mean/median age of the anterior circulation stroke participants ranged from 57 to 77 years and mean/median admission National Institutes of Health Stroke Scale (NIHSS) scores ranged from 8 to 18. Mean/median age of the posterior and lacunar stroke groups were 71 and 63 years, respectively. Median admission NIHSS score of the posterior and lacunar stroke groups ranged between 3 and 4.
Mean/median time to CTP from stroke onset in the anterior circulation group ranged from 1.82 to 24 h. Median time to CTP from stroke onset was 4.08 h in the lacunar infarct group and was not reported in the posterior stroke group. In the anterior circulation stroke group, 3,157 participants had DWI and 526 participants had FLAIR imaging as the reference standard. Of the 3,157 participants who received DWI in the anterior circulation stroke group, 1,304 participants had acute DWI imaging (mean/median time of 0.28–4.10 h between CTP and DWI) and 1,738 participants received delayed DWI imaging (mean/median time of 15–38 h between CTP and DWI). Of the 1,738 participants with delayed DWI imaging, 1,485 participants had successful reperfusion. Successful reperfusion was defined according to each individual study’s definition. It was unclear if the remaining 115 DWI anterior circulation group participants received acute or delayed DWI imaging and whether they had successful reperfusion. Of the 526 participants receiving FLAIR imaging (mean/median time of 24–168 h between CTP and FLAIR), 427 participants had successful reperfusion. All participants in the posterior and lacunar stroke groups received delayed DWI imaging.
Quality Assessment
An overview of the results of the QUADAS-2 assessment is shown in Figure 2 and online supplementary Figure 1. Overall, 21 of the 48 studies were judged to be low risk of bias or applicability concerns for all domains. Almost all studies were observational cohort studies. Three studies were assessed as high risk of bias for patient selection with one study utilizing a case-control design and two studies including patients from an artificial intelligence competition dataset. Another 13 studies were judged to be unclear risk of bias for patient selection as insufficient information was provided to ascertain if a consecutive sample was enrolled. Most studies were low risk of bias for the index test domain with adequate reporting of CTP protocols. Twelve studies did not report CTP coverage and were assessed as unclear risk of bias for index test domain. One study did not report their thresholds or CTP protocol and was assessed as high risk of bias for index test domain. All included studies utilized DWI or FLAIR and were low risk of bias for the reference standard. Most studies were low risk of bias for flow and timing of analysis. Ten studies were judged to be high risk of bias for flow and timing: 4 studies included patients with unsuccessful reperfusion in delayed MRI imaging and did not account for potential infarct core growth between CTP and MRI; another 5 studies provided minimal information on patient flow and timing between scans; one study utilized reperfusion therapy between CTP and acute DWI imaging. Finally, 6 studies were assessed as unclear risk of bias for flow and timing for not reporting time to successful reperfusion or time to reperfusion therapy in the context of delayed MRI imaging. Studies that were judged to be high risk in at least one domain were excluded from the analysis. Additionally, another four studies were excluded from the analysis: two studies were subsets of other included studies. And it was uncertain if another two studies were abstracts of a subsequently published full-text report (online suppl. Table 1).
Risk of bias and applicability concerns graph: review authors’ judgements about each QUADAS 2 domain presented as percentages across included studies.
Risk of bias and applicability concerns graph: review authors’ judgements about each QUADAS 2 domain presented as percentages across included studies.
Anterior Circulation Stroke
Overall, 30 studies focusing on anterior circulation strokes were included in the analysis. Five outcome measures were identified in the anterior circulation stroke group. Each study reported at least one outcome measure. These outcome measures were as follows: (1) mean volumetric difference reported by 16 studies (n = 1,149); (2) correlation reported by 14 studies with 9 studies reporting Pearson’s R coefficient (n = 634), 2 studies reporting intraclass correlation (n = 231), and 3 studies reporting Spearman’s correlation (n = 183); (3) sensitivity and specificity at the patient level reported by 3 studies (n = 134); (4) Dice coefficient reported by 4 studies (n = 366); and (5) sensitivity and specificity at the voxel level reported by 13 studies (n = 940).
There were no significant mean volumetric differences between CTP and acute DWI infarct volumes for CBF (mean difference 0.52 mL, 95% CI [−0.07, 1.11]), rCBF (mean difference 3.01 mL, 95% CI [−0.46, 6.48]), and rCBV (mean difference −12.84 mL, 95% CI [−38.56, 12.88]) (Fig. 3). Likewise, there were no significant mean volumetric differences between CTP and delayed DWI or FLAIR infarct volumes for rCBF (mean difference −1.29 mL, 95% CI [−6.49, 3.92]) and rCBV (mean difference −5.80 mL, 95% CI [−16.20, 4.60]) (Fig. 3). The funnel plot for CTP and delayed DWI or FLAIR rCBF analysis was suggestive for missing small studies with negative mean volumetric difference (online suppl. Fig. 2). Funnel plots were not generated for the other meta-analyses due to fewer than 10 included studies or subgroups. There were also insufficient studies reporting infarct volumes for other CTP parameters to perform meta-analysis. Significant heterogeneity was noted on CTP and acute DWI analysis for both rCBF (I2 82.6%, 95% CI [67.0, 90.8]) and rCBV (I2 96.2%, 95% CI [89.3, 98.6]). Minimal heterogeneity was noted on CTP and acute DWI analysis for CBF (I2 0.0%). Significant heterogeneity was also noted on CTP and delayed DWI or FLAIR analysis for both rCBF (I2 91.8%, 95% CI [87.6, 94.6]) and rCBV (I2 84.2%, 95% CI [52.9, 94.7]). Meta-regression was performed for CTP and delayed DWI or FLAIR rCBF meta-analysis. No variables were significant on meta-regression (online suppl. Table 2). Meta-regression was not performed for the other meta-analyses due to insufficient studies.
Forest plot of mean volumetric difference between CTP and DWI or FLAIR. Prediction intervals are calculated for meta-analysis with more than two studies. a CTP and acute DWI absolute CBF. b CTP and acute DWI rCBF. c CTP and acute DWI rCBV. d CTP and delayed DWI or FLAIR rCBF. e CTP and delayed DWI or FLAIR rCBV.
Forest plot of mean volumetric difference between CTP and DWI or FLAIR. Prediction intervals are calculated for meta-analysis with more than two studies. a CTP and acute DWI absolute CBF. b CTP and acute DWI rCBF. c CTP and acute DWI rCBV. d CTP and delayed DWI or FLAIR rCBF. e CTP and delayed DWI or FLAIR rCBV.
Mean volumetric correlation between CTP and acute DWI infarct volumes was 0.90 (95% CI [0.80, 0.95]) for rCBF and 0.84 (95% CI [0.58, 0.94]) for rCBV (Fig. 4). Mean volumetric correlation between CTP and delayed DWI or FLAIR infarct core volumes was 0.74 (95% CI [0.57, 0.85]) for rCBF and 0.90 (95% CI [0.69, 0.97]) for rCBV (Fig. 4). Funnel plots were not generated due to insufficient studies. There were significant heterogeneity for CTP and acute DWI analysis for both rCBF (I2 60.0%, 95% CI [0.0, 90.7]) and rCBV (I2 93.5%, 95% CI [86.5, 96.8]). Similarly, significant heterogeneity was noted for CTP and delayed DWI or FLAIR analysis for both rCBF (I2 94.6%, 95% CI [91.7, 96.5]) and rCBV (I2 93.1%, 95% CI [85.5, 96.7]). Meta-regression was not performed for correlation meta-analyses due to insufficient studies.
Forest plot of volumetric correlation between CTP-predicted volumes and DWI or FLAIR reference volumes. Prediction intervals are calculated for meta-analysis with more than two studies. a CTP and acute DWI rCBF. b CTP and acute DWI rCBV. c CTP and delayed DWI or FLAIR rCBF. d CTP and delayed DWI or FLAIR rCBV.
Forest plot of volumetric correlation between CTP-predicted volumes and DWI or FLAIR reference volumes. Prediction intervals are calculated for meta-analysis with more than two studies. a CTP and acute DWI rCBF. b CTP and acute DWI rCBV. c CTP and delayed DWI or FLAIR rCBF. d CTP and delayed DWI or FLAIR rCBV.
Three studies were included in the analysis of sensitivity and specificity at the patient level. Sensitivity values ranged from 83.3% to 100% and specificity values ranged from 0% to 81.8% (Table 1). Two of the three studies had the same definitions for TP, FP, TN, and FN infarct lesions (online suppl. Fig. 3).
Dice similarity coefficient mean or median values ranged from 0.16 to 0.60 across the included four studies (Table 2). Three studies utilized conventional thresholds to determine the infarct core, while one study utilized a neural network.
The planned hierarchical summary receiver-operating characteristic curve meta-analysis for sensitivity and specificity at the voxel level was not performed due to lack of TP, FP, TN, and FN voxel values. Instead, the sensitivity, specificity, and area under the curve values for each study are reported in online supplementary Table 3. Five of the 13 included studies directly compared rCBF to rCBV using sensitivity and specificity at the voxel level (online suppl. Table 3). Additionally, sensitivity and specificity values for multi-parametric models, including one neural network model, were reported in three studies (online suppl. Table 4). However, there were insufficient data to compare these multi-parametric models to individual CTP parameters.
Posterior Circulation Stroke and Lacunar Infarcts
In the posterior circulation stroke group, CTP had a sensitivity of 31% and a specificity of 94% in the posterior fossa compared to DWI at the patient level. Four studies from the anterior circulation stroke analysis group also included patients with posterior circulation lesions (online suppl. Table 5). But no subgroup analyses were provided for the posterior strokes in those studies.
In the lacunar infarct group, CTP identified 12/16 of the confirmed lacunar infarcts on DWI. Additionally, one of the studies from the anterior circulation stroke analysis group (online suppl. Table 6) that examined sensitivity and specificity at the patient level (n = 65) included patients with lacunar infarcts (n = 5) and focal infarcts in the middle cerebral artery (n = 9). The study reported that all false negative scans contained lacunar or small cortical infarcts.
Discussion
Summary of Main Results
This review aimed to determine if automated CTP is non-inferior to DWI or FLAIR in predicting the acute infarct core in patients with ischemic stroke. We included 48 observational studies with 3,782 participants. There were 46 studies focusing on anterior circulation stroke (n = 3,683), one study on posterior circulation stroke (n = 83), and one study on lacunar infarcts (n = 16). Twenty-one of the 48 studies were judged to be low risk of bias and applicability concerns for all QUADAS-2 domains [23]. Another 12 studies were assessed as high risk of bias for at least one domain, and the remaining 15 studies were assessed as unclear risk of bias for at least one domain. Studies judged to be high risk of bias for at least one domain were excluded from subsequent analysis. Another four studies were also excluded from the analysis due to potential double counting of participants from shared datasets with other included studies.
Overall, 30 studies were included in the anterior circulation group analysis. There were no significant mean volumetric differences between CTP and acute DWI (CBF 0.52 mL, 95% CI [−0.07, 1.11], I2 0.0%; rCBF 3.01 mL, 95% CI [−0.46, 6.48], I2 82.6%; rCBV −12.84 mL, 95% CI [−38.56, 12.88], I2 96.2%) and delayed DWI or FLAIR infarct volumes (rCBF −1.29 mL, 95% CI [−6.49, 3.92], I2 91.8%; rCBV −5.80 mL, 95% CI [−16.20, 4.60], I2 84.2%). Strong mean volumetric correlation was observed between CTP and acute DWI (rCBF 0.90, 95% CI [0.80, 0.95], I2 60.0%; rCBV 0.84, 95% CI [0.58, 0.94], I2 93.5%) and delayed DWI or FLAIR infarct volumes (rCBF 0.74, 95% CI [0.57, 0.85], I2 94.6%; rCBV 0.90, 95% CI [0.69, 0.97], I2 93.1%). Sensitivity and specificity at the patient level were reported across three studies, and Dice similarity coefficients were reported by four studies. Sensitivity and specificity at the voxel level analysis could not be performed. Due to significant heterogeneity and insufficient studies reporting each outcome, there is insufficient evidence to reliably determine the accuracy of CTP prediction of infarct core. But strong volumetric correlation may suggest that CTP volumetric prediction could be non-inferior to DWI or FLAIR. In the posterior circulation group, only one study reported CTP sensitivity and specificity at the patient level. And in the lacunar infarct group, only one study reported the sensitivity of CTP when compared to DWI confirmed lacunar infarcts. Hence, we were also unable to conclude on the efficacy of automated CTP in determining the infarct core in posterior circulation and lacunar infarcts.
Potential Sources of Heterogeneity
Significant heterogeneity was observed across all outcomes in the anterior circulation group. This is likely derived from differences in the pooled studies including imaging and treatment protocols, and patient factors. Meta-regression was planned to investigate heterogeneity for mean volumetric difference and correlation meta-analysis. Due to insufficient number of studies, meta-regression was only performed for rCBF mean volumetric difference comparing CTP to delayed DWI or FLAIR. No factors were found to be significant on the meta-regression.
Previous studies showed that variations in imaging and treatment protocols influence the optimal CTP parameters and accuracy of identifying the infarct core. Insufficient CTP coverage can result in underestimation of CTP infarct core compared to reference core on MRI [31, 51], particularly in the posterior fossa where limited coverage and beam-hardening artefact from bony interference limit the sensitivity of CTP in posterior strokes [67]. But recent sensitivity analyses have not shown any effects of CTP coverage on the derived optimal thresholds for anterior circulation infarcts [49, 59]. And modern CT machines can perform whole-brain CTP, extending coverage to posterior strokes. Various post-processing methods can lead to different optimal thresholds and performance in identifying the infarct core [8, 38, 51, 57‒59, 70]. Examples of post-processing methods for calculating perfusion maps include maximum slope [38], partial deconvolution [38], singular value deconvolution [8, 13, 34, 36‒38, 42, 51, 57‒59, 63], singular value deconvolution with delay or dispersion correction [12, 33, 38, 41, 44, 45, 51‒54, 57, 59, 66], closed-form deconvolution [29, 40], block-circulant deconvolution [38], stroke-stenosis [38], and Bayesian algorithms [8, 58, 70]. Due to insufficient studies, we were unable to individually examine the different types of post-processing algorithms. We had planned to conduct meta-regression by comparing delay-independent and delay-dependent algorithms. In addition to post-processing algorithms, different software brands have other differentiating features such as noise reduction and smoothing methods [72]. Noise reduction aims to optimize the signal-to-noise ratios of different CTP parameters. But CTP has inherently lower signal-to-noise ratios and hence higher measurement errors when compared to DWI [73]. This might explain the high heterogeneity, but minimal volume difference and strong volumetric correlation on meta-analysis. Hence, it is likely that some of the observed heterogeneity is due to variations in software, post-processing algorithms, and measurement error.
The time between CTP and MRI can influence the measured accuracy of the CTP infarct core due to potential infarct core growth [16]. Studies that included participants without successful reperfusion in delayed MRI imaging will appear to have underestimated the final infarct core on CTP. Consequently, these studies were considered high risk of bias and excluded from the analysis. Meta-analysis was also performed separately for studies with early MRI imaging and delayed MRI imaging. Optimal CTP thresholds for predicting the infarct core also depend on the type of reperfusion therapy and time to reperfusion. Patients who received thrombectomy or achieved earlier reperfusion may have lower rCBF thresholds compared to patients who received thrombolysis [52] or later reperfusion [52, 66].
Patient factors can also influence the optimal CTP parameters and thresholds for the infarct core. These include baseline cerebral small vessel disease [63], collateral status [74], and cardiovascular status [11]. Patients with higher grade leukoaraiosis are known to have poor correlation between CTP acute infarct volume and final infarct volume on DWI [63]. Poor collateral status may be associated with infarct core overestimation [74]. Cardiovascular status such as cardiac output and proximal arterial stenosis can influence contrast bolus delivery [11]. Hence, these patient factors may also account for some of the observed heterogeneity.
Strength and Weakness of the Review Process
Strengths of our review include a comprehensive electronic literature search of both full-text studies and abstracts without any language restrictions. We identified four additional studies through manual searching of the references of included studies and previous reviews [17, 18, 20, 21]. And we are confident that we have included most, if not all, of includable published studies.
There are several limitations of our review. Insufficient studies reporting each type of outcome measure prevent reliable conclusions on the accuracy of CTP prediction of infarct core. Additionally, insufficient studies within most of the meta-analyses impede examination of reporting bias through funnel plots. The meta-analyses also pooled studies with different imaging and treatment protocols. Consequently, the resulting heterogeneity prevents reliable conclusions from being determined. And the inability to explain sources of heterogeneity on meta-regression precludes analysing which protocol factors may have significant impacts on heterogeneity and prediction of the infarct core.
Comparison to Previous Research
To our knowledge, this is the first systematic review examining volumetric and spatial outcomes of automated CTP compared to DWI or FLAIR. Previous systematic reviews and meta-analyses have examined sensitivity and specificity of CTP at the patient level but included studies with visual assessments of perfusion maps [17, 18]. These meta-analyses reported moderate to high sensitivity and high specificity values for CTP but also observed high levels of heterogeneity for both sensitivity and specificity [17, 18]. In our review, we included three studies reporting sensitivity and specificity at the patient level for analysis. The low number of studies could contribute to the observed heterogeneous sensitivity and specificity values in our review. Another systematic review examined the various definitions for the acute infarct core and concluded that there were significant variations in the definitions [20]. This is in keeping with our findings of significant heterogeneity and varying CTP parameter thresholds.
Implications for Practice
Automated CTP remains inferior to DWI or FLAIR in identifying the acute infarct core. There are high levels of heterogeneity and consequently large prediction intervals for the mean volumetric difference between CTP and DWI or FLAIR. This might have clinical implications with large variations in accuracy between patients when applying mismatch criteria with cut-off volumes such as the EXTEND [3], DEFUSE 3 [4], and DAWN [5] criteria. Yet, these trials [3‒5] have all reported large positive treatment effects for those selected by RAPID CTP processing via their respective criteria. This suggests that even though automated CTP is an inconsistent estimate of core, there is still much benefit to be gained from using it to select patients for treatment.
Implications for Research
There remains significant unexplained heterogeneity in our review. More research is required in examining potential factors such as software, post-processing algorithms, and patient factors such as cardiac output. Understanding the unexplained heterogeneity is clinically important as it will enable reliable application of volume-based mismatch criteria [3‒5]. We hypothesize that a large proportion of unexplained heterogeneity results from the variation in software and algorithms. It might be necessary to standardize CTP acquisition protocols and perform benchmark testing of various software and algorithms on a common dataset [49]. Standardized reporting for future CTP studies would be useful, such as consistent reporting of patient factors and time to reperfusion. These standardized studies would help provide high-quality evidence for evaluating CTP adoption. Future directions for possible research on automated CTP include Bayesian algorithms, neural networks, and using different thresholds depending on brain region [35] or for grey and white matter [43, 50, 59]. Larger studies are needed to examine the efficacy of automated CTP for identifying posterior circulation strokes and lacunar infarcts. Finally, it has been suggested that in patients with large-vessel occlusion, selection for thrombectomy by simplified imaging protocols consisting of non-contrast CT and CT angiography has similar outcomes to selection by CTP [75]. Further randomized trials might be required to compare the efficacy of simplified imaging protocols to automated CTP.
Statement of Ethics
Ethical approval was not required because this study was based exclusively on the published literature.
Conflict of Interest Statement
Andrew Bivard and Mark Parsons have research partnerships with Apollo, Siemens, and Canon.
Funding Sources
No funding was received for this study.
Author Contributions
Nicholas Lim, Graeme Hankey, and Andrew Bivard conceived the study. Nicholas Lim designed and ran the searches. Nicholas Lim and Benjamin Chia screened titles and abstracts and assessed full-text studies for inclusion. Nicholas Lim, Graeme Hankey, Andrew Bivard, and Mark Parsons contributed to risk of bias assessment. Nicholas Lim performed data extraction. Nicholas Lim and Max Bulsara contributed to meta-analyses. Nicholas Lim, Benjamin Chia, Max Bulsara, Mark Parsons, Graeme Hankey, and Andrew Bivard contributed to manuscript drafting and critical revision.
Data Availability Statement
Data supporting the findings of this study are available from the corresponding author upon reasonable request.