Abstract
Background: In Japan, biennial esophagogastroduodenoscopy (EGD) screening for gastric cancer (GC) has been implemented for adults aged ≥50 years as part of a population-based screening program. This approach has facilitated early GC detection and demonstrated individual-level benefits. However, due to a generational decline in Helicobacter pylori infection, a reduction in the long-term effectiveness of this strategy is anticipated. Given the allocation of national resources, a cost-effectiveness evaluation is essential. Summary: As GC prevalence declines in the target population, the cost-effectiveness of existing screening practices may diminish. Mathematical simulation models are commonly employed to assess the comparative effectiveness and cost-effectiveness of various cancer screening strategies. Microsimulation models, which track individual-level outcomes, are utilized to evaluate person-level effects. By contrast, macrosimulation models – such as Markov and decision tree models – are used to assess population-level outcomes. Key Messages: Nine studies have compared different EGD screening strategies. These studies support the effectiveness of biennial screening in high-risk countries. In low- to intermediate-risk countries – and among lower risk populations within high-risk countries – extending the screening interval to ≥3 years appears reasonable. Additionally, strategies incorporating risk stratification using alternative modalities, such as serological tests, are more cost-effective. Continued discussion is necessary to optimize the EGD screening approach.
Introduction
Helicobacter pylori (H. pylori) is recognized as the primary etiological agent for gastric cancer (GC) and has been classified as a group 1 carcinogen by the International Agency for Research on Cancer. Over the past several decades, the prevalence of H. pylori infection has remained notably high in East Asia, particularly Japan and South Korea, posing a significant public health challenge in reducing GC mortality. In both Japan and South Korea, population-based GC screening programs utilizing radiographic and endoscopic methods have been implemented to facilitate early detection and treatment of H. pylori-associated GC, thereby reducing mortality [1‒4].
However, due to improvements in hygiene, the prevalence of H. pylori infection in Japan has markedly declined – from approximately 50% among individuals born in the 1950s, to approximately 20% among those born in the 1970s, and to <5% among those born in the 2010s [5‒8]. In light of these epidemiological changes, the optimization of screening strategies is warranted, with careful consideration given to early detection efficiency, safety, and cost-effectiveness.
Mathematical simulation models are widely used to evaluate the comparative effectiveness of different cancer screening strategies within a theoretical framework. For example, the effectiveness of screening programs for gastric, breast, and colorectal cancer has been evaluated using such models, thereby contributing to evidence-based policy-making [9‒12]. This article presents an overview of simulation models used in cancer screening, followed by an assessment of whether the current esophagogastroduodenoscopy (EGD) screening strategy for GC remains effective in contemporary and future Japan, given the substantial decline in H. pylori prevalence.
Global and Regional Burden of GC
The prevalence and mortality of GC have undergone significant changes worldwide over the past several decades [13]. These trends vary greatly by region and country, with significant differences observed between high-risk countries, such as Japan and South Korea, and low-risk regions, such as North America and Western Europe. In high-risk countries, although a decline in GC mortality has been reported, both incidence and mortality rates remain high. Thus, GC control continues to represent a major public health concern.
Concept of Population-Based GC Screening
Globally, screening asymptomatic individuals for GC is rare and generally limited to countries with a high incidence of the disease. For example, the American Gastroenterological Association advises against GC screening in asymptomatic individuals in the USA, except for immigrants originating from high-risk regions such as East Asia [14]. In East Asian countries, like Japan and South Korea, the historically high prevalence of H. pylori infection has resulted in a greater likelihood of infection even among asymptomatic individuals, thereby supporting the implementation of mass screening programs.
A global survey using cross-sectional data from 2010 to 2022 reported that, although the prevalence of H. pylori has steadily declined in East Asia, it remains substantially higher among adults in the region compared with those in North America and Europe [15]. In Japan, GC screening is carried out using various modalities and serves different objectives depending on the organizing body – municipalities, corporations, or private individuals. Population-based screening programs, administered by local governments and supported through public funding, are primarily intended to reduce GC mortality (Table 1). By contrast, opportunistic screening, undertaken by corporations or individuals, focuses on the early detection of GC to enhance health outcomes for employees or private individuals. Such screening programs typically do not impose age restrictions, with participation largely determined by the availability of financial resources. Conversely, population-based screening programs frequently restrict eligibility by age due to limitations in public funding.
Comparison of the population-based gastric cancer screening programs
Country . | Implementation status . | Screening method . | Target age group . |
---|---|---|---|
Japan | Implemented nationwide | EGD or X-ray | ≥50 years (varies by municipality) |
South Korea | Implemented nationwide | EGD or X-ray | ≥40 years |
China | Implemented in some regions | Mainly EGD (risk-based) | ≥40 years (high-risk population) |
USA and Europe | Not implemented | Individual based for high-risk groups | - |
Country . | Implementation status . | Screening method . | Target age group . |
---|---|---|---|
Japan | Implemented nationwide | EGD or X-ray | ≥50 years (varies by municipality) |
South Korea | Implemented nationwide | EGD or X-ray | ≥40 years |
China | Implemented in some regions | Mainly EGD (risk-based) | ≥40 years (high-risk population) |
USA and Europe | Not implemented | Individual based for high-risk groups | - |
EGD, esophagogastroduodenoscopy.
In contrast to Japan, South Korea employs a centralized system for managing EGD examination data through the National Health Insurance Service (NHIS) database. Under the National Cancer Screening Program (NCSP), EGD for GC screening is performed in adults aged ≥40 years. Additionally, EGDs carried out as part of opportunistic screening are also documented in the NHIS database, enabling comprehensive data collection [16‒18]. Consequently, clinical studies reflecting nationwide EGD utilization are feasible in South Korea. An analysis of these data has demonstrated that the participation rates in EGD-based screening are higher than those observed for X-ray-based screening in South Korea [19]. However, low participation in GC screening remains a significant challenge in Japan. A recent study using the synthetic control method evaluated the impact of GC screening program implementation on mortality in South Korea and Japan [20]. The study reported a 17% reduction in age-standardized GC mortality in South Korea (risk ratio = 0.83, 95% confidence interval: 0.71–0.96). By contrast, no corresponding reduction was observed in Japan, likely due to the low screening participation rates. Although precise national data are lacking in Japan, the total number of EGD screenings has increased since the implementation of population-based GC screening using EGD. This trend suggests an increasing need for the development of screening strategies with a greater emphasis on EGD-based modalities in the future [21].
By integrating NHIS and NCSP data, the effectiveness of EGD GC screening can be evaluated across different age groups. A study conducted by Jun et al. [45] analyzing data from 16,584,283 participants aged ≥40 years, showed that EGD screening reduced GC mortality up to age of 74 years. However, no mortality benefit was observed among individuals aged ≥75 years. Accordingly, the establishment of an upper age limit of 75 years for EGD screening in South Korea appears to be justified. However, the construction of such comprehensive databases requires substantial effort and resources. In Japan, where comparable infrastructure is lacking, the identification of optimal screening age based on original data remains a challenge. Furthermore, rigorous evaluation of screening effectiveness requires implementation within target populations and the measurement of outcomes such as GC mortality. Whether conducted prospectively or retrospectively, these studies require significant time, labor, and financial investment.
Moreover, earlier GC detection does not necessarily translate to a reduction in mortality. When evaluating cancer screening programs, lead-time bias must be considered, as it can result in the overestimation of the benefits related to disease-specific mortality [22, 23]. Nonetheless, the assessment of mortality reduction requires comprehensive databases capable of tracking individual outcomes. Mathematical simulation models provide a robust approach for testing clinical hypotheses that are otherwise difficult to validate using original data.
Mathematical Simulation Models Used in Cancer Screening
Various simulation methodologies are employed in medical research, particularly in the context of cancer screening. Despite their methodological diversity, all simulation models are grounded in a common principle: they model transitions between defined health states to generate the corresponding outcomes. Specifically, practical simulations evaluate how interventions (strategies) influence health state transitions from two perspectives: individual-level outcomes and population-level dynamics (Fig. 1).
Concept of mathematical simulation for cancer screening strategies. Mathematical simulations are used in cancer screening to compare the effectiveness of various strategies. These methods are broadly categorized into microsimulations and macrosimulations, which focus on tracking individual outcomes and population-level behaviors. By evaluating the results obtained, these simulations aim to quantify the effectiveness and cost-effectiveness of each strategy, thereby informing policy decision-making.
Concept of mathematical simulation for cancer screening strategies. Mathematical simulations are used in cancer screening to compare the effectiveness of various strategies. These methods are broadly categorized into microsimulations and macrosimulations, which focus on tracking individual outcomes and population-level behaviors. By evaluating the results obtained, these simulations aim to quantify the effectiveness and cost-effectiveness of each strategy, thereby informing policy decision-making.
At the individual level, simulations are used to model the progression of a person’s health trajectory through transitions among defined different health states. By contrast, population-level simulations are designed to assess collective outcomes by modeling such transitions across an entire population. These approaches are employed to evaluate the extent to which specific interventions influence outcomes at both levels. When the costs associated with interventions are incorporated alongside health outcomes, these models further enable cost-effectiveness analyses. By integrating three core functions – predicting the effectiveness of screening strategies, identifying optimal strategies through comparative analysis, and informing policy decisions – simulation modeling is ultimately intended to guide the implementation of evidence-based screening programs within society (Fig. 1).
Individual-Level Simulations
Simulations at the individual level reduce real-world populations into smaller units – such as municipalities, corporations, households, or individuals – while modeling transitions between health states. This method is known as microsimulation [24, 25]. A basic application of microsimulation involves population forecasting; by specifying birth rates and age-adjusted mortality rates for a given year and updating them annually, future population changes – for example, in a cohort of 1 million individuals over a decade – can be projected. This method has also been applied to simulate future demographic trends in Japan, integrated with a model of GC incidence and progression at the individual level [26]. In individual-based microsimulations, a virtual cohort comprising tens of thousands to millions of individuals (e.g., individuals A, B, and C) is constructed. For each simulated participant, disease onset, progression, detection, and outcomes are modeled, with transition probabilities assigned to each defined health state. After simulating over multiple years or decades, aggregate outcomes – such as the most common health trajectories – can be assessed.
Dynamic microsimulation models incorporate interactions among sociodemographic characteristics, health status, mortality rates, and behavioral risk factors, allowing for the prediction of individual healthy aging trajectories [27]. During the coronavirus disease 2019 pandemic, microsimulations were also employed to estimate the impact of individual-level interventions – such as lockdowns and mobility restrictions – on the spread of infectious diseases [28‒30]. Thus, microsimulation is considered essential not only for examining individual behavior but also for estimating the broader societal effects of interventions.
Population-Level Simulations
Unlike microsimulation, which tracks individual transitions, macro-level models provide a broader perspective on health phenomena and interventions. Among macro approaches, decision tree and Markov models are particularly prevalent in medical research, including studies on cancer. Decision tree models define transition probabilities between health states but do not account for the timing of transitions. Therefore, they are most suitable for short-term analyses or scenarios where the timing of events – such as specific ages – is not critical. In contrast, Markov models incorporate time by assigning transition probabilities at fixed intervals, enabling the modeling of health state changes over decades. These time-based transitions can reflect age-dependent variations in risk [31].
Macro-level models provide a simplified, population-wide perspective that is more intuitively accessible compared with microsimulations. However, Markov models possess notable limitations. First, they assume a memoryless process – only the current state influences future transitions – thus neglecting important historical information such as prior screening outcomes or treatment histories. This assumption can pose challenges when past events significantly affect future health risks. Moreover, Markov models discretize health into broad categories – such as “healthy,” “precancerous lesions,” “cancer,” and “death” – which limits their ability to accurately model continuous disease progression or heterogeneous clinical pathways. Consequently, modeling complex, multistage disease processes, like cancer progression (e.g., tumor grade or differentiation), is limited. Some of these limitations, such as memorylessness and individual variability loss, can be mitigated through the use of microsimulation models [32].
Although microsimulations can project intervention effects at the individual level, determining whether these effects translate into significant population-level impacts often requires a macro-level perspective. Consequently, macro-level modeling remains a key component in public health policy – particularly in cancer control and infectious disease management – where evaluating large-scale interventions is essential. Nonetheless, macro models may overlook localized effects driven by specific individual characteristics. Thus, microsimulation and macro-level modeling should be viewed as complementary approaches (Table 2).
Simulation models used in cancer screening studies
. | Microsimulation . | Macrosimulation . |
---|---|---|
Simulation unit | Individual | Group (municipality, nation, and society) |
Time horizon | Flexible | Fixed - Markov model: available - Decision tree model: unavailable |
Benefits | Available to investigate the associations between individual characteristics and outcomes | Available to verify the effect of the intervention on the target population as a whole |
Limitations | Difficult to visualize status changes in the target population | Unable to establish a relationship between individual characteristics and health state transitions |
. | Microsimulation . | Macrosimulation . |
---|---|---|
Simulation unit | Individual | Group (municipality, nation, and society) |
Time horizon | Flexible | Fixed - Markov model: available - Decision tree model: unavailable |
Benefits | Available to investigate the associations between individual characteristics and outcomes | Available to verify the effect of the intervention on the target population as a whole |
Limitations | Difficult to visualize status changes in the target population | Unable to establish a relationship between individual characteristics and health state transitions |
Comparative Effectiveness of GC Screening Strategies Using Microsimulation Models
One of the primary strengths of microsimulation is its ability to meticulously track individual outcomes when screening strategies are implemented at the level of each citizen (Table 3). In a 2020 study, a virtual population modeled after the Japanese national population was used to evaluate 15 EGD screening strategies, which varied by starting age, stopping age, and screening intervals [33]. The findings indicated that EGD screening every 3 years from ages 50–75 years was the most cost-effective approach, with an incremental cost-effectiveness ratio of USD 45,665 per quality-adjusted life year gained. Compared with no screening, this strategy reduced GC mortality by 63% and resulted in 27.2 quality-adjusted life years gained per 1,000 individuals.
Simulation studies that compared multiple EGD-based GC screening strategies
. | Country . | Type of simulation . | Outcome . | Target . | Data source . | Comparator . | Optimal strategy . |
---|---|---|---|---|---|---|---|
Babazono et al. [34] (1995) | Japan | Markov model | Cost-effectiveness | All GC | Open database | 4 EGD strategies | ≥40 years, 3-year interval |
Dan et al. [35] (2006) | Singapore | Markov model | Cost-effectiveness | All GC | Published data | EGD screening vs. no screening | 50–70 years old, 2-year interval |
Chang et al. [36] (2012) | South Korea | Markov model | Cost-effectiveness | All GC | Open database | 12 EGD or X-ray strategies | 50–80 years old, 1-year interval (men) |
50–80 years old, 2-year interval (female) | |||||||
Zhou et al. [37] (2013) | Singapore | Markov model | Cost-effectiveness | All GC | Published data | 4 EGD strategies | Initial EGD screening following 2-year interval EGD for GC high-risk individuals |
Huang et al. [33] (2020) | Japan | Microsimulation model | Cost-effectiveness | All GC | Open database | 15 EGD strategies | 50–75 years old, 3-year interval |
Xia et al. [38] (2021) | China | Markov model | Cost-effectiveness | All GC | Published data | 5 EGD strategies | 40–69 years old, 2-year interval |
Ascherman et al. [39] (2021) | Japan, Brazil, France, Nigeria, and USA | Markov model | Cost-effectiveness | All GC | Open database | 8 EGD strategies | Japanese |
2-year interval for GC high-risk individuals | |||||||
10-year interval for GC low-risk individuals | |||||||
Ishibashi et al. [40] (2024) | Japan | Markov model | Cost-effectiveness | All GC | Original data | 15 EGD strategies | ≥40 years, 4-year interval |
Ishibashi et al. [26] (2024) | Japan | Microsimulation model | Efficacy of GC detection | HpNGC | Original data | 9 EGD strategies | ≥45 years, 5-year interval |
. | Country . | Type of simulation . | Outcome . | Target . | Data source . | Comparator . | Optimal strategy . |
---|---|---|---|---|---|---|---|
Babazono et al. [34] (1995) | Japan | Markov model | Cost-effectiveness | All GC | Open database | 4 EGD strategies | ≥40 years, 3-year interval |
Dan et al. [35] (2006) | Singapore | Markov model | Cost-effectiveness | All GC | Published data | EGD screening vs. no screening | 50–70 years old, 2-year interval |
Chang et al. [36] (2012) | South Korea | Markov model | Cost-effectiveness | All GC | Open database | 12 EGD or X-ray strategies | 50–80 years old, 1-year interval (men) |
50–80 years old, 2-year interval (female) | |||||||
Zhou et al. [37] (2013) | Singapore | Markov model | Cost-effectiveness | All GC | Published data | 4 EGD strategies | Initial EGD screening following 2-year interval EGD for GC high-risk individuals |
Huang et al. [33] (2020) | Japan | Microsimulation model | Cost-effectiveness | All GC | Open database | 15 EGD strategies | 50–75 years old, 3-year interval |
Xia et al. [38] (2021) | China | Markov model | Cost-effectiveness | All GC | Published data | 5 EGD strategies | 40–69 years old, 2-year interval |
Ascherman et al. [39] (2021) | Japan, Brazil, France, Nigeria, and USA | Markov model | Cost-effectiveness | All GC | Open database | 8 EGD strategies | Japanese |
2-year interval for GC high-risk individuals | |||||||
10-year interval for GC low-risk individuals | |||||||
Ishibashi et al. [40] (2024) | Japan | Markov model | Cost-effectiveness | All GC | Original data | 15 EGD strategies | ≥40 years, 4-year interval |
Ishibashi et al. [26] (2024) | Japan | Microsimulation model | Efficacy of GC detection | HpNGC | Original data | 9 EGD strategies | ≥45 years, 5-year interval |
EGD, esophagogastroduodenoscopy; GC, gastric cancer; HpNGC, H. pylori-naïve gastric cancer.
In Japan, the prevalence of H. pylori infection has significantly declined across birth cohorts in recent years. Consequently, the proportion of GCs detected through screening classified as H. pylori-naïve GC (HpNGC) is increasing [41, 42]. As a result, future screening programs in Japan are expected to need to account for both H. pylori-associated GC and HpNGC as target lesions. To determine the optimal strategy for detecting HpNGC, a comparative analysis of nine strategies was conducted using a microsimulation model [26]. This model was developed based on data from 519,368 EGD procedures and 97 patients with HpNGC collected from 12 Japanese medical institutions. The screening strategies varied by starting age (40, 45, or 50 years) and interval (2, 5, or 10 years). The analysis showed that screening every 5 years starting at age 45 years provided the highest detection efficiency for HpNGC, as measured by the number of tests needed. A notable strength of this microsimulation model is its ability to simulate GC development and progression at the individual level, enabling detailed evaluation of cancer characteristics. As demonstrated, the model accurately reflects specific features such as the slow progression typical of HpNGC, thereby enhancing the reliability of its findings.
Simulation of GC Screening Strategies Using Markov Models
Markov models have been employed since the early stages of cost-effectiveness research and have been applied for the evaluation of EGD GC screening since the 1990s (Table 2). In 1995, Babazono et al. [34] utilized a Markov model to assess the cost-effectiveness of screening asymptomatic individuals in Japan starting at age 40 years, using intervals of 1, 2, 3, or 5 years. The analysis concluded that a 3-year interval was the most cost-effective strategy at the time.
A study conducted in South Korea – a country with a high risk of GC – compared no screening with various endoscopy and X-ray strategies, initiated at ages 30, 40, or 50 years and repeated at 1- or 2-year intervals [36]. The findings showed that annual screening beginning at age 50 years was most cost-effective for men, whereas biennial screening at the same age was optimal for women. This study highlighted the importance of sex-based risk stratification in optimizing screening strategies. In China, biennial EGD screening from individuals aged 40–69 years was also found to be cost-effective [38]. However, in several high-risk countries, the generational decline in H. pylori infection rates has prompted updated cost-effectiveness analyses based on more recent data. Using original data from urban Japanese cohorts with low H. pylori prevalence in the 2010s, a Markov model was applied to evaluate multiple EGD screening strategies. Screening every 4 years starting at age 40 years was identified as the most cost-effective approach, indicating that Japan’s current biennial strategy may be suboptimal [40].
By contrast, findings from low- to intermediate-risk countries have yielded different conclusions. A study conducted in Singapore assessed the cost-effectiveness of biennial EGD screening versus no screening in an intermediate-risk population [35]. Results demonstrated that EGD screening remained cost-effective even in this context. Another study from Singapore evaluated four follow-up strategies for individuals identified as high risk after initial EGD: annual or biennial surveillance, repeat biennial EGD, and an intensive strategy combining biennial EGD with annual surveillance [37]. In individuals categorized as low risk after the initial EGD, only usual care without scheduled follow-up screenings was provided. Surveillance every 2 years following the initial EGD was identified as the most cost-effective strategy. These findings underscore the importance of GC risk stratification in improving the cost-effectiveness of screening programs. A recent study comparing the cost-effectiveness of EGD screening in Japan with that in low- to intermediate-risk countries – including Brazil, France, Nigeria, and the USA – demonstrated that Japan’s biennial strategy would be less effective in those settings [39].
Several studies have indicated that, in low-risk countries, combining EGD with other risk stratification methods is essential to maintain cost-effectiveness. In the USA, a strategy incorporating H. pylori diagnosis through serum pepsinogen testing for risk stratification was more cost-effective than EGD alone [43]. In Japan, where historically high H. pylori infection rates have been prevalent, H. pylori eradication therapy strategies have demonstrated greater cost-effectiveness compared with repeated EGD without eradication [44]. However, as H. pylori prevalence continues to decline, the incorporation of risk stratification methods is expected to become increasingly critical. Recent Japanese studies suggest that extending EGD screening intervals beyond 2 years could improve cost-effectiveness. Similarly, studies conducted in low- to intermediate-risk countries have emphasized the importance of risk stratification to enhance the efficiency of GC screening strategies.
Potential of Risk-Stratified Strategies in EGD Screening
To maximize the cost-effectiveness of GC screening strategies, incorporating risk stratification is a critical consideration. Risk stratification may be implemented in two primary approaches. The first approach involves using findings from the initial EGD examination – such as H. pylori infection status or the presence of intestinal metaplasia – to determine the appropriate interval for subsequent examinations (i.e., surveillance). For example, a study conducted in Singapore proposed different surveillance intervals based on initial EGD findings [40]. The second approach involves assessing GC risk prior to endoscopy by measuring serum H. pylori antibody or serum pepsinogen levels and by accounting for additional factors such as male sex and a family history of GC. This prescreening risk assessment helps determine whether EGD screening should be performed. A study from South Korea recommended applying different screening intervals based on sex [42], whereas research from the USA demonstrated the effectiveness of risk stratification using serum H. pylori antibody levels [37]. By combining these two approaches into a hybrid risk-stratified strategy – where the timing of the initial EGD is determined through pre-EGD GC risk assessment and setting surveillance intervals based on the findings of the initial EGD – a more efficient and effective GC screening program can be implemented.
Conclusion
EGD screening has served as a cornerstone of population-based GC screening programs in Japan and South Korea, contributing to early detection and reductions in cancer-related mortality. However, the recent and significant decline in age-specific H. pylori infection rates in both countries necessitates a reassessment of existing EGD screening strategies. Emerging evidence from cost-effectiveness analyses suggests that extending the screening interval may enhance the overall efficiency and effectiveness of these programs.
Conflict of Interest Statement
Dr. Fumiaki Ishibashi has received honoraria for lectures from Fujifilm and Olympus Corporation and was a member of the journal’s editorial board at the time of submission. The other author declares no conflicts of interest.
Funding Sources
This work was supported by the Japan Cancer Society Research Grant 2024.
Author Contributions
F.I. designed the study and processed the data. F.I. and K.O. acquired the data, interpreted the data, and prepared the manuscript.