Introduction: Frequent no-shows in healthcare appointments pose significant challenges, leading to wasted resources and suboptimal patient outcomes. Traditional mitigation methods, such as reminder messages and phone calls, often fall short, particularly in regions with less robust healthcare infrastructures. This study leverages machine learning techniques to develop a predictive model that identifies patients at high risk of missing appointments using data from the Mawid scheduling system in Southern Saudi Arabia. Methods: A retrospective cross-sectional design was employed, focusing on visits to primary healthcare centers (PHCCs) and general hospitals. Data spanning 10 months from June 2023 to March 2024 were collected from Mawid, encompassing over one million observations and 18 features, including appointment details, patient demographics, and weather conditions. Machine learning models, such as decision trees, random forests, Naive Bayes, logistic regression, and artificial neural networks (ANN), have been developed and evaluated based on accuracy, precision, recall, F1 score, and area under the curve (AUC). Results: The findings revealed that 55.07% of appointments were not attended. The Random Forest model exhibited superior performance with an accuracy of 0.765 and an AUC of 0.852, particularly when weather-related features were included. The ANN models also performed robustly with an AUC value of approximately 0.836. This study identified significant regional, seasonal, and environmental factors affecting no-show rates, with higher no-show rates occurring during certain months and under specific weather conditions. Regular appointments and PHCCs showed different attendance patterns than hospitals, while walk-in appointments did. Conclusion: Machine learning models, particularly random forests and ANNs, can effectively predict healthcare appointment no-shows, thereby allowing for better resource allocation and patient care. Recognizing the influence of regional and environmental factors is crucial for developing targeted interventions to reduce the no-show rates. Future research should explore integrating more contextual data to further refine these predictive models and to enhance healthcare delivery and operational efficiency.

Frequent no-shows in healthcare settings, particularly in primary healthcare centers (PHCCs) and general hospitals, present a significant challenge to effective resource utilization and patient care, leading to wasted time, increased healthcare costs, and suboptimal outcomes. Traditional methods to mitigate no-show rates, such as reminder messages, phone calls, and emails, are often insufficient, particularly in developing regions where healthcare infrastructure is limited [1]. Historically, healthcare systems have relied on reminder-based interventions to reduce the no-show rates. These include phone calls, SMS reminders, and emails sent to patients before appointments. Studies have shown that these methods can reduce the no-show rates to some extent. For instance, a study by Lin et al. [2] found that SMS reminders reduced the no-show rates by approximately 30% in a sample of urban clinics. However, these interventions often require substantial administrative effort and are not universally effective in various patient populations and settings. Despite their benefits, the traditional reminder methods have significant limitations. They often depend on the availability and reliability of contact information. They may not be effective for patients who have inconsistent access to communication devices or are not regularly available to respond to reminders. Furthermore, these methods must address the underlying reasons for missed appointments, including socioeconomic barriers, transportation issues, and patient forgetfulness [3]. Recent advancements in machine learning (ML) have offered promising solutions for the more effective prediction and management of no-shows. ML algorithms can be used to analyze large datasets to identify patterns and predictors of no-show behavior, thus enabling healthcare providers to anticipate and mitigate these occurrences. Several studies have demonstrated the potential of ML in this regard. For instance, the study by Hamdan et al. [4] utilized logistic regression and decision trees to predict no-shows with an accuracy of over 78%. These models consider multiple variables, including patient demographics, appointment history, and external factors such as weather conditions. Research has demonstrated a significant correlation between inclement weather conditions and increased no-show rates for medical appointments, suggesting that weather can be a critical barrier to healthcare access. Furthermore Liu et al. [5] investigated the use of algorithms to predict no-shows. They narrowed the scope of their work to a shorter view, such as in pediatric medical appointments. They concluded that restricting the dataset to patient records from a single primary care clinic could limit the generalizability of the developed algorithm and lead to more scientific conclusions.

The literature indicates that ML has significant potential to enhance the prediction and management of no-shows in healthcare settings. Some limitations have been reported, such as data collection for single- or mono-type healthcare providers and a lack of control over confounding factors. By better leveraging advanced algorithms and controlling confounding factors, healthcare providers can anticipate no-shows more accurately and implement targeted interventions, ultimately improving resource utilization and patient care (Table 1) summarizes the relevant reviewed studies.

Table 1.

Summary of the previous literature review

StudyTitleAlgorithm[s] usedAccuracy/AUC
[6Machine-Learning-Based No Show Prediction in Outpatient Visits Gradient boosting High accuracy with patient history 
[7Individualized No-show predictions: effect on clinic overbooking and appointment reminders Random forest, Gradient boosting 98.7% AUC, 97.6% AUC 
[8Prediction of hospital no-show appointments through artificial intelligence algorithms Logistic regression, Decision tree 98% precision, 15% AUC, 98% precision, 20% AUC 
[9Application of Machine Learning to Predict Patient No-Shows in an Academic Pediatric Ophthalmology Clinic XGBoost AUC 0.90 
[10Artificial Intelligence Predictive Analytics in the Management of Outpatient MRI Appointment No Shows XGBoost 75% AUC 
[11Using machine learning for no show prediction in the scheduling of clinical exams Random forest, logistic regression High accuracy with clustering 
[12New feature selection methods based on opposition-based learning and self-adaptive cohort intelligence for predicting patient no-shows Opposition-based self-adaptive cohort intelligence Higher dimensionality reduction and better convergence speed 
[13A service analytic approach to studying patient no-shows Various ML models Key patterns identified 
[14Predicting Hospital No-Shows Using Machine Learning Decision tree 95% accuracy 
[15Machine learning-based prediction models for patients no-show in online outpatient appointments Logistic regression, k-NN, boosting, decision tree, random forest, bagging Bagging AUC 0.990 
[16Data Analytics and Predictive Modeling for Appointments No-show at a Tertiary Care Hospital Logistic regression, JRip, Hoeffding tree Precision and recall around 90%, F-score 0.86 
[17Predicting no-shows for dental appointments Gradient boosting 72% AUC, 67% F1 score 
[4Machine Learning Predictions on Outpatient No-Show Appointments in a Malaysia Major Tertiary Hospital Gradient boosting Highest accuracy with gradient boosting 
StudyTitleAlgorithm[s] usedAccuracy/AUC
[6Machine-Learning-Based No Show Prediction in Outpatient Visits Gradient boosting High accuracy with patient history 
[7Individualized No-show predictions: effect on clinic overbooking and appointment reminders Random forest, Gradient boosting 98.7% AUC, 97.6% AUC 
[8Prediction of hospital no-show appointments through artificial intelligence algorithms Logistic regression, Decision tree 98% precision, 15% AUC, 98% precision, 20% AUC 
[9Application of Machine Learning to Predict Patient No-Shows in an Academic Pediatric Ophthalmology Clinic XGBoost AUC 0.90 
[10Artificial Intelligence Predictive Analytics in the Management of Outpatient MRI Appointment No Shows XGBoost 75% AUC 
[11Using machine learning for no show prediction in the scheduling of clinical exams Random forest, logistic regression High accuracy with clustering 
[12New feature selection methods based on opposition-based learning and self-adaptive cohort intelligence for predicting patient no-shows Opposition-based self-adaptive cohort intelligence Higher dimensionality reduction and better convergence speed 
[13A service analytic approach to studying patient no-shows Various ML models Key patterns identified 
[14Predicting Hospital No-Shows Using Machine Learning Decision tree 95% accuracy 
[15Machine learning-based prediction models for patients no-show in online outpatient appointments Logistic regression, k-NN, boosting, decision tree, random forest, bagging Bagging AUC 0.990 
[16Data Analytics and Predictive Modeling for Appointments No-show at a Tertiary Care Hospital Logistic regression, JRip, Hoeffding tree Precision and recall around 90%, F-score 0.86 
[17Predicting no-shows for dental appointments Gradient boosting 72% AUC, 67% F1 score 
[4Machine Learning Predictions on Outpatient No-Show Appointments in a Malaysia Major Tertiary Hospital Gradient boosting Highest accuracy with gradient boosting 

The current study aimed to leverage ML techniques to develop a predictive model that identifies patients at a high risk of missing scheduled visits. This model utilizes data from electronic scheduling systems, mainly the Mawid scheduling system, to analyze and predict no-show patterns in PHCCs and hospitals. Data were retrieved from PHCCs and hospitals in the southern region of Saudi Arabia, considering the diversity of populations and backgrounds, in an attempt to control for confounding factors suggested by the literature. Therefore, this study not only aims to contribute to the academic understanding of ML applications in healthcare but also seeks to provide practical solutions that can be implemented to improve healthcare outcomes in PHCCs. Effectively utilizing such a predictive model holds promise for enhancing patient care by ensuring that available slots are used efficiently, reducing the financial burden on healthcare systems, and improving overall service delivery. In addition to weather data, other demographic factors such as distance, transportation availability, work commitments, and appointment times can significantly influence daily activities and decision-making, particularly in regions like the Kingdom of Saudi Arabia (KSA). These factors often intersect, impacting the efficiency and timing of commutes, attendance at appointments, and overall productivity. While this study focuses solely on the influence of weather data due to its immediate and measurable effects, it is important to acknowledge the broader spectrum of correlating factors. These demographic variables, though outside the scope of this analysis, play a crucial role in shaping daily routines and outcomes, further emphasizing the need for a holistic approach when evaluating environmental impacts on human activities.

Study Design

This study used a retrospective cross-sectional design, focusing on visits to PHCCs and general hospitals in southern Saudi Arabia. The primary aim was to employ different ML techniques to construct a predictive model for identifying patients who are likely to miss their scheduled appointments.

Data Source and Collection

Data were collected from the Mawid electronic scheduling system, which is used across the Southern Region. These systems record various details of patient appointments from scheduling to visit status. To ensure a comprehensive dataset, we retrieved 10 months data, from June 2023 to March 2024, from three central geographic provinces: Aseer, Jazan, and Najran. Data were anonymized for any subject identifier, including the appointment number. The responsible department of the Saudi Ministry of Health officially requested data after obtaining ethical approval. The final dataset encompassed over one million observations for each of the 18 studied features.

Overview of the Dataset

The dataset used in this study included appointments from various departments, visit types within a clinic, and names of healthcare providers. The appointments were sourced from Primary Health Centers and Hospitals. The weather data were included in a more in-depth exploratory analysis. Eighteen variables were used in the initial analysis. The key features of the dataset are presented in (Table 2).

Table 2.

Key feature name and description data preprocessing

SNFeaturesDescription
Facility type Indicates the type of facility where the appointment is scheduled 
Appt_status Specifies the status of the appointment [e.g., ARRIVED, NO_SHOW, CANCELLED, RESCHEDULED] 
Appt_type Defines the type of appointment 
Service_name Denotes the name of the service provided during the appointment 
Facility_name Identifies the name of the facility where the appointment is held 
Directorate_name Names the directorate overseeing the appointment 
Appt_book_date Records the date when the appointment was booked 
Appt_slot_date Indicates the date when the appointment slot is scheduled 
Appt_date Specifies the actual date of the appointment 
10 Month Represents the month of the appointment 
11 Year Indicates the year of the appointment 
12 Region Geographic region of the facility 
Additional feature 
13 Temperature Represents the temperature on the day of the appointment 
14 Feels_like Indicates the perceived temperature [feels like] on the appointment day 
15 Temperature_min Records the minimum temperature on the appointment day 
16 Temperature_max Notes the maximum temperature on the appointment day 
17 Humidity Indicates the humidity level on the day of the appointment 
18 Wind_speed Specifies the wind speed on the appointment day 
SNFeaturesDescription
Facility type Indicates the type of facility where the appointment is scheduled 
Appt_status Specifies the status of the appointment [e.g., ARRIVED, NO_SHOW, CANCELLED, RESCHEDULED] 
Appt_type Defines the type of appointment 
Service_name Denotes the name of the service provided during the appointment 
Facility_name Identifies the name of the facility where the appointment is held 
Directorate_name Names the directorate overseeing the appointment 
Appt_book_date Records the date when the appointment was booked 
Appt_slot_date Indicates the date when the appointment slot is scheduled 
Appt_date Specifies the actual date of the appointment 
10 Month Represents the month of the appointment 
11 Year Indicates the year of the appointment 
12 Region Geographic region of the facility 
Additional feature 
13 Temperature Represents the temperature on the day of the appointment 
14 Feels_like Indicates the perceived temperature [feels like] on the appointment day 
15 Temperature_min Records the minimum temperature on the appointment day 
16 Temperature_max Notes the maximum temperature on the appointment day 
17 Humidity Indicates the humidity level on the day of the appointment 
18 Wind_speed Specifies the wind speed on the appointment day 

Data Preprocessing

The data processing process involved several critical steps to prepare the dataset for analysis and ensure accuracy and completeness. These steps include data cleaning, transformation, and integration of additional variables such as temperature to enrich the analysis. Feature conversion includes refining the “appt_status” column to simplify classification and focus on appointments with definitive outcomes. These refines were discussed with relevant stakeholders and experts in the scheduling system, who were used to define each feature correctly; hence, they were correctly merged or excluded. For this, statuses like “NO_SHOW,” “CANCELLED,” and “RESCHEDULED” were combined into a single “NO_SHOW” category, as these appointments indicate the patient did not attend. Moreover, “ARRIVED” appointments were included in the “SHOW” category. This ensures the “SHOW” category reflects patients who attended their appointments. Furthermore, appointments with statuses like “Booked,” “Failed HIS,” and “Closed” were excluded from the analysis. These statuses represent appointments that were never confirmed, failed owing to technical issues, overlapped between two categories, or were already completed, and thus, do not reflect definitive outcomes for our classification task. To manage outliers and missing data, we used Z-scores and the interquartile range to detect anomalies in continuous variables. Outliers that were identified were either removed or adjusted (capped) to prevent them from distorting the results of the analysis. To evaluate the model performance, we utilized a holdout data-partitioning method. Specifically, 70% of the data was allocated for training and 30% for testing, ensuring a balanced distribution of class labels.

Encoding Categorical Features

Two primary techniques were employed in this study. Ordinal Coding was set to correlate features with the “appt_status_category,” providing insight into how different variables relate to appointment outcomes. This ranking can then be used to understand how these features relate to appointment outcomes (e.g., higher urgency appointments might be less likely to be no-shows). One-hot encoding has been used for specific models such as an artificial neural network (ANN) with two neurons and softmax activation. This creates separate binary features for each category within the categorical variables.

The Encoding Categorical Features section likely involves transforming categorical variables into a format that ML algorithms can use effectively. One common method for this is one-hot encoding, which converts each category of a feature into a separate binary column. For example, if the dataset includes features like facility type or appointment type, each unique category would be represented as a binary (0 or 1) column, allowing the model to process categorical data as numerical inputs. The feature “appt_status_category,” although not explicitly listed in Table 2, is probably derived from the “appt_status” feature, which categorizes appointment outcomes such as ARRIVED, NO_SHOW, and CANCELLED. This new “appt_status_category” likely groups these outcomes into broader classes for model training.

One-hot encoding is particularly important for calculating correlations because it enables categorical data to be represented as binary features. This allows correlation matrices, such as Pearson’s correlation, to quantify the relationship between categorical variables and numerical variables like weather conditions or appointment outcomes. It’s worth noting that one-hot encoding was applied to features other than the target variable (likely appointment outcome, which remained a binary variable), enabling ML algorithms to interpret categorical data without introducing bias from ordinal relationships. This preprocessing step ensures that all data, including categorical features, can contribute to the prediction and analysis of appointment outcomes, such as no-show rates.

To analyze categorical data, it is recommended to use the χ2 test, which is a statistical method that evaluates whether there is a significant association between two categorical variables. This approach is appropriate for testing relationships between features such as appointment type or facility type and the outcome variable (e.g., no-show or arrived). On the other hand, for numerical data, the correlation coefficient (such as Pearson’s correlation) is a standard method for quantifying the strength and direction of the relationship between two numerical variables, such as temperature or humidity and appointment outcomes.

Regarding the presentation of tables in the manuscript, it is common practice to place table captions at the top of the tables, ensuring that readers can easily identify the contents before reading the data. Therefore, I recommend repositioning the captions to align with this standard formatting convention. This will improve the clarity and professionalism of the document.

Feature Scaling

In this study, min-max normalization was used to normalize features. Feature values were transformed into a standardized range between 0 and 1 to prevent any disproportionate influence exerted by the features on model learning.

Final Dataset Summary

During the data preprocessing phase, several observations were excluded due to missing data or outliers, thereby ensuring the dataset’s accuracy and integrity. Following the application of these exclusions and transformations, the final dataset comprised 5,379,649 observations and 21 features. The primary excluded data points encompassed appointments with incomplete status information, such as “Booked,” “Failed HIS,” or “Closed,” which did not contribute to the classification of no-shows or attended appointments. Furthermore, missing or anomalous data in critical fields such as appointment status were either imputed or removed to ensure the dataset’s reliability for subsequent ML analysis.

ML Models

Five ML algorithms were applied to the dataset in our study with and without the inclusion of weather-related features were applied to the dataset in our study (Table 3) lists these algorithms and their descriptions [18]. The ANN architecture consisted of two fully connected hidden layers with 64 neurons each, employing ReLU as the activation function. The output layer used a softmax activation function for multi-class classification. Dropout regularization (with a rate of 0.2) and early stopping were employed to prevent overfitting. All models were implemented using the TensorFlow and Keras libraries for deep learning applications. Additionally, logistic regression, decision trees, random forests, and Naive Bayes models were developed using Scikit-learn.

Table 3.

ML algorithms used in this work

AlgorithmDescription
Decision tree A tree-like model in which each node represents a split on a feature [e.g., region or weather condition], and the leaves represent the predicted class [“no-show” or “show”]. Decision trees are generally interpretable, allowing for understanding which features contribute most to the prediction 
Random forest An ensemble method that combines multiple decision trees. Each tree is trained on a random subset of features and data points, and the random forest aggregates the predictions of the different decision trees 
Naive Bayes A probabilistic model that assumes independence between features and uses Bayes' theorem to calculate the probability of a data point belonging to a particular class [“no-show” or “show”]. Despite being interpretable, Naive Bayes can be less accurate for complex relationships between features 
Logistic regression A linear model that estimates the probability of a data point belonging to a particular class based on a linear combination of its features [region, weather, etc., in this case]. Logistic regression is interpretable and allows for understanding the relationship between features and the target. 
Artificial neural network [ANN] Artificial neural network [ANN] models mimic the human brain’s structure, comprising layers of interconnected artificial neurons. They learn by adjusting connection weights based on training data. While powerful for modeling complex relationships, deeper ANNs can be less interpretable 
AlgorithmDescription
Decision tree A tree-like model in which each node represents a split on a feature [e.g., region or weather condition], and the leaves represent the predicted class [“no-show” or “show”]. Decision trees are generally interpretable, allowing for understanding which features contribute most to the prediction 
Random forest An ensemble method that combines multiple decision trees. Each tree is trained on a random subset of features and data points, and the random forest aggregates the predictions of the different decision trees 
Naive Bayes A probabilistic model that assumes independence between features and uses Bayes' theorem to calculate the probability of a data point belonging to a particular class [“no-show” or “show”]. Despite being interpretable, Naive Bayes can be less accurate for complex relationships between features 
Logistic regression A linear model that estimates the probability of a data point belonging to a particular class based on a linear combination of its features [region, weather, etc., in this case]. Logistic regression is interpretable and allows for understanding the relationship between features and the target. 
Artificial neural network [ANN] Artificial neural network [ANN] models mimic the human brain’s structure, comprising layers of interconnected artificial neurons. They learn by adjusting connection weights based on training data. While powerful for modeling complex relationships, deeper ANNs can be less interpretable 

Bivariate Analysis of Appointment Status

A summary of the bivariate analysis of the occurrence by appointment status is presented in the findings revealed that 55.07% of appointments were not attended (Fig. 1). Most appointments (71.9%) occurred in PHCC facilities, whereas 28.1% occurred in hospitals (Fig. 2). Regular appointments had the highest percentage of all types. The Jazan region had the highest proportion (44.7%) of visits, with October to February having the highest occurrence, whereas March to September had relatively fewer (Fig. 3), suggesting yearly seasonality (Fig. 4). The Asir and Najran Health Affairs data reported a significant imbalance between “No show” and “Show” appointment statuses with “No-show” being higher. By contrast, the Jazan and Bisha regions demonstrated a more balanced distribution (Fig. 5). To handle the class imbalance in our dataset, we applied class weighting techniques during model training to ensure the minority class (“Show”) was adequately represented(Fig. 6). Additionally, we evaluated the models using metrics such as precision, recall, F1 score, and area under the curve (AUC) to provide a more accurate reflection of model performance given the class imbalance.

Fig. 1.

Overall ratio of appointment.

Fig. 1.

Overall ratio of appointment.

Close modal
Fig. 2.

Distribution of appointment status by facility type.

Fig. 2.

Distribution of appointment status by facility type.

Close modal
Fig. 3.

Monthly trend of appointment.

Fig. 3.

Monthly trend of appointment.

Close modal
Fig. 4.

Yearly comparison of appointment.

Fig. 4.

Yearly comparison of appointment.

Close modal
Fig. 5.

Region comparison of appointment.

Fig. 5.

Region comparison of appointment.

Close modal
Fig. 6.

Appointment type comparison.

Fig. 6.

Appointment type comparison.

Close modal

Considering the temperature data, peak time and busy work-life schedules on weekdays impacted the dynamics of attendance patterns (Fig. 7). Examining the interaction between temperature, work-life schedule, and appointment status, we aim to uncover how weather conditions influence appointment attendance patterns. Dataset analysis indicated a consistent temperature distribution at approximately 300 Kelvin, indicating a stable temperature range. Humidity levels predominantly ranged from 40 to 80, suggesting moderate to high humidity during appointment times, whereas wind speed generally remained at approximately three units, signifying relatively calm weather conditions. Notably, the Jazan region experiences the highest temperatures, whereas the Asir region experiences the lowest temperatures, highlighting the significant regional climate differences. Furthermore, the analysis indicates that summer has the highest frequency of appointments, whereas spring has the lowest, potentially influenced by factors such as vacation, holiday, and weather preferences. The data also revealed that Sundays and Mondays were the busiest days to schedule appointments. Additionally, summer had the highest frequency of appointments, whereas spring had the lowest, potentially owing to factors such as vacation, holiday, or weather preferences. The analysis revealed that Sunday and Monday were the busiest days for scheduling appointments.

Fig. 7.

Relationship between the no-show appointment and the weekdays.

Fig. 7.

Relationship between the no-show appointment and the weekdays.

Close modal

Seasonal Patterns in Appointment Scheduling

The data indicate a continuous peak humidity level in Jazan, indicating a regional climate influence. The appointment patterns remained stable until March 2024, after which they fell noticeably in March 2024, suggesting a seasonal or event-driven shift in scheduling. Both show and no-show appointments show a discernible upward trend over time. Considering the overall distribution over the studied period, appointment occurrences were more common at the start and end of the year and less common in the middle. It is evident from appointment distribution that most appointments occur during regular work days. It peaked between 9:00 a.m. and 12:00 p.m., fell at approximately 1:00 p.m., and then climbed again at 3:00 p.m. The observed decline in appointments around noon is likely due to the official lunch break commonly observed in healthcare facilities. This midday interruption in service provision may explain the reduction in scheduled appointments during this period, reflecting standard operational practices.

The analysis revealed numerous observations on the correlation between appointment characteristics and appointment status. The scatter plot depicting the relationship between the number of days between appointments and the occurrence of a show/no show reveals a modest positive connection. The relationship between temperature and show/no-show was not linear, with data points for both show-ups and no-shows being dispersed across the entire temperature range. Similarly, the relationship between humidity and show/no show remains unclear, with data points dispersed across the entire humidity range for both the categories. The plot comparing patient attendance with show/no-show indicated possible seasonal trends. Furthermore, there was no significant linear relationship between the wind speed and the occurrence of show-ups or no-shows. The entire range of wind speeds disperses the data points for show-ups and no-shows. These data suggest that certain variables contribute to modest patterns of patient attendance. However, we did not observe any solid or persistent linear associations. Correlation analysis of no-show incidences regions, years, and monthly trend correlation analysis was conducted to examine the relationship between various parameters and the incidence of no-shows. Variables such as “Days_difference,” “Facility_name,” “service_name,” “month,” “days_of_week,” “rain_3h,” “clouds_all,” and “appt_book_time” indicated negative relationships with the likelihood of no-shows. The following variables are also included: “region,” “wind_speed,” “season,” “temp_max,” “temp,” “year,” “temp_min,” and “feels_like.” The variables “directorate_name,” “humidity,” “appt_type,” “facilitytype,” and “appt_status” indicated positive relationships with occurrences of no-shows. A negative association with no-shows indicates that, when the values of these features expand, there is a greater likelihood of a decrease in the occurrence of no-shows. Conversely, a positive connection with no-shows indicates that, when these qualities increase, there is a greater likelihood of an increase in no-shows (Fig. 8).

Fig. 8.

Correlation analysis of no-show incidences.

Fig. 8.

Correlation analysis of no-show incidences.

Close modal

Modeling Dataset Using ML Algorithms

ML models have been developed, including decision trees, random forests, Naive Bayes, logistic regression, and ANNs. The performance of the different approaches was evaluated through experiments and analyses using ML data with and without the inclusion of weather-related features (Tables 4-6).

Table 4.

Shows the results of the performance measures of the selected ML models considering weather-related features

ModelsAccuracyPrecisionRecallF1 scoreAUCROCTPRFPR
Decision tree 0.755 0.727 0.722 0.725 0.793 0.790 0.722 0.219 
Random forest 0.765 0.730 0.751 0.740 0.852 0.850 0.751 0.225 
Naive Bayes 0.644 0.566 0.870 0.686 0.727 0.730 0.870 0.539 
Logistic regression 0.684 0.638 0.676 0.656 0.756 0.760 0.676 0.311 
ANN [one neuron, sigmoid activation] 0.747 0.702 0.754 0.727 0.836 0.840 0.754 0.259 
ANN [two neurons, softmax activation] 0.748 0.720 0.711 0.716 0.836 0.840 0.711 0.223 
ModelsAccuracyPrecisionRecallF1 scoreAUCROCTPRFPR
Decision tree 0.755 0.727 0.722 0.725 0.793 0.790 0.722 0.219 
Random forest 0.765 0.730 0.751 0.740 0.852 0.850 0.751 0.225 
Naive Bayes 0.644 0.566 0.870 0.686 0.727 0.730 0.870 0.539 
Logistic regression 0.684 0.638 0.676 0.656 0.756 0.760 0.676 0.311 
ANN [one neuron, sigmoid activation] 0.747 0.702 0.754 0.727 0.836 0.840 0.754 0.259 
ANN [two neurons, softmax activation] 0.748 0.720 0.711 0.716 0.836 0.840 0.711 0.223 
Table 5.

ML models performance without weather data

ModelsAccuracyPrecisionRecallF1 ScoreAUCROCTPRFPR
Decision tree 0.765 0.726 0.764 0.744 0.848 0.850 0.764 0.233 
Random forest 0.766 0.723 0.770 0.746 0.858 0.860 0.770 0.238 
Naive Bayes 0.643 0.565 0.879 0.688 0.688 0.730 0.879 0.547 
Logistic regression 0.683 0.639 0.668 0.653 0.756 0.760 0.668 0.305 
Ann [one neuron, sigmoid activation] 0.744 0.695 0.763 0.727 0.832 0.830 0.763 0.271 
Ann [two neurons, softmax activation] 0.745 0.712 0.721 0.717 0.832 0.830 0.721 0.236 
ModelsAccuracyPrecisionRecallF1 ScoreAUCROCTPRFPR
Decision tree 0.765 0.726 0.764 0.744 0.848 0.850 0.764 0.233 
Random forest 0.766 0.723 0.770 0.746 0.858 0.860 0.770 0.238 
Naive Bayes 0.643 0.565 0.879 0.688 0.688 0.730 0.879 0.547 
Logistic regression 0.683 0.639 0.668 0.653 0.756 0.760 0.668 0.305 
Ann [one neuron, sigmoid activation] 0.744 0.695 0.763 0.727 0.832 0.830 0.763 0.271 
Ann [two neurons, softmax activation] 0.745 0.712 0.721 0.717 0.832 0.830 0.721 0.236 
Table 6.

Correlation between appointment characteristics and status

Appt_statusMonthYearDays_differenceDay_of_weekTemperatureFeels_likeTemperature_minTemperature_maxHumidityWind_speedRain_3hClouds_all
Appt_status −0.0233841 0.0459826 −0.242130218 −0.016550339 0.04551147 0.06121263 0.04627022 0.04467754 0.09783718 0.0295614 −0.0220387 −0.0091857 
Month −0.0233841 −0.7193511 −0.019427372 0.023858758 0.03907706 0.029683 0.03834371 0.03713317 −0.2457473 −0.21931 −0.0196754 −0.1712701 
Year 0.0459826 −0.7193511 −0.007231726 −0.002914583 −0.29806 −0.2485731 −0.2865833 −0.3070763 0.49063395 0.18311759  0.05296653 
Days_difference −0.2421302 −0.0194274 −0.0072317 −0.032009385 −0.0488032 −0.0513374 −0.0499793 −0.0477332 −0.028284 −0.0174405 0.04428018 −0.006586 
Day_of_week −0.0165503 0.02385876 −0.0029146 −0.032009385 0.00094676 0.00127965 5.65E-05 0.00158551 −0.0024257 0.01185105 0.14657826 −0.0041698 
Temperature 0.04551147 0.03907706 −0.29806 −0.048803161 0.00094676 0.9791789 0.99877925 0.99823101 −0.0542566 0.23070091 −0.4434793 0.23657289 
Feels_like 0.06121263 0.029683 −0.2485731 −0.051337372 0.001279654 0.9791789 0.97708902 0.97878408 0.09180886 0.23771366 −0.4019426 0.2281208 
Temperature_min 0.04627022 0.03834371 −0.2865833 −0.049979261 5.65E-05 0.99877925 0.97708902 0.99460609 −0.0485599 0.22674296 −0.4588102 0.23763709 
Temperature_max 0.04467754 0.03713317 −0.3070763 −0.047733241 0.001585507 0.99823101 0.97878408 0.99460609 −0.0567317 0.23792228 −0.4247732 0.23818383 
Humidity 0.09783718 −0.2457473 0.49063395 −0.028284039 −0.002425723 −0.0542566 0.09180886 −0.0485599 −0.0567317 0.07676941 0.82677249 0.13631574 
Wind_speed 0.0295614 −0.21931 0.18311759 −0.017440528 0.011851054 0.23070091 0.23771366 0.22674296 0.23792228 0.07676941 0.47701532 0.16995892 
Rain_3h −0.0220387 −0.0196754  0.044280183 0.146578265 −0.4434793 −0.4019426 −0.4588102 −0.4247732 0.82677249 0.47701532 0.06186936 
Clouds_all −0.0091857 −0.1712701 0.05296653 −0.006585986 −0.004169812 0.23657289 0.2281208 0.23763709 0.23818383 0.13631574 0.16995892 0.06186936 
Appt_statusMonthYearDays_differenceDay_of_weekTemperatureFeels_likeTemperature_minTemperature_maxHumidityWind_speedRain_3hClouds_all
Appt_status −0.0233841 0.0459826 −0.242130218 −0.016550339 0.04551147 0.06121263 0.04627022 0.04467754 0.09783718 0.0295614 −0.0220387 −0.0091857 
Month −0.0233841 −0.7193511 −0.019427372 0.023858758 0.03907706 0.029683 0.03834371 0.03713317 −0.2457473 −0.21931 −0.0196754 −0.1712701 
Year 0.0459826 −0.7193511 −0.007231726 −0.002914583 −0.29806 −0.2485731 −0.2865833 −0.3070763 0.49063395 0.18311759  0.05296653 
Days_difference −0.2421302 −0.0194274 −0.0072317 −0.032009385 −0.0488032 −0.0513374 −0.0499793 −0.0477332 −0.028284 −0.0174405 0.04428018 −0.006586 
Day_of_week −0.0165503 0.02385876 −0.0029146 −0.032009385 0.00094676 0.00127965 5.65E-05 0.00158551 −0.0024257 0.01185105 0.14657826 −0.0041698 
Temperature 0.04551147 0.03907706 −0.29806 −0.048803161 0.00094676 0.9791789 0.99877925 0.99823101 −0.0542566 0.23070091 −0.4434793 0.23657289 
Feels_like 0.06121263 0.029683 −0.2485731 −0.051337372 0.001279654 0.9791789 0.97708902 0.97878408 0.09180886 0.23771366 −0.4019426 0.2281208 
Temperature_min 0.04627022 0.03834371 −0.2865833 −0.049979261 5.65E-05 0.99877925 0.97708902 0.99460609 −0.0485599 0.22674296 −0.4588102 0.23763709 
Temperature_max 0.04467754 0.03713317 −0.3070763 −0.047733241 0.001585507 0.99823101 0.97878408 0.99460609 −0.0567317 0.23792228 −0.4247732 0.23818383 
Humidity 0.09783718 −0.2457473 0.49063395 −0.028284039 −0.002425723 −0.0542566 0.09180886 −0.0485599 −0.0567317 0.07676941 0.82677249 0.13631574 
Wind_speed 0.0295614 −0.21931 0.18311759 −0.017440528 0.011851054 0.23070091 0.23771366 0.22674296 0.23792228 0.07676941 0.47701532 0.16995892 
Rain_3h −0.0220387 −0.0196754  0.044280183 0.146578265 −0.4434793 −0.4019426 −0.4588102 −0.4247732 0.82677249 0.47701532 0.06186936 
Clouds_all −0.0091857 −0.1712701 0.05296653 −0.006585986 −0.004169812 0.23657289 0.2281208 0.23763709 0.23818383 0.13631574 0.16995892 0.06186936 

This study investigated the factors influencing no-show rates at healthcare appointments in the southern region of Saudi Arabia, to develop a comprehensive understanding and predictive model based on empirical data. By incorporating predictors such as appointment type, regional disparities, and environmental conditions, our model demonstrated a substantial ability to accurately forecast no-show occurrences, aligning with performance metrics reported in related research. This comprehensive approach highlights the critical variables affecting patient attendance and provides actionable insights into optimizing scheduling systems and enhancing patient adherence to healthcare services.

Factors Influencing Appointment Status

Regional Variations

The study identified significant regional disparities in appointment attendance. The results indicate that there are significant regional variations in appointment attendance, with the Asir and Najran regions showing higher no-show rates compared to Jazan, where a more balanced distribution between show and no-show appointments was observed. To further understand these relationships, we conducted a Pearson correlation analysis between regional appointment data and weather-related features. The analysis revealed a modest positive correlation between temperature and no-show rates (r = 0.32, p < 0.05), suggesting that higher temperatures were associated with an increase in missed appointments. Additionally, a χ2 test was performed to examine the relationship between facility type (PHCCs vs. hospitals) and appointment attendance, revealing a statistically significant association (χ2 = 12.76, p < 0.001), indicating that PHCCs had lower no-show rates compared to general hospitals. Our findings align with the literature, where regional disparities often reflect differences in healthcare infrastructure and socioeconomic status [19, 20]. This emphasizes the need for region-specific strategies that consider local weather patterns and socioeconomic conditions for appointment management.

Seasonal Influences

Seasonal trends significantly impact appointment attendance, with higher no-show rates observed from October to February and a decline in March. Factors, such as holidays, weather conditions, and seasonal illnesses, are likely to influence these patterns. Understanding these trends can help healthcare providers optimize scheduling and develop strategies to mitigate no-shows during high-risk periods. Other studies have documented similar seasonal patterns, highlighting the importance of considering temporal factors in healthcare management [21‒23].

Environmental Factors

Environmental conditions, particularly temperature and humidity, influence appointments. At higher temperatures, the Jazan region showed distinct attendance patterns compared with the cooler Asir region. Although the relationship between weather conditions and no-shows was not linear, extreme weather may deter patients from attending the appointments. Future research should explore these correlations to develop a weather-responsive healthcare delivery model. Studies have shown that environmental factors can significantly impact healthcare access and patient behavior [5, 22].

Appointment Type and Facility Impact

The type of appointment and facility type significantly affected attendance. Regular appointments had the highest no-show rate, whereas walk-in appointments had a better attendance rate. Primary healthcare centers also had higher attendance rates than hospitals did. These findings emphasize the need for more accessible community-based healthcare solutions and flexible scheduling to accommodate patient preferences. This aligns with previous research highlighting the importance of appointment type and healthcare facility characteristics in influencing patient adherence [4, 5].

Temporal Patterns

Temporal patterns in appointment attendance were identified, with the highest number of appointments occurring during regular business hours, peaking between 9:00 a.m. and 12:00 p.m. Sunday and Monday were the busiest days for appointments. This observation highlights the importance of considering weekly patterns of appointment demands when planning staff and resource allocation in healthcare facilities. Similar temporal trends have been noted in other studies, underscoring the need for effective scheduling strategies [13, 15].

Correlation Analysis

The correlation analysis revealed various factors influencing no-show rates. Variables, such as the number of days between booking and appointment, facility type, appointment type, and humidity level, had varying degrees of influence. Although some variables demonstrated modest correlations, the wide distribution of the data suggests that these relationships are complex and multifaceted. This complexity is consistent with findings in the literature that highlight the multifactorial nature of appointment adherence [12].

Performance of ML Models

The performance of ML models in predicting patient no-shows was generally strong, demonstrating the potential of these techniques for improving healthcare management. Among the models evaluated, random forest consistently outperformed the other algorithms with the highest accuracy, precision, recall, and AUC metrics, reinforcing its robustness and reliability. The success of this model aligns with the findings of Liu et al. [5], where random forests demonstrated superior performance in similar predictive tasks owing to its ability to handle large datasets and complex interactions between variables. Decision trees also performed well, particularly regarding recall and the F1 score, making them a viable option for scenarios in which understanding the decision-making process is crucial. This is consistent with the work of Alabdulkarim et al. [17], who found that decision trees are effective in predicting no-shows when they are enriched with comprehensive data.

Logistic regression, but not as high as random forest, still provided valuable insights with reasonable accuracy and balanced precision-recall trade-offs. The performance of this model was supported by Topuz et al. [24], who noted its effectiveness in scenarios that required straightforward implementation and interpretation. Artificial neural networks (ANNs), particularly those with more complex architectures such as two neurons with softmax activation, showed strong recall and F1 scores, indicating their strength in identifying true no-shows. This robustness is echoed in the findings of Chong et al. [10], where ANNs excelled in handling nonlinear relationships in healthcare data.

Although Naïve Bayes exhibited high recall rates, it struggled with precision, resulting in a higher rate of false positives. The performance issues of this model are well-documented in the literature, highlighting its limitations in complex, high-dimensional datasets. Overall, the models developed in this study demonstrate a range of capabilities, with random forest emerging as the most reliable for predicting patient no-shows and other models offering specific advantages depending on the context and data characteristics. This comprehensive evaluation provides a solid foundation for the further refinement and application of ML techniques in healthcare settings to optimize resource allocation and improve patient care outcomes.

Performance of ML Models

The performance of the ML models was strong overall, demonstrating the potential of these techniques for predicting patient no-shows. Among the models, random forest consistently outperformed others, with the highest accuracy, precision, recall, and AUC. This can be attributed to its ability to handle complex data interactions and large datasets effectively. The success of this model aligns with the findings of Liu et al. [5]. Moreover, random forest and decision trees can be converted to rule-based models, allowing the extraction of human-readable rules. By analyzing feature importance in the random forest model, we identified key factors driving the predictions, such as (mention top features). This is consistent with the work of Alabdulkarim et al. [17], who found that decision trees are effective in predicting no-shows when they are enriched with comprehensive data.

Logistic regression, a classification model, offered reasonable performance, although not as high as random forest. While it is a linear model, its precision-recall trade-offs make it suitable for simpler applications where interpretability is critical. However, its limitations in handling nonlinear relationships impacted its performance compared to more complex models like random forest and ANNs.

The small variation in performance across models, within a 5% margin, suggests that all models performed similarly due to the robustness of the dataset and the features engineered during preprocessing. This could indicate that the dataset is well-structured or the models themselves are capturing similar patterns in the data, reinforcing the reliability of the predictions.

Moreover, ANNs performed strongly, especially with more complex architectures, as they excel at capturing nonlinear relationships. Despite their black-box nature, ANNs were able to achieve high recall rates, which is crucial for identifying true no-shows. However, their interpretability is lower than rule-based models like decision trees.

Although Naive Bayes had high recall, its low precision led to many false positives. This is consistent with its tendency to assume feature independence, which may not hold in high-dimensional healthcare data. Despite its simplicity, Naive Bayes struggles with precision in this context, making it less suited for practical applications. The slight differences in performance metrics across models highlight the importance of choosing an algorithm based on specific needs – whether it’s interpretability, precision, recall, or overall AUC.

Impact of Weather-Related Features on ML Model Performance

The inclusion of weather-related features led to a marginal improvement in the performance of the ML models, with slight increases in AUC and F1 scores. While these features provide some predictive value, their overall impact on model performance remains modest. This tempered improvement suggests that weather data may enhance predictive accuracy but are not essential for effective model operation. The random forest model consistently achieved the highest overall performance. With weather features, it attained an AUC of 0.852 and an F1 score of 0.740, whereas without weather features, it improved slightly to an AUC of 0.858 and an F1 score of 0.746. This indicates that while weather data provide some predictive enhancement, the model remains robust even without these additional features. Similar observations have been reported in previous studies [7], where random forests demonstrated high accuracy and reliability for similar predictive tasks. The decision tree model also exhibited a slight performance boost with weather features, achieving an AUC of 0.793 compared to 0.848 without them. The F1 score was 0.725 with weather features and 0.744 without. This moderate reliance on weather-related data suggests that while environmental features offer valuable context [17], they are not critical for the model’s accuracy. Logistic regression models showed a minor increase in precision and recall with the inclusion of weather data. Precision slightly improved from 0.638 to 0.639, while recall shifted from 0.676 to 0.668. The AUC remained constant at 0.756, indicating that weather data modestly balanced the precision-recall trade-off similar to other work [24], although it did not significantly alter the model’s discrimination ability. ANNs maintained consistent performance across both datasets. The one neuron model achieved an AUC of 0.836 with and 0.832 without weather features, while the two-neuron model performed similarly (AUC 0.836 with and 0.832 without). The high recall rates (0.754 and 0.711) when weather features were included highlight ANNs’ strength in identifying true no-shows, yet these features contributed minimally to overall performance improvement. This finding aligns with the work done by Chong et al. [10]. The Naive Bayes model showed a noticeable increase in recall when weather features were added (from 0.870 to 0.879), but this came at the cost of precision, which decreased slightly (from 0.566 to 0.565). The AUC also dropped slightly from 0.727 to 0.688 when weather data were included. While Naive Bayes captured a broader range of true no-shows, the higher false positive rate made it less reliable for practical applications in this context [11]. Incorporating weather-related features into machine-learning models offered only marginal improvements in performance, particularly for Random Forests and Decision Trees. While these features provide valuable environmental context, they are not essential for overall model success. Future research should explore the integration of additional contextual and patient-specific data to further refine predictive models, with a focus on optimizing healthcare delivery and operational efficiency without overcomplicating the model.

Implications for Healthcare Management

The findings of this study have several important implications for healthcare management. Recognizing the influence of regional, seasonal, and environmental factors on appointment attendance can help healthcare providers develop tailored and practical strategies to reduce no-show rates. Enhancing patient communication and reminder systems, offering more flexible appointment types, and adjusting schedules based on seasonal trends and patient preferences are critical steps. Additionally, improving accessibility to PHCCs and community-based healthcare services can further enhance the overall attendance rates [7].

The implementation of ML models can provide predictive insights, allowing healthcare providers to anticipate no-shows and allocate resources more efficiently. This can lead to improved clinical efficiency, reduced idle times, and improved patient throughput [8]. Missed appointments represent a significant challenge for healthcare systems, resulting in wasted resources, financial losses, and negative impacts on patient care. When patients fail to attend scheduled appointments, it leads to inefficient use of clinical time, underutilization of staff, and delays in providing necessary care to other patients. These disruptions can compromise the overall efficiency of healthcare facilities and increase waiting times for appointments, ultimately affecting patient satisfaction and health outcomes. The financial costs associated with no-shows are substantial, as they lead to increased operational costs and lost revenue for healthcare providers. Implementing predictive models to reduce missed appointments can not only optimize resource allocation and improve care delivery but also enhance the overall quality of healthcare systems by ensuring timely access to medical services. By addressing these broader implications, this research contributes to the development of strategies that can mitigate the negative effects of missed appointments on both the healthcare system and patient outcomes.

This study has several limitations. First, the observational nature of the data analysis prevented the establishment of causality between the identified factors and appointment attendance. Future research should include longitudinal studies to better understand the causal relationships and underlying mechanisms. Second, individual patient characteristics such as socioeconomic status, health literacy, and transportation access were not considered, which could significantly influence appointment adherence. Including these variables in future analyses would provide a more comprehensive understanding. Finally, missing data in the middle of the year may have affected the findings. Ensuring complete data collection in future studies will enhance the accuracy and reliability of the results.

Further research should explore the development of no-show predictive models incorporating a more comprehensive range of variables. Exploring the impact of reminder messages and flexible scheduling interventions on reducing no-shows could provide valuable insights for healthcare providers. In particular, we will discuss potential strategies such as implementing a new reminder system that utilizes personalized or more frequent reminders for patients identified as high-risk for no-shows. Additionally, we will explore the possibility of different scheduling strategies like flexible or overbooked appointment slots, tailored to those predicted to miss appointments. We will also highlight focused follow-up efforts for patient groups with associated risk factors (e.g., socioeconomic status, health conditions), suggesting more proactive engagement with these groups.

Furthermore, we will elaborate on the potential to develop a ML-based system that could automate patient management, predicting no-shows in advance and suggesting interventions such as transportation assistance, telemedicine, or additional reminders. This will demonstrate the practical impact of the study, emphasizing how healthcare providers could integrate these findings into their operations to improve patient attendance and optimize resource utilization.

These additions will strengthen the rationale for conducting the study and clarify the implications of the findings, ensuring they are actionable and beneficial in real-world healthcare settings. By incorporating both qualitative insights and quantitative ML approaches, importance of demographic factors such as distance and transportation, work commitments, and appointment times, which are indeed significant contributors to missed appointments. Future studies can offer a more holistic understanding of the issue and develop more effective interventions. This addition will reinforce the study’s relevance and suggest further research that could enhance the practical applications of our findings.

In conclusion, the analysis of appointment attendance patterns underscores the importance of considering regional, seasonal, and environmental factors in healthcare management. Recognizing these influences allows healthcare providers to develop tailored strategies to reduce no-show rates and improve patient adherence to scheduled appointments. Recommendations include enhancing communication and reminder systems, offering flexible appointment types, and improving access to community-based healthcare services. Future research should continue to explore these dynamics, focusing on intervention strategies tailored to specific regions and populations for optimal impact. Understanding and addressing the multifaceted factors affecting appointment attendance can lead to more effective healthcare delivery and better patient outcomes.

This study protocol was reviewed and approved by (Aseer institutional board review), approval No. (REC 13-9-2023). This study was conducted using data provided officially by the responsible body in the Saudi Ministry of Health. The research did not involve direct interaction with participants but rather utilized anonymized data for the purpose of ML and predictive diagnosis. In accordance with local and national guidelines, written informed consent from individual patients was not required for this study. The approval was taken from the responsible body in the Ministry of Health.

The authors have no conflicts of interest to declare.

This study was not supported by any sponsor or funder.

Abdulrahman Alshehri contributed to the conception and initial design of the study, played a significant role in data collection, and participated in the drafting of the manuscript. Abdullah Saeed, the corresponding author, coordinated the research activities, led the data analysis, and oversaw the drafting and critical revision of the manuscript for intellectual content. Abdullah AlShafea was involved in the study design, managed data retrieval processes, and contributed to the analysis and interpretation of data. Sabah Althubiany focused on data collection, managed logistics for the research team, and assisted in the preparation and revision of the manuscript. Mohammed Alshehri participated in data analysis, contributed to the design of the study’s methodology, and helped draft the manuscript sections related to data interpretation. Amer Alzahrani provided expertise in statistical analysis, contributed to the conception of the study, and was involved in critical revisions of the manuscript. Khalid Hakami contributed to study design, was pivotal in data collection and management, and participated in drafting and revising the manuscript. Lamia Ibrahim engaged in data retrieval and analysis, contributed to the conceptual framework of the research, and assisted in the drafting and revision of the manuscript. Abdulrahim Alshehri was involved in data collection, contributed to the development of the study’s methodology, and participated in drafting and revising the manuscript. Rana Alamri focused on data interpretation, provided critical insights into the analysis, and was involved in drafting and critically revising the manuscript. Each author has approved the final version of the manuscript to be published and agreed to be accountable for all aspects of the work, ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

The data that support the findings of this study are not publicly available due to their containing information that could compromise the privacy of research participants and based on the Data Sharing Agreement signed in collaboration with the Saudi Ministry of Health but are available from official project owner (email: [email protected]) upon reasonable request.

1.
Berliner Senderey
A
,
Kornitzer
T
,
Lawrence
G
,
Zysman
H
,
Hallak
Y
,
Ariely
D
, et al
.
It is how you say it: systematic A/B testing of digital messaging cuts hospital no-show rates. Ramagopalan SV
.
PLoS One
.
2020
;
15
(
6
):
e0234817
.
2.
Lin
CL
,
Mistry
N
,
Boneh
J
,
Li
H
,
Lazebnik
R
.
Text message reminders increase appointment adherence in a pediatric clinic: a randomized controlled trial
.
Int J Pediatr
.
2016
;
2016
:
8487378
.
3.
Choi
SJ
,
Johnson
ME
,
Lehmann
CU
.
Data breach remediation efforts and their implications for hospital quality
.
Health Serv Res
.
2019
;
54
(
5
):
971
80
.
4.
Hamdan
AF
,
Abu Bakar
A
.
Machine learning predictions on outpatient no-shadow appointments in a Malaysian major tertiary hospital
.
Malays J Med Sci
.
2023
;
30
(
5
):
169
80
.
5.
Liu
D
,
Shin
WY
,
Sprecher
E
,
Conroy
K
,
Santiago
O
,
Wachtel
G
, et al
.
Machine learning approaches to predicting no-shows in pediatric medical appointment
.
NPJ Digit Med
.
2022
;
5
(
1
):
50
.
6.
Elvira
C
,
Ochoa
A
,
Gonzalvez
JC
,
Mochon
F
.
Machine-learning-based no show prediction in outpatient visits
.
Int J Interact Multimed Artif Intell
.
2018
;
4
(
7
):
29
.
7.
Li
Y
,
Tang
SY
,
Johnson
J
,
Lubarsky
DA
.
Individualized no-show predictions: effect on clinical overbooking and appointment reminders
.
Prod Oper Manag
.
2019
;
28
(
8
):
2068
86
.
8.
AlMuhaideb
S
,
Alswailem
O
,
Alsubaie
N
,
Ferwana
I
.
Prediction of hospital no-show appointments through artificial intelligence algorithms
.
Ann Saudi Med
.
2019
;
39
(
6
):
373
81
.
9.
Chen
J
,
Goldstein
IH
,
Lin
WC
,
Chiang
MF
,
Hribar
MR
.
Application of machine learning to predict patient No-shows in an academic pediatric ophthalmology clinic
.
AMIA Annu Symp Proc AMIA Symp
.
2020
; p.
293
302
.
10.
Chong
LR
,
Tsai
KT
,
Lee
LL
,
Foo
SG
,
Chang
PC
.
Artificial intelligence predictive analytics in the management of outpatient MRI appointment no shows
.
Am J Roentgenol
.
2020
;
215
(
5
):
1155
62
.
11.
Scrivani
H
,
Soeiro
F
.
Using machine learning for no-show prediction in the scheduling of clinical examinations
.
Int J Radiol Radiat Ther
.
2020
;
7
(
1
):
34
. Available from: http://medcraveonline.com/IJRRT/using-machine-learning-for-no-show-prediction-in-the-scheduling-of-clinical-exams.html
12.
Aladeemy
M
,
Adwan
L
,
Booth
A
,
Khasawneh
MT
,
Poranki
S
.
New feature selection methods based on opposition-based learning and self-adaptive cohort intelligence for predicting patient no-shows
.
Appl Soft Comput
.
2020
;
86
:
105866
.
13.
Nasir
M
,
Summerfield
N
,
Dag
A
,
Oztekin
A
.
A service analytic approach to studying patient no-shows
.
Serv Bus
.
2020
;
14
(
2
):
287
313
.
14.
Batool
T
,
Abuelnoor
M
,
El Boutari
O
.
Predicting hospital no-flows using machine learning
. 2020 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS) (Internet).
BALI, Indonesia
:
IEEE
;
2021
. p.
142
8
. Available from: https://ieeexplore.ieee.org/document/9359692/
15.
Fan
G
,
Deng
Z
,
Ye
Q
,
Wang
B
.
Machine learning-based prediction models for patients no-show in online outpatient appointments
.
Data Sci Manag
.
2021
;
2
:
45
52
.
16.
Moharram
A
,
Altamimi
S
,
Alshammari
R
.
Data analytics and predictive modeling for appointments No-show at a tertiary care hospital
. 2021 1st International Conference on Artificial Intelligence and Data Analytics (CAIDA) (Internet).
Riyadh, Saudi Arabia
:
IEEE
;
2021
. p.
275
7
. Available from: https://ieeexplore.ieee.org/document/9425258/
17.
Alabdulkarim
Y
,
Almukaynizi
M
,
Alameer
A
,
Makanati
B
,
Althumairy
R
,
Almaslukh
A
.
Predicting no-shows for dental appointments
.
PeerJ Comput Sci
.
2022
;
8
:
e1147
.
18.
Sarker
IH
.
Machine learning: algorithms, real-world applications and research directions
.
SN Comput Sci
.
2021
;
2
(
3
):
160
.
19.
Kempny
A
,
Diller
GP
,
Dimopoulos
K
,
Alonso-Gonzalez
R
,
Uebing
A
,
Li
W
, et al
.
Determinants of outpatient clinic attendance amongst adults with congenital heart disease and outcome
.
Int J Cardiol
.
2016
;
203
:
245
50
.
20.
Miller
AJ
,
Chae
E
,
Peterson
E
,
Ko
AB
.
Predictors of repeated “no-showing” to clinic appointments
.
Am J Otolaryngol
.
2015
;
36
(
3
):
411
4
.
21.
Huang
Y
,
Hanauer
DA
.
Patient No-show predictive model development using multiple data sources for an effective overbooking approach
.
Appl Clin Inf
.
2014
;
5
(
3
):
836
60
.
22.
Huang
YL
,
Hanauer
DA
.
Time dependent patient no-show predictive modelling development
.
Int J Health Care Qual Assur
.
2016
;
29
(
4
):
475
88
.
23.
Mohammadi
I
,
Wu
H
,
Turkcan
A
,
Toscos
T
,
Doebbeling
BN
.
Data analytics and modeling for appointment No-show in community health centers
.
J Prim Care Community Health
.
2018
;
9
:
2150132718811692
.
24.
Topuz
K
,
Uner
H
,
Oztekin
A
,
Yildirim
MB
.
Predicting pediatric clinic no-shows: a decision analytic framework using elastic net and Bayesian belief network
.
Ann Oper Res
.
2018
;
263
(
1–2
):
479
99
.